These days, most of us think little of grabbing our smartphone or tablet to watch a video, whether we’re bingeing a television series, scrolling through social media, or participating in the virtual gatherings that have kept us all connected during the Covid-19 pandemic. Not so long ago, however, compressing video onto a handheld device presented a considerable challenge—one that Vivienne Sze SM ’06, PhD ’10 took on while earning her PhD at MIT. Now an associate professor at the Institute, Sze has shifted her focus to designing artificial intelligence systems that can interpret video—which could have applications ranging from robotics to health care.
It was a pivotal time for video when, in the early 2000s, Sze became a graduate student in the group of Anantha Chandrakasan (now MIT’s dean of engineering and the Vannevar Bush Professor of Electrical Engineering and Computer Science). The first iPhone was several years away, and “the whole idea of being able to watch video on a device that can sit in your pocket was super exciting,” Sze recalls. Yet obstacles remained—most significantly, figuring out how to compress the video so that it was high quality, yet didn’t drain the device’s battery.
To overcome this obstacle, Sze realized that she needed to cross the boundary between hardware and software, simultaneously developing more energy-efficient circuits and designing better algorithms to support those circuits. “If you do the two together, taking into consideration how you make the algorithm friendly for the hardware and how the hardware might affect the algorithm, you can do much better than each on its own,” Sze explains. Ultimately, this approach enabled her to build a system for low-power video compression.
If you do the two together, taking into consideration how you make the algorithm friendly for the hardware and how the hardware might affect the algorithm, you can do much better than each on its own.
In a bit of fortuitous timing, Sze finished her PhD and went to work at Texas Instruments just as the industry was developing new international standards for video compression—a process that typically happens once a decade. “Basically, the standards involve a bunch of companies coming together and agreeing upon how to compress and decompress videos,” Sze says. This allows video to be transferred across devices. Drawing on her PhD work, Sze ensured that the new standards incorporated hardware-friendly, energy-efficient algorithms. The High Efficiency Video Coding standard that she and her colleagues developed is currently used across television channels and viewing devices—and in 2017 the team won an Engineering Emmy Award from the Television Academy for its work.
As Sze’s time at Texas Instruments wound down, she felt a pull back to academia—so, in 2013, she joined the Electrical Engineering and Computer Science (EECS) faculty at MIT; she was granted tenure in 2020. Now she is interested in a new problem: developing better systems for interpreting the information videos contain—and making such systems as ubiquitous as those she helped develop for video compression. One promising strategy involves deep neural networks, a type of artificial intelligence that can be trained to understand the content of images. Sze’s research centers on designing more-efficient versions of these networks that are capable of parsing video, in conjunction with hardware systems to efficiently process them.
One application of her research is autonomous navigation for low-energy robots, which she is working on with Sertac Karaman SM ’09, PhD ’12, an associate professor in the Department of Aeronautics and Astronautics. As a robot moves around its environment and decides where to go next, it relies on deep neural networks to continuously interpret video of its surroundings. However, these networks must be efficient or the entire robot will be dominated by computation, with little power left for anything else.
Sze is also collaborating with Thomas Heldt PhD ’04, an associate professor in EECS and the Institute for Medical Engineering and Science, to improve evaluation of patients with neurodegenerative diseases. One way of assessing these patients involves recording their eye movements—so Sze is exploring how to build a program for a smartphone or tablet that can gather this information. The program must contain deep neural networks that are advanced enough to track small, subtle motions, yet energy-efficient enough to run on a handheld device.
“The ultimate goal is to use this technology to enable some of these exciting applications to take off and to be able to deploy them in the real world,” Sze says. “It’s one thing to say, ‘We’re going to do energy-efficient artificial intelligence,’ but how do you actually use it in the context of these applications so they can make an impact on solving challenging, real-world problems? What are the design choices and trade-offs that you can or need to make?”
Beyond her specific research projects, Sze is interested in communicating the fundamental principles of her work to the broader community. To this end, she will teach a two-day online MIT Professional Education course this June: Designing Efficient Deep Learning Systems (registration is open to the public, including MIT alumni). “If we can distill down these principles and teach them, then it’s not just us designing this technology—other folks can also design and use this technology for a broad range of applications,” Sze says. With EECS faculty colleague Joel Emer, she has coauthored a new book on the topic, Efficient Processing of Deep Neural Networks.
If we can distill down these principles and teach them, then it’s not just us designing this technology—other folks can also design and use this technology for a broad range of applications.
Sze is also committed to supporting women in her field through EECS Rising Stars, a workshop for female electrical engineers and computer scientists who are interested in careers in academia. Sze participated in the first workshop, which was held at MIT in 2012, and she has since been a mentor, a panelist, and—in 2018 when the program returned to MIT—a cochair.
What continues to motivate Sze in her work? “Wanting to solve meaningful, high-impact problems, collaborating with really great people, and always learning something new,” she says. “It’s a combination of those three aspects that makes this job great.”
Portrait of Vivienne Sze by Lillie Paquette/MIT School of Engineering.