CAMBRIDGE, Mass., Oct. 26, 2022 — MIT researchers have developed a method for computing directly on smart home devices that dramatically reduces the latency that can cause such devices to delay a response to a command or response to give a question . One reason for this delay is that the connected devices don’t have enough memory or power to store and run the massive machine learning models required for the device to understand a question. Instead, the question is sent to a data center, which can be hundreds of miles away, where an answer is calculated and sent back to the device.
The MIT researchers’ technique moved the memory-intensive steps of running a machine learning model to a central server, where components of the model are encoded onto lightwaves. The waves are transmitted via fiber optics to a connected device, making it possible to send large amounts of data over a network at high speed. The receiver then used a simple optical device that quickly performed calculations using the parts of a model carried by those light waves.
The technique resulted in a more than 100-fold improvement in energy efficiency compared to other methods. It could also improve security since a user’s data doesn’t have to be transmitted to a central location for computation.
In addition, the method could enable a self-driving car to make real-time decisions while using only a tiny percentage of the energy currently required by power-hungry computers. It could also be used for live video processing over cellular networks, or even enable high-speed image classification on a spacecraft millions of kilometers from Earth.
Lead author Dirk Englund, an associate professor in the Department of Electrical Engineering and Computer Science (EECS) and a member of the MIT Research Laboratory of Electronics, said: “Anytime you want to run a neural network, you have to run the program, and how how fast you can run the program depends on how fast you can run the program from memory. Our pipeline is huge – it’s roughly equivalent to sending a feature-length feature film across the internet every millisecond or so. That’s how fast data comes into our system. And it can calculate that quickly.”
A smart one transceivers uses silicon photonics technology to dramatically increase one of the most memory-intensive steps in running a machine learning model. This allows an Edge device, such as B. a smart home speaker, perform calculations with more than a hundredfold improvement in energy efficiency. Courtesy of Alexander Sludds.
According to lead author and EECS student Alexander Sludds, one of the biggest processes is the process of retrieving data – in this case the neural network “weights” – from memory and moving it to the parts of a computer that do the actual computation limiting factors for speed and energy. “So our thought was, why don’t we take all of the heavy lifting – the process of pulling billions of weights out of memory – away from the Edge device and place it in a place where we have ample access to power and storage what gives us the ability to get those weights down fast?” said Sluds.
To address the data retrieval process, the team designed and deployed a neural network. Neural networks can contain billions of weight parameters, which are numeric values that transform input data during processing. These weights must be stored. At the same time, the data transformation process involves billions of calculations that require a lot of power to execute.
The neural network architecture the team developed, Netcast, involves storing weights in a central server connected to an intelligent transceiver. The intelligent transceiver, a thumb-sized chip that can receive and send data, uses silicon photonics to pull trillions of weights from memory every second. Weights are received as electrical signals and then encoded onto light waves. Since the weight data is encoded as bits – 1s and 0s – the transceiver converts it by switching lasers. A laser is switched on with a 1 and switched off with a 0. It combines these light waves and then regularly transmits them over a fiber optic network, so a client device doesn’t have to poll the server to receive them.
Once the light waves arrived at the client device, they were used by a wideband Mach-Zehnder modulator to perform super-fast analog computations. Input data from the device, such as sensor information, was encoded onto the weights. Then it sent each individual wavelength to a receiver, which detected the light and measured the result of the calculation.
Researchers devised a way to tune the modulator to perform trillions of multiplications per second. This greatly increased the computing speed on the device while using only a tiny amount of power.
“To make something faster, you have to make it more energy efficient,” Sludds said. “But there is a compromise. We’ve built a system that can operate on about a milliwatt of power, but still perform trillions of multiplications per second. In terms of both speed and energy efficiency, this is a gain of orders of magnitude.”
The researchers tested the architecture by sending weights down an 86 km fiber linking their lab to MIT’s Lincoln Laboratory. Netcast enabled machine learning with high accuracy – 98.7% for image classification and 98.8% for digit recognition – at blistering speeds.
Now the researchers want to iterate the smart transceiver chip to achieve even better performance. They also want to miniaturize the receiver, which is currently the size of a shoebox, to the size of a single chip. This would allow the chip to fit on a smart device like a cellphone.
Euan Allen, a Research Fellow from the Royal Academy of Engineering at the University of Bath, who was not involved in this work, said: “Using photonics and light as a platform for computing is a really exciting area of research with potentially huge implications for speed and the speed efficiency of our IT landscape. The work of Sludds et al. is an exciting step towards real-world implementations of such devices, introducing a new and practical edge computing scheme, and exploring some of the fundamental limitations of computation at very low (single-photon) light levels.”
The research is funded in part by NTT Research, the National Science Foundation, the Air Force Office of Scientific Research, the Air Force Research Laboratory, and the Army Research Office.
The study was published in Science (www.doi/10.1126/science.abq8271).