近日,美国麻省理工学院的Dirk Englund&Saumil Bandyopadhyay及其研究团队取得一项新进展。经过不懈努力,他们实现了仅具有前向训练的单片光子深度神经网络。相关研究成果已于2024年12月2日在国际知名学术期刊《自然—光子学》上发表。
该研究团队在可扩展的光子集成电路中实现了这样一个系统,该系统将多个用于矩阵代数和非线性激活函数的,相干光学处理器单元单片集成到单个芯片中。研究人员通过实验展示了这种完全集成的相干光学神经网络架构,用于一个具有六个神经元和三层的深度神经网络,该网络以410皮秒的低延迟光学计算线性和非线性函数,为需要超快速、直接处理光学信号的新应用开辟了道路。
研究人员在此系统上实现了无需反向传播的原位训练,在六类元音分类任务上达到了92.5%的准确率,与在数字计算机上获得的准确率相当。这项工作为原位训练的理论提案提供了实验证据,使训练数据的吞吐量实现了数量级的提升。此外,完全集成的相干光学神经网络为纳秒级延迟,和每操作飞焦耳能效的推理开辟了道路。
据悉,深度神经网络彻底改变了机器学习领域,但能耗和吞吐量正成为互补金属氧化物半导体(CMOS)电子技术的根本局限。这促使人们开始寻找针对人工智能优化的新型硬件架构,如电子收缩阵列、忆阻器交叉阵列和光学加速器。光学系统能够以极高的速率和效率执行线性矩阵运算,这推动了近期低延迟矩阵加速器和光电子图像分类器的研发。然而,实现深度神经网络相干、超低延迟的光学处理仍是一项突出挑战。
附:英文原文
Title: Single-chip photonic deep neural network with forward-only training
Author: Bandyopadhyay, Saumil, Sludds, Alexander, Krastanov, Stefan, Hamerly, Ryan, Harris, Nicholas, Bunandar, Darius, Streshinsky, Matthew, Hochberg, Michael, Englund, Dirk
Issue&Volume: 2024-12-02
Abstract: As deep neural networks revolutionize machine learning, energy consumption and throughput are emerging as fundamental limitations of complementary metal–oxide–semiconductor (CMOS) electronics. This has motivated a search for new hardware architectures optimized for artificial intelligence, such as electronic systolic arrays, memristor crossbar arrays and optical accelerators. Optical systems can perform linear matrix operations at an exceptionally high rate and efficiency, motivating recent demonstrations of low-latency matrix accelerators and optoelectronic image classifiers. However, demonstrating coherent, ultralow-latency optical processing of deep neural networks has remained an outstanding challenge. Here we realize such a system in a scalable photonic integrated circuit that monolithically integrates multiple coherent optical processor units for matrix algebra and nonlinear activation functions into a single chip. We experimentally demonstrate this fully integrated coherent optical neural network architecture for a deep neural network with six neurons and three layers that optically computes both linear and nonlinear functions with a latency of 410ps, unlocking new applications that require ultrafast, direct processing of optical signals. We implement backpropagation-free in situ training on this system, achieving 92.5% accuracy on a six-class vowel classification task, which is comparable to the accuracy obtained on a digital computer. This work lends experimental evidence to theoretical proposals for in situ training, enabling orders of magnitude improvements in the throughput of training data. Moreover, the fully integrated coherent optical neural network opens the path to inference at nanosecond latency and femtojoule per operation energy efficiency.
DOI: 10.1038/s41566-024-01567-z
Source: https://www.nature.com/articles/s41566-024-01567-z