Samborska, V. Scaling up: how increasing inputs has made artificial intelligence more capable. Our World in Data https://ourworldindata.org/scaling-up-ai (2025).

Sebastian, A., Le Gallo, M., Khaddam-Aljameh, R. & Eleftheriou, E. Memory devices and applications for in-memory computing. Nat. Nanotechnol. 15, 529–544 (2020).

Article 
ADS 
CAS 
PubMed 

Google Scholar
 

Wetzstein, G. et al. Inference in artificial intelligence with deep optics and photonics. Nature 588, 39–47 (2020).

Article 
ADS 
CAS 
PubMed 

Google Scholar
 

Wright, L. G. et al. Deep physical neural networks trained with backpropagation. Nature 601, 549–555 (2022).

Article 
ADS 
CAS 
PubMed 
PubMed Central 

Google Scholar
 

Tanaka, G. et al. Recent advances in physical reservoir computing: a review. Neural Netw. 115, 100–123 (2019).

Article 
PubMed 

Google Scholar
 

Hughes, T. W., Williamson, I. A., Minkov, M. & Fan, S. Wave physics as an analog recurrent neural network. Sci. Adv. 5, eaay6946 (2019).

Article 
ADS 
PubMed 
PubMed Central 

Google Scholar
 

Onodera, T. et al. Scaling on-chip photonic neural processors using arbitrarily programmable wave propagation. Preprint at https://arxiv.org/abs/2402.17750 (2024).

Momeni, A., Rahmani, B., Malléjac, M., del Hougne, P. & Fleury, R. Backpropagation-free training of deep physical neural networks. Science 382, 1297–1303 (2023).

Article 
ADS 
MathSciNet 
CAS 
PubMed 

Google Scholar
 

Xu, Z. et al. Large-scale photonic chiplet Taichi empowers 160-TOPS/W artificial general intelligence. Science 384, 202–209 (2024).

Article 
ADS 
CAS 
PubMed 

Google Scholar
 

Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning representations by back-propagating errors. Nature 323, 533–536 (1986).

Article 
ADS 

Google Scholar
 

Lin, X. et al. All-optical machine learning using diffractive deep neural networks. Science 361, 1004–1008 (2018).

Article 
ADS 
MathSciNet 
CAS 
PubMed 

Google Scholar
 

Le Gallo, M. et al. A 64-core mixed-signal in-memory compute chip based on phase-change memory for deep neural network inference. Nat. Electron. 6, 680–693 (2023).

Article 

Google Scholar
 

Chen, Z. et al. Deep learning with coherent VCSEL neural networks. Nat. Photon. 17, 723–730 (2023).

Article 
ADS 
CAS 

Google Scholar
 

Mengu, D. et al. Misalignment resilient diffractive optical networks. Nanophotonics 9, 4207–4219 (2020).

Article 

Google Scholar
 

Matsushima, K. & Shimobaba, T. Band-limited angular spectrum method for numerical simulation of free-space propagation in far and near fields. Opt. Express 17, 19662–19673 (2009).

Article 
ADS 
CAS 
PubMed 

Google Scholar
 

Launay, J., Poli, I., Boniface, F. & Krzakala, F. Direct feedback alignment scales to modern deep learning tasks and architectures. Adv. Neural Inf. Process. Syst. 33, 9346–9360 (2020).


Google Scholar
 

Cramer, B. et al. Surrogate gradients for analog neuromorphic computing. Proc. Natl Acad. Sci. 119, e2109194119 (2022).

Article 
MathSciNet 
PubMed 
PubMed Central 

Google Scholar
 

Spall, J., Guo, X. & Lvovsky, A. I. Hybrid training of optical neural networks. Optica 9, 803–811 (2022).

Article 
ADS 

Google Scholar
 

Lillicrap, T. P., Cownden, D., Tweed, D. B. & Akerman, C. J. Random synaptic feedback weights support error backpropagation for deep learning. Nat. Commun. 7, 13276 (2016).

Article 
ADS 
CAS 
PubMed 
PubMed Central 

Google Scholar
 

Brunton, S. L. & Kutz, J. N. Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control (Cambridge Univ. Press, 2022).

Hinton, G. The forward-forward algorithm: some preliminary investigations. Preprint at https://arxiv.org/abs/2212.13345 (2022).

Laydevant, J., Lott, A., Venturelli, D. & McMahon, P. L. The benefits of self-supervised learning for training physical neural networks. In Proc. 37th First Workshop on Machine Learning with New Compute Paradigms at NeurIPS 2023 (MLNPCP 2023) https://openreview.net/forum?id=Fik4cO7FXd (OpenReview, 2023).

Refinetti, M., d’Ascoli, S., Ohana, R. & Goldt, S. Align, then memorise: the dynamics of learning with feedback alignment. In Proc. 38th International Conference on Machine Learning, 8925–8935 (MLR Press, 2021).

Lillicrap, T. P., Cownden, D., Tweed, D. B. & Akerman, C. J. Random feedback weights support learning in deep neural networks. Preprint at https://arxiv.org/abs/1411.0247 (2014).

Launay, J. et al. Hardware beyond backpropagation: a photonic co-processor for direct feedback alignment. Preprint at https://arxiv.org/abs/2012.06373 (2020).

Nakajima, M. et al. Physical deep learning with biologically inspired training method: gradient-free approach for physical hardware. Nat. Commun. 13, 7847 (2022).

Article 
ADS 
CAS 
PubMed 
PubMed Central 

Google Scholar
 

Hinton, G. E., Dayan, P., Frey, B. J. & Neal, R. M. The “wake-sleep” algorithm for unsupervised neural networks. Science 268, 1158–1161 (1995).

Article 
ADS 
CAS 
PubMed 

Google Scholar
 

Löwe, S., O’Connor, P. & Veeling, B. Putting an end to end-to-end: gradient-isolated learning of representations. In Proc. Advances in Neural Information Processing Systems 32 (NeuroIPS 2019), 3039–3051 (ACM, 2019).

Nøkland, A. & Eidnes, L. H. Training neural networks with local error signals. In Proc. 36th International Conference on Machine Learning, 4839–4850 (MLR Press, 2019).

Siddiqui, S. A., Krueger, D., LeCun, Y. & Deny, S. Blockwise self-supervised learning at scale. Preprint at https://arxiv.org/abs/2302.01647v1 (2023).

Oguz, I. et al. Forward–forward training of an optical neural network. Opt. Lett. 48, 5249–5252 (2023).

Article 
ADS 
PubMed 

Google Scholar
 

Xue, Z. et al. Fully forward mode training for optical neural networks. Nature 632, 280–286 (2024).

Article 
CAS 
PubMed 
PubMed Central 

Google Scholar
 

Spall, J. C. Multivariate stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Trans. Autom. Control 37, 332–341 (1992).

Article 
MathSciNet 

Google Scholar
 

McCaughan, A. N. et al. Multiplexed gradient descent: fast online training of modern datasets on hardware neural networks without backpropagation. APL Mach. Learn. 1, 026118 (2023).

Article 

Google Scholar
 

Bandyopadhyay, S. et al. Single-chip photonic deep neural network with forward-only training. Nat. Photon. 18, 1335–1343 (2024).

Article 
CAS 

Google Scholar
 

Oguz, I. et al. Programming nonlinear propagation for efficient optical learning machines. Adv. Photonics 6, 016002 (2024).

Article 
ADS 
CAS 

Google Scholar
 

Skalli, A. et al. Annealing-inspired training of an optical neural network with ternary weights. Commun. Phys. 8, 68 (2025).

Article 

Google Scholar
 

Bueno, J. et al. Reinforcement learning in a large-scale photonic recurrent neural network. Optica 5, 756–760 (2018).

Article 
ADS 

Google Scholar
 

Kanno, K., Naruse, M. & Uchida, A. Adaptive model selection in photonic reservoir computing by reinforcement learning. Sci. Rep. 10, 10062 (2020).

Article 
ADS 
CAS 
PubMed 
PubMed Central 

Google Scholar
 

Hermans, M., Burm, M., Van Vaerenbergh, T., Dambre, J. & Bienstman, P. Trainable hardware for dynamical computing using error backpropagation through physical media. Nat. Commun. 6, 6729 (2015).

Article 
ADS 
CAS 
PubMed 

Google Scholar
 

Burr, G. W. et al. Neuromorphic computing using non-volatile memory. Adv. Phys. X 2, 034092 (2017).


Google Scholar
 

Pai, S. et al. Experimentally realized in situ backpropagation for deep learning in photonic neural networks. Science 380, 398–404 (2023).

Article 
ADS 
CAS 
PubMed 

Google Scholar
 

Morichetti, F. et al. Non-invasive on-chip light observation by contactless waveguide conductivity monitoring. IEEE J. Sel. Top. Quantum Electron. 20, 292–301 (2014).

Article 
ADS 

Google Scholar
 

Zhou, T. et al. In situ optical backpropagation training of diffractive optical neural networks. Photonics Res. 8, 940–953 (2020).

Article 
CAS 

Google Scholar
 

Guo, X., Barrett, T. D., Wang, Z. M. & Lvovsky, A. Backpropagation through nonlinear units for the all-optical training of neural networks. Photonics Res. 9, B71–B80 (2021).

Article 

Google Scholar
 

Wanjura, C. C. & Marquardt, F. Fully nonlinear neuromorphic computing with linear wave scattering. Nat. Phys. 20, 1434–1440 (2024).

Article 
CAS 

Google Scholar
 

Yildirim, M., Dinc, N. U., Oguz, I., Psaltis, D. & Moser, C. Nonlinear processing with linear optics. Nat. Photon. 18, 1076–1082 (2024).

Article 
CAS 

Google Scholar
 

Xia, F. et al. Nonlinear optical encoding enabled by recurrent linear scattering. Nat. Photon. 18, 1067–1075 (2024).

Scellier, B. & Bengio, Y. Equilibrium propagation: bridging the gap between energy-based models and backpropagation. Front. Comput. Neurosci. 11, 24 (2017).

Article 
PubMed 
PubMed Central 

Google Scholar
 

Ackley, D. H., Hinton, G. E. & Sejnowski, T. J. A learning algorithm for Boltzmann machines. Cogn. Sci. 9, 147–169 (1985).


Google Scholar
 

Stern, M., Hexner, D., Rocks, J. W. & Liu, A. J. Supervised learning in physical networks: from machine learning to learning machines. Phys. Rev. X 11, 021045 (2021).

CAS 

Google Scholar
 

Scellier, B., Ernoult, M., Kendall, J. & Kumar, S. Energy-based learning algorithms for analog computing: a comparative study. In Proc. 37th International Conference on Neural Information Processing Systems (NIPS ’23), 52705–52731 (ACM, 2023).

Kendall, J., Pantone, R., Manickavasagam, K., Bengio, Y. & Scellier, B. Training end-to-end analog neural networks with equilibrium propagation. Preprint at https://arxiv.org/abs/2006.01981 (2020).

Wang, Q., Wanjura, C. C. & Marquardt, F. Training coupled phase oscillators as a neuromorphic platform using equilibrium propagation. Neuromorph. Comput. Eng. 4, 034014 (2024).

Article 

Google Scholar
 

Yi, S.-i, Kendall, J. D., Williams, R. S. & Kumar, S. Activity-difference training of deep neural networks using memristor crossbars. Nat. Electron. 6, 45–51 (2023).


Google Scholar
 

Laydevant, J., Marković, D. & Grollier, J. Training an Ising machine with equilibrium propagation. Nat. Commun. 15, 3671 (2024).

Article 
ADS 
CAS 
PubMed 
PubMed Central 

Google Scholar
 

Altman, L. E., Stern, M., Liu, A. J. & Durian, D. J. Experimental demonstration of coupled learning in elastic networks. Phys. Rev. Appl. 22, 024053 (2024).

Article 
CAS 

Google Scholar
 

Dillavou, S., Stern, M., Liu, A. J. & Durian, D. J. Demonstration of decentralized physics-driven learning. Phys. Rev. Appl. 18, 014040 (2022).

Article 
ADS 
CAS 

Google Scholar
 

Dillavou, S. et al. Machine learning without a processor: emergent learning in a nonlinear analog network. Proc. Natl Acad. Sci. 121, e2319718121 (2024).

Stern, M., Dillavou, S., Jayaraman, D., Duria, D. J. & Liu, A. J. Training self-learning circuits for power-efficient solutions. APL Mach. Learn. 2, 016114 (2024).

Article 

Google Scholar
 

Anisetti, V. R., Kandala, A., Scellier, B. & Schwarz, J. Frequency propagation: multimechanism learning in nonlinear physical networks. Neural Comput. 36, 596–620 (2024).

Article 
MathSciNet 
PubMed 

Google Scholar
 

Murugan, A., Strupp, A., Scellier, B. & Falk, M. Contrastive learning through non-equilibrium memory. In APS March Meeting Abstracts 2023, F02.005 (APS, 2023).

Laborieux, A. & Zenke, F. Holomorphic equilibrium propagation computes exact gradients through finite size oscillations. In Proc. 36th International Conference on Neural Information Processing Systems (NIPS ’22), 12950–12963 (ACM, 2022).

Scellier, B., Mishra, S., Bengio, Y. & Ollivier, Y. Agnostic physics-driven deep learning. Preprint at https://arxiv.org/abs/2205.15021 (2022).

Lopez-Pastor, V. & Marquardt, F. Self-learning machines based on Hamiltonian echo backpropagation. Phys. Rev. X 13, 031020 (2023).

CAS 

Google Scholar
 

Touvron, H. et al. LLaMA: open and efficient foundation language models. Preprint at https://arxiv.org/abs/2302.13971 (2023).

Chowdhery, A. et al. PaLM: scaling language modeling with pathways. J. Mach. Learn. Res. 24, 1–113 (2023).


Google Scholar
 

Achiam, J. et al. GPT-4 technical report. Preprint at https://arxiv.org/abs/2303.08774v1 (2023).

Team, G. Gemini: a family of highly capable multimodal models. Preprint at https://arxiv.org/abs/2312.11805v1 (2024).

Radford, A. et al. Learning transferable visual models from natural language supervision. In Proc. 38th International Conference on Machine Learning, 8748–8763 (MLR Press, 2021).

Liu, H., Li, C., Wu, Q. & Lee, Y. J. Visual instruction tuning. In Proc. 37th Conference on Neural Information Processing Systems (NeurIPS 2023) https://openreview.net/forum?id=w0H2xGHlkw (OpenReview, 2023).

Radford, A. et al. Language models are unsupervised multitask learners. OpenAI Blog 1, 9 (2019).


Google Scholar
 

Katharopoulos, A., Vyas, A., Pappas, N. & Fleuret, F. Transformers are RNNs: fast autoregressive transformers with linear attention. In Proc. 37th International Conference on Machine Learning, 5156–5165 (MLR Press, 2020).

Gu, A. & Dao, T. Mamba: linear-time sequence modeling with selective state spaces. Preprint at https://arxiv.org/abs/2312.00752v1 (2023).

Wang, H. et al. BitNet: scaling 1-bit transformers for large language models. Preprint at https://arxiv.org/abs/2310.11453 (2023).

Hu, E. J. et al. LoRA: low-rank adaptation of large language models. Preprint at https://arxiv.org/abs/2106.09685 (2021).

Dao, T., Fu, D., Ermon, S., Rudra, A. & Ré, C. FLASHATTENTION: fast and memory-efficient exact attention with IO-awareness. In Proc. 36th Conference on Neural Information Processing Systems (NeurIPS 2022) 35, 16344–16359 (ACM, 2022).

Juravsky, J. et al. Hydragen: high-throughput LLM inference with shared prefixes. Preprint at https://arxiv.org/abs/2402.05099 (2024).

Anderson, M. G., Ma, S.-Y., Wang, T., Wright, L. G. & McMahon, P. L. Optical transformers. Preprint at https://arxiv.org/abs/2302.10360 (2023).

Shen, Y. et al. Deep learning with coherent nanophotonic circuits. Nat. Photon. 11, 441–446 (2017).

Article 
ADS 
CAS 

Google Scholar
 

Hamerly, R., Bernstein, L., Sludds, A., Soljačić, M. & Englund, D. Large-scale optical neural networks based on photoelectric multiplication. Phys. Rev. X 9, 021032 (2019).

CAS 

Google Scholar
 

Tait, A. N. Quantifying power in silicon photonic neural networks. Phys. Rev. Appl. 17, 054029 (2022).

Article 
ADS 

Google Scholar
 

Laydevant, J., Wright, L. G., Wang, T. & McMahon, P. L. The hardware is the software. Neuron 112, 180–183 (2024).

Article 
CAS 
PubMed 

Google Scholar
 

Hooker, S. The hardware lottery. Commun. ACM 64, 58–65 (2021).

Article 

Google Scholar
 

Stroev, N. & Berloff, N. G. Analog photonics computing for information processing, inference, and optimization. Adv. Quantum Technol. 6, 2300055 (2023).

Article 

Google Scholar
 

Cerezo, M., Verdon, G., Huang, H.-Y., Cincio, L. & Coles, P. J. Challenges and opportunities in quantum machine learning. Nat. Comput. Sci. 2, 567–576 (2022).

Article 
CAS 
PubMed 

Google Scholar
 

Kashif, M. & Shafique, M. Hqnet: harnessing quantum noise for effective training of quantum neural networks in NISQ era. Preprint at https://arxiv.org/abs/2402.08475v1 (2024).

Zhou, M.-G. et al. Quantum neural network for quantum neural computing. Research 6, 0134 (2023).

Article 
ADS 
PubMed 
PubMed Central 

Google Scholar
 

Tian, J. et al. Recent advances for quantum neural networks in generative learning. IEEE Trans. Pattern. Anal. Mach. Intell. 45, 12321–12340 (2023).

Article 
PubMed 

Google Scholar
 

Cerezo, M. et al. Variational quantum algorithms. Nat. Rev. Phys. 3, 625–644 (2021).

Article 

Google Scholar
 

Niazi, S. et al. Training deep Boltzmann networks with sparse Ising machines. Nat. Electron. 7, 610–619 (2024).

Article 

Google Scholar
 

Ma, S. Y., Wang, T., Laydevant, J., Wright, L. G. & McMahon, P. L. Quantum-limited stochastic optical neural networks operating at a few quanta per activation. Nat. Commun. 16, 359 (2025).

Pierangeli, D., Marcucci, G., Brunner, D. & Conti, C. Noise-enhanced spatial-photonic Ising machine. Nanophotonics 9, 4109–4116 (2020).

Article 

Google Scholar
 

McMahon, P. L. The physics of optical computing. Nat. Rev. Phys. 5, 717–734 (2023).

Article 

Google Scholar
 

Keeling, J. & Berloff, N. G. Exciton–polariton condensation. Contemp. Phys. 52, 131–151 (2011).

Article 
ADS 
CAS 

Google Scholar
 

Berloff, N. G. et al. Realizing the classical XY Hamiltonian in polariton simulators. Nat. Mater. 16, 1120–1126 (2017).

Article 
ADS 
CAS 
PubMed 

Google Scholar
 

Johnston, A. & Berloff, N. G. Macroscopic noise amplification by asymmetric dyads in non-Hermitian optical systems for generative diffusion models. Phys. Rev. Lett. 132, 096901 (2024).

Article 
ADS 
MathSciNet 
CAS 
PubMed 

Google Scholar
 

Wang, T. et al. Image sensing with multilayer nonlinear optical neural networks. Nat. Photon. 17, 408–415 (2023).

Article 
ADS 
CAS 

Google Scholar
 

Zhou, F. & Chai, Y. Near-sensor and in-sensor computing. Nat. Electron. 3, 664–671 (2020).

Article 

Google Scholar
 

del Hougne, P., F. Imani, M., Diebold, A. V., Horstmeyer, R. & Smith, D. R. Learned integrated sensing pipeline: reconfigurable metasurface transceivers as trainable physical layer in an artificial neural network. Adv. Sci. 7, 1901913 (2020).

Article 

Google Scholar
 

Vaswani, A. et al. Attention is all you need. In Proc. 31st International Conference on Neural Information Processing Systems (NIPS ’17), 6000–6010 (ACM, 2017).

Wu, C. et al. Harnessing optoelectronic noises in a photonic generative network. Sci. Adv. 8, eabm2956 (2022).

Article 
ADS 
CAS 
PubMed 
PubMed Central 

Google Scholar
 

Bonnet, D. et al. Bringing uncertainty quantification to the extreme-edge with memristor-based Bayesian neural networks. Nat. Commun. 14, 7530 (2023).

Article 
ADS 
CAS 
PubMed 
PubMed Central 

Google Scholar
 

Olin-Ammentorp, W., Beckmann, K., Schuman, C. D., Plank, J. S. & Cady, N. C. Stochasticity and robustness in spiking neural networks. Neurocomputing 419, 23–36 (2021).

Article 

Google Scholar