Histories of Deep Learning

In late 2015, Andrey Kurenkov wrote a four-part series on deep learning that provides an excellent overview of the field from the 1950’s through the early 21st-century with an emphasis on the dramatic improvements since 2006.

Also in 2015, Yann LeCun, Yoshua Bengio, and Geoffrey Hinton published an important academic review article in Nature. They are three of the most well-known names in artificial intelligence research. LeCun divides his time between NYU and Facebook, Bengio teaches at Université de Montréal, and Hinton works at the University of Toronto and Google. Bengio advised Maluuba, an AI startup acquired by Microsoft in 2017. Along with Ian Goodfellow and Aaron Courville, Bengio co-authored my favorite single resource on deep learning.

It’s useful to read Jürgen Schmidhuber’s critique of the 2015 LBH article in Nature. Schmidhuber co-directs the Dalle Molle Institute for Artificial Intelligence Research and is another major figure in AI who is famous for creating LSTM. Schmidhuber has been critical of histories that leave out the contributions of other pioneers in AI.

Finally, Haohan Wang, Bhiksha Raj, and Eric P. Xing at Carnegie Mellon University wrote a more advanced perspective on the evolution of Deep Learning. This 2017 paper requires a bit more math background. I especially appreciate that section 7 explores optimization of neural networks beyond “vanilla backpropagation”, esp. techniques such as rprop, AdaGrad, AdaDelta, and Adaptive Moment Estimation (ADAM). It also has a concise description of the powerful dropout regularization technique. The original dropout paper by Geoffrey Hinton, Nitish Srivastava, et alia was published just five years ago and is already one of the most cited deep learning papers–see PDF here.