Machine Learning’s Arc Of History
Focusing on the long arc of the history of machine learning—and not just on what’s happening now—brings into sharp relief this moment in time. Machine learning-enabled artificial intelligence has a long and storied past, particularly when it involves neural networks, including periods when the phrase “neural networks” was anathema. But unlike areas like physics, where the key historical figures (say, those responsible for quantum physics) are long gone, ML’s past is extant. Barring some luminaries (Frank Rosenblatt comes to mind; he died of a tragic accident when was in his 40s), many of the seminal figures in the development of machine learning are not just still around to tell their stories, but even actively involved in their work.
One of the joys of writing WHY MACHINES LEARN was being able to talk to many of them. Here are some of the researchers I had the pleasure of interviewing:
BERNIE WIDROW
Widrow, along with Ted Hoff, developed the LMS algorithm, an extremely noisy algebraic formulation of stochastic gradient descent, which he used to train his single neurons, called ADALINE, for adaptive linear neuron
PETER HART
Hart developed the rigorous math behind the Cover-Hart K-Nearest Neighbor rule, one of the seminal machine learning algorithms. It was his PhD thesis, with his advisor Thomas Cover, who was barely a few years older.
JOHN HOPFIELD
Reluctant to be talk at first, Hopfield was a delightful interviewee. We talked of his turn from physics to neural networks and his design of Hopfield Networks, for which he won the 2024 Nobel Prize in physics.
ISABELLE GUYON
Guyon was the brains behind support vector machines (SVMs), with Vladimir Vapnik and Bernhard Boser. She did not get enough credit during the 1990s for SVMs, but the community would eventually recognize her contributions
GEORGE CYBENKO
Cybenko is credited for proving the first universal approximation theorem: a neural network with a single hidden layer, given enough neurons, can approximate any function. This and theorems that followed made people believe in neural networks
GEOFFERY HINTON
Hinton is the glue that connects the first AI winter, when no one was working on neural networks except a few like him, to today, when everyone is working on deep nets. Backpropagation, AlexNet and much else. Of course, Nobel Laureate along with Hopfield.
YANN LeCUN
LeCun was Hinton’s postdoc and went on to found his own lab. Designed the first convolutional neural networks and so much more. Again, a seminal figure in the history of modern deep neural networks. Turing Award winner along with Hinton and Yoshua Bengio
ALETHEA POWER
Power’s team at OpenAI stumbled upon and analyzed the phenomenon of Grokking: a neural network trained way past the point of interpolation ends up discovering a simpler solution; it ‘groks’ — or becomes the solution
MISHA BELKIN
Belkin, a mathematician and deep learning theorist, has been trying to make sense of why deep neural networks generalize despite being over-parameterized, and has analyzed in particular the so-called double descent behavior of deep nets