From Rosenblatt to Claude
Getting a modern large language model to generate code to implement and visualize Frank Rosenblatt’s perceptron algorithm is in some way paying tribute to Rosenblatt’s visionary ideas. In 1958, he designed the first learnable artificial neurons and the first single-layer artificial neural networks. LLMs are descendants of those early networks. It’s particularly sweet to get an LLM to help us code/visualize Rosenblatt’s perceptron.
When I started writing WHY MACHINES LEARN, one of the first algorithms I coded for the book was the perceptron algorithm. I designed a simple, interactive user interface that would allow me select my data points on the 2D X-Y plane, so that I could visualize the algorithm as it tried to find a line separating two clusters of data.
The figures in the book are images generated using the same UI (using the Python plotting library matplotlib). I did all this sometime in late 2020, well before ChatGPT came on the scene, and certainly well before any LLM-based coding assistants such as CoPilot.
But it’s a different world now. As Harvard professor Boaz Barak said in a recent tweet: “Just realized that the next time I teach my ML foundations course, the primary programming language we use will likely be English. (Students will still need to know math, and be able to read model-generated python.)”
I have been thinking along the same lines: creating a Codebook for WHY MACHINES LEARN using code assistants, so that interested readers could read about the algorithms and basic mathematical ideas in WHY MACHINES LEARN and then prompt an LLM to generate the code and learn how the algorithms work in code, if they are so interested (I used Anthropic’s Claude 3.5 Sonnet, the paid version; but I’m sure there are many open-source models out there that would do the job just as well).
This post is about the process of generating Python code, so that you can engage with the perceptron algorithm and see it working. Details of Rosenblatt’s work, the history and the math, etc., can be found in the first two chapters of WHY MACHINES LEARN.
Some lessons I learned regarding code generation: It really helps if you know exactly what you want, so that your prompts can be precise. You also need to be reasonably familiar with coding, to be able to understand the coding mistakes made by the LLM, so that you can ask it to correct the errors.
The first thing I did was take one of the images of the perceptron algorithm from the book, which shows a linearly separating hyperplane (in this case a line, as the data is 2D), dropping it into Claude’s context window, and giving it my first prompt (I find myself being weirdly polite while interacting with an LLM, hence the over-the-top usage of “please”!).
Prompt: Can you modify the code such that you draw every 3rd line the perceptron finds. Show the wrong lines as gray dotted lines, and the final correct line as a solid, black line. But plot it slowly, so that there is a 1-second delay between the plotting of each line.
Prompt: Something is not right. The code is creating a separate plot for each line. Please don't redraw the plot each time, but use the same canvas. It should seem like an animation.
Prompt: Also, for drawing the line, please use the same fig and ax you use for drawing the circles and triangles. This means your perceptron class will need extra arguments: to take in the fig and ax. Once you have the fig and ax inside the perceptron class, then use the artist to draw the line.
Prompt: Also, the code doesn't have a check to see if the perceptron has found a solution. Modify code to check if the perceptron has found a solution and then terminate the loop.
Prompt: Instead of drawing the perceptron's lines for every 3 iterations, do it for every iteration. Also, make the circles and triangles a little bigger.
Prompt: You removed the 1-second pause between drawing the perceptron's lines. Reintroduce the pause, but keep it to 0.5 seconds.
Prompt: So, everything is great, except for one detail. You have used values of 0 for circles and 1 for triangles, for the classification. The perceptron algorithm requires it to be -1 for circles and 1 for triangles. Can you redo the code with this change?
Prompt: After the perceptron has converged and you have drawn the black solid line, can you turn the entire sequence of lines drawn to convergence into a GIF file?
For readers of WHY MACHINES LEARN: I’ll be writing a series of blog posts, detailing my attempts to generate code using Claude or some open-source code assistant (preferably). I think it’s a great way to learn both the conceptual and mathematical basics of machine learning—which is the subject of WHY MACHINES LEARN—and also learn how to use code assistants, inspect the generated code, and understand HOW the machines work, by seeing/coding the algorithms at work.