From Rosenblatt to Claude

Getting a modern large language model to generate code to implement and visualize Frank Rosenblatt’s perceptron algorithm is in some way paying tribute to Rosenblatt’s visionary ideas. In 1958, he designed the first learnable artificial neurons and the first single-layer artificial neural networks. LLMs are descendants of those early networks. It’s particularly sweet to get an LLM to help us code/visualize Rosenblatt’s perceptron.

When I started writing WHY MACHINES LEARN, one of the first algorithms I coded for the book was the perceptron algorithm. I designed a simple, interactive user interface that would allow me select my data points on the 2D X-Y plane, so that I could visualize the algorithm as it tried to find a line separating two clusters of data.

The figures in the book are images generated using the same UI (using the Python plotting library matplotlib). I did all this sometime in late 2020, well before ChatGPT came on the scene, and certainly well before any LLM-based coding assistants such as CoPilot.

But it’s a different world now. As Harvard professor Boaz Barak said in a recent tweet: “Just realized that the next time I teach my ML foundations course, the primary programming language we use will likely be English. (Students will still need to know math, and be able to read model-generated python.)”

I have been thinking along the same lines: creating a Codebook for WHY MACHINES LEARN using code assistants, so that interested readers could read about the algorithms and basic mathematical ideas in WHY MACHINES LEARN and then prompt an LLM to generate the code and learn how the algorithms work in code, if they are so interested (I used Anthropic’s Claude 3.5 Sonnet, the paid version; but I’m sure there are many open-source models out there that would do the job just as well).

This post is about the process of generating Python code, so that you can engage with the perceptron algorithm and see it working. Details of Rosenblatt’s work, the history and the math, etc., can be found in the first two chapters of WHY MACHINES LEARN.

Some lessons I learned regarding code generation: It really helps if you know exactly what you want, so that your prompts can be precise. You also need to be reasonably familiar with coding, to be able to understand the coding mistakes made by the LLM, so that you can ask it to correct the errors.

The first thing I did was take one of the images of the perceptron algorithm from the book, which shows a linearly separating hyperplane (in this case a line, as the data is 2D), dropping it into Claude’s context window, and giving it my first prompt (I find myself being weirdly polite while interacting with an LLM, hence the over-the-top usage of “please”!).

Prompt

Please look at the image provided. Can you write code that does the following:

Provide a matplotlib interactive user interface that allows the user to click on a 2D graph. The first 5 clicks should be used for circles, the second 5 clicks should be used for triangles.

Output of Claude’s code

Claude generated code that worked without any errors. I was able to interact with the UI and select 10 data points, 5 for circles and 5 for triangles. But you can see that the plot doesn’t look exactly like what I asked for. So, I prompted it a little more, to create code that could generate a plot with solid lines for the axes, no bounding box, etc.

Output of Claude’s code

This is the output after a couple of iterations of simple prompting. Okay, close enough. Ideally, I should asked Claude to make the circles and triangles to have gray “fill”, but I can now work with this. So, I gave Claude a new prompt.

Prompt: Great. Now, once the user has finished clicking 10 times and generating the circles and triangles, when the user clicks next, use that input to kick off a perceptron algorithm, to find a straight line that separates the circles from the triangles. Once the perceptron finds the line, please draw the line

Output of Claude’s code

Okay. This was a big change. The code that Claude generated had significantly more functionality than the previous version which was simply allowing me to select the data points. This time, it’s actually implemented a perceptron algorithm and plotted the linearly separating hyperplane.

Next, I wanted to visualize in the form of an animation, where the output involved plotting some of the incorrect hyperplanes, and ending with the correct one. Getting this to work took some prompting. Below is the series of prompts that got it to work (I show only the important prompts; there were simple ones I haven’t included, to do with the look-and-feel of the UI which aren’t that important).

Prompt: Can you modify the code such that you draw every 3rd line the perceptron finds. Show the wrong lines as gray dotted lines, and the final correct line as a solid, black line. But plot it slowly, so that there is a 1-second delay between the plotting of each line.

Prompt: Something is not right. The code is creating a separate plot for each line. Please don't redraw the plot each time, but use the same canvas. It should seem like an animation.

Prompt: Also, for drawing the line, please use the same fig and ax you use for drawing the circles and triangles. This means your perceptron class will need extra arguments: to take in the fig and ax. Once you have the fig and ax inside the perceptron class, then use the artist to draw the line.

Prompt: Also, the code doesn't have a check to see if the perceptron has found a solution. Modify code to check if the perceptron has found a solution and then terminate the loop.

Prompt: Instead of drawing the perceptron's lines for every 3 iterations, do it for every iteration. Also, make the circles and triangles a little bigger.

Prompt: You removed the 1-second pause between drawing the perceptron's lines. Reintroduce the pause, but keep it to 0.5 seconds.

Prompt: So, everything is great, except for one detail. You have used values of 0 for circles and 1 for triangles, for the classification. The perceptron algorithm requires it to be -1 for circles and 1 for triangles. Can you redo the code with this change?

Prompt: After the perceptron has converged and you have drawn the black solid line, can you turn the entire sequence of lines drawn to convergence into a GIF file?

This is the final GIF generated by the Claude-generated code. The code allows you to select your data points, and it then uses the perceptron algorithm to find a line that separates the circles from the triangles.

For readers of WHY MACHINES LEARN: I’ll be writing a series of blog posts, detailing my attempts to generate code using Claude or some open-source code assistant (preferably). I think it’s a great way to learn both the conceptual and mathematical basics of machine learning—which is the subject of WHY MACHINES LEARN—and also learn how to use code assistants, inspect the generated code, and understand HOW the machines work, by seeing/coding the algorithms at work.

Previous
Previous

The Monty Hall Problem: Could an LLM have convinced Paul Erdős?