Reproduction of one of the first Convolutional Neural Networks
In 1989, Yann LeCun and his team at AT&T Bell Laboratories published a landmark paper in the history of deep learning. They demonstrated that a neural network could learn directly from raw pixels to recognize handwritten characters — without any hand-crafted features. This network was used by the U.S. Postal Service to automatically read zip codes on mail. Its architecture became the foundation of the Convolutional Neural Network (CNN) that dominates modern computer vision.
| Layer | Type | Output | Params |
|---|---|---|---|
| Input | Image | 1×16×16 | — |
| H1 | Conv 5×5, stride 2 | 12×8×8 | 1,068 |
| H2 | Sparse Conv 5×5, stride 2 | 12×4×4 | 2,592 |
| H3 | Fully Connected | 30 | 5,790 |
| Output | Fully Connected | 10 | 310 |
Total: only 9,760 parameters — incredibly small compared to modern models that have millions to billions of parameters.
| Metric | Paper (1989) | Karpathy (PyTorch) | Ours (Node.js) |
|---|---|---|---|
| Train loss | 2.5e-3 | 4.07e-3 | 4.85e-3 |
| Train error | 0.14% | 0.62% | 0.86% |
| Test loss | 1.8e-2 | 2.84e-2 | 3.02e-2 |
| Test error | 5.0% | 4.09% | 4.19% |
| Test misses | 102 | 82 | 84 |
Small differences are expected due to different datasets (MNIST vs. original zip codes) and different PRNG implementations.
This implementation is fully reproduced in pure Node.js without any ML frameworks — all convolution, backpropagation, and SGD operations are written manually from scratch. After training for 23 epochs (~27 seconds), the weights are saved to JSON. The browser then runs the forward pass directly: your drawing is resized to 16×16 pixels, normalized to [-1, +1], and processed through the 4-layer CNN exactly as described in the original paper. Prediction happens in real-time in the browser with no server round-trip.