1. The Goal: Reading Handwriting
NeuralDigit is a web-based application designed to recognize handwritten numbers (from 0 to 9) drawn
directly onto the browser canvas. It acts as an interactive demonstration of modern Machine Learning.
When you draw on the canvas, the tool needs to figure out which numerical digit your pattern of pixels
most closely resembles.
2. Data Preparation (Preprocessing)
Before the Neural Network can examine the drawing, the raw canvas data must be converted into a format
the model understands, simulating the dataset it was originally trained on (the famous MNIST
dataset).
- Bounding Box: The app finds the exact edges of your drawing, ignoring empty space.
- Resizing: Your drawing is shrunk down and centered precisely into a 28x28 pixel
grid.
- Flattening: The grid of 28x28 pixels is "flattened" into a single, straightforward
array of exactly 784 numerical values ranging from 0 (black/empty) to 1 (white/painted).
3. The Neural Network Architecture
NeuralDigit operates using a fully connected, feedforward Artificial Neural Network (also known as a
Multi-Layer Perceptron), designed in three major stages.
- Input Layer (784 nodes): Each node corresponds directly to one pixel from the 28x28
resized image.
- Hidden Dense Layer (128 neurons): This layer acts as the "brain". Every neuron here
is connected to all 784 input pixels. It utilizes a mathematical function called
ReLU (Rectified Linear Unit) to decide if a neuron should activate ("fire") based
on the input drawn patterns.
- Output Dense Layer (10 neurons): The final layer has exactly 10 nodes, representing
the digits 0 through 9. A Softmax function compresses these 10 output values into
percentages, ensuring that all 10 probabilities combined equal exactly 100%.
The highest percentage dictates the model's final prediction.
4. 100% Client-Side Inference
Most AI tools upload your data to a remote server. NeuralDigit is entirely different. It uses embedded
model weights and a custom JavaScript engine to calculate the linear algebra (matrix multiplications)
directly inside your CPU/GPU, directly in the browser.
This means inferences occur locally within milliseconds, without needing an internet connection
post-load, ensuring absolute privacy.
// A snippet of the logic powering the inference
let sum = bias[j];
for (let i = 0; i < 784; i++) {
sum += input[i] * weight[i * 128 + j];
}
hidden[j] = Math.max(0, sum); // ReLU activation