The Forward-Forward Algorithm

Forward-Forward (Hinton, 2022) replaces backpropagation's forward-then-backward sweep with two forward passes, training each layer by its own local objective — no global loss, no gradients flowing between layers. A positive pass runs on real data and a negative pass on corrupted or mislabeled data; every layer adapts its weights to make its goodness — the sum of its squared activities, Σy² — high for positive data and low for negative. Activity length (which carries the goodness) is normalized away between layers, so each layer must learn features from the orientation it receives rather than reusing the previous layer's score. For supervised tasks the label is embedded in the input, and a sample is classified by choosing the label that maximizes total goodness. The result needs no backward pass, no stored activations, and no differentiable model of the forward computation.

01

01 / 05