The XOR Problem - Why Neural Networks Need Hidden Layers

Part 1: Can a Single Line Separate the Classes?

Select a logic gate and try to draw a line that separates the green (output=1) points from the red (output=0) points.

x₁	x₂	Output
0	0	0
0	1	1
1	0	1
1	1	1

Perceptron (Single Neuron)

Hover over output neuron to see computation

Activation Function

Step: σ(x) = 1 if x > 0, else 0

Input Space

Perceptron Weights

Adjust weights to find a separating line

w₁: 0.2

w₂: 1.0

b: -0.5

Adjust the weights to separate the classes!

Part 2: Solving XOR with a Hidden Layer

A hidden layer transforms the input space, making the problem linearly separable. Adjust the weights to see how the transformation works. One approach: h₁ detects "at least one input is on" (OR) and h₂ detects "not both are on" (NAND), then combine with AND. Another approach: h₁ detects "x₁ is on but not x₂" and h₂ detects "x₂ is on but not x₁", then combine with OR.

Network Architecture

Activation Function

Step: σ(x) = 1 if x > 0, else 0

h₁: —

Green region = h₁ fires

h₁ Weights

w₁₁: 0.2

w₁₂: 1.0

b₁: -0.5

h₁ = σ(w₁₁·x₁ + w₁₂·x₂ + b₁)
Boundary: w₁₁·x₁ + w₁₂·x₂ + b₁ = 0

h₂: —

Green region = h₂ fires

h₂ Weights

w₂₁: -0.2

w₂₂: -1.0

b₂: 1.5

h₂ = σ(w₂₁·x₁ + w₂₂·x₂ + b₂)
Boundary: w₂₁·x₁ + w₂₂·x₂ + b₂ = 0

Hidden Space (h₁, h₂)

output boundary (—)

Output Layer Weights

v₁: 0.3

v₂: 1.0

c: -0.5

Combined: Network Output (XOR)

h₁ (—) h₂ (—)

How XOR Works

XOR = h₁ AND h₂

Adjust the weights to find a valid XOR solution.

Forward Pass Computation

Input (x₁,x₂)	Hidden (h₁,h₂)	Output (ŷ)	Target	Correct?
(0, 0)	-	-	0	-
(0, 1)	-	-	1	-
(1, 0)	-	-	1	-
(1, 1)	-	-	0	-

Network Accuracy: 0/4

The Key Insight

The XOR function outputs 1 when exactly one input is 1. In the input space, the two classes (0 and 1) are arranged diagonally — no single line can separate them.

The hidden layer acts as a feature transformation. Each hidden neuron computes:

$$h_i = \sigma(w_{i1}x_1 + w_{i2}x_2 + b_i)$$

where σ is the step function. This transforms the 4 input points into a new 2D space where they can be linearly separated.

There are infinitely many sets of weights that correctly solve XOR — the two presets above are just two examples of fundamentally different approaches.

Part 1: Can a Single Line Separate the Classes?

Perceptron (Single Neuron)

Activation Function

Perceptron Weights

Part 2: Solving XOR with a Hidden Layer

Network Architecture

Activation Function

h1 Weights

h2 Weights

Output Layer Weights

How XOR Works

Forward Pass Computation

The Key Insight

h₁ Weights

h₂ Weights