MPHY 6120 — Module 6 Interactive Exercises
You already know logistic regression: multiply inputs by weights, add bias, apply sigmoid. That's a neuron.
This diagram IS the equation on the left. Same math, different picture.
A single neuron draws one straight line through the data. If the boundary between classes isn't straight, it fails. Add more neurons and the boundary bends.
Accuracy: --
Blue = Class A (Normal)
Orange = Class B (Pneumonia)
Background color = what the network predicts at each point.
Without an activation function, a neuron is just a straight line. Stack 100 straight lines and you still get... a straight line. Activations add the curves.
The activation function sits right here -- between the weighted sum and the output
Each curve below is a different neuron's output. Watch what happens when you turn activation off:
f(z) = 0.500
Each layer detects patterns in the previous layer's output. Simple to Complex.
In medical imaging:
pixels -> edges -> textures -> anatomy -> pathology
A fully-connected layer treats every pixel independently. But images have spatial structure -- nearby pixels matter more than distant ones. Let's see why that's a problem.
A fully-connected network can't tell the difference between these two images. It just sees a list of numbers.
An FC layer treats both the same -- it doesn't know about spatial arrangement.
Highlight a pixel and see its 3x3 neighborhood. Edges, textures, anatomy -- all defined by LOCAL patterns.
A 224x224 image with 128 neurons = 6.4 million weights in ONE layer. Most are wasted.
Pixel [0,0] connects to the same neuron as pixel [223,223]. The network can't know they're far apart.
A tumor in the top-left activates totally different weights than the same tumor in the bottom-right.
Convolution solves all three problems from Step 5. Instead of connecting every pixel to every neuron, use a small sliding window.
Fully Connected: every pixel -> every neuron
Conv 3x3: 9 weights x neurons (shared!)
Drag the image size slider to see the fully-connected count explode while convolution stays flat.
A convolution kernel is a small grid of weights that slides across the image. Different weights detect different patterns.
These 9 numbers ARE the "weights" the CNN learns. Different weights = different features detected.
A CNN learns these kernels automatically from data!
After convolution, we downsample with pooling. Keep the strongest activations, discard the rest. The image shrinks but the features get richer.
Each pooling layer halves the spatial dimensions. Features get compressed but more meaningful.
Now put it all together. Click each stage to see what it does and how the data shape changes.
Click any stage above to learn what it does.