Learning Resource - CNN for Beginner
What is Machine Learning?
Machine learning is the scientific study of algorithms and statistical models that computer systems use to perform a specific task without using explicit instructions, relying on patterns and inference instead. It is seen as a subset of artificial intelligence.
What is CNN?
In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of deep neural networks, most commonly applied to analyzing visual imagery.
What can CNN do?
- Image recognition
To Learn about CNN, It is better to have DNN knowledge first.
In Picture recognition, spatial information is very important, and DNN cannot solve it.
Therefore, CNN solved this problem by applying a kernel to every possible position of the image.
- Kernels (or filters) are like weights.
Using Kernels, we can simplify input layer, can get a layer called Convolutional layer.
Another important concept of CNN is pooling.
- We would divide the convolution layer into small squares without overlapping.
We reduce the dimensionality of the problem by only take and keep the strongest detail.
We use three main types of layers to build ConvNet architectures: Convolutional Layer, Pooling Layer, and Fully-Connected (Dense) Layer.
Materials to start
MIT’s Intro to Deep Learning course
Short Introduction of Convolutional Neural Networks
Convolutional Neural Networks — A Beginner’s Guide
How do Convolutional Neural Networks work?
CNN For Visual Recognition Lectures From Stanford
Deep Learning for Computer Vision (Andrej Karpathy, OpenAI)
The 9 Deep Learning Papers You Need To Know About (Understanding CNNs)
Main Types of Neural Networks and its Applications
7 Types of Neural Network Activation Functions: How to Choose?
A simple keras model on my laptop webcam
Basic Concept Videos from deeplizard
Machine Learning & Deep Learning Fundamentals
Visualizing CNN Architecture
Some Good Articles
Some Good Interactive Sites
Convolutional network 2D visualization by Adam Harley
Convolutional network 3D visualization by Adam Harley
This network has 1024 nodes on the bottom layer (corresponding to pixels), six 5x5 (stride 1) convolutional filters in the first hidden layer, followed by sixteen 5x5 (stride 1) convolutional filters in the second hidden layer, then three fully-connected layers, with 120 nodes in the first, 100 nodes in the second, and 10 nodes in the third. The convolutional layers are each followed by downsampling layer that does 2x2 max pooling (with stride 2).
A Nice Video
- 00:00 ~ 00:57 2D convolving with two 2D filters
- 00:57 ~ 01:01 Applying the activation function(ReLU)
- 01:01 ~ 01:05 Pooling
- The 1st layer done
- 01:06 ~ 01:26 2D convolving with two 3D filters
- 01:26 ~ 01:28 Applying the activation function again
- 01:28 ~ 01:31 Pooling again
- The 2nd layer done
- 01:31 ~ Softmax-with-loss layer done