Math Concepts

Scalar, Vector, Matrix and Tensor

Matrix

$A_{m\times n}$ means a matrix with $m$ rows and $n$ columns. Also called $m$ by $n$ matrix.

Matrix have 2 dimensions.

In Geometry, A Matrix is like a plane.

In Python, to declare a matrix :

# get numpy library

m = np.array([5,-2,4],[-3,0,14])

m

# Check the shape of the matrix
m.shape

Vector

Vector is a matrix that have only 1 row and many columns; or only 1 column and many rows.

The number of elements in a vector is called length.

Example :

Row Vector $A_{1\times n}$
Column Vector $A_{m \times 1}$

Vector have 1 dimensions.

In Geometry, A Vector is like a line. It has direction.

3D Vector Plotter

In Python, to declare a vector :

# get numpy library

v = np.array([5,-2,4])

# v is a row vector (1 row 3 columns)
v

# Check the shape of the vector
v.shape

Note you can use the reshape method to change the shape of vector.

v.reshape(3,1)

# v is now a column vector （3 rows 1 column)
v

Scalar

Scalar is a matrix with 1 row and 1 column. Also called 1 by 1 matrix.

A matrix only single element is called a scalar.

Scalar have 0 dimensions.

In Geometry, A Scalar is like a point. It have no direction nor size.

In Python, any integer or float is a scalar.

Note a scalar does not have shape because it have 0 dimensions.

Tensor

0D array is a scalar (Rank 0 Tensor)
1D array is a vector (Rank 1 Tensor)
2D array is a matrix (Rank 2 Tensor)
3D array is a tensor (Rank 3 Tensor)
nD array for n>2 is a tensor (Rank n Tensor)

To create a Rank 3 Tensor in python, you can use np.array([matrix1, matrix2]) to make a 3D array.

m1 = np.array([[5,12,6],[-3,0,14]])
m2 = np.array([[9,8,7],[1,3,-5]])

t = np.array([m1,m2])
t

In above example the shape of t is (2,2,3).

Multiplication

Vector Multiplication

Condition: They must have the same length.

There are 2 Types of Output we can get:

Dot Product (inner product)
Tensor Product (outer product)

Dot Product is heavily used.

Example of Dot Product:

Vector $\times$ Vector = Scalar

Matrix Multiplication

Condition: We can only mulitply an $m \times n$ with an $n \times k$ matrix

$A_{m\times n} \times B_{n \times k} = C_{m \times k}$

Example of Dot Product:

$5 \times 2 + 12 \times 8 + 6 \times 3 = 124$

$5 \times -1 + 12 \times 0 + 6 \times 0 = -5$

$-3 \times 2 + 0 \times 8 + 14 \times 3 = 36$

$-3 \times -1 + 0 \times 0 + 14 \times 0 = 3$

Tensorflow

One of the biggest advantages of tensorflow is it uses not only the CPU of the computer but also as GPU. Recently Google furthered this trend by introducing TPU (tensor processing units) which improves performance even further.

Tensorflow is a good fit for neural network but sklearn is a better fit in clustering and regressions, especially preprocessing.

Note:

Tensorflow doesn’t work with csv or xlxs files. Instead it work with Tensors (npz files) which store ndarray.
In other words, data need to be preprocessed and save in .npz file.

Minimal example with TensorFlow 2.0

Import the relevant libraries

1
2
3

import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf

Data generation

# First, we should declare a variable containing the size of the training set we want to generate.
observations = 1000

# We will work with two variables as inputs. You can think about them as x1 and x2 in our previous examples.
# We have picked x and z, since it is easier to differentiate them.
# We generate them randomly, drawing from an uniform distribution. There are 3 arguments of this method (low, high, size).
# The size of xs and zs is observations x 1. In this case: 1000 x 1.
xs = np.random.uniform(low=-10, high=10, size=(observations,1))
zs = np.random.uniform(-10, 10, (observations,1))

# Combine the two dimensions of the input into one input matrix. 
# This is the X matrix from the linear model y = x*w + b.
# column_stack is a Numpy method, which combines two matrices (vectors) into one.
generated_inputs = np.column_stack((xs,zs))

# We add a random small noise to the function i.e. f(x,z) = 2x - 3z + 5 + <small noise>
noise = np.random.uniform(-1, 1, (observations,1))

# Produce the targets according to our f(x,z) = 2x - 3z + 5 + noise definition.
# In this way, we are basically saying: the weights should be 2 and -3, while the bias is 5.
generated_targets = 2*xs - 3*zs + 5 + noise

# save into an npz file called "TF_intro"
np.savez('TF_intro', inputs=generated_inputs, targets=generated_targets)

Solving with TensorFlow

1 2	# Load the training data from the NPZ training_data = np.load('TF_intro.npz')

# Declare a variable where we will store the input size of our model
# It should be equal to the number of variables you have
input_size = 2
# Declare the output size of the model
# It should be equal to the number of outputs you've got (for regressions that's usually 1)
output_size = 1

# Outline the model
# We lay out the model in 'Sequential'
# Note that there are no calculations involved - we are just describing our network
model = tf.keras.Sequential([
                            # Each 'layer' is listed here
                            # The method 'Dense' indicates, our mathematical operation to be (xw + b)
                            tf.keras.layers.Dense(output_size,
                                                 # there are extra arguments you can include to customize your model
                                                 # in our case we are just trying to create a solution that is 
                                                 # as close as possible to our NumPy model
                                                 kernel_initializer=tf.random_uniform_initializer(minval=-0.1, maxval=0.1),
                                                 bias_initializer=tf.random_uniform_initializer(minval=-0.1, maxval=0.1)
                                                 )
                            ])

# We can also define a custom optimizer, where we can specify the learning rate
custom_optimizer = tf.keras.optimizers.SGD(learning_rate=0.02)
# Note that sometimes you may also need a custom loss function 
# That's much harder to implement and won't be covered in this course though

# 'compile' is the place where you select and indicate the optimizers and the loss
model.compile(optimizer=custom_optimizer, loss='mean_squared_error')

# finally we fit the model, indicating the inputs and targets
# if they are not otherwise specified the number of epochs will be 1 (a single epoch of training), 
# so the number of epochs is 'kind of' mandatory, too
# we can play around with verbose; we prefer verbose=2
model.fit(training_data['inputs'], training_data['targets'], epochs=100, verbose=2)

Extract the weights and bias

1 2	# Extracting the weights and biases is achieved quite easily model.layers[0].get_weights()

# We can save the weights and biases in separate variables for easier examination
# Note that there can be hundreds or thousands of them!
weights = model.layers[0].get_weights()[0]
weights

# We can save the weights and biases in separate variables for easier examination
# Note that there can be hundreds or thousands of them!
bias = model.layers[0].get_weights()[1]
bias

Extract the outputs (make predictions)

# We can predict new values in order to actually make use of the model
# Sometimes it is useful to round the values to be able to read the output
# Usually we use this method on NEW DATA, rather than our original training data
model.predict_on_batch(training_data['inputs']).round(1)

1 2	# If we display our targets (actual observed values), we can manually compare the outputs and the targets training_data['targets'].round(1)

Plotting the data

# The model is optimized, so the outputs are calculated based on the last form of the model

# We have to np.squeeze the arrays in order to fit them to what the plot function expects.
# Doesn't change anything as we cut dimensions of size 1 - just a technicality.
plt.plot(np.squeeze(model.predict_on_batch(training_data['inputs'])), np.squeeze(training_data['targets']))
plt.xlabel('outputs')
plt.ylabel('targets')
plt.show()

# Voila - what you see should be exactly the same as in the previous notebook!
# You probably don't see the point of TensorFlow now - it took us the same number of lines of code
# to achieve this simple result. However, once we go deeper in the next chapter,
# TensorFlow will save us hundreds of lines of code.

Reference

The Data Science Course 2020: Complete Data Science Bootcamp