• Package is namespace which contain multiple packages and modules itself.
  • Module is a files which consisting of Python codes.
  • Functions, classes and variables can be defined and implemented in a module.

To Install new package:

1
$ pip install <package name>

Display installed packages:

1
$ pip list

Numpy

NumPy is a Fundamental package for scientific computing with Python.

How to import

1
import numpy as np

To declare an array

Array is a data type provided by NumPy support 2D, 3D or higher dimensional arrays.

some examples

1
2
3
x = np.array([0,1,2])
y = np.array( ([0,1,2],[3,4,5]) )
a = np.array( ([1,2,3],[4,5,6],[7,8,9]) )

Accessing shape of array

1
2
a = np.array( ([1,2,3],[4,5,6],[7,8,9]) )
print(a.shape)

Accessing elements of array

Indexing

Array vs List

Array has a different Arithmetic operations with list.

Check this example you will know how:

1
2
3
4
5
6
7
8
9
10
a = [1,3,4,7]
b = np.array(a)
print(a*2)
# result:
# [1, 3, 4, 7, 1, 3, 4, 7]

print(b*2)
# result:
# [ 2 6 8 14]

Different ways to create an array

numpy.zeros & numpy.ones

You can use numpy.zeros or numpy.ones to create an array filled with 0 or 1 with the specified shape.

  • numpy.ones(shape, dtype=None, order='C')
  • numpy.zeros(shape, dtype=None, order='C')

Check these example you will know how:

1
2
a = np.zeros((2,3))
print(a)

[[0. 0. 0.]
[0. 0. 0.]]

1
2
b = np.ones((3,3))
print(b)

[[1. 1. 1.]
[1. 1. 1.]
[1. 1. 1.]]

numpy.arange

Generate an array with values within a half-open interval [start, stop)

  • numpy.arange([start=0,]stop,[step=1,]dtype=None)
1
2
3
4
5
6
7
a = np.arange(3)
print(a)
# [0 1 2]

b = np.arange(2,3,0.2)
print(b)
# [2. 2.2 2.4 2.6 2.8]

numpy.linspace

Create an array with evenly spaced numbers over a specified interval [start, stop]

  • numpy.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0)
1
2
3
4
5
6
7
a = np.linspace(2.0, 3.0, 5)
print(a)
# [2. 2.25 2.5 2.75 3. ]

b = np.linspace(2.0, 3.0, 5, False)
print(b)
# [2. 2.2 2.4 2.6 2.8]
1
2
a = np.linspace((0, 0, 0), (2, 4, 6), 3, axis=0)
print(a)

[[0. 0. 0.]
[1. 2. 3.]
[2. 4. 6.]]

1
2
b = np.linspace((0, 0, 0), (2, 4, 6), 3, axis=1)
print(b)

[[0. 1. 2.]
[0. 2. 4.]
[0. 3. 6.]]

numpy.logspace

Create an array with numbers spaced evenly on a log scale

  • numpy.logspace(start, stop, num=50, endpoint=True, base=10.0, dtype=None, axis=0)
1
2
3
4
5
6
7
8
9
10
11
a = np.logspace(2.0, 3.0, 4)
print(a)
# [ 100. 215.443469 464.15888336 1000. ]

b = np.logspace(2.0, 3.0, 4, False)
print(b)
# [100. 177.827941 316.22776602 562.34132519]

c = np.logspace(2.0, 3.0, 4, base=2.0)
print(c)
# [4. 5.0396842 6.34960421 8. ]

Returns the indices of the maximum values along an axis

https://numpy.org/doc/stable/reference/generated/numpy.argmax.html

np.argmax(a, axis=None, out=None)

Example:

1
2
3
4
5
6
7
8
9
10
11
12
a = np.arange(6).reshape(2,3) + 10
#array([[10, 11, 12],
# [13, 14, 15]])

np.argmax(a) # Return index of max number (which is 5 is this case)
#5

np.argmax(a, axis=0)
#array([1, 1, 1])

np.argmax(a, axis=1)
#array([2, 2])

More about Reshape

What does -1 mean in numpy reshape?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
z = np.array([[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12]])
z.shape
#(3, 4)

z.reshape(-1)
#array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])


z.reshape(-1,1)
#array([[ 1],
# [ 2],
# [ 3],
# [ 4],
# [ 5],
# [ 6],
# [ 7],
# [ 8],
# [ 9],
# [10],
# [11],
# [12]])

z.reshape(-1, 2)
#array([[ 1, 2],
# [ 3, 4],
# [ 5, 6],
# [ 7, 8],
# [ 9, 10],
# [11, 12]])



Mathematical functions

https://numpy.org/doc/stable/reference/routines.math.html

numpy.ndarray

https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html#numpy-ndarray

Polynominals

Import numpy.polynomial

1
from numpy.polynomial import polynomial as P

Basics

  • Polynomial(coef[, domain, window])
    p(x)=67x+x3p(x)=6-7 x+x^{3}
1
2
3
f = P.Polynomial( [6,-7,0,1] ) # ^The above equation
print(f.roots())
# [-3. 1. 2.]
  • polyval(x, c[, tensor]) - Evaluation the value of a polynomial at specific value of 𝑥.
  • polyroots(c) - compute roots of a polynomial
  • polyfromroots - Generate a polynomial by the given roots

Fitting

  • polyfit(x,y, deg[, rcond, full, w]) - Least squares fit of a polynomial to data

Polynomial arithmetic

https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.polynomials.polynomial.html#algebra

  • polyadd(c1, c2) - Add one polynomial to another.
  • polysub(c1, c2) - Subtract one polynomial from another.
  • polymul(c1, c2) - Multiply one polynomial by another.
  • polydiv(c1, c2) - Divide one polynomial by another.

Matplotlib

Matplotlib can be used to perform various 2D/3D plots with python.

How to import

1
import matplotlib.pyplot as plt

Plot an equation

1
2
3
y = [1, 4, 3, 2, 8, 5, 7]
plt.plot(y)
plt.show()

Equation with labels

1
2
3
4
5
6
7
y = [1, 4, 3, 2, 8, 5, 7]
x = [1, 2, 3, 4, 5, 6, 7]
plt.plot(x,y)
plt.title('title')
plt.ylabel('y label')
plt.xlabel('x label')
plt.show()

Plot multiple equations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
r = np.arange(0,2*np.pi,0.1)
z1 = np.sin(r)
plt.plot(r,z1)

z2 = np.sin(r-np.pi/4)
plt.plot(r,z2)

z3 = np.sin(r-np.pi/2)
plt.plot(r,z3)

z4 = np.sin(r-np.pi)
plt.plot(r,z4)

plt.legend(['z1','z2','z3','z4'],loc=1)

Latex support

Matplotlib supports special symbols for displaying equations and other mathematical expressions.

1
2
3
4
5
t=np.arange(0,1000)
plt.plot(t,0.25*np.exp(-0.005*t))
plt.title(r'$0.25{\it e}^{\alpha{\it t}}$')
plt.xlabel(r'$\mu$ sec')
plt.ylabel('Amplitude')

Subplot

subplot(nrows, ncols, index, **kwargs)

  • nrows - Number of rows
  • ncols - Number of columns
  • index - Index of the graph
1
2
3
4
5
6
7
8
9
10
11
12
x=np.linspace(0,1)
y=np.exp(-x)*np.cos(6*np.pi*x)
z=np.exp(-x)

plt.subplot(2,2,1)
plt.plot(x,y)

plt.subplot(2,2,2)
plt.plot(x,z,'r:',x,-z,'r:')

plt.subplot(2,1,2)
plt.plot(x,y, x,z,'r:',x,-z,'r:')

Axis configuration

  • plt.xticks(ticks, labels)
  • plt.yticks(ticks, labels)

Change the scale of x axis

1
2
3
4
5
plt.plot(x,np.sin(x))

t=np.linspace(0,2*np.pi,5)
lbl=['0','pi/2','pi','3pi/2','2pi']
plt.xticks(ticks=t, labels=lbl)

See an example combining subplot.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
x=np.linspace(0,2,200)
y=np.exp(-x)*np.cos(10*np.pi*x)
z=np.exp(-x)

plt.subplot(2,1,1)
plt.plot(x,y)
plt.subplot(2,1,2)
plt.plot(x,y)

# hspace - the height space between 2 graphs
plt.subplots_adjust(hspace=0.6)
# show only 0.25 to 0.75 parts for second graph
plt.axis([0.25, 0.75, -0.75, 0.75])
# Axis configuration
plt.xticks(ticks=[0.25, 0.5, 0.75])

Plot a bar chart

https://matplotlib.org/api/_as_gen/matplotlib.pyplot.bar.html#matplotlib-pyplot-bar

bar(x, height, width=0.8, bottom=None, *, align='center', data=None, **kwargs)

  • x - The x coordinates of the bars.
  • height - The height of the bars.
  • width - The width of the bars.
  • bottom - The y coordinate of the bars bases.
  • align - Alignment of the bars to the x coordinates.
    • 'center' or 'edge'

Normal Example

1
2
3
4
5
6
7
8
9
cata = [20, 35, 30, 35, 27]
catb = [15, 45, 50, 75, 97]

index = np.arange(5)
width=0.4

p1 = plt.bar(index-width/2, cata, width=width)
p2 = plt.bar(index+width/2, catb, width=width)
plt.legend((p1[0], p2[0]), ['Category A', 'Category B'])

Example with bottom

1
2
3
4
5
6
7
8
9
cata = [20, 35, 30, 35, 27]
catb = [15, 45, 50, 75, 97]

index = np.arange(5)

p1 = plt.bar(index, cata)
p2 = plt.bar(index, catb, bottom=cata)
plt.xticks(index, ['G1', 'G2', 'G3', 'G4', 'G5'])
plt.legend((p1[0], p2[0]), ['Category A', 'Category B'])

Plot a pie chart

https://matplotlib.org/api/_as_gen/matplotlib.pyplot.pie.html#matplotlib-pyplot-pie

pie(x, explode=None, labels=None, colors=None, autopct=None)

  • x - The wedge sizes
  • explode - An array which specifies the fraction of the radius with which to offset each wedge.
  • labels - A sequence of strings providing the labels for each wedge.
  • autopct - A string or function used to label the wedges with their numeric value.

Example

1
2
3
4
count = [4, 5, 6, 3, 4]
labels = ['Comedy','Action','Romance','Drama','SciFi']

plt.pie(count, labels=labels, autopct='%.1f%%')

Example with explode

1
2
3
4
5
count = [4, 5, 6, 3, 4]
labels = ['Comedy','Action','Romance','Drama','SciFi']
explode = [1,0,0,0,0]

plt.pie(count, labels=labels, explode=explode, autopct='%.1f%%')

Plot a histogram

https://matplotlib.org/api/_as_gen/matplotlib.pyplot.hist.html#matplotlib-pyplot-hist

hist(x, bins=None, range=None, density=None, weights=None, cumulative=False, bottom=None, histtype='bar', align='mid', orientation='vertical', rwidth=None, log=False, color=None, label=None, stacked=False, normed=None, *, data=None, **kwargs)

  • x - Data to distribute among bins, specified as a vector.
  • bins - Number of bins of the histogram. Default bins = 10.
  • range - The lower and upper range of the bins.
  • cumulative - If True, then a histogram is computed where each bin gives the counts in that bin plus all bins for smaller values.
  • rwidth - The relative width of the bars as a fraction of the bin width.

Example

1
2
3
4
5
6
7
x=np.random.randn(1000) # Generate 1000 random numbers 

plt.subplot(1,2,1)
plt.hist(x)

plt.subplot(1,2,2)
plt.hist(x,bins=20, color='pink')

More Example

1
2
3
4
5
6
7
8
9
10
11
12
13
14
x=np.random.randn(1000) # Generate 1000 random numbers 

plt.subplot(2,2,1)
plt.hist(x)

plt.subplot(2,2,2)
plt.hist(x,range=[-3,3], color='green') # Remove outliers


plt.subplot(2,2,3)
plt.hist(x,bins=np.arange(-3,4), color='orange') # Set bin edges

plt.subplot(2,2,4)
plt.hist(x,cumulative=True,bins=100, color='purple') # Cumulative plot

Pandas

  • A fast and efficient DataFrame object for data manipulation;
  • Tools for reading and writing data of different formats: CSV and text files, Microsoft Excel, SQL databases, and the fast HDF5 format;
  • Intelligent data alignment and integrated handling of missing data;
  • Flexible reshaping and pivoting of data sets;
  • Time series-functionality;
  • Python with pandas is in use in a wide variety of academic and commercial domains, including Finance, Neuroscience, Economics, Statistics, Advertising, Web Analytics, and more.

pandas deals with the following data structures:

  • Series - 1D labeled array
  • DataFrame - General 2D labeled, size-mutable tabular structure

How to import

1
import pandas as pd

pandas.Series

Not so important so I skipped it
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.html#pandas-series

pandas.DataFrame

In Real life programming, we often use dataframe.

Example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
staff = {'Name':['Alex', 'Bob', 'Calvin', 'David'],
'Age': [19, 22, 25, 32],
'Qualification':['UG', 'MPhil', 'MPhil', 'PhD']}

# Convert the dictionary into DataFrame
df = pd.DataFrame(staff)

print(df.index) # Get all the indexes
# RangeIndex(start=0, stop=4, step=1)

print(df.columns) #Get all the column names
# Index(['Name', 'Age', 'Qualification'], dtype='object')

print(df.shape) # Get the shape of the DataFrame
# (4, 3)

print(df.info()) # To check the type, non-null count in the dataframe

print(df.describe()) # Get more info about the dataframe (mean, std)

print(df)

![https://i.imgur.com/xmVed5m.png]

Selection and Indexing


Drop element

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.drop.html

  • df.drop(self[, labels, axis, index, ...])
1
2
3
4
5
6
7
df.drop([0]) # Drop first row by index
df.drop([0, 1]) # Drop first 2 row by index

df.drop(columns=['B', 'C']) # Drop columns method 1
df.drop(['B', 'C'], axis=1) # Drop columns method 2

df.drop(['label']) # Drop by label

Convert pandas DataFrame to NumPy array

1
2
3
array = df.values

array = df.to_numpy()

Reading Data files

  • read_csv(filepath_or_buffer, pathlib.Path, ...) - Read csv
  • read_excel(io[, sheet_name, header, names, ...]) - Read Excel
  • read_json([path_or_buf, orient, typ, dtype, ...]) - Convert a JSON string to pandas object.
  • read_sql_table(table_name, con[, schema, ...]) - Read SQL database table into a DataFrame.
  • read_sql_query(sql, con[, index_col, ...]) - Read SQL query into a DataFrame.
  • read_sql(sql, con[, index_col, ...]) - Read SQL query or database table into a DataFrame.

Write to data files

https://pandas.pydata.org/pandas-docs/stable/reference/frame.html#serialization-io-conversion