Introduction to Jupyter Notebook, Numpy, & Matplotlib

Outline

Jupyter notebook

Numpy

Matplotlib

Jupyter Notebook

What is Jupyter Notebook

  • create and share documents that contain live code, equations, visualizations and narrative text.
  • data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more.
  • From when you start a Jupyter notebook, variable assignments, function declarations, etc are stored and updated as you go.

Useful features

  • Hit shift+enter to execute the cell
  • Use ! to excute shell commands, try !ls, !pwd, !pip install ...
  • Use ? (e.g., ?numpy, ?numpy.mean()) for quick reference on syntax
  • Change themes
    • pip install jupyterthemes to install themes
    • jt -l to check the available themes
    • jt -t <theme> to set the theme
    • jt -r back to the default theme
  • Notebook Extensions
    • install: type pip install jupyter_contrib_nbextensions and then jupyter contrib nbextension install
    • Table of Contents
    • Hinterland
    • Split cells
  • Using LaTeX for forumlas
    • $\frac{\partial^2u}{\partial t^2}=c^2\frac{\partial^2 u}{\partial x^2}$
  • Multicursor support

Sharing your notebooks

  • Before you share
    • Click “Cell > All Output > Clear”
    • Click “Kernel > Restart & Run All”
    • Wait for your code cells to finish executing and check ran as expected

Numpy

Array programming with NumPy

Nature volume 585, pages 357–362 (2020)

Charles R. Harris, K. Jarrod Millman, and Stéfan J. van der Walt et al.

What is Numpy (Numerical Python)

  • a fundamental package for scientific computing in Python
  • a Python library includes a multidimensional array object (ndarray), various derived objects, and fast operations on arrays
  • one of the top five Python packages, used in every field of science and engineering.
  • install numpy: pip install numpy
  • import numpy: import numpy as np

Why is Numpy

  • fast
  • the core of the scientific Python and PyData ecosystems
  • the universal standard for working with numerical data in Python
  • works well with Pandas, SciPy, Matplotlib, scikit-learn, scikit-image and most other data science and scientific Python packages
  • all kinds of matrix operations, lots of built-in functions

ndarray: the core of NumPy

  • a homogeneous n-dimensional array object (unlike Python sequences)
    • The elements in a NumPy array are of the same data type and the same size in memory.
  • NumPy arrays have a fixed size at creation, unlike Python lists (which can grow dynamically)
  • a NumPy array has continuous memory

A 1D array is a vector

its shape is just the number of components

In [1]:
import numpy as np

data = np.array([1,2,3])
print(data)
[1 2 3]

Screen%20Shot%202020-10-08%20at%207.49.54%20PM.png

A 2D array is a matrix; its shape is (number of rows, number of columns)

In [2]:
data = np.array([[1,2],
               [3,4],
               [5,6]])
print(data)
[[1 2]
 [3 4]
 [5 6]]

image.png

Structure of a 3D (3, 4, 2) array

image.png

Memory and strides: what make numpy fast

  • a NumPy array is stored in a contiguous block of memory
  • two key concepts relating to memory: dimensions and strides
  • strides are integer numbers of bytes to step in each dimension

An example of 2D array

In [3]:
data = np.array([[1,2,3],[4,5,6]], dtype='int16')
data.strides
Out[3]:
(6, 2)

shape of the 2D array

image.png

continuous memory and strides

image.png

The byte address of an element data[i1,i2]:

byte_address = data.strides[0] * i1 + data.strides[1] * i2

2D array shape and memory