Numpy is an essential Python library data scientists use to help work with multi-dimensional arrays in the most efficient way possible with Python.
*Following code done through Jupyter.
Here is a motivation for the use of Numpy:
Let’s say that we have an array or a list and we want to double each record in the data structure, how can we do this in Numpy vs basic Python?
We can always loop through each element in the list and multiply it by 2. Although this basic approach seems alright with a small dataset, it can
get very messy if we have to do many more computations with the data in the list.
Example with only Python:
# Let's initialize a list from 1 to 10 (inclusive for both) numbers = range(1,11) # We will need to create an empty list and append each index after multiplying by 2 using a for-loop double_num = [] for num in numbers: double_num.append(num * 2) double_num
[2, 4, 6, 8, 10, 12, 14, 16, 18, 20]
Although this method does work, it is long and tedious to do a relatively simple task. Luckily, with the use of Numpy, we will be able to double each record in an array simply setting multiply by 2 to every cell within 1 small step.
# Always must import numpy package. Conventional to use `np` as alias. import numpy as np
# Let's create an array from 1 to 10 (inclusive) arr = np.arange(1, 11) # Multiplies each array cell by 2 double_arr = arr * 2 double_arr
array([ 2, 4, 6, 8, 10, 12, 14, 16, 18, 20])
We have created a new array that takes in the old array and doubles each element by 2.
Creating Arrays with Numpy
Creating a one-dimensional array is quick and easy:
oned_arr = np.arange(5) # We can also create the same array with `np.array([0, 1, 2, 3, 4])` oned_arr
array([0, 1, 2, 3, 4])
Now let’s create a two-dimensional array which is very similar to how we would do in regular Python programming.
my_arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]]) print(my_arr)
[[1 2 3 4] [5 6 7 8]]
Accessing Rows, Columns, and Elements in an Array
Accessing elements in an array using Numpy is very similar to accessing lists in which we use indexing.
# Accesses the "first" element(s) in an array. In this case it accesses the first row in the array. my_arr[0]
array([1, 2, 3, 4])
# This accesses the specific element in the whole two-dimensional array: # This is how we would also do it using list indexing with Python. my_arr[0][3]
4
# With Numpy, we can access an element in a 2D array by including both row and column numbers in a single set of brackets # separated by a comma my_arr[0, 3]
4
We can also access elements in an array using the negative (ex. [-1]) which just like regular Python looks backwards.
my_arr[-1, -1]
8
Array Slicing
Slicing is also available in Numpy arrays:
my_arr[0][0:3] # Looks into the first row in the array and accesses the first 3 elements in the row. my_arr[0:2] # Accesses the first 2 rows of the array my_arr[:, 0:2] # Accesses all rows but only the first two columns
array([[1, 2], [5, 6]])