⇦ Back

Using an array is an efficient way to handle multiple elements. It is a better choice than a list (or a list-of-lists) if you are doing calculations with a set of numbers, but not as good if you are working with strings or doing general data handling and cleaning as opposed to mathematics.

Note that a vector is the name for a one-dimensional array and a matrix is a two-dimensional array. These do not exist as separate data types in Python - there are only arrays - but you can still call an array a vector or a matrix if it fits the definition.

1 Create an Array

You have two options for creating arrays:

1.1 Using the Standard Library

Python has a built-in module called array that can be used as follows:

import array

ar = array.array('i', [1, 2, 3, 4, 5, 6, 7])
print(ar)
## array('i', [1, 2, 3, 4, 5, 6, 7])
print(type(ar))
## <class 'array.array'>

This object ar is more memory-efficient than a list would have been. The i that has been used is the type code which specifies what data type will apply for all elements in the array. In this case they will be signed integers. For all the available type codes, see the documentation.

However, this module has a drawback: it only allows you to create one-dimensional arrays. It also does not have a particularly large set of mathematical operations that are compatible with this data type. For greater functionality we need to use the NumPy package.

1.2 Using NumPy

If we have NumPy installed, we can create an array manually by first creating a list (or a list-of-list) of numbers and then using the array() function to convert it:

import numpy as np

ls = [1, 2, 3, 4, 5, 6, 7]
ar = np.array(ls)
print(ar)
## [1 2 3 4 5 6 7]
print(type(ar))
## <class 'numpy.ndarray'>

Note the data type of this object: it is an ndarray. It is a 1D array (a vector); here’s a 2D array (a matrix):

ls_of_ls = [
    [1, 2, 3, 4, 5, 6, 7],
    [1, 2, 3, 4, 5, 6, 7],
    [1, 2, 3, 4, 5, 6, 7]
]
ar = np.array(ls_of_ls)
print(ar)
## [[1 2 3 4 5 6 7]
##  [1 2 3 4 5 6 7]
##  [1 2 3 4 5 6 7]]

To create an array of all ones for a given size:

ar = np.ones(5)
print(ar)
## [1. 1. 1. 1. 1.]

…and of all zeroes for a given size:

ar = np.zeros(9)
print(ar)
## [0. 0. 0. 0. 0. 0. 0. 0. 0.]

For the rest of this page we will use NumPy arrays.

2 Index an Array

Indexing is done with square brackets. Remember that Python uses zero-indexing, meaning that the first element is at index 0, the second is at index 1 and so on. This means that the number at index 4 of the list/array [10, 20, 30, 40, 50, 60, 70] is “50”:

ar = np.array([10, 20, 30, 40, 50, 60, 70])
print(ar[4])
## 50

Use the enumerate() function to iterate over both the values in an array and their indexes. This can be used, for example, to find the index of the first value in an array to meet a certain condition. In the below example, the first occurrence of the number “4” is searched for using the next() function and it is found at index 3 in the array:

# Index of first value meeting a condition
idx = next(i for i, v in enumerate(ar) if v == 40)
print(idx)
## 3

For the record, the actual value (as opposed to the index) can be returned in a similar fashion:

# First value meeting a condition
value = next(v for i, v in enumerate(ar) if v == 40)
print(value)
## 40

Sort an array into numerical order and find the index where, if you were to insert a given number, it would maintain the numerical order:

# Index where a number can be inserted while maintaining order
idx = np.searchsorted(ar, 35)
print(idx)
## 3

2.1 Filter an Array

When indexing an array with a Boolean mask (a list of trues and falses) it has the effect of filtering the array:

ar = np.array([10, 20, 30, 40, 50, 60, 70])

# Create a Boolean mask
mask = [False, False, False, False, True, True, True]
# Filter the array
filtered = ar[mask]
print(filtered)
## [50 60 70]

This is more usually used when searching for values that meet certain criteria, eg here’s how to get the values that are greater than 40:

# Filter the array
filtered = ar[ar > 40]
print(filtered)
## [50 60 70]

Logical operators can be used to combine Boolean masks and create more complicated filters, eg getting the values above 50 and below 30:

# Filter the array
filtered = ar[(ar > 50) | (ar < 30)]
print(filtered)
## [10 20 60 70]

3 Augment an Array

3.1 Append

Add to an array using Numpy’s append() function. You can append a single element, an array of elements or a list of them:

ar = np.array([10, 20, 30, 40, 50, 60, 70])

# Append a single number
ar = np.append(ar, 80)
print(ar)
## [10 20 30 40 50 60 70 80]
# Append an array
ar = np.append(ar, np.array([90, 100, 110]))
print(ar)
## [ 10  20  30  40  50  60  70  80  90 100 110]
# Append a list
ar = np.append(ar, [120, 130, 140])
print(ar)
## [ 10  20  30  40  50  60  70  80  90 100 110 120 130 140]

3.2 Concatenate

To append elements as a new row you need to use the concatenate() function, although this can be a bit confusing. If you simply concatenate two arrays it will act identically to the append() function (although note that you need to use an extra set of round brackets when specifying the arrays to concatenate):

ar1 = np.array([1, 2, 3])
ar2 = np.array([4, 5, 6])

# Flatten arrays before concatenating them
ar = np.concatenate((ar1, ar2))  # Note the double round brackets
print(ar)
## [1 2 3 4 5 6]

The reason you need to use two sets of round brackets is because there is a hidden keyword argument included in the concatenate() function called “axis”. If we leave it out (as we did in the previous example) it will take the default value of None and as a result it will flatten the arrays before concatenating them. This is why the functionality of the above example was identical to the append() function. Here is the example again but with the axis keyword argument explicitly shown:

# Flatten arrays before concatenating them
ar = np.concatenate((ar1, ar2), axis=None)
print(ar)
## [1 2 3 4 5 6]

As you can see, the above example is identical to the one before it, and both are identical to using the append() function.

If we want to take control of the behaviour we need to specify which axis to concatenate along, either the “0” axis or the “1” axis. In our previous examples, our arrays have only had one dimension each and with one-dimensional data we only have the option to concatenate along the 0th axis:

ar1 = np.array([1, 2, 3])
ar2 = np.array([4, 5, 6])

# Concatenate arrays horizontally
ar = np.concatenate((ar1, ar2), axis=0)
print(ar)
## [1 2 3 4 5 6]

If we changed the above to axis=1 the script would fail because it would be looking for an extra dimension to the data which doesn’t exist. If we instead used two-dimensional data (ie arrays made from lists-of-lists) we could do the following:

ar1 = np.array([[1, 2, 3]])  # Note the double square brackets creating a list-of-list
ar2 = np.array([[4, 5, 6]])  # Note the double square brackets creating a list-of-list

# Concatenate arrays vertically
ar = np.concatenate((ar1, ar2), axis=0)
print(ar)
## [[1 2 3]
##  [4 5 6]]

Note that axis=0 doesn’t consistently mean “concatenate horizontally” or “concatenate vertically”. It means “concatenate along the highest dimension”. This means that, for this data, axis=1 will concatenate horizontally and we will yet again produce 1 2 3 4 5 6 as an output (although this time it will be two-dimensional data - although only one dimension is occupied - as shown by the double square brackets in the output):

# Concatenate arrays horizontally
ar = np.concatenate((ar1, ar2), axis=1)
print(ar)
## [[1 2 3 4 5 6]]

If you recall our first concatenate() example you’ll remember that not specifying an axis (or, equivalently, specifying it as None) it will flatten the arrays before concatenating, resulting in one-dimensional data (note the single square brackets in the output):

# Concatenate arrays horizontally
ar = np.concatenate((ar1, ar2), axis=None)
print(ar)
## [1 2 3 4 5 6]

Finally, here is a way to do a ‘line-break’: convert a one-dimensional array into a two-dimensional one:

ar = np.array([1, 2, 3, 4, 5, 6])
# Perform a line-break
ar = np.concatenate(([ar[0:3]], [ar[3:]]), axis=0)
print(ar)
## [[1 2 3]
##  [4 5 6]]

3.3 Transpose

Flipping an array along a diagonal axis can be done with the .T method:

ar = np.array([
    [10, 20, 30],
    [40, 50, 60]
])
# Original array
print(ar)
## [[10 20 30]
##  [40 50 60]]
# Transposed array
print(ar.T)
## [[10 40]
##  [20 50]
##  [30 60]]

The .transpose() method also works:

# Transposed array
print(ar.transpose())
## [[10 40]
##  [20 50]
##  [30 60]]

Here’s how to use transposition to concatenate arrays in the exact way you want:

ar1 = np.array([[1, 2], [3, 4]])
ar2 = np.array([[5, 6]])
ar = np.concatenate((ar1, ar2.T), axis=1)
print(ar)
## [[1 2 5]
##  [3 4 6]]

4 Array Arithmetic

Arrays can be added, subtracted, multiplied, etc.

4.1 Array Addition

ar1 = np.array([10, 20, 30])
ar2 = np.array([10, 20, 30])
# Add elements
ar = ar1 + ar2
print(ar)
## [20 40 60]

4.2 Array Subtraction

# Subtract elements
ar = ar - ar2
print(ar)
## [10 20 30]

4.3 ‘Simple’ Multiplication

Here’s the difference between multiplying a list by 2 versus multiplying an array by 2:

# Multiply a list by 2
ls = [1, 2, 3] * 2
print(ls)
## [1, 2, 3, 1, 2, 3]
# Multiply an array by 2
ar = np.array([1, 2, 3]) * 2
print(ar)
## [2 4 6]

As you can see, multiplication works on a list as an object whereas it works on the elements of an array.

4.4 Dot Multiplication

ar1 = np.array([
    [1, 2],
    [3, 4],
    [5, 6]
])
ar2 = np.array([
    [5, 6, 7],
    [7, 8, 9]
])
# Perform dot multiplication
ar = np.dot(ar1, ar2)
print(ar)
## [[19 22 25]
##  [43 50 57]
##  [67 78 89]]

4.5 Cross Multiplication

ar1 = np.array([
    [1, 2],
    [3, 4]
])
ar2 = np.array([
    [5, 6],
    [7, 8]
])
# Perform cross multiplication
ar = np.cross(ar1, ar2)
print(ar)
## [-4 -4]

⇦ Back