Numerical Analysis in Python

The basic structure of Python is great for a lot of purposes, as it is super simple, and the underlying Python system handles much of the more complicated data management operations more or less transparently: there’s no need to tell Python to allocate or free memory, or what the data type of variables is, or how to actually complete many complex operations. Thus we can do things like creating a list of values or initializing a variable by simply writing:

[1]:
x = list(range(5000))
y = 0

However, this simplicity comes at a cost: Python handles all these tasks for you, but depending on the workload you create, it may end up being extremely slow. For example, consider the following commands to square the numbers in our list:

[2]:
for i in x:
    y = y + i**2
y
[2]:
41654167500

Underneath the hood, Python is actually executing low level commands roughly equivalent to this monologue:

  • Get the first item in the list x and store it in i.
  • Note that I am using this value in one additional place.
  • Identify that the data type for the variable i is an integer.
  • Look up that the double-asterisk operator means ‘power’ when applied to integers.
  • Find the value of i somewhere in the computer’s memory.
  • Take the value of i and raise it to the power of 2.
  • The result is also an integer.
  • Identify that the data type for the variable y is an integer.
  • Look up that the + operator means ‘add’ when applied to integers.
  • Find the value of y somewhere in the computer’s memory.
  • Add the integer in y to the integer result of the power operation above.
  • Store the result of the addition in variable y, and write the value to somewhere in the computer’s memory.
  • Note that I am using this value that I just stored in y in one additional place.
  • Note that I am using the previously stored value of y in one fewer place.
  • If the total number of places I am using the old value of y is now zero, I can flag this memory to be freed up.
  • Get the next item in the list x and store it in i.
  • Note that I am using the old value of i in one fewer place.
  • If the total number of places I am using the old value of i is now zero, I can flag this memory to be freed up.
  • Identify that the data type for the variable i is an integer.
  • Look up that the double-asterisk operator means ‘power’ when applied to integers.
  • Take the value of i and raise it to the power of 2.
  • The result is also an integer.
  • Identify that the data type for the variable y is an integer.
  • Look up that the + operator means ‘add’ when applied to integers.
  • Add the integer in y to the integer result of the power operation above.
  • Store the result of the addition in variable y.
  • Note that I am using this value that I just stored in y in one additional place.
  • Note that I am using the previously stored value of y in one fewer place.
  • If the total number of places I am using the old value of y is now zero, I can flag this memory to be freed up.
  • Get the next item in the list x and store it in i.
  • … and repeat the previous 14 lines 4,997 more times.
  • Done. (Phew!)

Fortunately, our computer can run through these instructions quite fast, so for the five item list, this quite long list of instructions appears to run instantanously. Nevertheless, it is obviously extraordinarily repetitive, and if the loop is to run on thousands of items in a list, the sheer number of instructions will slow down the process to the point where executing code becomes painfully slow.

Vectorization

To speed things up, it is necessary to get rid of a lot of the excessively repetitive instructions. This is called “vectorization” of Python code. The two key features of vectorization that helps speed things up are:

  1. We store data in an array of same-datatype values.
  2. We figure out operations that will be applied to those values, and then apply those operations straight down the line of values directly, without creating and destroying a lot of intermediate Python variables along the way.

The numpy package includes the tools necessary to do this efficiently. For example, we can vectorize the example code above like this:

[3]:
import numpy
z = numpy.arange(5000)
(z**2).sum()
[3]:
41654167500

In comparison to the instructions above, this code is executed in a manner loosely equivalent to this monologue:

  • I have an array of five integers.
  • Look up that the double-asterisk operator means ‘power’ when applied to integers.
  • Find the group of 5,000 integer values of x somewhere in a contiguous block of the computer’s memory.
  • Allocate a similarly sized contiguous block of the computer’s memory to store the result of a ‘power’ operation.
  • Take each of 5,000 values in x, square it, and write that to the memory allocated above.
  • Look up that the sum method means ‘add these up’ when applied to integers.
  • Allocate memory for an integer value that will be the total.
  • Take each of 5,000 squared values and add them to the running total.
  • Return the final value.
  • Done.

It should be clear that the code above is wildly more efficient. For such a small set of operations described above, the different in time that it takes to compute the result either way is insignificant for a human perspective: does it really matter to you if the work is completed in a millisecond or a microsecond?

[4]:
%%timeit
y = 0
for i in x:
    y = y + i**2
2.05 ms ± 287 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
[5]:
%%timeit
(z**2).sum()
8.41 µs ± 578 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

But when scaled up to larger data sets and more complex operations, these differences are extremely important: make each a million times longer, and we’re looking at the difference between one second and 15 minutes.

Many of the tools and libraries described in the rest of these tutorials already take advantage of the speed benefit of vectorization as much as reasonable, but it is still important for a user to know about these differences. The simplest way to take advantage of these speed ups in coding your own Python tools is to avoid using “for” loops as much as possible in any section of your code that runs slowly, especially in places where “for” loops are nested inside other “for” loops. For example, writing code that loops for each destination TAZ inside a loop for each origin TAZ is generally a bad idea.

Conversely, there’s no reason to fear using “for” loops in other places. As we observed above, a single loop that iterates over a few thousand items in a list can complete in just a millisecond. It’s only necessary to vectorize code when the number of loops and the complexity of calculation inside the loop are both high.

Making Arrays

The arange function in numpy creates an array of evenly spaced values, analogous to the range function in Python itself. Arrays can also be created out of lists:

[6]:
a = numpy.asarray([11,22,33,44])
a
[6]:
array([11, 22, 33, 44])

Importantly, the data type of the values in the array is all the same. You can inspect the datatype using the dtype attribute:

[7]:
a.dtype
[7]:
dtype('int64')

You cannot create a simple array of mixed data types; instead the data will be up-casted as necessary to create a common datatype. For example, if you mix integers and floating point values, the resulting array will be all floating point values.

[8]:
b = numpy.asarray([11,22,33,4.4])
b.dtype
[8]:
dtype('float64')
[9]:
b
[9]:
array([11. , 22. , 33. ,  4.4])