Some Basics of Python

Variables

Variable values are assigned with the equals sign (“=”), and variable types do not need to be defined. It is sufficient to simply make an assignment. For example, we can write

[1]:
a = 1

and the variable a now holds an integer (int). We can check if any given variable is a particular type:

[2]:
isinstance(a, int)
[2]:
True

The type of a variable can be change by assigning something new.

[3]:
a = "Hello Florida!"
[4]:
isinstance(a, int)
[4]:
False

But generally, to avoid confusion, it is unwise to regularly overwrite variables with entirely different data of a different type.

Since assignments are made with the equals sign, equality testing is done using two equals signs (“==”).

[5]:
a == "Hello Florida!"
[5]:
True

Comments

Comments can be inserted in Python code using the hash (#) symbol. On each line of code, everything after the first hash symbol is a comment. Comments can be on a line by themselves, or tucked in after other non-comment code.

[6]:
# this is a comment
x = 1  # this is also a comment
       # and this too is a comment.

The exception to this rule is when the hash symbol appears inside a quoted string. In that case, it’s part of the string, not a comment.

[7]:
text = "# This is not a comment because it's inside quotes."

Another way to effectively create a comment in Python is to write a string and not assign it to a variable name. Although it’s not strictly a comment in that the Python interpreter will read and interpret it, doing this has no effect on the execution of other code.

[8]:
"This is sort of a comment"
[8]:
'This is sort of a comment'

Numbers

Python obviously can handle numerical values for variables. It usually is not necessary to explicitly identify whether numbers are integers or floating point (decimal) numbers, as Python can usually figure it out: when writing numbers explicitly in code, if you include the decimal point, it’s a float, otherwise it’s an integer.

[9]:
a = 1  # integer
[10]:
isinstance(a, int)
[10]:
True
[11]:
isinstance(a, float)
[11]:
False
[12]:
a = 1.0  # float
[13]:
isinstance(a, int)
[13]:
False
[14]:
isinstance(a, float)
[14]:
True

Python recognizes comma characters as meaningful, so you cannot put commas into numerical expressions in code:

[15]:
1,000,000  # not one million
[15]:
(1, 0, 0)

Instead of commas, underscores can be inserted within numbers in code to make it more readable by humans, but retain the correct value by Python:

[16]:
1_000_000  # one million
[16]:
1000000

If you want to output a nicely formatted large number with commas for the thousands, you need to convert the number to a string. There is an extensive section on string formatting provided in the Python documentation that outlines how to format numbers (and other Python objects) into virtually any format imaginable.

[17]:
"{:,}".format(1_000_000)
[17]:
'1,000,000'

Strings

Python can also work with strings, which can be enclosed in single quotes ('...') or double quotes ("..."). If the string needs to contain text which also includes quotes, they can be escaped using a backslash.

[18]:
'Florida'  # single quotes
[18]:
'Florida'
[19]:
'Florida\'s Turnpike'  # use \' to escape the single quote...
[19]:
"Florida's Turnpike"
[20]:
"Florida's Turnpike"  # ...or use double quotes instead
[20]:
"Florida's Turnpike"

In addition to using the backslash to escape quote marks in a string, it can also be used to escape other special characters, including '\n' for newlines, '\t' for tabs, and '\\' for literal backslashes. If you don’t want to escape anything a instead treat any backslash as a single backslash, you can use raw strings by adding an r before the first quote:

[21]:
print('C:\Python\transportation-tutorials')  # here \t means *tab*
C:\Python       ransportation-tutorials
[22]:
print(r'C:\Python\transportation-tutorials')  # note the r before the quote
C:\Python\transportation-tutorials

As you might note from the example, this is a subtle but important problem that often catches users by surprise when entering pathnames on Windows. It is tricky because most regular letters following a backslash are not escape sequences and code may work fine, until problems mysteriously emerge when a filename is changed to begin with one of a, b, f, n, r, t, u, or v. To help out, in addition to permitting raw string input, Python also allows Windows paths to be written using forward slashes instead of the default backslashes.

String literals can span multiple lines. One way is using triple-quotes: """...""" or '''...'''. End of lines are automatically included in the string, but it’s possible to prevent this by adding a \ at the end of the line. The following example:

[23]:
print("""\
   Usage: florida-man [OPTIONS]
        -h            Display this usage message
        -t degrees    Set temperature in °F
        -a            Include alligators
""")
   Usage: florida-man [OPTIONS]
        -h            Display this usage message
        -t degrees    Set temperature in °F
        -a            Include alligators

Two or more string literals (i.e. the ones enclosed between quotes) next to each other are automatically concatenated. :

[24]:
'Flo' 'rida'
[24]:
'Florida'

Strings (literal or otherwise) can be concatenated (glued together) with the + operator:

[25]:
s1 = 'Cape '
s2 = 'Canaveral'
s1 + s2
[25]:
'Cape Canaveral'

Lists

There are a variety of compound data types, used to group together other values. One of these is the list, which can be written as a list of comma-separated values (items) between square brackets.

[26]:
cities = ['Miami', 'Orlando', 'Tampa', 'Jacksonville']

Lists are ordered, and can be indexed starting at zero for the first item.

[27]:
cities[0]
[27]:
'Miami'

They can also be sliced, passing a start:stop syntax instead of a single number to the indexer:

[28]:
cities[0:2]
[28]:
['Miami', 'Orlando']

Note that the stop value indicates the first value past the end of the slice, not the last value in the slice.

You can append items to a list like this:

[29]:
cities.append('Tallahassee')

The items in a list are not restricted to be all the same data type, so Python won’t prevent you from doing something like this:

[30]:
cities.append(123)
[31]:
cities
[31]:
['Miami', 'Orlando', 'Tampa', 'Jacksonville', 'Tallahassee', 123]

But for your sanity, you may not want to do that. The pop method can help here, by removing and returning the last item in the list.

[32]:
cities.pop()
[32]:
123
[33]:
cities
[33]:
['Miami', 'Orlando', 'Tampa', 'Jacksonville', 'Tallahassee']

You can also change values within a list:

[34]:
cities[2] = 'Tampa–St. Petersburg'
cities
[34]:
['Miami', 'Orlando', 'Tampa–St. Petersburg', 'Jacksonville', 'Tallahassee']

Tuples

A tuple is like a list, except it is immutable (i.e. you cannot change any part of it after it exists). Tuples are created using round parenthesis instead of square brackets.

[35]:
counties = ('Miami-Dade', 'Orange', 'Hillsborough', 'Duval')
[36]:
counties[0] = 'Miami'
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-36-9e6b786104ce> in <module>
----> 1 counties[0] = 'Miami'

TypeError: 'tuple' object does not support item assignment

Dictionaries

Tuples and lists are sequences: a bunch of objects in some order.
Dictionaries, on the other hand, are mappings: a bunch of objects with labels. Like a physical dictionary, if you know the label (for a regular dictionary, the word) then you can very quickly find the object (for a regular dictionary, the definition) without having to read through very many other labels on the way.

When writing a dictionary directly in Python code, we use curly brackets, and give the contents of the dictionary as key:label pairs.

[37]:
jurisdictions = {'Miami': 'Dade County', 'Orlando': 'Orange County'}

We can add new items by assignment with index-like syntax like this:

[38]:
jurisdictions['Jacksonville'] = 'Duval'

And access members similarly:

[39]:
print(jurisdictions['Miami'])
Dade County

An attempt to access a key that does not exist in the dictionary raises an error:

[40]:
print(jurisdictions['Tallahassee'])
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-40-34b0fab846be> in <module>
----> 1 print(jurisdictions['Tallahassee'])

KeyError: 'Tallahassee'

Although the values that are stored in a dictionary can be anything, the keys need to be both unique and immutable. Assigning a new value to the same key deletes the old value:

[41]:
jurisdictions['Miami'] = 'Miami-Dade County'

Sets

A set is an unordered bunch of unique objects. Although the implemention is a bit different, you can generally think of a set as a dictionary that only has the keys and doesn’t have any values attached to those keys.

[42]:
national_parks = {'Everglades', 'Biscayne', 'Dry Tortugas'}

Syntax

Python has no mandatory statement termination characters. You can end a statement by ending the line of code. You can also have a semicolon at the end of a statement, which allows you to write two statements on a single line, although this is generally frowned upon for most Python code.

Blocks of Python code are denoted by indentation. Multiple lines of code in a row with identical indentation form a block of code that is executed in sequence. Note that indenting can be done with spaces, or with tabs, but mixing these in one file is very strongly discouraged. Statements that expect an increase in indentation level on the following line end in a colon (:).

Flow Control

Common flow control keywords are if, for, and while.

An if statement can be used alone, or along with elif (short for else if) and/or else statements.

[43]:
x = 'Miami Beach'

if x in cities:
    print("found", x, "City")
elif x in counties:
    print("found", x, "County")
else:
    print("didn't find", x)
didn't find Miami Beach

The for statement will iterate over items in a sequence (list, tuple, or similar).

[44]:
for x in counties:
    print(x, 'County')
Miami-Dade County
Orange County
Hillsborough County
Duval County

The while statement loops until the conditional evaluates as False.

[45]:
x = 1
while x < 5:
    print(x)
    x += 1
1
2
3
4

You can exit a for or while block early with a break statement.

[46]:
x = 1
while x < 5:
    print(x)
    x += 1
    if x > 3:
        break
1
2
3

Importing Packages

Only the most basic functionality is exposed in Python by default. Almost all functions and object classes will need to be imported from a library, either one that is built-in to Python (but still needs to be imported) or one installed alongside Python (by conda or pip, usually). Libraries are loaded with the import keyword. You can also use from [libname] import [funcname] to access library components, and you can use the as keyword to provide a unique or different name for the thing you are importing. For example:

[47]:
import os
import numpy as np
from transportation_tutorials import data as td

You can then use functions and modules that are found in these packages using “dot” notation. For example, the numpy package includes a zeros function, that creates an array full of zeros, which we can call like this:

[48]:
np.zeros(5)
[48]:
array([0., 0., 0., 0., 0.])

In a Jupyter notebook, we can also access function documentation by using a special question mark syntax, which will print out the built-in documentation for any function, if it is available (most standard analytical packages including numpy, scipy, and pandas have extensive documentation for nearly every function).

[49]:
np.zeros?
Docstring:
zeros(shape, dtype=float, order='C')

Return a new array of given shape and type, filled with zeros.

Parameters
----------
shape : int or tuple of ints
    Shape of the new array, e.g., ``(2, 3)`` or ``2``.
dtype : data-type, optional
    The desired data-type for the array, e.g., `numpy.int8`.  Default is
    `numpy.float64`.
order : {'C', 'F'}, optional, default: 'C'
    Whether to store multi-dimensional data in row-major
    (C-style) or column-major (Fortran-style) order in
    memory.

Returns
-------
out : ndarray
    Array of zeros with the given shape, dtype, and order.

See Also
--------
zeros_like : Return an array of zeros with shape and type of input.
empty : Return a new uninitialized array.
ones : Return a new array setting values to one.
full : Return a new array of given shape filled with value.

Examples
--------
>>> np.zeros(5)
array([ 0.,  0.,  0.,  0.,  0.])

>>> np.zeros((5,), dtype=int)
array([0, 0, 0, 0, 0])

>>> np.zeros((2, 1))
array([[ 0.],
       [ 0.]])

>>> s = (2,2)
>>> np.zeros(s)
array([[ 0.,  0.],
       [ 0.,  0.]])

>>> np.zeros((2,), dtype=[('x', 'i4'), ('y', 'i4')]) # custom dtype
array([(0, 0), (0, 0)],
      dtype=[('x', '<i4'), ('y', '<i4')])
Type:      builtin_function_or_method