Using Boxplots

[1]:
import transportation_tutorials as tt
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

Questions

  1. Use a boxplot to show the distribution of household income in the Jupiter study area, by number of automobiles owned. What is the median income of households who own exactly two automobiles? (Hint: the correct answer is $96 thousand.)
  2. Is the median income higher or lower if we only consider two-car households that have at least one person over age 65? Create a set of box plots similar to those created in question (1), but only for households with at least one person over age 65.

Data

To answer the questions, use the following data files:

[2]:
hh = pd.read_csv(tt.data('SERPM8-BASE2015-HOUSEHOLDS'))
hh.head()
[2]:
Unnamed: 0 hh_id home_mgra income autos transponder cdap_pattern jtf_choice autotech tncmemb
0 426629 1690841 7736 512000 2 1 MMMM0 0 0 0
1 426630 1690961 7736 27500 1 0 MNMM0 0 0 0
2 426631 1690866 7736 150000 2 0 HMM0 0 0 0
3 426632 1690895 7736 104000 2 1 MMMM0 0 0 0
4 426633 1690933 7736 95000 2 1 MNM0 0 0 0
[3]:
person = pd.read_csv(tt.data('SERPM8-BASE2015-PERSONS'))
person.head()
[3]:
hh_id person_id person_num age gender type value_of_time activity_pattern imf_choice inmf_choice fp_choice reimb_pct wrkr_type
0 1690841 4502948 1 46 m Full-time worker 5.072472 M 1 1 -1 0.0 0
1 1690841 4502949 2 47 f Part-time worker 5.072472 M 2 37 -1 0.0 0
2 1690841 4502950 3 11 f Student of non-driving age 3.381665 M 3 1 -1 0.0 0
3 1690841 4502951 4 8 m Student of non-driving age 3.381665 M 3 1 -1 0.0 0
4 1690961 4503286 1 52 m Part-time worker 2.447870 M 1 2 -1 0.0 0