Date, Time, & Location
Tuesday, February 18, 3:30pm-4:45pm, PM 251
Overview
Test 1 will cover all material from the beginning of the semester through data transformation (Thursday, February 13). The material will cover the assigned readings and the topics we discussed in class.
Topics
- Python
- numpy
- pandas
- Data (items, attributes, attribute types, semantics, metadata)
- Data Wrangling
- Data Cleaning
- Data Transformation
Free Response Example Questions
- Given a dataset, (a) identify an item, attribute, and cell; (b) state, for each column, whether it is categorical, ordered, or quantitative.
- In the following Python code, identify all errors.
- Given the following numpy array, write two different ways to index the highlighted sub-array

- Given the following two data frames, write a sequence of pandas operations to transform the first into the second. Exact syntax is not important, explain (and be specific) if you do not recall a particular operation or function name.
|
name
|
genres
|
0
|
Toy Story (1995)
|
animation|children’s|comedy
|
1
|
Jumanji (1995)
|
Adventure|Children’s|Fantasy
|
2
|
Grumpier Old Men (1995)
|
COMEDY|ROMANCE
|
3
|
Waiting to Exhale (1995)
|
Comedy|Drama
|
4
|
Father of the Bride Part II (1995)
|
Comedy
|
|
Name
|
Year
|
Adventure
|
Animation
|
Children’s
|
Comedy
|
Drama
|
Fantasy
|
Romance
|
0
|
Toy Story
|
1995
|
0
|
1
|
1
|
1
|
0
|
0
|
0
|
1
|
Jumanji
|
1995
|
1
|
0
|
1
|
0
|
0
|
1
|
0
|
2
|
Grumpier Old Men
|
1995
|
0
|
0
|
0
|
1
|
0
|
0
|
1
|
3
|
Waiting to Exhale
|
1995
|
0
|
0
|
0
|
1
|
1
|
0
|
0
|
4
|
Father of the Bride Part II
|
1995
|
0
|
0
|
0
|
1
|
0
|
0
|
0
|
- State three distinct ways in which Wrangler helps users trying wrangle raw datasets.
- Compare Foofah’s example-based data cleaning with Wrangler’s interactive data cleaning.