Test 1

Date, Time, & Location

Monday, February 27, 9:30-10:45am, PM 253

Overview

Test 1 will cover all material from the beginning of the semester through data wrangling (Monday, February 20). The material will cover the assigned readings and the topics we discussed in class.

Format

  • Multiple Choice
  • Free Response
  • CSCI 640 Students will have additional questions

Topics

  • Python
  • Relational Algebra
  • SQL
  • numpy
  • pandas
  • Data (items, attributes, attribute types, semantics, metadata)
  • Data Wrangling

Readings

Assigned Readings

Referenced Papers

Free Response Example Questions

  • Given a dataset, (a) identify an item, attribute, and cell; (b) state, for each column, whether it is categorical, ordered, or quantitative.
  • In the following Python code, identify all errors.
    // print the numbers from 1 to 100
    int counter = 1 
    while counter < 100 { 
        print counter 
        counter++
    }
  • Given the following numpy array, write two different ways to index the highlighted sub-array
  • State three distinct ways in which Wrangler helps users trying wrangle raw datasets.
  • Given a data frame, identify issues with the data and how these can be resolved. Exact syntax is not important, explain (and be specific) if you do not recall a particular operation or function name.
  • How transform data by example (TDE) is different from Wrangler? How does transform by pattern (TBP) improve on transform by example, and what use cases does it aid with?