Please note that this schedule is subject to change.
Lectures
(01/23) Structured Data
- Slides
- Notebooks:
- Reading: McKinney, Ch. 4
(01/28) Data and Pandas
- Slides
- Reading: McKinney, Ch. 5
- Notebooks:
(01/30) Data Wrangling
- Slides
- Reading: McKinney, Ch. 6
- Notebook:
(02/06) Data Cleaning
- Slides
- Reading: McKinney, Ch. 7
- References:
(02/11) Data Transformation
- Slides
- Notebooks:
- Reading:
- References:
(02/13) Data Transformation
(03/05) Databases and Visualization
- Combined with the Big Ideas lecture
(03/24) Scalable Databases
(03/26) Scalable Databases
(04/07) Time Series Data
- Slides
- Reading: McKinney, Ch. 11
- Notebook:
(04/23) Reproducibility
- Slides
- Reading:
- References:
- Repeatability and Benefaction in Computer Systems Research, C. Collberg et al., 2015.
- Reproducible Research in Computational Science, R. D. Peng, 2011.
- Ten Simple Rules for Reproducible Computational Research, G. K. Sandve et al., 2013.
- Computational Reproducibility: State-of-the-Art, Challenges, and Database Research Opportunities, J. Freire et al., 2012.
- A Large-scale Study about Quality and Reproducibility of Jupyter Notebooks, J. F. Pimentel, 2019.
(04/28) Databases and Machine Learning
- Slides
- Reading:
- References:
- Learning and Self-Designing Data Structures (SIGMOD Tutorial), S. Idreos and T. Kraska: Part 1 and Part 2