Please note that this schedule is subject to change.
Lectures
(09/03) Relational Databases
(09/08) Relational Databases
(09/17) Polars and DuckDB
(09/29) Data Transformation
(10/20) Scalable Databases
(10/22) Scalable Databases
- Reading:
- References:
- NewSQL,
A. Pavlo, 2012.
- The
Official Ten-Year Retrospective of NewSQL, A. Pavlo, 2021.
- Spanner:
Google’s Globally-Distributed Database, J. C. Corbett et al.,
2012.
- F1: A
Distributed SQL Database That Scales, J. Shute et al., 2013.
- Spanner,
TrueTime & The CAP Theorem, E. Brewer, 2017.
- A
Critique of the CAP Theorem, M. Kleppmann, 2015
- Is
Scalable OLTP in the Cloud a Solved Problem?, T. Ziegler et al.,
2022
(10/27) Scalable Dataframes
(10/29) Scalable Dataframes
(11/05) Graph Data
- Reading:
- References:
- Introduction
to Neo4j and Graph Databases, M. D. Allen, 2019.
- Demystifying Graph
Databases: Analysis and Taxonomy of Data Organization, System Designs,
and Graph Queries (preprint), M. Besta et al., 2022.
- Survey
of Graph Database Models, R. Angles and C. Gutierrez, 2008.
- An Introduction to
Graph Data Management, R. Angles and C. Gutierrez, 2017
- Graph
Databases, D. Lembo and R. Rosati, 2015
- Introduction
to Graph Databases, M. De Marzi, 2012
- The
Future Is Big Graphs: A Community View on Graph Processing Systems,
S. Sakr et al., 2021
- The (sorry)
State of Graph Database Systems, P. Boncz, 2022
(11/12) Databases and Visualization
(11/17) Spatial Data
- Reading:
- References:
- Big
Spatial Data Management, A. Eldawy, 2020
- Data
Cubes, J. Han, M. Kamber, and J. Pei, 2011.
- Nanocubes
for Real-Time Exploration of Spatiotemporal Datasets, L. Lins et
al., 2013.
- TopKube:
A Rank-Aware Data Cube for Real-Time Exploration of Spatiotemporal
Datasets, F. Miranda et al., 2017.
- Dynamic
prefetching of data tiles for interactive visualization, L. Battle
et al., 2016.
(11/24) Provenance
- Reading:
- References:
- Provenance
for Computational Tasks: A Survey, J. Freire et al., 2008
- Provenance
in Databases: Why, How, and Where, J. Cheney et al., 2007
- Provenance in
Databases, A. Amarilli, 2019.
- Capturing
and querying fine-grained provenance of preprocessing pipelines in data
science, P. Missier, 2023.
(12/01) Reproducibility
- Reading:
- References:
- Repeatability
and Benefaction in Computer Systems Research, C. Collberg et al.,
2015.
- Reproducible
Research in Computational Science, R. D. Peng, 2011.
- Ten
Simple Rules for Reproducible Computational Research, G. K. Sandve
et al., 2013.
- Computational
Reproducibility: State-of-the-Art, Challenges, and Database Research
Opportunities, J. Freire et al., 2012.
- A
Large-scale Study about Quality and Reproducibility of Jupyter
Notebooks, J. F. Pimentel, 2019.
(12/03) Databases and Machine Learning