Labs

The course consists of six labs. The labs are short activities are
meant to be completed in-class, and generally cover some practical
matter, such as getting started with NumPy or Apache Spark. We use
labs as a way to force students to try out some new technology with
instructors and TAs present in the room, so that they can get
immediate help if things do not work. Each lab assignment takes
between thirty and ninety minutes to complete, and is typically given
to students in the form of a Jupyter notebook.

The data for the labs may be found here.

Name
Description
Data
Installation of Jupyter Notebook on student machines
Setup and/or use of Postgres
Introduction to Numpy arrays
Lab to investigate the performance advantage of Numpy arrays and vectorized operations
Introduction to Spark, pySpark, and RDDs
Connecting to AWS and running Spark, interactively and in batch
Introduction to TensorFlow using Google CoLab and / or AWS