The course consists of six labs. The labs are short activities are
meant to be completed in-class, and generally cover some practical
matter, such as getting started with NumPy or Apache Spark. We use
labs as a way to force students to try out some new technology with
instructors and TAs present in the room, so that they can get
immediate help if things do not work. Each lab assignment takes
between thirty and ninety minutes to complete, and is typically given
to students in the form of a Jupyter notebook.
The data for the labs may be found here.
Name
|
Description
|
Data
|
---|---|---|
Installation of Jupyter Notebook on student machines
|
||
Setup and/or use of Postgres
|
||
Introduction to Numpy arrays
|
||
Lab to investigate the performance advantage of Numpy arrays and vectorized operations
|
||
Introduction to Spark, pySpark, and RDDs
|
||
Connecting to AWS and running Spark, interactively and in batch
|
||
Introduction to TensorFlow using Google CoLab and / or AWS
|