Welcome to the course website for MUSA 550, Geospatial Data Science in Python, taught at the University of Pennsylvania in fall 2020.
Course Description
This course will provide students with the knowledge and tools to turn data into meaningful insights, with a focus on real-world case studies in the urban planning and public policy realm. Focusing on the latest Python software tools, the course will outline the “pipeline” approach to data science. It will teach students the tools to gather, visualize, and analyze datasets, providing the skills to effectively explore large datasets and transform results into understandable and compelling narratives. The course is organized into five main sections:
- Exploratory Data Science: Students will be introduced to the main tools needed to get started analyzing and visualizing data using Python.
- Introduction to Geospatial Data Science: Building on the previous set of tools, this module will teach students how to work with geospatial datasets using a range of modern Python toolkits.
- Data Ingestion & Big Data: Students will learn how to collect new data through web scraping and APIs, as well as how to work effectively with the large datasets often encountered in real-world applications.
- Geospatial Data Science in the Wild: Armed with the necessary data science tools, students will be introduced to a range of advanced analytic and machine learning techniques using a number of innovative examples from modern researchers.
- From Exploration to Storytelling: The final module will teach students to present their analysis results using web-based formats to transform their insights into interactive stories.
Schedule & Course Materials
- Schedule is tentative; lectures & assignment dates are subject to change.
- Weekly course materials are stored in individual repositories on Github — available via the icons below.
- Lecture slides are distributed as Jupyter notebooks, which are self-contained, executable Python documents.
- Fully interactive and executable versions of the lecture slides are available via the buttons. This will launch the notebooks in a temporary cloud environment. Note that because the computing platform is temporary (and provided for free!), the cloud environment will be deleted and you will need to create a fresh version after an extended time of inactivity (~20 minutes or so).
- Static, non-interactive versions of the lecture slides are available via the buttons.
Week | Github | Topic | Date | Interactive Slides | Static Slides | Homework |
---|---|---|---|---|---|---|
1 | Exploratory Data Science in Python | 09/01 (Tu) | ||||
09/03 (Th) | Assign HW #1 (required) | |||||
2 | Data Visualization Fundamentals | 09/08 (Tu) | ||||
09/10 (Th) | Assign HW #2 (required) | |||||
3 | Geospatial Data Analysis and GeoPandas | 09/15 (Tu) | ||||
09/17 (Th) | ||||||
4 | More Interactive Data Viz, Working with Raster Datasets | 09/22 (Tu) | ||||
09/24 (Th) | Assign HW #3 (required) | |||||
5 | Getting Data Part 1: Working with APIs | 09/29 (Tu) | ||||
10/01 (Th) | ||||||
6 | Getting Data Part 2: Web Scraping | 10/06 (Tu) | ||||
10/08 (Th) | Assign HW #4 (optional) | |||||
7 | Analyzing and Visualizing Large Datasets | 10/13 (Tu) | ||||
10/15 (Th) | ||||||
8 | Case Study: Advanced Raster Analysis | 10/20 (Tu) | ||||
10/22 (Th) | Assign HW #5 (optional) | |||||
9 | Case Study: OpenStreetMap, Urban Networks, and Interactive Web Maps | 10/27 (Tu) | ||||
10/29 (Th) | ||||||
10 | Case Study: Clustering Analysis in Python | 11/03 (Tu) | ||||
11/05 (Th) | Assign HW #6 (optional) | |||||
11 | Predictive Modeling Part 1: Home Prices in Philadelphia | 11/10 (Tu) | ||||
11/12 (Th) | ||||||
12 | Predictive Modeling Part 2: Space/time Rideshare Demand | 11/17 (Tu) | ||||
11/19 (Th) | Assign HW #7 (required) | |||||
13 | From Notebooks to the Web: Github Pages, Web Servers, and Dash | 11/24 (Tu) | ||||
12/01 (Tu) | ||||||
14 | From Notebooks to the Web: Dashboarding with Panel and the HoloViz Ecosystem | 12/03 (Th) | Final project proposal due | |||
12/08 (Tu) |