By building a set of modules to teach data science in STEM courses at Dartmouth, we build a flexible and reusable set of tools and methods for faculty to enrich learning objectives through the hands-on exploration of data collection, analysis, and visualization.

DIFUSE Modules

Our team works with faculty in the sciences and social sciences to build data science learning modules for existing courses. These modules could be for a short assignment or a longer-running exercise with skill-building components. Module teams consist of 2-3 students (graduate and undergrad), one of the DIFUSE grant PI’s. We do the heavy lifting, with input from the faculty member during weekly meetings.

Timeline of DIFUSE Modules

Taylor Hickey 3/9/23 Taylor Hickey 3/9/23

Modeling First Order Systems with Footage of a Small Motorized Cart

This module examines the open loop response of a small motorized cart with a voltage applied to the motor. This module has 2 components: The individual data collection and analysis, and then drawing conclusions based on the aggregated class data.

Taylor Hickey 3/9/23 Taylor Hickey 3/9/23

Exploring the Relationships between Land Use, Deer Population, and Lyme Cases in Four U.S. States

This module allows students to explore data on lyme disease cases, deer population, and land use and environmental factors for four different states, Connecticut, Maryland, New Hampshire, and Massachusetts using various data analysis techniques. Six Canvas quizzes with mainly short answers and a few multiple choice questions guide students through a Google Colab application.

Taylor Hickey 11/30/22 Taylor Hickey 11/30/22

Using the Wind Power Equations to Site a Wind Farm

This module allows students to engage with the wind energy power equations and explore other considerations in the siting of a wind farm. Students work through three block assignments in Google Colab, beginning with the wind power equations and culminating in considerations in siting a wind farm.

Taylor Hickey 11/30/22 Taylor Hickey 11/30/22

Using Statistics and Supervised Machine Learning to Inform Airline Decision Making

This module reinforces underlying statistical concepts in the process of building a data analysis pipeline. Students practice statistical concepts to gain an understanding of the airline data in Part 1, then the data is used to implement machine learning models in Part 2. The final deliverable is a slide deck, in which students act as consults for the Phoenix Sky Harbor Airport using insights gained from supervised machine learning analysis of the relationship between airline carrier delays and passengers per flight.

Taylor Hickey 11/29/22 Taylor Hickey 11/29/22

Using Footprint Data to Make Inferences about Historical Societies

In this module students learn and apply the systematic steps that anthropologists may take to make deductible inferences about historical societies given the observations of fossil (foot print) records. Students first collect data on their own footprints using a sandbox built by DIFUSE, then analyze aggregated data from the entire class, and finally use their insights to make inferences about social behavior of historical populations.

Taylor Hickey 4/4/22 Taylor Hickey 4/4/22

Quantifying Behavior Using Focal Bout and Instantaneous Scan Sampling

This course module is a two-step assignment in which students collect data on shots taken during a provided basketball game video using the two main data collection methods used in research on primate behavior, focal bout sampling and instantaneous scan sampling. The class data is then aggregated and visual representations are created and discussed. The goal of the module is for students understand the respective strengths and weaknesses of the two data collection methods.

Taylor Hickey 4/4/22 Taylor Hickey 4/4/22

Examining Air Quality Data in Germany

This module consists of six assignments in which students learn and then apply air quality dispersion modeling using an R-based programming module, with the help of the package ‘openair’ and open-sourced air quality datasets of cities in Germany.

Taylor Hickey 12/28/21 Taylor Hickey 12/28/21

Examining the Effect of Different Factors on Self-Rated Health in Texas Counties

This course module consists of four assignments in which students explore different categories of factors that could affect self-rated health in Texas counties. Students use a linear regression and a heat map to explore these relationships.

Taylor Hickey 12/28/21 Taylor Hickey 12/28/21

Modeling the Glucose Insulin System

This module consists of two assignments. The first guides students through modeling simple ODEs in Matlab, and the second, longer assignment, guides students through modeling the Glucose Insulin System in Matlab with Euler’s Method. The students are then expected to explore this model by optimizing one parameter for a given set of data using the least squares method.

Taylor Hickey 5/25/21 Taylor Hickey 5/25/21

Examining the Racial, Environmental, and Economic Influences on COVID-19 Mortality in Louisiana

This course module is a web-app accompanied by a short-answer-based assignment to guide students through it. Students will use spatial data to visualize human-environment relationships and analyze those relationships through data visualization, plotting, and linear regression analysis.

Taylor Hickey 4/6/21 Taylor Hickey 4/6/21

Statistics in R

This course module consists of Jupyter notebooks designed to introduce students to basic functional R commands/procedures whilst tying in key statistical content. It aims to give novice students competence in R and challenge experienced students.

Taylor Hickey 4/6/21 Taylor Hickey 4/6/21

Stars and the Milky Way

This course module is a series of group exercises and one problem set designed to introduce students to the way Astrophysicists manipulate data and perform analyses in Python with an emphasis on data visualization and plot interpretation.