The objective of this project is to collect publicly available attendance data of professional sports leagues such as MLB, NHL, NFL, and NCAA to see if there are patterns over the years where the performance (win or loss) of one team is negatively associated with the attendance of the other teams’ games.


Team Members

Drew Wood | M. Bilal Yaseen | Yuan Meng | Connor Deardorff | Ji Yun Min | Marshall Malino | | | | | |

Project Poster

Click on any image to enlarge.



Watch the Project Video


video player icon

Download the Project Summary


report icon


Project Summary

Overview

The objective of our project is to collect publicly available attendance data of professional sports leagues such as MLB, NHL, NFL, and NCAA to see if there are patterns over the years where the performance (win or loss) of one team is negatively associated with the attendance of the other teams’ games. The main theme of our project is to study the correlation between attendance factors and performance factors.

Objectives

– Determine reliable and official websites we will use to collect data such as ESPN, NCAA, etc.

– Extract the most meaningful datasets with the information we need for our project.

– Collect data through data scraping methods.

– Writing code to refine data and generate graphs with the correlation data.

– Examine the correlation data and determine the sets of variables with the strongest correlation.

– Present a concrete conclusion to our research on correlating data.

Approach

– Data Gathering: Find relative website;

– Data Scraping: Convert data into useable format;

– Data Processing: Process data with Python;

– Data Analysis: Use the query system to gather necessary information and statistics;

– Data Visualization: Represent data into a visual format.

Outcomes

– Developed code capable of determining whether there is a zero-sum game between two sports teams.

– Visualizations showed positive or negative correlations between one team’s attendance and the other teams’ win/loss percentage.