This summer, the NSF project “Big Data Education” held its Second Data Science Educators Workshop at Penn State Altoona. The project aims to develop a data science curriculum for college students, and the annual workshop is the main event that brings together participating educators from across the Penn State University System, and from other collaborating institutions.
This year, the workshop focused on 3 main topics:
- The educators’ evaluation of the project’s Module 1 (teaching key data science concepts through digital storytelling);
- The educators’ experience so far with the project’s Module 2 (solving big data problems in information security) and Module 3 (hands-on tutorials in advanced big data topics like machine learning);
- The educators’ proposals and needs of future teaching materials (main proposals being a portal site that organizes various data science education resources for beginners, and an introductory data science textbook for students with no or limited computer science background).
Many project’s team members and educators presented, including: Jungwoo Ryoo, Soo-yong Byun, Dongwon Lee, Larry Garvin, Lloyd Painter, Cynthia Wood, Ariel Salvaro, and others.
The educators reviewed the currently available data science curricula and found that there is a blind spot in the educational market: most teaching materials and courses focus either completely on statistical methods, or on computer science and coding, but not both; and most do not provide real-world applications and case studies, thereby making them more inaccessible to non-CS majors (college students not majoring in Computer Science). Turns out it is actually reasonably difficult to find a great introductory textbook for data science. So the NSF project will definitely be working on the textbook as a priority in the coming months.
Another great news came with one of the project’s corporate partner institutions: Datric Inc., an NC-based data solutions company, agreed to a license worth $1M to Penn State University for using its proprietary software ADS (Agile Data Suite) for the coming years as part of the project’s Module 3 data science hands-on tutorials being developed by Ariel Salvaro (Graduate Research Assistant – Head RA) and his team.
Overall, the Second Data Science Educators Workshop proved to be a success, and the project team is already busy at work implementing the educators’ recommendations and developing new data science teaching materials.
There will be a Third and final Data Science Educators Workshop next year, where we present all the data science teaching materials we developed during this project, including the introductory data science textbook, the hands-on tutorials in different programs applied to real-world case studies, and a comprehensive list of beginner data science teaching sources. Consider joining our mailing list here (link) to get an invitation to next year’s workshop!