DS310: Machine Learning for Data Analytics

DS310: Machine Learning for Data Analytics, Fall Semester 2020
College of Information Sciences and Technology, Pennsylvania State University

COVID-19

Instructors

  • Fenglong Ma (lead instructor; fenglong[at]psu.edu)
  • Suhas Bettapalli Nagaraj (instructional assistant; suhas[at]psu.edu)
  • Zifei Zheng (learning assistant; zfz5136[at]psu.edu)

Meeting Times and Locations

  • Tuesdays and Thursdays 4:35 PM – 5:50 PM (Eastern Time), Westgate Building E208 + Zoom
  • Fenglong Office Hours: Mondays 2:00 – 4:00 PM (Eastern Time), Zoom
  • Suhas Office Hours: Fridays 10:00 AM – 12:00 PM (Eastern Time), Zoom
  • Zifei Office Hours: Thursdays 9:00 AM – 10:00 AM (Eastern Time), Zoom

Topic Schedule

Week Date Topic Assigned Due
1 08/25 Logistics (Online)
08/27 Introduction to Machine Learning (Online)
2 09/01 Inputs & Outputs of Machine Learning (Online) HW1
09/03 Hands-on Lab 1: Python, Pandas, and Numpy
3 09/08 Review: Derivatives
09/10 Regression: Linear Regression HW1 (09/13)
4 09/15 Regression: Gradient Descent HW2
09/17 Hands-on Lab 2: Kaggle and Regression  Proj1
5 09/22 Classification: Evaluation and K Nearest Neighbors
09/24 Classification: Logistic Regression HW2 (09/27)
6 09/29 Classification: Perceptron
10/01 Hands-on Lab 3: Classification 1 
7 10/06 Classification: Decision Trees HW3
10/08 Classification: Naive Bayesian Proj1 (10/11)
8 10/13 Classification: Support Vector Machines
10/15 Midterm Review HW3 (10/18)
9 10/20 Classification: Ensemble Learning Proj2
10/22 Midterm Exam (Online) 
10 10/27 Hands-on Lab 4: Classification 2 HW4
10/29 Clustering: Basics
11 11/03 Clustering: Kmeans Clustering
11/05 Clustering: Hierarchical Clustering HW4 (11/08)
12 11/10 Clustering: Density-based Clustering HW5
11/12 Dimensionality Reduction Proj2 (11/15)
13 11/17 Hands-on Lab 5: Clustering Proj3
11/19 Deep Learning: Introduction HW5 (11/22)
14 11/24 No Class (Thanksgiving Holiday)
11/26 No Class (Thanksgiving Holiday)
15 12/01 Final Review (Online)
12/03 Deep Learning: Convolutional Neural Networks (Online)
16 12/08 Deep Learning: Recurrent Neural Networks (Online)
12/10 Final Exam (Online) Proj3 (12/13)

Course Texts

  • Machine Learning, Tom Mitchell, McGraw-Hill, 1997.
  • Pattern Recognition and Machine Learning, Chris Bishop, Springer, 2006. (free online copy)
  • A Course in Machine Learning, Hal Daume III, 2017. (free online copy)
  • Data Mining: Concepts and Techniques, Jiawei Han, Micheline Kamber, Jian Pei, 2011.
  • Introduction to Data Mining (2nd Edition), Pang-Ning Tan, Michael Steinbach, Anuj Karpatne, Vipin Kumar, 2019.

Grading

    • Class Participation – 5%
    • Homework – 35% (7% * 5)
    • Hands-on Labs – 10% (2% * 5)
    • Project – 30% (10% * 3)
    • Mid-term Exam (online) – 10%
    • Final Exam (online) – 10%
    • All components will be individually curved
    • Final grade
      • A [93%, 100%]
      • A- [90%, 93%)
      • B+ [87%, 90%)
      • B [83%, 87%)
      • B- [80%, 83%)
      • C+ [77%, 80%)
      • C [70%, 77%)
      • D [60%, 70%)
      • F [0%, 60%)

The instructor reserves the right to modify the grading scale so as to improve the letter grade if warranted by the circumstances (e.g., unusually high level of difficulty of problem sets).

Expectations

At the completion of this course, students are expected to obtain the following:

  • Broad understanding of the principles of data mining/machine learning, representative data mining/machine learning algorithms and their applications in data analytics and data sciences
  • Capability to identify, formulate and solve exploratory data analysis and predictive modeling problems that arise in practical applications
  • Understanding of the strengths and weaknesses of alternative algorithms
  • Capability to adapt or combine key elements of existing algorithms to design new algorithms as needed
  • Hands-on experience with the applications of several representative algorithms in a high-level programming language (e.g., Python)
  • Hands-on experience in participating in online data mining/machine learning competitions (e.g., Kaggle)

Assignment Submission Policy

  • Homework and Projects are usually assigned during TUESDAY classes
  • Homework and Project Dues are by default SUNDAYS 11:59 pm (EST)
  • Assignments must be TYPED and dropped to proper CANVAS drop boxes (hand-drawn figures are OK scan and drop)
  • Students can submit late with the penalty of 25% deduction for every 12 hours late (up to 2 days)
  • After 2 days, no more late submission is allowed

Academic Integrity

According to the Penn State Principles and University Code of Conduct: Academic integrity is a basic guiding principle for all academic activity at Penn State University, allowing the pursuit of scholarly activity in an open, honest, and responsible manner. In accordance with the University’s Code of Conduct, you must not engage in or tolerate academic dishonesty. This includes, but is not limited to cheating, plagiarism, fabrication of information or citations, facilitating acts of academic dishonesty by others, unauthorized possession of examinations, submitting work of another person, or work previously used without informing the instructor, or tampering with the academic work of other students. Any violation of academic integrity will be investigated, and where warranted, punitive action will be taken. For every incident when a penalty of any kind is assessed, a report must be filed.

Plagiarism (Cheating): Talking over your ideas and getting comments on your writing from friends are NOT examples of plagiarism. Taking someone else’s words (published or not) and calling them your own IS plagiarism. Plagiarism has dire consequences, including flunking the paper in question, flunking the course, and university disciplinary action, depending on the circumstances of the office. The simplest way to avoid plagiarism is to document the sources of your information carefully.

Homework: When discussing problems from assigned homework with other students, you may:

  • Discuss the material presented in class or included in the assigned readings needed for solving the problem(s)
  • Assist another student in understanding the statement of the problem (e.g., you may assist a non-native speaker by translating some English phrases unfamiliar to that student)

It is expected that you have independently arrived at solutions that you turn in for problem sets. The following are examples of activities that are PROHIBITED:

  • Sharing solutions or fragments of solutions (via email, discussion groups, social media, whiteboard, handwritten or printed copies, etc.)
  • Posting solutions or fragments of solutions in a location that is accessible to others
  • Using solutions or fragments of solutions provided by other students (including students who had taken the course in the past)
  • Using solutions or solution fragments obtained on the Internet or from solution manuals for textbooks

Project: When discussing laboratory assignments, you may:

  • Discuss the material presented in class or included in assigned readings, documentation, user manual, etc.
  • Assist another student in understanding the statement of the problem (e.g., you may assist a non-native speaker by translating some English phrases unfamiliar to that student)
  • Discuss high-level ideas about how to complete the lab assignment, including problem specification, general strategies for the solution, strategies for debugging and testing code, etc. without examining code written by other students, or sharing code written by you with other students.

It is expected that you have independently arrived at solutions that you turn in for laboratory assignments. The following are examples of activities that are PROHIBITED:

  • Examining, copying of code or code fragments from someone else (including online sources), other than the code that is provided to you by the instructor or included in the reference books.
  • Sharing code or code fragments (via email, discussion groups, social media, whiteboard, handwritten or printed copies, etc.)

If a “friend” asks you to show him/her your code (especially if the request is to receive a copy of your code), you are opening the door wide for a possible charge of academic misconduct for both of you. I have seen friendships crumble when student A innocently supplies a copy of his/her code to student B, who then plagiarizes it, getting both in trouble. Do not be an accessory; truly help a friend by saying no. The best source for help on these assignments is the instructor or the teaching assistant. We are experienced in providing the right kind of information and help.

Exam: It is expected that you have independently arrived at solutions that you turn in for exams. The following are examples of activities that are PROHIBITED:

  • Copying someone else’s solution
  • Using notes, online resources, or other reference materials (unless instructed otherwise)
  • Seeking, obtaining or providing help on an exam via phone, text messaging, email, social media
  • Altering a graded exam for re-grading
  • Getting an advance copy of the examination
  • Facilitating another student to cheat (e.g., by allowing him or her to copy your solution)
  • Having someone else write the exam amount to cheat on an exam.

You need to exercise special care with take-home exams. You should NEVER

  • Share solutions or fragments of solutions (via email, whiteboard, handwritten or printed copies, etc.)
  • Post solutions or fragments of solutions in a location that is accessible to others
  • Use solutions or fragments of solutions provided by other students (including students who had taken the course in the past)
  • Use solutions or solution fragments obtained on the Internet or from solution manuals for textbooks
  • Use material from textbooks, reference books, online resources, or research articles without properly acknowledging and citing the source

! Warning

  • Violation of Academic Integrity policy will result in an automatic F for the concerning submission.
  • Two violations ⇒ fail grade in the course
  • Have discussions about homework. Every student should submit own homework with the names of students in the discussion group explicitly mentioned.

Disability Access Statement

Americans with Disabilities Act: The School of Information Sciences and Technology welcomes persons with disabilities to all of its classes, programs, and events. If you need accommodations or have questions about access to buildings where IST activities are held, please contact us in advance of your participation or visit. If you need assistance during a class, program, or event, please contact the member of our staff or faculty in charge. Access to IST courses should be arranged by contacting the Office of Human Resources, 332 IST Building: (814) 865-8949.

Students with Disabilities: It is Penn State’s policy to not discriminate against qualified students with documented disabilities in its educational programs. (You may refer to the Nondiscrimination Policy in the Student Guide to University Policies and Rules.) If you have a disability-related need for reasonable academic adjustments in this course, contact the Office for Disability Services (ODS) at 814-863-1807 (V/TTY). For further information regarding ODS, please visit the Office for Disability Services Web site at http://equity.psu.edu/ods/.

In order to receive consideration for course accommodations, you must contact ODS and provide documentation (see documentation guidelines at http://equity.psu.edu/ods/guidelines/documentation-guidelines). If the documentation supports the need for academic adjustments, ODS will provide a letter identifying appropriate academic adjustments. Please share this letter and discuss the adjustments with your instructor as early in the course as possible. You must contact ODS and request academic adjustment letters at the beginning of each semester.

Statement on Nondiscrimination & Harassment (Policy AD42)

The Pennsylvania State University is committed to the policy that all persons shall have equal access to programs, facilities, admission and employment without regard to personal characteristics not related to ability, performance, or qualifications as determined by University policy or by state or federal authorities. It is the policy of the University to maintain an academic and work environment free of discrimination, including harassment. The Pennsylvania State University prohibits discrimination and harassment against any person because of age, ancestry, color, disability or handicap, national origin, race, religious creed, sex, sexual orientation, gender identity or veteran status. Discrimination or harassment against faculty, staff or students will not be tolerated at The Pennsylvania State University. You may direct inquiries to the Office of Multicultural Affairs, 332 Information Sciences and Technology Building, University Park, PA 16802; Tel 814-865-0077 or to the Office of Affirmative Action, 328 Boucke Building, University Park, PA 16802-5901; Tel 814-865-4700/V, 814-863-1150/TTY.

For reference to the full policy (Policy AD42: Statement on Nondiscrimination and Harassment): http://guru.psu.edu/policies/AD42.html