A Two-Stage Machine Learning Framework for Nonwear and Sleep Detection in Actigraphy Data

Project Team

Students

Zihan Zhao
Computer Science
Penn State Harrisburg

Faculty Mentors

Md Faisal Kabir
Penn State Harrisburg
Department of Computer Science

Orfeu M.Buxton
Penn State University Park
Department of Biobehavioral Health

Daniel Roberts
Penn State University Park
Department of Biobehavioral Health

Project

https://sites.psu.edu/mcreu/files/formidable/2/2025-07-21/Poster-Updated.pdf

Project Video

Project Abstract

Name: Zihan Zhao
Campus: Penn State Harrisburg
Major: Computer Science
Anticipated Graduation Date: May 2027
Mentors: Md Faisal Kabir (Penn State Harrisburg), Orfeu Buxton (University Park), Daniel Roberts(University Park)
Project Title: A Two-Stage Machine Learning Framework for Nonwear and Sleep Detection in Actigraphy Data

Wearable devices offer scalable solutions for sleep monitoring, but undetected nonwear periods and misclassified sleep states remain major challenges in actigraphy-based health research. These issues compromise the reliability of downstream analyses and may introduce bias in large-scale studies. To address this, robust, generalizable methods are needed for both nonwear detection and sleep classification.

This project adopts a two-stage framework: we first detect nonwear periods using XGBoost models trained to replicate the MESA dataset’s offwrist labels from light and activity data and then classify sleep versus wake from the filtered segments. We trained and evaluated our models using two real-world datasets: the MESA Sleep Dataset — which includes actigraphy-based sleep/wake labels generated by the device’s internal algorithm — and a second wrist-worn wearable dataset with polysomnography-derived sleep annotations. We implemented XGBoost classifiers using sliding-window features derived from activity and light for MESA, and additionally heart rate for the PSG-labeled dataset. We also reproduced prior experiments and plan to retrain our models on MESA using the same architecture for improvement comparisons.

The nonwear detection model, trained on the MESA dataset using light and activity features, achieved a balanced accuracy of 98.5%, effectively identifying nonwear periods. For sleep classification, we trained separate models on two datasets. A binary classifier trained on MESA (using light and activity) achieved 97.0% balanced accuracy, with 96% recall and 98% specificity. Additionally, a six-class sleep staging model was developed using the second wrist-worn dataset with PSG-based labels and additional heart rate data. This model achieved an overall accuracy of 84.0% and a macro-averaged recall of 78.0%, demonstrating strong performance across most sleep stages. Feature importance analysis revealed that activity-related features dominated classification performance, while light and heart rate features played supportive roles.

This two-stage approach improves the quality and reliability of actigraphy-derived sleep assessments by reducing noise and addressing data artifacts. It lays the foundation for scalable, unbiased, and reproducible sleep monitoring tools and supports further improvements through unified modeling across diverse datasets.

Evaluate this Project

Use this form link to provide feedback to the presenters, and add your project evaluation for award(s) consideration.