Oura Ring Gen 4 sensor data — not clinical measurementsN=1 case study — not validated for clinical decisionsHEV diagnosed Mar 18; interpret findings cautiously in this Day 20 post-ruxolitinib window

Treatment Response Detection

Module 2: Comparative Changepoint Analysis
Generated 2026-04-05 19:38 · Henrik (post-HSCT) vs Mitchell (post-Stroke)

Executive Summary

SIGNIFICANT CHANGES
Info
3.0/ 6
Henrik post-Rux (Bonferroni-corrected)
IMPROVED METRICS
Normal
5.0/ 6
Direction of change post-treatment
MITCHELL EVENTS
Info
12.0detected
High-confidence consensus changepoints
DETECTION METHODS
Info
4.0methods
PELT + CUSUM + BOCPD + Rolling Window

Henrik: Treatment Response Analysis

HRV (RMSSD)
Normal
+17.4%change
Pre: 9.0 (n=64) | Post: 10.6 (n=16)
p=p=0.030 (corrected) | d=-0.61 (medium)
LOWEST HEART RATE
Normal
-5.6%change
Pre: 76.7 (n=64) | Post: 72.4 (n=16)
p=p=0.035 (corrected) | d=0.80 (large)
AVERAGE HEART RATE
Normal
-5.3%change
Pre: 85.2 (n=64) | Post: 80.7 (n=16)
p=p=0.039 (corrected) | d=0.72 (medium)
SLEEP EFFICIENCY
Normal
+2.7%change
Pre: 78.6 (n=64) | Post: 80.8 (n=16)
p=p=0.181 (corrected) | d=-0.64 (medium)
DEEP SLEEP
Abnormal
-8.1%change
Pre: 1.1 (n=64) | Post: 1.0 (n=16)
p=p=1.000 (corrected) | d=0.29 (small)
DAILY STEPS
Normal
+11.5%change
Pre: 2390.7 (n=67) | Post: 2665.2 (n=19)
p=p=1.000 (corrected) | d=-0.18 (negligible)
MetricPre-Acute
(< 2026-02-09)
Post-Acute / Pre-Rux
(2026-02-09 - 2026-03-16)
Post-Rux
(≥ 2026-03-16)
HRV (RMSSD)7.6 (n=30)10.4 (n=34)10.6 (n=16)
Lowest Heart Rate79.6 (n=30)74.1 (n=34)72.4 (n=16)
Average Heart Rate88.7 (n=30)82.0 (n=34)80.7 (n=16)
Sleep Efficiency78.1 (n=30)79.1 (n=34)80.8 (n=16)
Deep Sleep1.2 (n=30)1.0 (n=34)1.0 (n=16)
Daily Steps2473.1 (n=32)2315.3 (n=35)2665.2 (n=19)

Mitchell: Discovered Changepoints

DateScoreConfidenceMethodsMetrics
2022-01-239HIGHcusum, pelt, rolling_windowdeep_sleep_hours, efficiency, hr_average, hr_lowest, hrv_average, steps
2021-05-318HIGHcusum, rolling_windowdeep_sleep_hours, efficiency, hr_average, hr_lowest, hrv_average, steps
2021-10-298HIGHcusum, rolling_windowdeep_sleep_hours, efficiency, hr_average, hr_lowest, hrv_average, steps
2022-05-068HIGHcusum, rolling_windowdeep_sleep_hours, efficiency, hr_average, hr_lowest, hrv_average, steps
2023-07-068HIGHcusum, rolling_windowdeep_sleep_hours, efficiency, hr_average, hr_lowest, hrv_average, steps
2023-12-308HIGHcusum, rolling_windowdeep_sleep_hours, efficiency, hr_average, hr_lowest, hrv_average, steps
2022-08-087HIGHcusum, rolling_windowdeep_sleep_hours, efficiency, hr_average, hr_lowest, steps
2024-06-036HIGHcusum, rolling_windowhr_average, hr_lowest, hrv_average, steps
2025-02-036HIGHcusum, rolling_windowefficiency, hr_average, hr_lowest, hrv_average
2024-01-225HIGHcusum, rolling_windowhr_average, hr_lowest, hrv_average
2025-02-115HIGHcusum, rolling_windowefficiency, hr_average, hrv_average
2025-03-255HIGHcusum, rolling_windowefficiency, hr_average, hrv_average
2021-02-202lowcusumsteps
2025-12-022lowcusumhrv_average

Comparative Distributions

Multi-Metric Convergence

Henrik: 3 days with 3+ metrics deviating beyond 1.5 SD (systemic shift events).

Mitchell: 28 days with 3+ metrics deviating beyond 1.5 SD (systemic shift events).

Methods Appendix

Changepoint Detection Methods

PELT (Penalized Exact Linear Time): Uses the ruptures library with RBF kernel to detect optimal changepoints. Signals are interpolated (for NaN) and standardized before fitting. Penalty is derived from BIC: 2 * log(n) * variance.

CUSUM (Cumulative Sum): Computes the cumulative sum of deviations from the overall mean. Second-derivative sign changes identify inflection points. Filtered by magnitude threshold (0.5 SD).

BOCPD (Bayesian Online Change Point Detection): Implements Adams & MacKay (2007) with Normal-Gamma conjugate prior. Hazard rate set to 1/30 for Henrik (shorter observation window) and 1/50 for Mitchell (longer data span). Changepoints where posterior probability exceeds 0.3.

Rolling Window Comparison: Adjacent 14-day windows compared via Welch's t-test and Cohen's d. Dates flagged where p < 0.01 AND |d| > 0.5, indicating both statistical significance and practical effect size.

Statistical Tests

Pre/Post Comparison: Mann-Whitney U test (non-parametric, two-sided) with Bonferroni correction for 6 simultaneous comparisons. Effect size: Cohen's d with pooled standard deviation. Confidence intervals: bootstrap with 1,000 iterations.

Consensus Scoring: For Mitchell, all (method x metric) detections are clustered within a 3-day tolerance window. The consensus score counts the number of unique methods and metrics detecting each cluster. High confidence = score >= 3.