The idea
I kept catching myself forty minutes into a YouTube rabbit hole with no memory of how I got there. Open a tab to look up a function signature, somehow end up reading about the history of the Suez Canal. Classic.
I wanted something that could notice before I did. Most focus tools either rely on you to report your own state — which fails exactly when you need it most — or require a wearable. I'm a college student; I'm not strapping on a chest monitor to do a problem set. But every laptop already has a webcam, and it turns out the webcam can see something your eyes can't.
Remote photoplethysmography
The technique is rPPG. Each heartbeat pushes blood through the capillaries near the surface of your skin, causing tiny color fluctuations — invisible to the naked eye, but recoverable from video if you process the frames carefully. From those fluctuations you can pull heart rate and heart rate variability in real time, no contact required.
The pipeline runs at 30 FPS inside a Chrome extension: face detection, ROI extraction (forehead and cheeks give the cleanest signal), color-channel decomposition, a 0.7–4 Hz bandpass filter, peak detection. Out the other end: a continuous stream of instantaneous HR and HRV.
Where I actually spent my time
Not on the algorithm. On noise.
A cloud passes and the ambient light shifts. Your head drifts two centimeters and the ROI loses the forehead. Webcam auto-exposure kicks in and wipes out a full second of data. The browser's media pipeline injects compression artifacts as high-frequency garbage. Each of these corrupts the estimate, and they all happen constantly.
So I built a confidence-scoring layer that weights each frame by how stable the face detection and lighting are, and down-weights shaky frames in peak detection instead of discarding them outright. It smoothed things out — but the signal still falls apart in dim light. The bottleneck isn't compute. It's the camera.
Choosing interventions
Detecting distraction was the easy half. What to do about it was the project.
My first version was blunt: when HRV dropped below a threshold, a banner slid down over the page — Focus check: take a breath. It once fired while I was mid-chase on a null-pointer bug, which is precisely when a breathing prompt is least welcome. I turned it off within a day.
The problem is contextual. A breathing prompt fits a reading assignment and grates during a coding sprint. "Take a break" is wise at the 90-minute mark and patronizing at fifteen. The right move depends on what you're doing, when, and how you've reacted to past nudges. So I used a LinUCB contextual bandit to learn that mapping per user. The context vector carries current HRV, time in session, time of day, and a rolling history of recent outcomes; the reward blends physiological recovery (does HRV bounce back after a nudge?) with explicit thumbs up/down. LinUCB balances exploration and exploitation on its own — which matters, because you can't A/B test focus interventions on a population of one.
What I took away
Real-time signal processing in the browser is more viable than I expected. Typed arrays and a tight render loop leave enough headroom to monitor a physiological signal continuously without the page so much as stuttering. The ceiling is hardware: a MacBook webcam is clean enough for reliable rPPG; most external webcams are not.
But the part that stuck with me was the other half. Measuring attention is a signal-processing problem, and signal-processing problems yield to effort. Earning the right to interrupt attention — without becoming the exact thing I'd resent and switch off — does not. You can build something that notices you've drifted in an afternoon. Building something you won't disable by Tuesday is the hard part, and no amount of model tuning gets you there.