Confessions of a Sleep Heretic
During my site visiting years, I spent a considerable amount of time on planes reading accreditation applications. One thing that always puzzled me was that many centers included high and low ranges for sleep stages as a percentage of total sleep time. Despite the fact that my site visiting hit its peak in 2010, I think these numbers usually came from the 1974 opus by Williams, Karacan and Hirsch, which appears to be out of print. Patients with inadequate Stage 3 or excessive REM were branded as abnormal. But abnormal how?
We have long been focused on the details of sleep. For example, we have rules for discriminating N2 from REM. I know. I spent years running the AASM Inter-scorer Reliability program and debating with experts whether a squiggle represented a spindle or muscle artifact. I’m not going to mention Dr. Chokroverty, Dr. Zacheck or Claude Albertario by name, but you know who you are. Good times. I’ve talked to people who have developed automated scoring techniques that score as well as humans. The field has not accepted this as an adequate substitute. But humans are not very good at measuring amplitude and frequency, the key features of the EEG that lead to sleep stage discrimination. Automated systems can do this much more accurately and apply algorithms to determine sleep stages. If we want precision and reliability, we should use automated scoring systems instead of fallible humans.
I know I’ve buried the lede, but here it is: Over the past few years, I’ve begun to lose faith in our dedication to measuring sleep in such detail. What is the value of endless hours of sleep stage scoring? For the most prevalent sleep disorder, obstructive sleep apnea, we have already abandoned the recording of sleep in favor of home sleep apnea testing without EEG or even, in many cases, without actigraphy as a measure of sleep time. I’ve argued that we need to “put sleep back in sleep apnea” by recording sleep during titration and considering sleep fragmentation in determining optimal sleep treatments. But it’s not really sleep that we are trying to improve.
When you wake up in the morning and say, “That was a great night of sleep” is it because you can feel that you had some extra N3 or fewer arousals? I don’t think so. I think it’s because you feel awake and refreshed.
Most of us have focused on the sleep half of the sleep/wake dichotomy because we are sleep people. There have been a few exceptions, such as Dr. David Dinges, who has championed the psychomotor vigilance task as a measure of alertness since 1985.1 And as I read over my previous blog on the Epworth Sleepiness Scale, I can’t help but note that this is a measure that has remained essentially unchanged since 1991. If, as Drs. Omobomi and Quan suggest2, we are ready to retire the ESS, what will we use to replace it?
Thinking about a sleep disorder that is defined by excessive daytime sleepiness, such as narcolepsy, it is instructive to note that the AASM Quality Measure for EDS is: “Excessive daytime sleepiness must be measured with a validated scale including, but not limited to, the Epworth Sleepiness Scale, Stanford Sleepiness Scale, Karolinska Sleepiness Scale, Cleveland Adolescent Sleepiness Questionnaire or a Visual Analog Scale.”3 (p. 337) This is a sad state of affairs. I could draw a horizontal line on a piece of paper and tell a patient, “the left end is very sleepy, and the right end is wide awake; put a vertical line where you are now.” This is a visual analog scale and meets the outcome measures requirements in the AASM guideline. And the possibilities are not limited to that. I could use the Rosenberg Cosmic Consciousness Emoji Scale of Excessive Daytime Sleepiness, which starts with the scream emoji and ends with the exploding head. Patent pending.
The AASM also states, “This outcome measure is highly patient centered, given the implications for an individual of excessive daytime sleepiness for functioning and quality of life.”3 (p. 337) Does this mean that we need a different test for each patient? I think we can do better. I think the construct of alertness can be operationalized, and we have lots of options to test. The PVT and ESS clearly measure aspects of this, and some combination of reaction time, subjective sleepiness, pupillary changes and maybe a biomarker might give us some construct validity that leads to consensus agreement.
My favorite quote these days is, “Measure what can be measured, and make measurable what cannot be measured.” The quote is attributed to Galileo. After the Spanish Inquisition sentenced him to house arrest, he is said to have had severe insomnia, so it might just be that he needed a good night of sleep. On the other hand, I think we’ve focused on measuring sleep in great detail. Now it might be a good time to focus on measuring waking and alertness. If our current measures don’t work, let’s find some new ones. Does this sound heretical? I hope I don’t get subjected to the Spanish Inquisition for this. Because you never expect the Spanish Inquisition.
- Dinges, D., & Powell, F. (1985). Microcomputer analyses of performance on a portable, simple visual RT task during sustained operations. Behavior Research Methods, Instruments, & Computers,17(6), 652-655.
- Omobomi O, Quan SF. A requiem for the clinical use of the Epworth Sleepiness Scale. J Clin Sleep Med. 2018;14(5):711–712.
- Krahn LE, Hershner S, Loeding LD, Maski KP, Rifkin DI, Selim B, Watson NF. Quality measures for the care of patients with narcolepsy. J Clin Sleep Med2015;11(3):335–355.