Wearable Heart Rate Accuracy Measures Vary by Product

February 12, 2020

Understanding the accuracy of wearables can help inform better medical decisions, investigators say.

Jessilyn Dunn, PhD

New findings show the accuracy of a patient’s heart rate measured by a wearable device can vary, depending on the wearer's everyday activities.

A new study from Duke University investigators highlighted the need to better understand the accuracy of wearable devices and determine how errors in measurements could affect research conclusion and inform medical decisions.

The team, led by Jessilyn Dunn, PhD, assistant professor of biomedical engineering at Duke, started the study because they were seeing evidence that indicated that wearable devices were not working as well for individuals with darker skin tones.

Because the companies that manufacture such devices don’t put out metrics about how well the technologies work across different skin tones (F1:7, F2:8, F3:10, F4: 9, F5: 9, F6:10), the team wanted to “collect evidence about how well they work and identify potential circumstances where they may not work well,” Dunn said in a statement.

Dunn and colleagues tested 6 different wearables—4 of which were commercial devices (the Apple Watch 4; Fitbit Charge 2; Garmin Vivosmart 3; and Xiaomi Midband) and 2 of which were research devices (Empatica E4 and the Biovotion)—on a group of 53 individuals. In order to achieve an accurate baseline, each participant wore an electrocardiogram patch to measure their true heart rate during each activity.

Factors included skin tone; body fat percentage; weight; height; waist circumference; and sun exposure habits.

The study included 3 phases of baseline, deep breathing, activity, washout, and a typing task. For baseline, participants sat in a comfortable position for 4 minutes. Then the individual immediately completed a deep-breathing exercise where they breathed in sync with a one-minute video. After the deep breathing, the participant did a five-minute walking activity. Heart rate was monitored to ensure the participant reached 50% of their maximum heart rate and did not exceed their maximum.

A two-minute washout period occurred prior to the typing task to ensure the heart rate returned to baseline. Participants typed on a computer keyboard for 1 minute before switched devices to begin the second phase.

During phase 1, the participant wore the Empatica E4 on the right wrist and the Apple Watch 4 on the left wrist. In phase 2, the Fitbit Charge 2 was on the left wrist, while during phase 3, the Garmin Vivosmart 3 was on the right wrist and the Xiaomi Midband 3 was on the left wrist. The Biovotion was worn on the upper right arm for all 3 phases.

At rest, individuals with FP5 skin tone had the largest mean directional error across all devices and FP1 had the lowest mean directional error (−4.25 bpm and −0.53 bpm, respectively). The darkest skin tone, FP6, had the highest mean absolute error and the second darkest tone, FP5, had the lowest at rest (10.6 bpm and 8.6 bpm, respectively). The average directional and absolute error across all groups at rest were -2.99 bpm and 9.5 bpm.

During activity, FP5 had the highest mean directional error and FP3 had the lowest (9.21 bpm and 7.21 bpm). FP4 had the highest absolute error and FP3 had the lowest (14.8 bpm and 10.1 bpm).

Skin tone was not the driver of directional or absolute error, the investigators reported.

The highest mean directional error occurred in FP5 and/or FP6 during activity in all of the devices, except for the Xiaomi Midband 3.

Consumer-grade wearable were more accurate than research wearables at rest. During rest, absolute error of consumer devices was an average of 7.2&thinsp;±&thinsp;5.4 bpm, while it was 13.9&thinsp;±&thinsp;7.8 bpm for research-grade wearables (P &thinsp;<.0125). For physical activity, the absolute error of consumer wearables was 10.2&thinsp;±&thinsp;7.5 bpm and 15.9&thinsp;±&thinsp;8.1 bpm (P &thinsp;<.0125) for research wearables.

Overall, walking tended to cause reported heart rate to be higher than true heart rate, while typing caused the reported heart rate to be lower than the true heart rate. The accuracy was better when the participants were at rest rather than performing a physical activity.

The study, “Investigating sources of inaccuracy in wearable optical heart rate sensors,” was published online in NPJ Digital Medicine.