Data Science in the Health Care Industry: Unintended Consequences of Online Ratings Informing HealthCare Decisions
The Ubiquity of Online Ratings
In a 2016 study, the Pew Research Center found that 84% of all US adults use online ratings sites to inform their product or service purchase decisions. The same is true for health care – patients increasingly access online ratings sites to inform their healthcare decisions, with online ratings emerging as the most influential factor for choosing a physician. In a 2017 study by the National Institutes of Health (NIH), 53% of physicians and 39% of patients reported visiting a health care rating website at least once. In addition, a recent study of 600 randomly selected physicians in the United States revealed that 66% of physicians had at least one rating across the most popular ratings sites – Healthgrades.com, Vitals.com and RateMDs.com. But perhaps the most striking statistic comes from a survey of 1000 outpatients from the Mayo Clinic in Rochester, MN, where 75% of patients would choose a physician and 88% would avoid seeing a physician based on ratings data alone. Payers and health systems are also now including consumer ratings in their patient portals, which provides tacit endorsement for the ratings’ validity in comparing doctors.
But is more data always better?
The increasing ubiquity of online ratings for physicians is providing patients, insurers and health systems with new data to inform healthcare related decision making. However, data without proper context has the potential to negatively impact these decisions and may even have downstream effects on health policy. As a result, this seeming treasure trove of new data, may in fact contribute to more poorly informed decisions. This is true for at least three reasons.
What is a “good” rating?
The first issue with physician ratings is related to the interpretation of the numeric values. The most popular online consumer ratings websites use a 5-star Likert-type scale to rate providers, with the average score typically reported. While consumers may assume that higher scores (i.e., scores of 4 and 5) indicate above-average performance, this may not be so since the ratings are almost never normally distributed – most online ratings follow a “J” shape with a small proportion of values of “1”, limited values of “2” and “3”, and a substantive proportion of values of “4” and “5”. One study highlights that the percentile rank for a given star rating may differ drastically based on how scores are distributed, making the average value less meaningful. As a result, concepts like a ratio of 5:1 ratings, by specialty, by years of practice are substantively more informative than the more commonly reported average rating. However, this important issue of interpretation is frequently lost on the general public.
What is really being rated?
The second issue is related to what the patient is actually rating. While the most common physician ratings sites provide opportunities for patients to clarify their ratings between the physician, the clinical outcome and their experience with the staff/front office, most studies continue to demonstrate that there is no correlation between the quality of the medical outcome and the patient’s rating of the physician. For instance, Healthgrades.com asks reviewers to consider eight criteria, including the ease of scheduling urgent appointments, office comfort, staff friendliness, wait time and whether the doctor spent “an appropriate amount of time” with the patient. None of the questions focus on the quality of care – that is to say, a physician’s diagnostic skills, experience or success rates – nor on value or unnecessary or overly costly care. One study found that for “office-centered, non-surgical” specialties that required a substantive amount of patient-doctor interaction (e.g., pediatricians, allergists, dermatologists) patients were better equipped to provide “accurate” physician ratings versus specialties with less “non-office, surgical” patient-doctor interaction (e.g., anesthesiologist, cardiologist). This finding also raises the question – are patients even capable of rating the quality of the care they are being given?
On the Internet No One Knows You’re a Dog
In 1993 – the earliest days of the Internet – the New Yorker published a cartoon that has become the most reproduced cartoon in the publication’s history. The cartoon depicts two dogs working at a computer, with one dog explaining to the other dog, “On the Internet No One Knows You’re a Dog”. This cartoon has become one of the more light-hearted memes which highlights a more troubling issue – fake reviews. “Astroturfing” is the practice of physicians (or other service providers) creating positive fake reviews on ratings sites. Some state regulators have started formally penalizing astroturfing – particularly amongst physicians. One curious Data Scientist spent 95 hours scraping approximately 1.2 million doctors from the ratings site ZocDoc. Amongst his many interesting findings, he asserted that the heavy skew towards positive reviews – which did not even have the more typical “J” distribution – was likely a function of bias on the site. ZocDoc removes reviews if they contain profanity, pricing specifics, accuracy of treatment or diagnosis information – which is inherently biased against negative reviews (i.e. if a review includes a patient swearing about an overpriced doctor who misdiagnosed them, the review never makes it onto the site). He also opined that ZocDoc or doctors are artificially inflating their positive reviews with fake positive reviews – astroturfing.
Online ratings sites are a great example of how data scientists have used crowd sourcing platforms to aggregate massive amounts of numeric and text-based data with the intention of improving decision making. However, as is so often the case, the ability to capture and disseminate data outpaced the policies related to how and in what context the data should be disseminated. As a result, there are a lot of issues related to the validity and veracity of online ratings of physicians.
The next phase for data scientists to inform this space will involve the improved detection of fake reviews, improved metrics for communication of “good” ratings and ultimately providing consumers of health care data with tools that will help them make better choices.
Jennifer Lewis Priestley, Ph.D. is the Associate Dean of The Graduate College at Kennesaw State University. She is director of the Analytics and Data Science Institute and launched one of the first Ph.D. programs in Data Science in the country.