By Christine Kern, contributing writer
Physicians are more than twice as likely to make a correct diagnosis, according to a new study.
Humans are better at making diagnoses than their corresponding algorithms, according to the results of a study conducted by researchers from Harvard Medical School, Brigham & Women’s Hospital, and the Human Diagnosis Project. In fact, physicians are more than twice as likely as algorithms to make a correct diagnosis.
These findings may surprise some as earlier research touted the success of artificial intelligence to provide accurate diagnoses. An Indiana University study found using patient data with machine-learning algorithms could actually dramatically improve the quality of healthcare and reduce the associated costs by using simulation modeling.
The Harvard/Brigham & Women’s/Human Diagnosis Project study examined 45 clinical vignettes to compare the diagnostic accuracy of 23 online or app-based symptom checkers with that of 234 physicians. The results show 72.1 percent of doctors listed the right diagnosis first, versus 34 percent of the algorithms.
The 23 online symptom checkers — some accessed via websites and others available as apps — included those offered by Web MD and the Mayo Clinic in the U.S. and the Isabel Symptom Checker in the U.K. Researchers used a web platform called Human Dx to distribute the vignettes to 234 physicians, who were asked to base their diagnosis on the information provided, without seeing the patient or running additional tests.
“The current symptom checkers, I was not surprised do not outperform doctors,” said senior author Dr. Ateev Mehrotra of Harvard Medical School in Boston. However, he added, computers and human doctors could both be involved collaboratively in a diagnosis, rather than pitted against one another.
“In a real-world setting, I could envision MD plus algorithm vs MD alone,” Dr. Andrew M. Fine of Boston Children’s Hospital, who was not part of the new study told Reuters Health. “The algorithms will rely on a clinician to input physical exam findings in a real-world setting, and so the computer algorithm alone could not go head to head with a clinician.”
Fifteen vignettes described acute conditions, 15 were moderately serious, and 15 required low-levels of care. Most described commonly diagnosed conditions, while 19 described uncommon conditions. Doctors submitted their answers as free text responses with potential diagnoses ranked in order of likelihood.
“In medical school, we are taught to consider broad differential diagnoses that include rare conditions, and to consider life-threatening diagnoses,” Fine. “National board exams also assess our abilities to recognize rare and ‘can’t miss’ diagnoses, so perhaps the clinicians have been conditioned to look for these diagnoses.”
“Physicians do get it wrong 10 to 15 percent of the time, so maybe if computers were augmenting them the outcome would be better,” Mehrotra said.