Every week, Eli Gelfand, chief of general cardiology at Beth Israel Deaconess Medical Center in Boston, wastes a lot of time on letters he doesn’t want to write — all of them to insurers disputing his recommendations. A new drug for a heart failure patient. A CAT scan for a patient with chest pain. A new drug for a patient with stiff heart syndrome. “We’re talking about appeal letters for things that are life-saving,” says Gelfand, who is also an assistant professor at Harvard Medical School.
So when OpenAI’s ChatGPT began making headlines for generally coherent artificial intelligence-generated text, Gelfand saw an opportunity to save some time. He fed the bot some basic information about a diagnosis and the medications he’d prescribed (leaving out the patient’s name) and asked it to write an appeal letter with references to scientific papers.
ChatGPT gave him a viable letter — the first of many. And while the references may sometimes be wrong, Gelfand told Forbes the letters require “minimal editing.” Crucially, they have cut the time he spends writing them down to a minute on average. And they work.
Gelfand has used ChatGPT for some 30 appeal letters, most of which have been approved by insurers, he says. But he’s under no illusion that ChatGPT or the AI that powers it is going to save the U.S. healthcare system anytime soon. “It’s basically making my life a little easier and hopefully getting the patients the medications they need at a higher rate,” Gelfand says. “This is a workaround solution for a problem that shouldn’t really exist.”
That problem: The U.S. spends more money on healthcare administration than any other country. In 2019, around a quarter of the $3.8 trillion spent on healthcare went to administrative issues like the ones bemoaned by Gelfand. It’s estimated around $265 billion of that was “wasteful” — unnecessary expenditures necessitated by the antiquated technology that undergirds the U.S. healthcare system.
Gelfand can use a chatbot to electronically generate an appeal letter. But he has to fax it to the insurer. And that encapsulates the challenge facing companies hoping to build time-saving AI back-office tools for a healthcare system stuck in the 1960s.
Cut The “Scut”
The fax machine isn’t going away anytime soon, says Nate Gross, cofounder and chief strategy officer of Doximity, a San Francisco-based social networking platform used by two million doctors and other healthcare professionals in the U.S. That’s why Doximity’s new workflow tool, DocsGPT, a chatbot that helps doctors write a wide range of letters and certificates, is connected to its online faxing tool.
“Our design thesis is to make it as easy as possible for doctors to interface with the novel digital standards, but also be backwards compatible with all the old stuff that healthcare actually runs on,” says Gross.
Often referred to as a “LinkedIn For Doctors,” Doximity has a $6.3 billion market cap and generates most of its revenue ($344 million in its fiscal year 2022) from pharma companies looking to advertise and health systems looking to hire. But it also offers a range of tools for doctors to help “cut through the scut” – medical slang for reducing administrative burden. The basic versions are generally free with upsells for enterprise integrations, says Gross.
DocsGPT is built on ChatGPT but is trained on healthcare data, such as anonymized insurance appeals letters. Doctors can use the tool to draft letters, including patient referrals, insurance appeals, thank you notes to colleagues, post-surgery instructions and even death certificates. It provides a library of curated prompts based on what other doctors have searched for in the past, and is designed to remind doctors who use it that it is not a medical professional.
Before each response DocsGPT generates, a disclaimer runs across the top, asking the user to “PLEASE EDIT FOR ACCURACY BEFORE SENDING.” In an earnings call earlier this month, cofounder and CEO Jeff Tangney was asked how Doximity planned to monetize DocsGPT. “I’ll make a joke here,” he replied. “We probably spent more time worrying about the liability of that product than the monetization of it so far.”
While DocsGPT might save some time for the doctor, the subsequent back and forth with insurance companies over fax and phone means it can still take days to verify a patient’s insurance benefits or get a prior authorization for a surgery approved. Currently, a person in a doctor’s office or hospital staring at a screen needs to call a person at an insurance company who is also staring at a screen to manually sort through the specific details of each patient’s insurance benefits.
That eats up a lot of time for both insurers and doctors, and a shortage of workers isn’t helping, “It’s not just about it being slow, it’s stuff is not getting done,” says Ankit Jain, cofounder and CEO of conversational AI startup Infinitus Systems. “There was [an insurer] we were talking to who had 32 trailers of faxes that they’re backlogged on.”
With Infinitus, which raised more than $50 million since he cofounded it in 2019, Jain is trying to build a future where instead of people endlessly discussing benefits and approvals, bots do the talking for them.
“When a doctor makes stuff up, that’s called lying. When a model makes stuff up, we use this odd phrase called hallucination.”
Jain, a former Googler and cofounder of the tech giant’s AI-focused fund Gradient Ventures, says the problem is that every doctor, insurer and health system records information in different formats. Unlike long-suffering health industry employees, AI can very quickly make sense of it. Infinitus has built its own models and doesn’t rely on OpenAI’s technology, but Jain says the underlying premise is the same: “What large language models do is, they say, ‘Throw all that data at us.’ And the large language models can extract the right connections between phrases and concepts.”
So far, the conversation is one-sided: Infinitus used large language models to create Eva Lightyear, a robot that has made more than 1 million calls to insurance companies on behalf of doctors to verify insurance benefits and prior authorization requirements. One day, he hopes Eva won’t be calling a human on the other end of the phone, but another robot — though not literally.
“It’s not robots talking to robots in English or exchanging faxes with each other,” says Jain. “That becomes an API. The future needs to be digital highways, where you just submit information, it’s judged, it’s adjudicated, and you get a response instantaneously.”
While Jain may be optimistic about end-to-end automation, when it comes to adoption, chatbots and other kinds of AI-powered technology are facing a serious hurdle: Models like ChatGPT spout falsities as if they were true and have to be constantly retrained with the most up-to-date information out there.
“When a doctor makes stuff up, that’s called lying. When a model makes stuff up, we use this odd phrase called hallucination,” says Nigam Shah, chief data scientist at Stanford Healthcare.
ChatGPT was only trained on data available until 2021, and isn’t regularly updated. The field of medicine is constantly changing, with new guidelines, drugs and devices coming on the market, which means outdated data would pose a problem. Shah says he doesn’t see the possibility for broad adoption of generative AI in healthcare until there are systems in place to regularly retrain the models on new information and detect when the answers are wrong.
“We have to figure out how to verify the veracity and truthfulness of the output,” he says. There is also the risk that a doctor, no matter how well-intentioned, enters protected health information into ChatGPT. While anonymization and encryption are two ways to protect patient data, these measures alone may not be enough, says Linda Malek, a partner at the law firm Moses Singer.
“Even if you try to de-identify the data that is stored in ChatGPT, the AI capabilities can re-identify information,” she says. “ChatGPT is a particular target for cyber criminals as well, because it can be utilized for ransomware and different types of cyber attacks.”
Potential dangers aside, generative AI’s achievements continue to wow users. In January, researchers found that ChatGPT could pass the US medical licensing exam with “moderate accuracy” without any special training. (It’s not alone in this – at least two other AI programs, Google’s Flan-PaLM and Chinese AI-powered bot Xiaoyi, have also passed national medical licensing exams.)
The motivation was to get ChatGPT to perform standardized tasks without being specifically trained on any healthcare datasets, says Morgan Cheatham, a vice president at Bessemer Venture Partners and medical student at Brown University, who co-authored the study, which was published in PLOS Digital Health. While Cheatham says the results suggest ChatGPT’s large language models “have inherent value in healthcare applications,” he says any path forward is going to require a “crawl, walk, run approach.”
For now, the hope is generative AI could help doctors bring their attention and time to the most important part of their jobs: their patients. “What got me excited about becoming a doctor was the face-to-face interaction in the exam room with another human being,” says David Canes, a urologist at Beth Israel Lahey Health and cofounder of patient education startup Wellprept. “What’s intruded now is thousands of mouse clicks and keyboard entries.”
Canes says he plans to use ChatGPT for “low-stakes communications,” and he looks forward to the day when he can spend less time dealing with never-ending bureaucracy.
“My days would be perfect if they were just filled with patient care. I love that as much now as I ever did,” he says. “I look at these improvements and it makes me really hopeful that we’re on the verge of maybe a new era where the worst aspects of medicine can be minimized.”
Correction: The research involving ChatGPT taking the U.S. medical licensing exam was published in PLOS Digital Health in February.