[ad_1]
It also makes the origin of some datasets difficult. This could mean that researchers are missing out on important features that skew the training of their models. Many unwittingly used a dataset containing chest scans of children who did not have covid as examples of what non-covid cases looked like. But as a result, AIs learned to identify children, not viruses.
Driggs’s group trained their own model using a dataset that contained a combination of scans taken while patients were lying and standing. Since patients scanned in a supine position were more likely to be seriously ill, the AI did not correctly predict the serious risk of coronavirus based on a person’s position.
In other cases, some AIs have been found to capture the font of text that some hospitals have used to label scanned images. As a result, fonts from more heavily loaded hospitals have become predictors of coronavirus risk.
In retrospect, mistakes like this seem obvious. They can also be corrected by adjusting the models if the researchers are aware of them. The flaws can be recognized and a less accurate but less misleading model can be released. But many of the tools were developed either by artificial intelligence researchers who lacked the medical knowledge to identify gaps in the data, or by medical researchers who lacked the math skills to compensate for these gaps.
A more subtle issue that Driggs emphasizes is the inclusion bias, or bias introduced when the dataset is marked. For example, many medical images have been tagged according to whether the radiographers who created them declared they had coronavirus. But this includes or includes any bias of that particular physician in the underlying truth of the dataset. It would be much better to label medical scans with the result of a PCR test rather than the opinion of a single doctor, Driggs said. But busy hospitals don’t always have time for statistical subtleties.
This has not stopped the introduction of some of these tools into clinical practice. Vinants says it’s unclear which ones are being used and how. Hospitals sometimes say they only use the tool for research purposes, making it difficult to gauge how much doctors rely on it. “There are many secrets here,” she says.
Vinants asked a deep learning marketing company to share their approach, but received no response. She later found several published models of researchers associated with this company, all of which have a high risk of bias. “We really don’t know what the company has implemented,” she says.
Some hospitals are even signing nondisclosure agreements with medical AI providers, Winants said. When she asked doctors what algorithms or software they were using, they sometimes replied that they were not allowed to speak.
How to fix it
What to fix? More accurate data may help, but in times of crisis this is a big question. It is more important to make the most of the datasets we have. According to Driggs, the easiest thing would be for the AI teams to collaborate more with clinicians. Researchers should also share their models and talk about how they learned so that others can test and build on them. “These are two things we could do today,” he says. “And they will solve maybe 50% of the problems we have identified.”
Data acquisition would be easier if formats were standardized, says Bilal Matin, a physician leading clinical technology research at the Wellcome Trust, a global health research charity based in London.
[ad_2]
Source link