That’s sort of what I was thinking. What if an AI was trained purely on ultrasound or MRI images? Would the bias from the techs that took the images be significant enough to harm the AI? Or what about whose images make it in the dataset? Are there ways to mitigate that type of bias?
Great questions. Yes, technically the bias introduced by medical technicians would affect the model’s capability to better generalize the data. However, there are methods that are used to account for the bias in models, in conjunction with the technicians and human raters to mitigate bias.
What if the bias is in the vignette? Given that the vignette presents an already simplified set of data based on a specific diagnosis, the LLM has an easier task finding a pattern than a trained physician. I wonder what would happen if the LLM had to conduct a patient interview...
They said in the article that these case studies had never been published before but were used for years in research. That makes it likely that they are somewhere on the internet and may have been scraped with the answers at some point. These LLMs don’t “know” anything.
I personally agree that LLMs do not “know anything”, but extremely useful “tools”, particularly in research. They will guide you to an area of possible solutions, not absolute solutions.
Even those are trained by samples plus descriptive labels of the samples. If some samples are incorrectly labeled due to prejudice then the model might learn to replicate that prejudice, because during learning it's *literally being told to believe the prejudice* in those labels
Comments