tkorem.bsky.social
Microbiome, metagenomics, ML, and reproductive health. All views are mine. So are all your base
71 posts
466 followers
413 following
Regular Contributor
Active Commenter
comment in response to
post
Congrats!
comment in response to
post
A hopefully paywall-free link:
www.nature.com/articles/s41...
comment in response to
post
DEBIAS-M is available as a Python package (korem-lab.github.io/DEBIAS-M/ or just pip install debias-m). It works with any microbiome read count or relative abundance matrices, and any paired metadata. 7/7
comment in response to
post
Its multi-task version allows DEBIAS-M to learn models for multiple tasks at the same time, further increasing its performance. This is particularly useful for tasks such as metabolite level predictions, where we want to predict multiple metabolite levels using the same microbiome data. 6/7
comment in response to
post
Finally, DEBIAS-M is designed for machine learning pipelines, allowing to not just hold-out labels for a test set, but actually has an online learning mode that can handle completely new data on the fly (to our knowledge - the only method that allows that for microbiome data). 5/7
comment in response to
post
Next, the changes DEBIAS-M makes to the data are interpretable and explained by differences in experimental protocols. Analyzing the biases inferred for these 17 gut microbiome studies in HIV, we found that 84% of the variance can be explained by just three experimental factors. 4/7
comment in response to
post
This results in several benefits. First, in diverse benchmarks - using metagenomics and 16S sequencing, vaginal and gut microbiomes, and phenotypic and metabolite predictions - DEBIAS-M outperforms alternative methods. Here is an example for a gut 16S-based HIV classification across 17 studies. 3/7
comment in response to
post
DEBIAS-M is based on the multiplicative bias model of McLaren et al. (elifesciences.org/articles/46923). Under this model, every experimental protocol has different biases for each microbe. We infer the biases that maximize cross-batch association with phenotypes and minimize batch effects. 2/7
comment in response to
post
Congratulations!
comment in response to
post
My next study section was canceled well beyond Feb 2. (20-21)
comment in response to
post
It's really well done
comment in response to
post
A thread by Megan explaining the work
bsky.app/profile/mega...
comment in response to
post
Maturity is necessary but not sufficient imo. Take the most successful TT faculty, give them less infrastructure and slash their funding by >50% - they'll be less successful. It's a handicap.
comment in response to
post
Another small point: keep in mind that not having a postdoc advisor is one less person advocating for you, and one less recognizable name on your CV/pedigree (which really has an outsized impact in certain settings). Not that I am a big proponent of postdocs, but that’s for another thread.
comment in response to
post
Even if you did amazing considering your resources, and better than you would've as a postdoc, in many settings (search committees, study sections, etc.) you'd be compared to assistant professors who got bigger start-ups and better access to infrastructure and students.
comment in response to
post
There aren’t a lot of chances for starting your independent group. You want to go as far as you can, as fast as you can, with your very best ideas. In these positions, you often don’t have enough resources to do this, and you're often also not eligible to apply for funding.
comment in response to
post
These positions hire early (often out of PhD), are not tenure track (there's an end date), and provide limited funding (usually enough to hire 2-3 folks). They're appealing: it’s competitive, prestigious, and you get more money and less supervision. But I find it’s often a(n unintentional) trap.
comment in response to
post
Just told my partner yesterday that even if I had another two weeks between Saturday and Sunday I would still be late on a few deadlines come Monday
comment in response to
post
אתה יכול להרחיב?
comment in response to
post
Importantly - we'd love to hear your comments, feedback, and GitHub issues! In particular if there’s additional prior work on this topic that we should note.
comment in response to
post
But CV is used not just for evaluation but also for hyperparameter tuning, and distributional bias impacts HPs that affect regression to the mean. For example, we show that it biases for weaker model regularization, which might affect generalization and downstream deployment.
comment in response to
post
With RebalancedCV we could see the "real-life" impact of distributional bias. We reproduced 3 recently published analyses that used LOOCV, and showed that it under-evaluated performance in all of them. While the effect isn't major, it is consistent.
comment in response to
post
With this in mind, we developed RebalancedCV, an sklearn-compatible package which drops the minimal amount of samples from the training set to maintain the same class balance in the training sets of all folds, thus resolving distributional bias. github.com/korem-lab/Re...
comment in response to
post
As the issue is caused by a shift in the class balance of the training set, distributional bias can be addressed with stratified CV - but only if your dataset allows it to happen precisely. The less exact the stratification - the more bias you have (in this plot, closer to 0).
comment in response to
post
Does this mean that past work with LOOCV is overinflated? Not quite. Most machine learning algorithms regress to the mean - not to its negative - and so they are actually _under_evaluated. That's the negative bias we started with!
comment in response to
post
Distributional bias is a severe information leakage - so severe that we designed a dummy model that can achieve perfect auROC/auPR in ANY binary classification task evaluated via LOOCV (even without features). How? it just outputs the negative mean of the training set labels!
comment in response to
post
The issue is that every time one holds out a sample as a test set in LOOCV, the mean label average of the training set shifts slightly, creating a perfect negative correlation across the folds between that mean and the test labels. We call this phenomenon distributional bias:
comment in response to
post
This story begins with benchmarking we did for some of our machine learning pipelines. We used random data, so we expected to see random classification accuracy (auROC=0.5). Instead, we found a clear negative bias, that got worse with more imbalanced datasets:
comment in response to
post
A bit of background: when training models on small datasets it’s common to use LOOCV, as it maximizes the N of samples for training. It also leaves a single sample for testing, meaning that many performance metrics (e.g., area under ROC curve) require aggregation across folds/iterations.