Profile avatar
gaetanbenoit.bsky.social
Postdoc researcher in bioinformatics at Pasteur institute. Scalable methods and software for metagenomics. https://github.com/GaetanBenoitDev
4 posts 92 followers 52 following
Prolific Poster

It's the K-mer Days in Toulouse, a meeting between computational bioinformaticians and clinical bioinformaticians, discussing how k-mer based methods can be used in research/diagnostic... Here Yohan presents REINDEER2.

Really enjoying my first attendance at @mdc-bimsb.bsky.social in Berlin. Thanks for the invitation!

Preprint alert! 🦌 Our new abundance index, REINDEER2, is out! It's cheap to build and update, offers tunable abundance precision at kmer level, and delivers very high query throughput. Short thread! www.biorxiv.org/content/10.1... github.com/Yohan-Hernan...

Preprint on "Improving spliced alignment by modeling splice sites with deep learning". It describes minisplice for modeling splice signals. Minimap2 and miniprot now optionally use the predicted scores to improve spliced alignment. arxiv.org/abs/2506.12986

Verkko2 paper is finally published! genome.cshlp.org/content/earl...

Hey yeast lovers. Do you like pangenomes? O'Donnel et al. 2023 produced T2T assemblies of different strains, including phased haplotypes for yeast. Here I selected 10 phased haplotypes and the S288C reference, and looked for the MST28 / YAR033W gene reported to contain SVs such as indels. 👇🏻👇🏻

NEW PREPRINT OUT arxiv.org/abs/2506.04926 The Burrows-Wheeler transform of a string is, in general, more compressible than the original string, especially when it contains lots of repeated subtrings. The extended BWT allows to deal with multisets of strings. 1/7

Another circRNA in MAN1A2 human gene, first described in Mercer et al. 2015. I retrieved illumina data, built the graph, searched the gene in Vizitig and here it is!

Slides from my talk (with @kamilsjaron.bsky.social) on an history of k-mers in bioinformatics: rayan.chikhi.name/pdf/2025-kme...

Announcing myloasm, a new long-read (ONT R10/PacBio) metagenome assembler that I've been working on during my postdoc in the Heng Li lab (@lh3lh3.bsky.social). myloasm-docs.github.io

📜 Excited to share insights from our recent paper: "Kaminari: a resource-frugal index for approximate colored k-mer queries". The study aims to efficiently identify documents containing a query string, focusing on DNA strings. www.biorxiv.org/content/10.1... 🧬 🖥️ 1/8

New blog post! In it, I benchmark the new version of Dorado from @nanoporetech.com, which comes with new DNA basecalling models. Short version: big accuracy gains for hac, small improvements for sup. Check it out for the full results: rrwick.github.io/2025/05/27/d...

Short-read metagenomic sequencing cannot recover genomes from many abundant marine prokaryotes due to high strain heterogeneity and platform-inherent GC bias (likely viruses, too), but Nanopore long reads can address this. A results thread on our recent preprint 🧵.

www.biorxiv.org/content/10.1... Autocycler, the automated successor to Trycycler from @rrwick.bsky.social has a pre-print out - overall, pretty awesome performance (and is very easy to use) github.com/rrwick/Autoc...

In this (rebutted) grant proposal I wrote that I thought we could see circular RNAs with Vizitig, despite being hard to detect and with low expression. Guess what? Looks very much like a circular RNA to me 😎 (in TET2 gene, human co-assembled cancer RNA-seq samples).

I am very happy (and anxious) to share with you our most recent work in which we evaluated four of the most popular long-read assemblers, www.biorxiv.org/content/10.1... and tell you just a little bit about it in the following 🧵

Preprint on hifiasm Nanopore-only assembly. Led by Haoyu Cheng: www.biorxiv.org/content/10.1...

High-quality metagenome assembly from nanopore reads with nanoMDBG https://www.biorxiv.org/content/10.1101/2025.04.22.649928v1

Starting #RECOMBseq with @rayanchikhi.bsky.social 's keynote. Here stressing our responsibility as scientists to enable access to a common good: genomic data

You should try A*PA2! It's Edlib, but faster! It's waiting for its first citation 😅

Finally; the preprint on Cuttlefish 3 is available! This is the most recent in a long line of work led by Jamshed Khan, a recent PhD graduate from my lab. Cuttlefish 3 further improves the efficiency of Cuttlefish 2, while adding support for *colored* compacted de Bruijn graphs. 1/x

🧬 Excited to share our latest work, MUSET 🌭, a new tool for creating abundance unitig matrices from sequencing data. It was published yesterday in Oxford Bioinformatics if you want to have a look👀 : academic.oup.com/bioinformati... Let's break it down:

We have updated all Logan contigs (now at version 1.1)! Contiguity has been much improved (2x) and a duplicated k-mers bug has been fixed. More information and changelog here: github.com/IndexThePlan...

After 24 years of work, I’m thrilled to announce the TYMEFLIES dataset, which comprises metagenomes from Lake Mendota (Madison, WI), collected roughly every 10 days (471 samples) for 20 years! @quendi.bsky.social @robinrohwer.bsky.social rdcu.be/d5put A thread…

How helpful is a de Bruijn graph for visualizing alternative RNA variants? Here I requested our graph visualization tool Vizitig to show me the CIC human gene in my data (3 RNA-seq samples). I connected sequences to known genes using an annotation and colored CIC's exons in different tints.

AAAAAAAaaaaaaaaaaaaaa on time

We start a new day of the Workshop on Genomics 2025 with the lecture on Genome Assembly by @camillemrcht.bsky.social and @npmalfoy.bsky.social! They always have a way to make the theory behind genome assembly enterntaining and exciting! 🤩 #evomics2025 #bioinformatics #genomics

We may have cut our last Canu release in 2024, but we are starting off 2025 with Verkko2! 🎉 Not only is it 4x faster than Verkko1, this version integrates Hi-C data for both phasing and scaffolding, enabling the automated assembly of acrocentric chromosomes! Preprint: www.biorxiv.org/content/10.1...

Glad to see metaMDBG in the line-up of assemblers 💚

Merry lexicographic minimiz... Christmas arxiv.org/abs/2412.17492

We built a new tool for disentangling local haplotypes from long-read sequencing: check out devider! github.com/bluenote-157... www.biorxiv.org/content/10.1... 1/5

🚨🚨🚨 We are hiring 🚨🚨🚨 After the creation of logan-search (see: bsky.app/profile/pier...) we propose a 2-years engineer position for continuing the development and optimizations. With @rayanchikhi.bsky.social and @tlemane.bsky.social Details + applications: recrutement.inria.fr/public/class...

Glad to share our preprint 🏃 **LongTrack**: long read metagenomics-based precise tracking of bacterial strains (and their genomic changes!) after 💩 fecal microbiota transplantation. A 4+ years journey and tour de force by ** @yufan01.bsky.social ** & team. A long 🧵 www.biorxiv.org/content/10.1...

🚨You've got to read this one. 🚨

We are also blessed with cool metagenomic talks, @gaetanbenoit.bsky.social who presented a preview of his new nanopore metagenomoc assembler,

Hi Bluesky! Live from the Seqbim workshop, our yearly national workshop on sequence bioinformatics. We started with a strobecool keynote by @ksahlin.bsky.social ! He overviewed strobemers, and related developments.

Excited to join the #Nanopore Community on BlueSky. Follow us for the latest updates and discussions about Oxford Nanopore and Nanopore Community. #WYMM

Excited to share “Bin Chicken”, substantially improving genome recovery through rational metagenomic assembly. Applied to public 🌍 metagenomes, it recovered 24,000 novel species 🦠, including 6 novel phyla. doi.org/10.1101/2024... @wwood @rhysnewell @CMR_QUT 🧵1/6

🧬🔍There are 50 petabases of freely-available DNA sequencing data. We introducing Logan Search which allows you to search for any DNA sequence in minutes, bringing Earth’s largest genomic resource to your fingertips. 🏔️ logan-search.org 🏔️ #Genomics #Bioinformatics #OpenScience

Long-read special issue of GR is out, including our paper on T2T assembly using only Nanopore. Just in time for ONT to discontinue duplex cells on Nov 27 😅😂 Good thing Verkko also works great with HERRO-corrected simplex data! 📄 genome.cshlp.org/content/34/1... 📖 genome.cshlp.org/content/34/1...

I wrote about the backstory of our recently published metagenomic profiler called sylph (www.nature.com/articles/s41...), partially to celebrate the new migration to bluesky Check out the blog here: jim-shaw-bluenote.github.io/blog/2024/de... -- and I apologize in advance for the lack of brevity :)

Love this idea for helping the community make the transition to Bluesky, so I’ve compiled a woefully incomplete list of microbial ecologists and -omics peeps I know are here. Please make suggestions and share. Let me know if you’d like to be added. go.bsky.app/TQpaTqF

We finished yesterday’s #GenomeInformatics24 with @gaetanbenoit.bsky.social describing metagenome assembler MetaMDBG now for nanopore data NanoMDBG. Using chained minimisers, piled up to detect errors.

One more opening in my group at Oxford Nanopore. This time in our New York office! ejnh.fa.em2.oraclecloud.com/hcmUI/Candid...

Very excited to see a new metagenome asssembly algorithm available for PacBio HiFi data! metaMDBG performs extremely well based on my benchmarks and we plan on using it for several deep-sequencing projects. doi.org/10.1038/s415...