pashadag.bsky.social - Profile | ThreadSky | a Reddit-style client for Bluesky

Slides from my talk (with @kamilsjaron.bsky.social) on an history of k-mers in bioinformatics: rayan.chikhi.name/pdf/2025-kme...

submitted 2 days ago • 1 comment

Happy to share that CloseRead paper is out at Genome Biology! CloseRead reports intuitive metrics for assessing genome assembly quality based on read alignments. We benchmarked it on IG loci which are known to be hotspots for SVs and assembly errors. genomebiology.biomedcentral.com/articles/10....

submitted 16 days ago • 1 comment

A new preprint on indexing pangenome graphs using an FM-index of the haplotypes and a tag array. Joint work with Parsa Eskandar and @benedictpaten.bsky.social.

submitted 21 days ago • 1 comment

@alexanderjpetri.bsky.social's isONclust3 algorithm is now published doi.org/10.1093/bioi.... isONclust3 performs de novo clustering of long-read cDNA sequencing data. A key step in reference-free transcriptome analysis.

submitted 28 days ago • 1 comment

The deadline for WABI 2025 has been extended (but is still rapidly approaching) wabiconf.github.io/2025/ * abstract deadline: May 12 (AoE) * paper deadline: May 15 (AoE) Consider submitting your exciting algorithmic bioinformatics work to the WABI conference!

submitted 29 days ago • 0 comments

IG loci of widely used lymphobastoid cell lines contain somatic VDJ recombinations. Our novel toolkit, IGLoo, detects somatic events and removes them from a library thus enabling accurate reassembly of these regions. @maojanlin.bsky.social @benlangmead.bsky.social www.cell.com/cell-reports...

submitted 31 days ago • 0 comments

Nice article on (frustrating) life as performance engineer. purplesyringa.moe/blog/why-per...

submitted 37 days ago • 0 comments

The list of proceedings papers for #ISMB2025 is up on the website www.iscb.org/ismbeccb2025... ! It's an exciting collection of papers, as always :).

submitted 37 days ago • 0 comments

We finally concluded the meeting. Thanks to all attendees for their scientific contributions and for traveling (near or far) to the meeting! Thanks to the local organizers for the infrastructure and catering, and thanks to the co-organizers @yaronorenstein.bsky.social @camillemrcht.bsky.social!

submitted 41 days ago • 1 comment

The primate T2T paper is out at Nature! Our team led a comparative analysis of adaptive immune loci across great apes and revealed that these rapidly evolving regions harbor various SVs and species-specific genes. Check out all exciting stories in the peer-reviewed version.

submitted 57 days ago • 0 comments

We are hiring PhD students in Computational Mathematics and Mathematics at Stockholm University in various subjects: su.varbi.com/en/what:job/... Application deadline: April 22. (1/3)

submitted 71 days ago • 1 comment

PSA: if you are applying to a CS grad program & a faculty member is asking for a verbal commitment before an official offer letter, this is a HUGE 🚩! There is an April 15th resolution to avoid this behavior (cgsnet.org/resources/fo...). I'd urge you to avoid those departments!

submitted 104 days ago • 0 comments

Passionate about open science and FAIR data principles for microbiome data? Consider becoming an NMDC Ambassador next year! I was an Ambassador in 2023 and happy to answer any questions about the experience!

submitted 112 days ago • 0 comments

🎶 Last Christmas,i gave you m̶y̶ ̶h̶e̶a̶r̶t̶ 40 pages of delicious combinatorics 🎶 Choose any word W of size m. How many words of size k>m admit W as their smallest lexicographical subword of size m ? Find out in my latest preprint!

submitted 115 days ago • 1 comment

Over and over again, I come to the conclusion that the process of writing comes down to finding the most intuitive topological order through a high-dimentional space of results. Can be: sort by time, sort by paper, sort by topic, sort by previous work vs new stuff, first simple, then in-depth...

submitted 114 days ago • 2 comments

Since indirects are in the news again, and everybody and their dog has an opinion on how much "research overhead" should cost, here's an excellent book that explains where exactly the money goes. escholarship.org/uc/item/59p1...

submitted 117 days ago • 0 comments

Finally; the preprint on Cuttlefish 3 is available! This is the most recent in a long line of work led by Jamshed Khan, a recent PhD graduate from my lab. Cuttlefish 3 further improves the efficiency of Cuttlefish 2, while adding support for colored compacted de Bruijn graphs. 1/x

submitted 119 days ago • 1 comment

0/ Essential reading for anyone training or using sequence-function models trained on genomic sequences! 🚨 In our new preprint, we explore the ways homology within genomes can cause leakage when training sequence-based models and ways to prevent it

submitted 129 days ago • 1 comment

I'm glad to announce that the simd-minimizers library is out! 🧬🖥️ @curiouscoding.nl and I have been optimizing the computation of minimizers down to the smallest detail. The result is an order of magnitude faster than existing methods ; processing an entire human genome takes only 4s on my laptop! 🧵

submitted 127 days ago • 1 comment

🚨 Deadline Extended: Call for Papers - RECOMB-seq 2025 🚨 Great news! You now have an extra week to submit your work. 🎉 Updated Deadlines: 🔹 Abstract Registration: Jan 31, 2025 🔹 Submission: Feb 7, 2025 recomb-seq.github.io/papers/

submitted 133 days ago • 0 comments

How helpful is a de Bruijn graph for visualizing alternative RNA variants? Here I requested our graph visualization tool Vizitig to show me the CIC human gene in my data (3 RNA-seq samples). I connected sequences to known genes using an annotation and colored CIC's exons in different tints.

submitted 140 days ago • 1 comment

My PhD adviser Liliana Florea has developed a Coursera course "Bioinformatics Methods for Transcrptomics". A great resource to learn cutting-edge short- and long-read RNA-seq data analysis techniques: www.coursera.org/learn/bioinf...

submitted 141 days ago • 1 comment

Yes, they can hallucinate papers that don't exist, discuss results that seem to be imaginary, and can be confusing and inconsistent. But talking to tenured professors may still be helpful

submitted 142 days ago • 14 comments

I understand the reasons that people are codifying responsibilities and expectations for academic positions, but I can't help but feel like all this formalization risks extinguishing the things that make academia special to begin with.

submitted 154 days ago • 6 comments

I wonder if it is better to measure productivity techniques less by how much time they save but instead by how much more time you spend doing things you want. I suspect many common productivity frameworks like inbox zero et al would not fare well by this metric.

submitted 156 days ago • 6 comments

62 years later, the book that changed everything is still a must read. Kuhn distinguished between 'normal science' and 'revolutionary science', where in the former we work within the paradigm but if anomalies add up, a new paradigm emerges in a period of revolutionary science.

submitted 160 days ago • 6 comments

Hi all, here's my passion project for December - a Web site for ~realtime DNA sample screening/composition analysis. Let me know what you think!

submitted 160 days ago • 3 comments

Check out the thread by @elisarosix.bsky.social describing the latest efforts of the TR-IG Nomenclature Review Committee. TR-IG was a huge part of my work in 2024, and I am proud to be on the team that develops robust and transparent policies for annotation and naming adaptive immune genes.

submitted 163 days ago • 0 comments

Merry lexicographic minimiz... Christmas arxiv.org/abs/2412.17492

submitted 163 days ago • 0 comments

Do you want to learn systematic ways in which you can revise your research papers? I've posted a short collection of 4 lectures youtube.com/playlist?lis... 1/n

submitted 167 days ago • 1 comment

We're thrilled to introduce LexicMap v0.5.0🎉 It's more accurate and slightly faster! LexicMap has helped some scientists align genes and plasmids in AllTheBacteria and GenBank, each has > 2 million prokaryotic genomes! We'll provide an index for ATB on AWS later. github.com/shenwei356/L...

submitted 168 days ago • 1 comment

I wrote about the backstory of our recently published metagenomic profiler called sylph (www.nature.com/articles/s41...), partially to celebrate the new migration to bluesky Check out the blog here: jim-shaw-bluenote.github.io/blog/2024/de... -- and I apologize in advance for the lack of brevity :)

submitted 198 days ago • 1 comment

Jim notes in this blog that the first bioinformatics paper he ever read was Mash. I definitely had an agenda when writing that paper: I thought min-hashing was awesome and really wanted to teach other people about it. So, it's super gratifying to learn stars like Jim read it and were inspired!

submitted 169 days ago • 1 comment

Very cool work from Yang Lu et al. demonstrating miscalibration of BLASTP’s E-values and generating well-calibrated values via a knockoff-based approach (cc @mikelove.bsky.social) - academic.oup.com/bioinformati...! More analyses could benefit from knockoff-based approaches.

submitted 183 days ago • 2 comments

PSA: when listing the CPU you used to run experiments for your paper, include ALL of the below: - model name - base clock speed - L1, L2 cache size PER CORE - L3 cache size (total) - Number of sockets (if >1), cores, and threads - Whether hyperthreading was on? - Whether turboboost was disabled?

submitted 184 days ago • 2 comments