pvdimens.bsky.social
A PhD in population genetics of marine fishes, Julia package dev, linked-read aficionado, guitar player, rock climber, and once described as endearingly mediocre.
54 posts
132 followers
118 following
Prolific Poster
Conversation Starter
comment in response to
post
I realize now that your question might be more specific to pipeline and not individual software. Unfortunately, I'm unfamiliar with the existing pipeline landscape 😔
comment in response to
post
It's been a minute since I've done a pacbio assembly, but flye was a great assembler to use (hopefully that's still the case). Regarding QC, fastp has a long-read version.
comment in response to
post
comment in response to
post
Pope vs Predator
Alien vs Pope
comment in response to
post
Spice Girl Scout Cookies
comment in response to
post
I think so too! I think you should be able to simulate a junky metagenome somewhere (no ideas off the top of my head), then use Harpy to make linked reads out of it and assemble
comment in response to
post
For those inversions, it benefits us to have linked sequences far apart. In the assembly context, it might work, as link reads are used for metaassembly. You can use the data simulator and assembler in Harpy to test it out!
comment in response to
post
The utility of the chemistry depends on the original DNA molecules being longer, creating more tagged fragments. DNA that is already very degraded and fragmented will likely yield less linked information. However, our lab uses link reads for large inversion detection, not assembly.
comment in response to
post
I don't hoard knowledge. Here's a template repo I made to quickly set this up for yourself for any project:
github.com/therkildsen-...
Here's a guide I wrote to do it:
therkildsen-lab.github.io/user-guide/d...
We can take steps towards being sincere when we write "all code available at xxx" 🤓
comment in response to
post
And, you can use Jupyter from VScode (like I do, see screenshot) if you don't want to bother with the native browser-based UIs. It's amazing. My mental overhead is so much less for organization. It looks NICE. It's TRANSPARENT. It's REPRODUCIBLE.
comment in response to
post
And that was automatically built from github.com/pdimens/hapl... whenever I make a push. The *totality* of this project is that repo/site. And it'll be in that state when I go on to publish it. You won't have to rummage around the repo if you don't want to-- the site is the public-face of that work.
comment in response to
post
But here's where it gets awesomer. You can set up a GH repo to automatically build your notebooks into a *nice* website (via JupyterBook). Your repository, your work, code, results, and narrative *as a website*. Automatically. Automagically. The sims project? pdimens.github.io/haplotagging...
comment in response to
post
Embedded is great b/c it saves output in the doc (text, plots, etc). Jupyter = JUlia PYThon R, the languages it was initially conceived for. It also works with bash, which rocks. So you write a narrative, with notes, and code that gets executed, with results that get embedded. Awesome.
comment in response to
post
So I committed to something I've been meaning to explore for a while: Jupyter + JupyterBook. If you aren't familiar, Jupyter is a "notebook" system where you write narrative and executable code in one document. Think Rmarkdown/Quarto but more what-you-see-is-what-you-get, and results get embedded.
comment in response to
post
Preface: I'm doing a simulation-based benchmark of linked-reads and thought I was doing SUCH a good job of project organization: naming files, directory structure, etc. Two weeks in, I had to redo something from a very early step and had NO IDEA what I was looking at 😱. How is that possible?
comment in response to
post
bsky.app/profile/did:...