Profile avatar
paulfharrison.bsky.social
Bioinformatician at Monash University, Melbourne, Australia. I also use mastodon: @[email protected] https://mastodon.online/@pfh My homepage is: https://logarithmic.net/pfh/ On Twitter I was: @paulfharrison
51 posts 374 followers 114 following
Regular Contributor
Active Commenter

Comparison of optimization and sampling from a distribution defined by an energy function. I use a continuous version of the Ising model spin lattice energy. First, optimization from a random initial state using gradient descent with momentum, using the SGD optimizer in PyTorch.

I think I'm a little bit in love with the withr package in R. Similar to "with" in Python, you can guarantee to properly clean up when using a resource such as a connection to a file, a temporary directory, or a temporary global setting change. withr.r-lib.org

Bash Strict Mode: Stop Silent Failures in Your Scripts 🧵 1/ Ever had a Bash script fail silently, leading to hours of debugging? Bash doesn't fail by default—it keeps running even after errors. Here's how to fix it with strict mode. 👇

I've been playing with a Langevin Dynamics sampler as an optimizer in PyTorch -- SGD with momentum and noise. It's a remarkably simple algorithm for sampling from a distribution (approximately). Should eliminate over-fitting, and can also be used for generative tasks.

Trying to bring some old family photos back to life with g.i.m.p and its truly amazing resynthesizer add-on created by @paulfharrison.bsky.social getting some pretty good results. Takes me a few hours per photo, but is super satisfying…

This is long overdue, but over the winter break we were finally able to write up our sgdGMF paper: arxiv.org/abs/2412.20509 We present a stochastic gradient descent method that allows to efficiently and very quickly estimate latent factors for, e.g., dimensionality reduction of single-cell data

If I were to do some virtual half-day workshops this year, what topics and depths might be useful?

I've been watching the 2023 Statistical Rethinking lecture series by Richard McElreath. These cover a complete approach to statistics based on causal reasoning and Bayesian analysis. They are excellent, highly recommend. www.youtube.com/playlist?lis...

☎️ calling #SingleCell researchers in 🇦🇺! Join our last community meeting for the year Hear from @lgmartelotto.bsky.social and @swbioinf.bsky.social and chat all things single cell or #SpatialOmics! www.biocommons.org.au/events/sc-sp... #bioinformatics #lifescience

Some years ago I wrote some code that produced a bad plot. I then had to endure multiple people having to explain my bad plot in seminars.

Starting today, The Carpentries will be active on BlueSky! 🥳 We'll continue to maintain our Mastodon (hachyderm.io/@thecarpentr...) and our LinkedIn (www.linkedin.com/company/the-...) accounts, and the content we post on here will be the same as the content we post on Mastodon. Excited to be here!

I just ran my 1.5 hour HTML+SVG+JavaScript+D3 workshop at ResBaz Victoria 2024. Touched on hopefully most of the important ideas needed for an interactive web-based data visualization. #resbaz #resbazvic #resbazvic2024 pfh.github.io/jsfoot/

Our paper "A Plot is Worth a Thousand Tests: Assessing Residual Diagnostics with the Lineup Protocol" has appeared in issue 3 of JCGS, www.tandfonline.com/doi/full/10.... . It is open access for now, but if you ever have problems accessing you'll find the preprint at doi.org/10.48550/arX...

I've begun a statistical graphics starter pack at go.bsky.app/EMb8xWf . Please help me find more people or feeds to add to this pack. #rstats #visualisation #statistics 📈

Pondering FDR control. Procedures like BH represent a certain policy, which can be confirmed to work by simulation for any given number of true discoveries. The actually achieved FDR guarantee is the FDR obtained in the worst case.

Our collaborative work with Narry Kim on the kinetics of mRNA poly(A) tail shortening. Led by Young-suk Lee, who established his own lab while the work was in progress, and Yevgen Levdansky in my lab. #RNASky #RNABiology www.nature.com/articles/s41...

Something that should exist: geom_principal_curve() If x and y have a symmetric relationship, eg both x and y are noisy measurements of an underlying hidden variable, geom_smooth will under-estimate the slope, just like linear regression.

sgdGMF is a general purpose matrix factorization package. Like PCA, but better, e.g. works well with count data. Should be fast, e.g. suitable for scRNA-Seq. Presented by @davi1893.bsky.social at #abacbs2024. github.com/CristianCast...

The Knockoff Framework looks very interesting as a broadly applicable method of maintaining an FDR. (Guannan Yang talked about this at #abacbs2024) web.stanford.edu/group/candes...

Jingyi Jessica Li demonstrates how to use permutations to test the accuracy of tSNE and UMAP #ABACBS2024

A nice video on power calculation by simulation. www.youtube.com/watch?v=vE8b...

Registrations for ResBas Victoria 2024 at Monash University are now open. The program is looking quite varied, it's going to be an adventure! #resbaz #resbazvic2024 resbaz.github.io/resbazvic202...

New CRAN Task View: Dynamic Visualizations and Interactive Graphics #rstats #dataviz By Sherry Zhang, Dianne Cook @visnut.bsky.social URL: cran.r-project.org/view=Dynamic...

High FDR rates from edgeR and DESeq2 for large datasets, seems a little concerning. I have noticed edgeR has a tendency to call genes differentially expressed on the basis of a single sample with high expression (have not tried with latest version). genomebiology.biomedcentral.com/articles/10....

This week the Monash Genomics and Bioinformatics Platform did a bulk RNA-Seq workshop, covering end-to-end from experimental design, through library preparations, running an analysis pipeline, and digging into differential expression.