yes, you're right, I should have picked 42, what was I thinking?
(sorry, Hitchhiker's Guide reference).
yep our cluster installed these 128-core beast machines, they are fantastic for big Stan models. I'm cooking a 7m obs model with ~50k parameters in 5 hours.
I’m honestly a little surprised bc my impression was that speed improvements from within-chain parallelization were nonmonotonic in the number of cores due to the fixed costs of each additional core, and 64 is way beyond the
point where I thought performance declined. But maybe that’s outdated?
Fixed cost but with very big data gradient computation per core increases as well. This model still takes a while but it goes from forever to a few days.
Comments
(sorry, Hitchhiker's Guide reference).
yep our cluster installed these 128-core beast machines, they are fantastic for big Stan models. I'm cooking a 7m obs model with ~50k parameters in 5 hours.
point where I thought performance declined. But maybe that’s outdated?