fclc.bsky.social
HPC, BLAS, I make things FAST
Standing on the shoulders of giants
TLDR; 🇨🇦🐧🧑🏼💻🚴🏎️🧗🏼 💩posting.
Haver of opinions that are all my own.
I mainly do #HPC #BLAS #AI #RVV and #clusters
Proud French Canadian, you’ll hear about it
(I help with HPC.social)
721 posts
916 followers
552 following
Prolific Poster
Conversation Starter
comment in response to
post
Ease of integration with various compilers, actual usage/tradeoffs in various scenarios, eg "long tail latency" typically observed with some Ethernet implementations
comment in response to
post
its*
🙃
comment in response to
post
More so from the sourcing side and ease of acquisition.
In a pure technical “give my users the best tool to enable the best science” how do the various networking technologies stack up?
comment in response to
post
hpc.social does have an events calendar, the issue is primarily people have to add events to it.
hpc.social/events/
comment in response to
post
Meanwhile I just landed in Taipei 😂
comment in response to
post
Tooling for this isn’t super pleasant, nor portable.
For Intel specifically, VTUNE has some nice tooling that helps here; look for the cpu branch metrics.
Off the cuff there were two good ones on the specifics of branch taken retirement rates
comment in response to
post
BTW ..
@fclc.bsky.social
Looking up for this on PPLX.ai, brought me up to this substack -> fprox.substack.com/p/taxonomy-o.... Got a few nice @instlatx64.bsky.social style diagams for RISC-V plus some other nice stuff too :)
comment in response to
post
VFMADD132SH
comment in response to
post
(Yes it's a wyvern, but styling Dragon as "Dwagon" is significantly more fun )
comment in response to
post
That’s the annoying part; they’re the least open modern Si player, trying to compete in the most open/datadriven field in SI, yet share *no* data
comment in response to
post
And the people that use ofast will have earned that behaviour
comment in response to
post
shorter version:
comment in response to
post
Need to go scorched earth on Ofast.
Shouldn’t be available, should ALWAYS be something that doesn’t get included in default optimization passes
comment in response to
post
This looks a lot like the proposed Matrix in Vector/integrated matrix extensions in RISCV
Effectively a way to overlay matrix capabilities using the existing register file, instead of changing/adding state the way AMX operates