ThreadSky
About ThreadSky
Log In
instlatx64.bsky.social
•
43 days ago
In the #AVX512 world, that's just 8 uops on the critical path:
Comments
Log in
with your Bluesky account to leave a comment
[–]
lemire.bsky.social
•
39 days ago
Nice, but does it have any practical benefit? I suspect that the fast scalar approach is faster...
0
1
reply
[–]
instlatx64.bsky.social
•
38 days ago
I see a small chance only on #AMD #Zen5, in regards of throughput, where the 2 PCMPs can run in parallel and KUNPCK* is only 1 clock. And perhaps on future cores.
0
reply
Posting Rules
Be respectful to others
No spam or self-promotion
Stay on topic
Follow Bluesky's terms of service
×
Reply
Post Reply
Comments