The most underutilized #AVX512 instruction, VPSHUFBITQMB, is actually very useful for compactly generating a mask with arbitrary criteria based on b[5:0], e.g. to select special characters for parsing: if the second operand is a broadcasted 64-bit vector, where the ones represent the searched values
Comments
Tradeoffs are different on Zen 4, too.
@fclc.bsky.social
@geofflangdale.bsky.social
@haroldaptroot.bsky.social
@lemire.bsky.social
@nietras.bsky.social
@perforatedblob.bsky.social
@st01014.bsky.social
Sorry if I forgot anyone
For the specific case of Sep I doubt this will be faster given the mask for each special character is used too. Eg mask for column separator is used to check if only separators and quick readout of those which is why Sep is fast for this common case.