Vector on consoles are basically the same throughput as consumer grade CPUs I think. There have been several attempts at making SIMD code look like scalar code in Rust, but none that I know of that are ubiquitous. Autovectorization is hard animation because you have to go AoSoA.
it's kind of a shame that rust is pervasively impure, because that makes it very hard to force programs into the exact shape that make it easy to fully vectorize a function.
I do hope Rust implements some kind of effect system (like has been proposed). I'd love to write allocation-free, panic-free, etc. software and actually have the tools to enforce it.
the compiler already does these analyses. proving functions nounwind already happens in llvm. the only thing this gets you is enforcement in the frontend, which can actually lead to API fragility
I'm actually pretty anti stuff like frontend-level no-alloc because that's not a thing that people should be promising their callers. it's none of my callers business if I allocate, only that I not be wasteful.
side effects introduce the need to branch, which is an impediment to converting a computation into a circuit that can then be replicated a bunch of times in parallel across the lanes of be tor registers
you can eliminate branches if neither outgoing edge has side effects by just executing both branches and selecting the output you want. similarly you can (partly) eliminate loops by unroll-and-jamming them and masking off remainders and stuff like that
Comments
it can always be better, of course.
https://hpac.cs.umu.se/teaching/sem-accg-14/llvm-vectorization.pdf