//! Let the compiler auto-vectorize for portable SIMD.
Regular (scalar) processing: a teacher grades one exam at a time. Read question 1, check answer, move to the next. SIMD processing: the teacher has a transparent overlay with all correct answers.
Is low-level programming a sin or a virtue? It depends. When programming for using vector processing on a modern processor, ideally I’d write some code in my favorite language and it would run as fast ...