pub fn transpose_bitmatrix(input: &[u8], output: &mut [u8], rows: usize)Expand description
Transpose a bit matrix using AVX2.
This implementation is specifically tuned for transposing 128 x l matrices
as done in OT protocols. Performance might be better if input is 16-byte
aligned and the number of columns is divisible by 512 on systems with
64-byte cache lines.
§Panics
If input.len() != output.len()
If the number of rows is less than 128.
If input.len() is not divisible by rows.
If the number of rows is not divisible by 128.
If the number of columns (= input.len() * 8 / rows) is not divisible by 8.
§Safety
AVX2 instruction set must be available.