Also when do they make use of the 16 Neural cores for matrix multiplication ? Parallels throws away 11 TFLOPS of M1 performance so far ..