For me, WSL1 always worked fine on Windows for ARM. But Windows . This thread is about nested hypervisors. Windows without its own hypervisor-features is castrated, this is not about just a separate Ubuntu VM in Parallels (which is easy). Its about the hypervisor-based security features in Windows, VSCode WSL2 integration,... This is e.g. why I still have a separate physical Windows machine (nowadays a Surface Laptop 7), but I want to get rid of it and upgrade my M2 MacBook Air to an M4 Pro MacBook Pro instead of my current 2 machines.
Also the performance of WSL1 is bad. Here a comparison of llama.cpp llama-2 CPU-only on the M2 (4 p-cores) vs. Snapdragon X Elite (12 cores). M2 running Windows in Parallels and Ubuntu native in Parallels and in WSL1, Snapdragon running Ubuntu in WSL2.
TLDR: WSL2/Ubuntu runs as fast as native Windows on e.g. the Snapdragon, Parallels has massive performance impatcs. Windows+WSL1 being the worst impact.
llama-bench numbers run as in llama.cpp github issue #4167, but with current version of llama.cpp:
| model | size | params | backend | ngl/CPU | test | t/s |
| M2, MacOS 15.1 native -------- | ---------: | ---------: | ---------- | ------: | ------------: | -------------------: |
| llama 7B Q4_0 | 3.56 GiB | 6.74 B | Metal,BLAS | 0 / 4 | pp512 | 58.12 ± 2.41 |
| llama 7B Q4_0 | 3.56 GiB | 6.74 B | Metal,BLAS | 0 / 4 | tg128 | 14.99 ± 0.14 |
| M2, MacOS 15.1, Parallels 20.1.1, Ubuntu 24.04.1 ------: | ---------- | ------: | ------------: | -------------------: |
| llama 7B Q4_0 | 3.56 GiB | 6.74 B | CPU | 4 | pp512 | 22.60 ± 0.58 |
| llama 7B Q4_0 | 3.56 GiB | 6.74 B | CPU | 4 | tg128 | 11.20 ± 0.92 |
| M2, MacOS 15.1, Parallels 20.1.1, Windows 11 24H2 -----: | ---------- | ------: | ------------: | -------------------: |
| llama 7B Q4_0 | 3.56 GiB | 6.74 B | CPU | 4 | pp512 | 22.18 ± 0.50 |
| llama 7B Q4_0 | 3.56 GiB | 6.74 B | CPU | 4 | tg128 | 12.40 ± 0.32 |
| M2, MacOS 15.1, Parallels 20.1.1, Windows 11 24H2, WSL1, Ubuntu 24.04 | ------: | ------------: | -------------------: |
| llama 7B Q4_0 | 3.56 GiB | 6.74 B | CPU | 4 | pp512 | 9.39 ± 0.36 |
| llama 7B Q4_0 | 3.56 GiB | 6.74 B | CPU | 4 | tg128 | 6.33 ± 0.68 |
| Snapdragon X Elite, Windows 11 24H2 ------: | ---------: | ---------- | ------: | ------------: | -------------------: |
| llama 7B Q4_0 | 3.56 GiB | 6.74 B | CPU | 12 | pp512 | 63.53 ± 6.79 |
| llama 7B Q4_0 | 3.56 GiB | 6.74 B | CPU | 12 | tg128 | 20.60 ± 2.40 |
| Snapdragon X Elite, Windows 11 24H2, WSL2, Ubuntu 24.01 | ---------- | ------: | ------------: | -------------------: |
| llama 7B Q4_0 | 3.56 GiB | 6.74 B | CPU | 12 | pp512 | 61.75 ± 8.94 |
| llama 7B Q4_0 | 3.56 GiB | 6.74 B | CPU | 12 | tg128 | 21.48 ± 3.99 |
Click to expand...