Parallels Desktop for Mac computers with Apple silicon M4 chips

Discussion in 'Parallels Desktop on a Mac with Apple silicon' started by Mikhail Ushakov, Oct 30, 2024.

  1. cmarinas

    cmarinas Bit poster

    Messages:
    7
    TL;DR warning: some thinking out loud, don't read unless you're interested in the Linux kernel fork() mechanism and the Arm architecture.

    fork() in Linux duplicates the parent page tables into the child process while marking the PTEs read-only. In the parent process, there's a single TLBI ASIDE1IS at the end of the page table copy (and before the child is started). There is no need for a TLBI in the child process since it starts with its own ASID and presumably no stale TLB entries for the new ASID (when the ASIDs run out, there's a full local TLBI on each CPU - TLBI VMALLE1; we call this a roll-over event).

    The stack smashing check failure looks like copy-on-write (CoW) does not always happen for the stack page when both the parent and the child process access it shortly after fork(). The stack is likely the first page accessed after the fork() and, when the bug triggers, either the parent or the child succeed in writing it without triggering a permission fault into the kernel (for CoW). This typically happens if there are stale TLB entries.

    I think we have two main scenarios after fork():
    1. The parent writes the stack without CoW. Since we had a TLBI ASIDE1IS already, that's very unlikely, especially if the parent is not migrated to another vCPU (which may run on another CPU). Well, there's a small chance that the parent migrated to another vCPU (and on a different physical CPU) and the TLBI ASIDE1IS did not get propagated there for some hardware reason. I find this unlikely
    2. The child writes the stack without CoW. This would not be possible if the TLB cache is empty for the new ASID. However, we can have an ASID roll-over given that Apple Silicon only exposes 256 ASIDs, at least to the VM (a shell script with lots of forking would quickly run through them). Sub-scenario (a) is that M4 does some TLB sharing between CPUs but the local TLBI (non-inner-shareable) that Linux does on ASID roll-over doesn't invalidate all such shared TLBs, things can go wrong with stale TLB entries. A more likely possibility is (b) the hypervisor framework does not properly invalidate the TLB when multiplexing multiple vCPUs on the same CPU.
    2.b is important as Linux assumes that the TLBs are private to a (v)CPU and can do a local TLBI on ASID roll-over, deferred until the next context switch on a (v)CPU. If M4 has a lot larger TLBs and the hypervisor framework is missing proper maintenance on multiplexing vCPUs on a CPU, things can go wrong, especially when fork() and CoW are heavily involved. Actually, this could even trigger scenario 1 if the parent is scheduled out and back again requiring an ASID roll-over.

    FWIW, Linux/KVM had a bug in this area, fixed about 8 years ago - https://lore.kernel.org/all/20161103222706.24129-1-marc.zyngier@arm.com/.
     
  2. LiquidV

    LiquidV Junior Member

    Messages:
    10
    I can confirm 1 core works with Ubuntu Desktop ARM64 24.04.1 on MBP M4 Pro.
     
  3. Freddy2

    Freddy2 Junior Member

    Messages:
    12
    @Mikhail Ushakov,
    That is nice to know but
    Are you guys going or not to patch version 19?
    Again, I'm not asking to get version 20 for free but minimum user support for a product that it's not even a year old.
    If you cannot fix the issue on previous versions;
    could you please offer a special discount for people like us (many in this chat) will be highly appreciated ?

    Thanks,
    Fred
     
  4. Avinash Bundhoo

    Avinash Bundhoo Staff Member

    Messages:
    628
    Hello,
    Thank you for reaching out.
    We have released a new update Parallels Desktop 20.1.2 which fixes this issue.
    Please install the latest update as soon as possible.
    Thanks
     
  5. charlesr11

    charlesr11 Bit poster

    Messages:
    1
    Y
    I agree. I bought v19 outright in Nov 2023 and now I find that the solution is to "future proof" by paying more money! Australian consumer laws ask a simple question: "Would you have bought the product if you had known of this issue" (that it wouldn't work a year later).
     
  6. AlexeyS8

    AlexeyS8 Bit poster

    Messages:
    8
    lovely write up!
    I hope Parallels can fix the desktop.
    Or apple fix hypervisor.
    And in any case lets hope it is not M4 hardware issue. otherwise quite silly to upgrade and run on 1 CPU
     
  7. Peter V.

    Peter V.

    Messages:
    2

    Just stop being cheap. People like you are annoying.

    MacOS Sequoia was released on September 16th, 2024.
    Parallels 20 was released in September 2024.
    The Apple M4 MacBook Pro line was released on October 30th 2024.
    The Apple M4 MacBook Pro line was released with MacOS Sequoia.
    So it's easy... if you have a M4 MacBook Pro, then get Parallels 20.
    Parallels did not force you to buy a M4 MacBook Pro/ M4 Mac Mini, did they?

    Expecting legacy support for free (quote from you: "patch version 19")/ expecting a (quote from you...) "special discount" as you do is just cheap.

    Parallels customers already get a discount if they choose to upgrade.
    Quote from a Parallels website: "For all one-time purchases and customers with a previous version of Parallels Desktop, upgrade to the latest version at a discounted rate."
    Link: https://www.parallels.com/products/desktop/buy/ -> "Upgrade"

    Otherwise...
    Again the quote from you: "fix the issue on previous versions"
    That would be called legacy support.
    Would you be willing to pay an extra for legacy support? If you would appreciate the time of the Parallels software developers, then you would pay an extra. Why should the developers work for free?
    Have you ever thought about the possibility that the current price model of Parallels might be build on the idea to keep the software price as low as possible and so not to offer legacy support?
     
  8. cmarinas

    cmarinas Bit poster

    Messages:
    7
    A quick update on the Linux front - I changed the kernel TLB flushing code in the kernel to ignore the application ASID and do the all-ASID variants instead:
    TLBI VAE1IS -> VAAE1IS
    TLBI VALE1IS -> VAALE1IS
    TLBI ASIDE1IS -> VMALLE1IS​
    (and similarly for the range operations)
    It seems to be working fine with 10 vCPUs, no failures for my simple tests (typically /sbin/mkinitramfs -o /dev/null). I have not tried a full distro install as that comes with its own kernel.

    What does this mean? Probably stale TLB entries from incorrect handling by the Hypervisor Framework. The all-ASID TLB invalidation ensures that the forked process starts with clean TLBs, at least for those ranges invalidated in the parent space. Another possibility is that the hardware does not propagate the ASID+VMID invalidation correctly to other CPUs, though I'm sure this would have been seen by Apple engineers already. Not sure Parallels engineers can fix either of these but one thing to try would be to pin each vCPU to a physical CPU (no idea how to do this on macOS and the Hypervisor Framework). This would avoid problems with multiplexing two or more vCPUs on a single CPU and, if it works, we can rule out hardware bugs.
     
  9. Nul_l

    Nul_l Bit poster

    Messages:
    5
    Can confirm - setting any distro to 1 Core removes all issues and allows all installs. Hopefully this can be addressed soon
     
  10. Krystic

    Krystic Bit poster

    Messages:
    1
    Yes, setting the CPU to single-core can resolve the crashing issue. I have successfully installed and used Ubuntu Desktop. Thank you!
    Hopefully, the official team will fix the multi-core bug soon.
     
  11. AlexeyS8

    AlexeyS8 Bit poster

    Messages:
    8
    it is not being cheap, it is expecting reasonable service and quality of support for something that is marketed as premium product.
    btw, parallels 20 has same issues and there is no fix still.
     
  12. AlexeyS8

    AlexeyS8 Bit poster

    Messages:
    8
    I do not think it is hardware issues, but specifically Parallels software issues.
    I can run standard Docker.app on mac M4 and it builds my images with 8 vCPUs. underlying it is a linux VM.
    I can also run VMWare Fusion 13 and it runs ubuntu22/arm64 perfectly fine on 8 vCPUs.
    What does NOT work is any flavour of Ubuntu/arm64 on Parallels 20.
    These are all tests on the same MBP M4 Max.
     

Share This Page