Parallels needs to implement a workaround for disk corruption with Linux VMs

Discussion in 'Linux Virtual Machine' started by Will Dormann, Nov 24, 2023.

  1. Will Dormann

    Will Dormann Junior Member

    Messages:
    19
    There has been a lengthy discussion about disk corruption that can present itself with Linux virtual machines when run on an Apple Silicon platform. Particularly when using a filesystem that is modern enough to recognize silent corruption (e.g. BTRFS). While I'm not 100% certain if this is a Linux-only bug, or an Apple Hypervisor Framework bug, or some unlucky combination of the two. But the important part is that there is a workaround that can avoid the corruption. Specifically, using VZDiskImageStorageDeviceAttachment
    cachingMode: .uncached
    will avoid corruption in my testing, and in the testing of the others in the thread.

    Nearly all of my tests were using Parallels, so Parallels is definitely affected by this problem. Whether or not a user may see disk corruption seems to depend on various attributes of the underlying disk backing for the VM (e.g. timing, latency, speed, etc., or some combination of all of the above).
    There are complete details with how to reproduce the bug (essentially run stress-ng --iomix 4 in a Linux VM), but more importantly there is a workaround that has ben proven to work. So rather than needing to rely on Linux , Apple, or both to sort out the actual cause of the problem, it is possible to avoid the problem until that occurs. Specifically, the workaround details are here:
    https://github.com/utmapp/UTM/issues/4840#issuecomment-1824340975

    If you wish to test this workaround, you can get Apple's Linux VM demo, and modify GUILinuxVirtualMachineSampleApp/AppDelegate.swift to look like this:
    private func createBlockDeviceConfiguration() -> VZVirtioBlockDeviceConfiguration {
    guard let mainDiskAttachment = try? VZDiskImageStorageDeviceAttachment(url: URL(fileURLWithPath: mainDiskImagePath), readOnly: false, cachingMode: .cached, synchronizationMode: .full) else {
    fatalError("Failed to create main disk attachment.")
    }


    With this change, stress-ng --iomix 4 can be run indefinitely without experiencing disk corruption. Without this workaround in place, the disk will be corrupted in less than a minute of the test running.
    As far as Parallels is concerned, end users cannot implement this workaround. The Parallels developers must implement this change in order to avoid disk corruption.
     

Share This Page