Severe performance degradation with jdk10 / java10

Discussion in 'Windows Virtual Machine' started by MaSc, Apr 21, 2018.

  1. MaSc

    MaSc Member

    Messages:
    40
    * Guest: Windows 10 (latest)
    * Host: MacOS 10.13.4

    here's a cdm benchmark on a native system with samsung 840-pro (4k sequential and 4k random)
    [​IMG]

    and here's the cdm benchmark on parallels with nvme which is 4 times faster than the 840-pro:
    [​IMG]

    As you can see while sequential access is still 3-4 times faster than the 840 pro on a native system (which is expected), the performance of random small writes is extremely degraded. it's 4 times slower despite running on flash that is 4 times faster.

    Software development is basically impossible on this vm, as it slows down to a crawl.
    What do you recommend to mitigate this issue?

    If anyone would share their cdm benchmark running windows 10 on parallels with nvme flash I'd appreciate it..
     
  2. MaSc

    MaSc Member

    Messages:
    40
    it may also be noteworthy, that the host is running on apfs.
     
  3. Arno1

    Arno1 Parallels Developers

    Messages:
    23
    The particular configuration you are trying to benchmark is not quite clear from you description. Could you please generate a problem report (as per https://kb.parallels.com/en/9058) for VM right after completion of the CDM test and post its id in this thread?

    Could you please describe a particular scenario in which you suffer from the insufficient performance?
     
  4. MaSc

    MaSc Member

    Messages:
    40
    252952981

    it's a java build. I'm running the exact same project on the host and a clean build takes around one minute.
    Code:
    ./gradlew --no-build-cache clean classes
    BUILD SUCCESSFUL in 1m 1s
    
    on the windows 10 vm which runs on the same host with the same underlying storage, the same command takes an excruciating 15 minutes to complete
    Code:
    ./gradlew --no-build-cache clean classes
    BUILD SUCCESSFUL in 15m 29s
    
    when observing the performance metrics, it's conspicious that most of the time the guest is mostly idle on cpu and disk io.
     
  5. MaSc

    MaSc Member

    Messages:
    40
    also, this is not a new project and it has been performing well for years on this parallels vm, so this must be some kind of regression relating to updates of the recent months.

    since I haven't run it on windows 10 for a while, I cannot tell you precisely which (java, windows, parallels or macos) update could be related.
    disabling windows defender or expanding disk does not make a significant difference.
     
  6. Arno1

    Arno1 Parallels Developers

    Messages:
    23
    Thank you. According to the problem report there are no indications of the conventional causes of the performance degradations. However, I agree, that the performance of the project build compared to the host is way lower, than expected. Please note, that there is a suspicious assertion in the kernel log in the report "
    AppleNVMe Assert failed: deviceID == kNVMeDeviceIDANS2", but its cause and impact are not clear.

    Let's try the following steps in order to identify the culprit:
    1. Could you please collect a system wide spindump (or a couple of them), when the build is running? Like, "sudo spindump", and attach the files generated in this thread. There are plenty of third party kernel extensions loaded by OS X as per report. If any of them hinders OS X disk IO, it may be observed in the samples.
    2. Could you please temporary uninstall Parallels Tools, reboot, run a project build and generate a new problem report? Then Parallels Tools may be installed again. To uninstall, run C:\Program Files (x86)\Parallels\Parallels Tools\uninstall.exe and reboot. It will rule out all the guest tools, if it makes no difference.
    3. Is it an issue for a particular project, or any opensource one exhibits the same level of degradation in your environment? It is essential for an attempt to replicate the issue.
     
  7. MaSc

    MaSc Member

    Messages:
    40
    I don't experience any degradation (at all) on macOS and the mentioned project as well as generic benchmarks perform as expected.
    are you sure it makes sense to troubleshoot the host, as there's no indication of an issue on that level.

    Just tested without parallels tools, it doesn't make a difference

    gradle initialization time is already massively increased, which suggests the issue is rather generic (which would also be confirmed by the degraded generic benchmarks I posted initially).
     
  8. MaSc

    MaSc Member

    Messages:
    40
    update: java10 seems to be very much involved here. I downgraded to jdk8 which yields:
    Code:
    ./gradlew --no-build-cache clean classes
    BUILD SUCCESSFUL in 3m 29s
    for the same project.
    I don't see this (degradation) with jdk10 on macOS and linux though.
     
  9. MaSc

    MaSc Member

    Messages:
    40
    meanwhile I verified java10 on native win10, which shows no degradation.
    so this seems to boil down specifically to java10 + parallels.
     
  10. Arno1

    Arno1 Parallels Developers

    Messages:
    23
    I have tried to replicate the difference of the build time with jdk8 vs jdk10 in Parallels Desktop. I have tried to use the groovy project to benchmark the build speed (https://git-wip-us.apache.org/repos/asf/groovy.git). What I have got is as follows:

    network enabled in VM, JDK10, 2 VCPUs assigned: 13m 34s
    network enabled in VM, JDK10, 4 VCPUs assigned: 11m 53s
    network disabled in VM, JDK10, 4 VCPUs assigned: 4m 3s
    network disabled in VM, JDK10, 2 VCPUs assigned: 6m 28s

    network enabled in VM, JDK8, 2 VCPUs assigned: 12m 11s
    network enabled in VM, JDK8, 4 VCPUs assigned: 11m 35s
    network disabled in VM, JDK8, 4 VCPUs assigned: 3m 16s
    network disabled in VM, JDK8, 2 VCPUs assigned: 3m 55s

    There is no significant difference in build times for JDK8 and JDK10 in my setup, though JDK10 is definitely slower. It seems like JDK10 builds would benefit more from additional VCPUs assign to VM. But unexpectedly the impact is just marginal compared to that of network being enabled and disabled (no downloads occurred during all these builds). I'll take a look on why it makes that much difference.

    Unless there is some difference in the network configuration of the environments, that you have benchmarked yourself, I am inclined to believe, that the issue does not reproduce universally.

    I still believe, that it makes sense to briefly troubleshoot the host, just in case, as I have described above.
     
  11. MaSc

    MaSc Member

    Messages:
    40
    interesting, thanks.

    here's my results when testing with the groovy repository.
    I actually chose a different subset for comparison, as the runtimes of the complete build were really lengthy.
    Could you please also post your command line you used for testing?

    here's my test run with jdk8 and performance metrics.
    I actually ran
    Code:
    ./gradlew clean dist
    once to download all deps once, so I can use --offline in subsequent test runs.

    JDK8
    Code:
    export JAVA_HOME="C:\Program Files\Java\jdk1.8.0_172"
    ./gradlew --no-build-cache --no-daemon --offline clean :groovy-console:classes
    
    [​IMG]
    the process took around 5 minutes to complete.

    JDK10
    Code:
    export JAVA_HOME="C:\Program Files\Java\jdk-10.0.1"
    ./gradlew --no-build-cache --no-daemon --offline clean :groovy-console:classes
    
    [​IMG]
    when running with jdk10 the process was still running after 15 minutes. CPU is idle most of the time.

    PS. with the second performance graph I failed to stop in time before it rolled over, just to clarify so the first seconds are missing / overwritten.

    so it seems this issue is not i/o related at all.
     
  12. MaSc

    MaSc Member

    Messages:
    40
    I don't see a difference with or without network (switching to disconnected while the vm is running)
     
  13. MaSc

    MaSc Member

    Messages:
    40
    also, the performance metrics displayed here reflect quite accurately on the host as well.
    when using jdk10 and observing the host metrics it looks as idle as the guest. seems like there's lots of wait cycles / locking / synchronisation.
     
  14. Arno1

    Arno1 Parallels Developers

    Messages:
    23
    I have just used your command to build groovy
    Code:
     ./gradlew --no-build-cache --no-daemon --offline clean :groovy-console:classes 
    and got 3m 32s for jdk8 and 4m 18s for jdk10 with 4 CPUs in VM. Network availability stopped contributing to the build duration, apparently, due to the --offline flag.

    I believe, that we are using the same versions of JDKs for tests.
     
  15. Arno1

    Arno1 Parallels Developers

    Messages:
    23
    On a side note, could you please try to check "Start up and shut down manually" under the "Startup and Shutdown" section of the Options tab of the configuration of the leoz-dev Ubuntu VM. Then Quit Parallels Desktop, start it again, and check if compilation speed in your Windows VM improves with JDK10. If it does not, could you please generate a problem report once again and provide its id?
     
  16. MaSc

    MaSc Member

    Messages:
    40
    changing the startup option of the ubuntu vm doesn't make a difference.
    I asked a colleague with the same mbp and similar setup for a benchmark and will post the results shortly.
     
  17. MaSc

    MaSc Member

    Messages:
    40
    update: couldn't replicate the issue with the same project on a second win10 guest vm & macOS host (with same hardware including apfs) which confirms yours results. will check the alternative win10 vm on my host and report results in a few days.
     
  18. MaSc

    MaSc Member

    Messages:
    40
    just verified with another win10 vm having the same project setup on my system.

    the regression does not occur there, which excludes the host and narrows it down to my own win10 guest vm.
    still odd though, as I reseted the entire OS a few months ago so it's pretty clean and also a rather minimal setup without any system software or specific drivers.

    which troubleshooting steps do you suggest from here?
     
  19. Arno1

    Arno1 Parallels Developers

    Messages:
    23
    It is possible to rule out almost all impact of settings of VM (although I have already reviewed the most prominent possibilities in the problem report, that you've sent) as a possible culprit, since you have two VMs: with and without reproduction. For this sake:
    1. Copy the hard disk bundle from the development VM bundle (I believe, that it is the Windows 10 Dev.pvm/Windows 8 EN-0.hdd folder). It is sort of a precautionary step, which would avoid any impact for the development VM.
    2. Connect this copy as a hard drive source to the new VM, where the issue does not reproduce (Configuration->Hardware->Hard Drive).
    3. Try to reproduce the issue with the new VM and old hard drive. After that the Hard Drive source of the new VM may be restored.

    There are three possible outcomes:
    1. VM does not boot - then this way of troubleshooting should be disregarded.
    2. The issue does not reproduce - perhaps, it solves the issue (although the copied Hard Drive bundle is better be moved into the new VM bundle for the sake of consistency).
    3. The issue reproduces - the culprit is with Windows patch level, Windows settings or the third party software (which is unlikely the case as per your claim).

    Still, I'd suggest to try out creating a VM snapshot (in order to preserve the VM state, so it can be rolled back after the experiment completes), and remove the third party software one by one in order to identify, if it has any impact over the reproduction.

    Other than that I can only suggest collecting xperf samples with the stacktraces (for example, as per https://blogs.msdn.microsoft.com/ntdebugging/2008/04/03/windows-performance-toolkit-xperf/) for both JDKs, so that they may be compared.
     
  20. MaSc

    MaSc Member

    Messages:
    40
    thanks, that's an excellent idea. I could actually simply switch the hd image without copying, but I had to set `vm.bios.efi=0` to be able to boot. doing so actually resolves the issue. so it must be something in the vm's setup / settings.

    as this is (presumably) a parallels issue, would you like to investigate further?

    ps. is there any way to get efi working again after swapping the hd image?
     

Share This Page