Observations of Parallels Beta 5

Discussion in 'Parallels Desktop for Mac' started by bbraun, Apr 27, 2006.

  1. bbraun

    bbraun Member

    Messages:
    47
    I've been trying to learn more about Parallels, how it works, how it does things, why it does certain things, etc. This is a summary of what I've discovered so far. I've broken up my observations into 2 groups, observations while Parallels is running in normal operation, and when Parallels is suspending/resuming. My observations began while investigating some threads mentioned here, so some of my observations have been pulled from my posts to other threads. I have added more and condensed my observations into a single post.

    I hope others find these observations useful, and I'd be happy if anyone would like to correct or clarify my observations. I've included information about my observation environment at the end of this post, and I'd be happy to help anyone reproduce my observations.

    Parallels normal operation:
    During normal guest os operation, parallels does 4k+ reads and writes, asynchronously. The async writes are potentially done one-per thread, as a pthread_kill is issued, as well as a SIGUSR2 is caught for every write. This async write method would explain why every write() is preceded by an lseek() even for linearly sequential writes.

    By far, the most common syscall performed by Parallels is an ioctl on /dev/vm-main. At least with a fedora core 5 guest running gnome-terminal, there appears to be many ioctls per cursor flash. I'm guessing at least some of these ioctls are for video operations.

    The second most common syscall performed by Parallels is gettimeofday. These appear to happen almost once a second. Presumably for guest/host time synchronization.

    Another odd behavior is Parallels appears to statfs the user's home directory every few seconds (between 4 and 6 seconds). Presumably to see if the user's homedir is network mounted or some such. It's probably safe to say the user's home directory won't be changing filesystems once the app starts. All sorts of things will break if the user swaps homedirs out from underneath apps. Related to this, Parallels appears to use ~/.vmm-XXXXXX file(s) as temorary storage, which is immediately unlinked. Perhaps the constant statfs'ing of ~/ is part of an abstracted layer built around these temporary files. This .vmm-XXXXXX file size appears to correspond to the size of the virtual machine's memory. Perhaps it is the file that backs a vm's "physical" memory. The file gets mmapped into the Parallels process that runs the virtual machine. MAP_ANON would probably be a much better approach, if this is indeed the case.

    Whenever a parallels window (not a guest OS' window) is selected or moved, parallels synchronizes its preferences, presumably to preserve window location for next launch, in the event of a crash/panic/etc. The synchronization process appears to be rather expensive, touches many files, and does many system calls. The expense is incurred by CFPreferencesSynchronize, not by parallels, but minimizing calls to sync prefs would probably help.

    Another fun little tidbit is Parallels seems to like to check for the file /var/db/.AccessibilityAPIEnabled. Presumably, this is a side effect of some high level API Parallels is using, but it seems to happen fairly frequently, particularly when suspending/resuming.

    Parallels appears to use Carbon, or a UI toolkit that uses Carbon for at least some of it's file operations. This can be seen by it's use of volfs (/.vol) to access files. Additionally, they appear to use C++ somewhat extensively. You can see that they link against libstdc++, but also nm shows several C++ symbols. My guess is they are using Qt to do the UI, probably on Linux and Windows as well, which contributed to the relatively fast Mac OS X port. The early betas for Mac OS X still had Qt symbols in them.

    File persistence: To successfully sync data out of kernel and drive caches on Mac OS X, one needs to do an fcntl(F_FULLFSYNC). This is what we need to look for to see if Parallels is telling the host to sync data to disk, when a guest tells Parallels to do the same. When using linux, and doing an fsync(), fdatasync(), or doing hdparm and flushing caches, no fsync or fcntl call is generated in the host environment. It appears "safe" writes in the vm do not correspond to "safe" writes in the host.

    Fun tidbit: it appears Parallels has used some of their Linux port to get to market quickly on Mac OS X. Some things are still left over from the linux port, such as trying to open /proc/parallels/mem.

    When Parallels launches a virtual machine, it has 2 processes that run. One is where most of the VM activity takes place. The other process doesn't appear to actually do anything. While the other process is running, this process appears to just sit in select() waiting on a pipe for, presumably, a notification from the other process saying it has finished.

    Parallels has a ~/.parallels_settings file, which lists your recent virtual machines, for use in the dialog box of creating a new vm, opening an existing one, or opening a recent vm. It also stores the location of the main parallels window on the screen. It also has a /Library/Parallels/.parallels_common_options file, which appears to just specify a maximum memory for all current running VM's.
    The license file appears to be stored in /Library/Parallels/.parallels_license_2.1 and just stores the license key, and who parallels is registered to.
    The Parallels kexts are also /System/Library/Extensions/hypervisor.kext and /System/Library/Extensions/vmmain.kext.
    When a virtual machine is in use, a .<vmname>.pvs.lock file is created in the same directory as the .pvs file, which describes the virtual machine. This lock file can become stale if Parallels crashes.
    Parallels also listens on an IPv4 socket on localhost port 5679 although I have not been able to capture any traffic over this port using tcpdump on lo0. Piping random data to it seemed to have no effect.

    For memory, parallels appears to use a combination of both malloc and of vm_allocate. Presumably, the vm_allocated regions are for shuttling data between the kexts and the parallels app its self.

    CD/DVD access appears to be exclusive at the IOKit level. Data is read/written to and from the dvd without read/write system calls, but rather through ioctls to /dev/vm-main. This also appears to be asynchronous and done in a separate thread. The rest of the host OS does not have "normal" access to the drive. There is no device node for the device when Parallels has captured exclusive access to it.

    When you click on a parallels window to select it or drag it, the guest vm's display is not updated, although the VM appears to continue executing just fine. It appears host window updates are just disabled while the mouse is down on the window.


    Suspend/Resume behavior:
    Parallels suspends and resumes by doing 16k synchronous reads/writes from/to it's .sav file. On suspend, each write appears to be preceded by ~3 lseeks to the exact same offset. Presumably this is because they are using a high level API and each layer of abstraction needs to ensure it is writing to the right location. 128k reads/writes could provide a
    performance increase.

    Between reads of the .sav file when resuming, parallels appears to do an fstat, 2 ioctls, and an lseek. I'm not sure what the fstat is for, the ioctl is presumably passing the data, or information about the data, it just read off to the kext via the device node. The lseek just baffles me, since resumes appear to be sequential linear reads of the .sav file, so why lseek to the position you're already at? Presumably, this is part of some API abstraction. Sometimes, there can be up to 4 lseeks in a row, seeking to the exact same location in the file. Sometimes it seeks backwards, and then back to where it was with no file access inbetween.


    Observation environment:
    All observations were made with beta5 on a MacBook Pro 2.0GHz w/2GB memory.
    The primary guest OS was Fedora Core 5 with VT-x disabled. Observations were made using fs_usage, ktrace, gdb, and nm. My home directory was on the local disk.
     
    Last edited: Apr 27, 2006
  2. bigman

    bigman Member

    Messages:
    35
    Good to bring this up again.
    Especially the sync behaviour needs clarification.

    Please, parallels. Give some remarks about this.
     
  3. Andrew @ Parallels

    Andrew @ Parallels Parallels Team

    Messages:
    1,507
    Thanks for your observations! Sync problem was reproduced and indeed take place for Linux guests. For Windows guest all work fine.

    Now we fixed it for Linux guests as well. Fix will be included in next update.
     
  4. daveschroeder

    daveschroeder Member

    Messages:
    64
    bbraun,

    Thanks for these observations. It's this kind of discussion that is needed to make Parallels a polished product.
     
  5. perdurabo

    perdurabo Junior Member

    Messages:
    16
    So this topic seems to be a wealth of information that keeps getting buried under numerous topics (whose subject is often answered in the documentation or previous forums posts ;).

    To spur a little more recognition and discussion on this topic, do bbraun's observations help provide for any more significant optimization of Parallels? As the software continues to be ported (de-Linuxized?), are any significant speed boosts likely?

    While I am perfectly satisfied with Parallels' performance, it does seem like disk i/o is the bottleneck. Although I'm lazy and haven't explored the issue to even 10% of bbraun's efforts. I would like to see this improved.
     
  6. bbraun

    bbraun Member

    Messages:
    47
    I don't really know much about the internals of parallels, just what I can observe. I'm not privy to exactly what they do and why they do it, but if I were an engineer working on parallels, my TODO would be something like:
    - Investigate using mmap(MAP_ANON) instead of mmapping the unlinked ~/.vmm-XXXXXX file. While this probably wouldn't gain significant performance increases, it would probably reduce some complexity in the software in having to worry about what filesystem the user's homedirectory is on. Hopefully this would reduce compatibility problems of people having homedirs on various (smb, afp, nfs, disk image) filesystems. Additionally, my hope would be that it would reduce the constant need to statfs($HOME). Again, I wouldn't anticipate much of an observable increase in performance, but fixing problems with less code is always gratifying.

    - Experiment with 128k reads/writes on suspend resume.

    - Do thread pooling. In the post above, I speculate that reads and writes are done pseudo-asynchronously via synchronous calls within their own threads. If this is the case, rather than constantly creating and destroying threads, create a pool of them. This would ideally reduce latency and overhead of disk image reads/writes.

    - Implement a write-caching mechanism. The goal is to try to do writes with coalesced, larger buffers. A simple approach that can be used to test with is to just try to coalesce multiple small linearly sequential writes into larger buffers to reduce the number of writes. Not only would reducing the number of writes help, but so would doing the writes in larger chunks. OSX has incredibly high system call latency compared to linux, and HFS is incredibly slow when doing small reads/writes. It seems to be optimized for throughput (final cut pro/imovie) than latency. For anyone that has tried to run a cvs/cvsup server on OSX, it becomes painfully obvious that lots of 4k reads/getdirentries brings the system to its knees very quickly.

    And of course there would be UI improvments by going with a native interface rather than using Qt. I have no idea how hefty an undertaking this would be, it entirely depends on how the current codebase is designed. Also, it is a maintenance decision, if both the windows and linux apps are Qt based, forking OSX's seems rather gratuitous.

    Anyway, like I said, there are a lot of things I don't know and am not privy to, so any and/or all of the above may be completely irrelevant. But from where I'm at now, that would be my TODO.
     
  7. Andrew @ Parallels

    Andrew @ Parallels Parallels Team

    Messages:
    1,507
    bbraun,

    Thanks a lot! We will take your considerations into account.
     
  8. rich_w

    rich_w Member

    Messages:
    20
    why don't you just give this guy a job?
     
  9. mcg

    mcg Hunter

    Messages:
    168
    Ha ha! That's a good idea! :)
     
  10. bbraun

    bbraun Member

    Messages:
    47
    Perhaps I'm missing something and someone can enlighten me. While doing a ktrace of a fedora core 5 virtual machine created under beta4, but running under beta6, I noticed no fcntls being called when I called fsync(), fdatasync(), or hdparm in linux. I would have expected to see an fcntl(F_FULLFSYNC) being called in the host with any one of those commands, unless I'm not understanding the semantics on the linux side. Perhaps this bug fix missed the beta6 update?

    I've also noticed that you cannot disable write caching on the emulated IDE drive with "hdparm -W 0 /dev/hda"
     
  11. bbraun

    bbraun Member

    Messages:
    47
    I've been playing with the ioctl on the device nodes. Here is what Parallels (with a linux guest VM) sends to /dev/vm-main:
    857 Parallels CALL ioctl(0xc,0xc0185405 ,0xb011ad54)
    Address 0xb011ad54 is on the stack, according to vmmap:
    Stack b011a000-b011b000 [ 4K] rw-/rwx SM=COW thread 1

    This appears to be almost an idle loop call to ioctl, since it happens all the time with the exact same arguments.

    The second argument to ioctl contains the in/out status of the parameter, the command, the group, the parameter, and the parameter length. Decoded, 0xc0185405 means:
    In/out status: both in and out (110)
    Length: 24 (0000000011000)
    Group: 84 ('T') (01010100)
    Parameter: 5 (00000101)

    It seems all ioctls made to the device nodes are in/out commands. Commands I have cataloged during very basic suspend/resume and whatnot are:
    in/out, length 24, group 84 ('T'), parameter 5 (0xc0185405)
    in/out, length 24, group 84 ('T'), parameter 1 (0xc0185401)
    in/out, length 24, group 84 ('T'), parameter 2 (0xc0185402)
    in/out, length 4, group 84 ('T'), parameter 2 (0xc0045402)

    To verify my decodings, check out /usr/include/sys/ioccom.h. The hex numbers above are what I observed from ktrace.

    While trying to manipulate the device from my own code, I noticed a couple things:
    1) it is incredibly easy to make open(/dev/vm-main) return EBUSY. Sometimes waiting a bit while Parallels is running will fix it, as will quitting parallels. If you try to launch Parallels and start a virtual machine while /dev/vm-main returns EBUSY, it will pop up a dialog box telling you there is a problem and you should reinstall parallels.
    2) When passing random bogus data to ioctl on /dev/vm-main and ioctl returns an error, errno is being set to -22, which is an unknown errno value. However, if you take the absolute value of errno, you get EINVAL, which I believe is what it should be returning. My guess is the ioctl handler is setting errno -EINVAL instead of EINVAL.
    3) reading from /dev/vm-main and /dev/hypervisor is unsupported (sets errno to -78, again it looks like a negative errno for ENOSYS). It appears the only way to interact with the device node is via ioctl.
    4) Input checking seems to be fairly good, since I ran it through a test of invoking ioctl with arguments from /dev/random in a tight loop, and it would always do -EINVAL, and didn't consume too much CPU. However, the returning EBUSY even after my stuff had closed the descriptor and exited, shows more is happening in the kext than just an open/invalid ioctl/close.

    I'm still working on capturing the data pointed at by the second arg and figuring out what it is and how I can use it. I've been using ktrace and gdb, although I suspect a logging version of the ioctl syscall stub would be easier and faster. If anyone captures and interprets the data being passed to ioctl, I'd be interested.
     
  12. bbraun

    bbraun Member

    Messages:
    47
    As a follow up, I've managed to catch some of the data being passed to ioctl via gdb. Sadly, when I watch for extended periods of time, Parallels seems to crash. I'm suspecting either a race condition in Parallels (things move veeery slowly under gdb) or gdb is doing something funny. In any case, the data appears to remain the same through each call to ioctl. For example:
    ioctl(0xc, 0xc0185405, 0xb011ad54) <- the most common ioctl on my fedora guest. This is on /dev/vm-main, the second arg is decoded in the previous post, and the buffer contains (in hex):
    8458d824 70c0b478 a4fc38a4 e4b450bc 00000001 b011ed64

    ioctl(0xc, 0xc0185401, 0x8c9700) <- second most common ioctl on my fedora guest. Again on /dev/vm-main:
    8f01202c 00000000 00000000 00000000 00000000 00000000

    I observed each of these two ioctls many times (dozens to hundreds of calls), and each time the buffer passed had the same values.

    Just for kicks, not expecting anything useful, I tried passing these values to my own call to ioctl and got the usual -22 (-EINVAL). Oh well, keep going I suppose.

    I was also curious about the EBUSY I was getting, so decided to try and open(/dev/vm-main, O_RDONLY); close(); in a tight loop. When parallels was running, all was well. It would open and close to my heart's content. But, launch Parallels (and start a VM!) and I suddenly started getting EBUSY. I tried keeping the descriptor open while launching Parallels, and Parallels started the VM just fine. I had expected the dialog box saying there was a problem and I should reinstall parallels, similar to what I had recieved before. However, Parallels was able to open /dev/vm-main while I had it open, but I wasn't able to while it had the device open.
     
  13. bbraun

    bbraun Member

    Messages:
    47
    When doing suspend/resume, you can probably use fcntl(F_NOCACHE), since I doubt you need the kernel to be caching the suspend/resume file. This might decrease lag/system jerkiness while the suspend/resume is in process. This might also help if parallels does it's own caching of disk access.
     
  14. Andrew @ Parallels

    Andrew @ Parallels Parallels Team

    Messages:
    1,507
    Hello bbraun,

    1. There was bug in IDE emulation that prevented all guest Linuxes from issuing flush command to IDE controller after sync(). We fixed it n Beta6.

    2. Our investigation shows that some Linux distros doesn't issue flush command to IDE controller at all after sync(). There are no problems with modern Linuxes (SUSE 10 for example). And no problems with guest Windows as well.

    We are still working on this issue.
     
    Last edited: May 13, 2006

Share This Page