Garbage Collection + Threading Issues.

Discussion in 'Installation and Configuration' started by ScottB9, Sep 6, 2016.

  1. ScottB9

    ScottB9 Bit Poster

    Messages:
    3
    I've got Windows 10 installed running the latest version of both Visual Studio 2015 and Unity3d.

    [​IMG]

    Whenever I run my projects within Unity3D I am constantly getting "Fatal error in gc: get ThreadContext fail" within Unity3D. However, when I run the same code within VMWare or Windows/Mac native, the errors don't occur. This now leads me to believe its environmental issues within Parallels with how Garbage Collection functions under such conditions. Furthermore, when I compare Basic Unit Test analysis within Visual Studio on a Native Windows device (same hardware - MacBook Pro), the iterations of garbage collections appear to have a smooth peaks/troughs in the graphing. When I run the same code again within Parallels, I see constant spikes in behaviour which again leads me to believe that hardware virtualisation type issues are occurring.

    I've tried to increase CPU/RAM allocations via configuration, and I've also made sure that the parallels VM are housed within the same SSD primary hard drive to ensure minimal I/O issues are a likely cause.

    I'm not sure on how to troubleshoot this further, and it dramatically impacts my productivity daily as a result. Any advice or ideas on what I could do to narrow my search on finding a solution would be welcomed.

    NOTE: When you search for information relating the GC GetContext it states that its linked to "Virus Scanners" etc. This is not always the case.

    GetThreadContext will basically only fail if you call it on a thread that isn't suspended, so the most likely scenario is that there's a case in Unity where a running thread can be added to this internal list after all threads in the list have been suspended but before the GC has done a sweep to call GetThreadContext on them. The GC then tries to get the context of a thread that should be suspended but isn't and kaboom.

    GC scans are typically triggered when you allocate memory. The memory manager sees that it doesn't have enough to give you what you've asked for, so it collects, checks again, and if there still isn't enough then it gets more from the OS.

    So in this case, since we're talking about a race condition, any change to when an allocation triggers a collection vs. when threads are starting will affect the likelihood of this error occurring. This is why people have reported this being related to Kaspersky. Virus detectors sometimes basically connect to apps with a debugger, which changes the timing of when things occur, so disabling a virus detector when this crash is happening may "fix" the problem.
    As stated, this DOES NOT occur on other environments outside Parallels but is repeatable ONLY within Parallels.
     

    Attached Files:

  2. DustinHullett

    DustinHullett Product Expert

    Messages:
    181
    I am having a similar problem with the oracle suite in parallels. It works fine within VMware and virtualbox and all native windows installs but fails in parallels? It weird though i tried to install it again the other day and by chance it worked perfectly. But as soon as I restarted Windows VM it crashes again when I start. So the way parallels is handling windows as a whole is different than the other hypervisors for sure. I have been emailing and sending data to the dev team all week since I know have a working snapshot and the failed snapshot so they can see the issue. I'm a logic programmer but this can't be that hard to track down and fix being as the answer has to be in the snapshots and technical data of the two copies of my system I sent. Maybe you should do the same for your system
     
  3. ScottB9

    ScottB9 Bit Poster

    Messages:
    3
    I've been able to isolate it to a race condition in .NET when two threads collide and are attempting to access locked threads. Usually they simply retry/suspend until the locks are next available however when a hardware interrupt occurs it can cause .NET to abandon or lose context per say. Which is why Unity3D simply aborts.

    Basically what I am finding is Parallels is quite aggressive on Windows when it comes to load shifting which in turn causes the Garbage Collection in Paralles to be far more aggressive than a native experience.
     
  4. DustinHullett

    DustinHullett Product Expert

    Messages:
    181
    What I'm finding for my issue is that when the oracle forms runtime is started an access violation is instantly created and the program exits? I have searched and searched how I can further debug this but the best I have gotten is looking in the windows logs and system events to see that the tkw32.dll is what's causing the violation. This is the oracle toolkit for windows. I also found that the program will run on a 32bit windows install but it's kind of slower than normal. It runs on native 64 but windows with no problem. Also 64 bit fusion VM with no issue. There's something weird about how parallels is handling 32 bit emulation in a 64 bit windows install. How else could I trace this issue as you have? I have been trying ollydebugger but I cant see what's causing it to call the termination routine
     
  5. ScottB9

    ScottB9 Bit Poster

    Messages:
    3
    I think you're spot on with the Access Violation being the issue. The problem like I stated *only* occurs inside a Parallels running Windows environment. The same code / application running inside Mac nothing of the sorts occurs. Same with Native development outside parallels and the point of failure occurs when I'm writing a lot of "records" into a file database (LightingBB).

    I think the GC.GetContextThreadError is more of a symptom than root cause.
     
  6. DustinHullett

    DustinHullett Product Expert

    Messages:
    181
    I finally fixed my issue a while back by patching the tkw32.dll for oracle turns out that when ran inside a parallels vm it was overflowing a register. I just changed a few lines around and it seems to be working fine for a while now
     

Share This Page