Skip to content

openEMS is Currently Unsafe if Multiple Simulations, Instances or Threads are Created Due to Stale State #187

@biergaizi

Description

@biergaizi

During development of my own simulation project, I find three problems when running openEMS as a shared library - such as the Python binding. All of these problems are related to stale state when one attempts to run multiple simulations within a single execution.

Historically, openEMS was written as a standalone program, using it as a shared library was an afterthought for the Python binding, so these problems are not surprising. As the main contributor of the project, I believe I can isolate and fix these problems, but I'm not sure if my time allows it as I have too many simultaneous projects at hand. Here is a short note about these problems, in case any people encounter it.

  1. Erroneous Simulation Output if the Simulation is Changed via SetCSX().

    If the same openEMS() instance is created in Python to run multiple simulations, the result is unpredictable due to stale states. Sometimes it works correctly but ends up with a segfault upon exiting, sometimes the simulation result doesn't make any sense due to stale setup states from the previous simulation. This can be a subtle problem, the simulation may finish but with misleading results.

    I haven't fully isolated the problem yet.

    Workaround: Always create a new openEMS() instance and a new CSXCAD.ContinuousStructure() for each simulation run, as in:

     newfdtd = openEMS.openEMS()
     newcsx = CSXCAD.ContinuousStructure()
     newfdtd.SetCSX(newcsx) 
    

    But it leads to the following problem.

  2. RuntimeError: option is ambiguous and matches different versions

    If different openEMS() instance is created in Python to run multiple simulations, with different command-line options, the following error occurs:

     Traceback (most recent call last):                                                                                                                
       File "/home/fdtd-production/code/vacuum-coax/vacuum-coax.py", line 730, in <module>                                                             
         main(sys.argv[1])                                                                                                                             
       File "/home/fdtd-production/code/vacuum-coax/vacuum-coax.py", line 714, in main                                                                 
         simulate_tasks(task_list)                                                                                                                     
       File "/home/fdtd-production/code/vacuum-coax/vacuum-coax.py", line 456, in simulate_tasks                                                       
         fdtd.Run(subsimdir, numThreads="0")                                                                                                           
       File "openEMS/openEMS.pyx", line 534, in openEMS.openEMS.openEMS.Run                                                                            
       File "openEMS/openEMS.pyx", line 495, in openEMS.openEMS.openEMS._SetLibraryArguments
     RuntimeError: option '--numThreads=0' is ambiguous and matches different versions of '--numThreads'
    

    The reason is simple. The command-line arguments are tracked by global.h throughout the lifetime of the entire openEMS shared library, regardless of how many instances of openEMS() are created. When multiple openEMS() instances are used, the available global options are added multiple times, making each option a conflict because of multiple candidates.

    Solution: Change the code of openEMS::collectCommandLineArguments and Global::appendOptionDesc to either stop it from readding options, or to clear all options before adding it. I plan to fix this problem when I have time. (But this leads to the next problem, see below).

    Workaround: Isolate different simulations in their own Python processes via Python's multiprocessing and sub-interpreter features.

    Update: Fixed in commit 5ebe71d.

  3. Global.h is fundamentally thread-unsafe

    In theory, multiple openEMS() instances can be created for use in different threads (with their own CSXCAD.ContinuousStructure()), since each instance has its own internal states, they should be completely isolated. However, openEMS expects all options are saved at a central global location, which is global object in global.h. It means in a multi-threaded environment, the global variable values can suddenly change in the middle of a simulation, potentially creating race conditions and data corruptions.

    Workaround: Multi-threading in Python is rarely used because of GIL, use sub-interpreters for now. In the future, perhps we should make the global object a thread-local global variable instead of a process-global variable.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions