This is the mail archive of the cygwin-developers mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: [PATCH 0/6] When fork fails, retry with hardlinks.


On 12/08/2016 04:51 PM, Corinna Vinschen wrote:
> 
> Would you mind to create a brief tl;dr overview with bullet points to
> describe what happens at which point?
> 

No problem, but that needs some iterations to filter the really interesting
information. Here's a first draft (ohw, a brain dump, sorry) that still is
too long (whoa, did I really examine that much corners?):


== Current situation ==

Cygwin's fork() implementation uses CreateProcess and LoadLibrary to load
binary files (the main-executable and the dlls) into the child process.

For a reliable POSIX-ly fork implementation, any binary file loaded into the
child process needs to be the very same file as loaded in the parent process.

Since Cygwin lacks a runtime loader, the dll search path for linked dlls boils
down to something like: <current dir>, <dir of executable>, <PATH env-var>.

Unfortunately, CreateProcess and LoadLibrary may encounter different (or missing)
binary files (compared to what is loaded into the parent process) in cases like:
 * The parent process may have changed the <current dir>.
 * The parent process may have changed the <PATH env-var>.
 * A dll with the same basename may appear earlier in the dll search path.
 * A binary file might have been removed (moved to trash actually).
 * When moved to trash, the original binary file name may be different file.


== The "forkables" topic ==

As these really are corner cases: Until a different (or missing) binary file is
detected while loading (or creating) the child process, fork is attempted using
both the original executable location and the original dll search path.

But instead of failing, for the corner cases the "forkables" topic tries to
perform beforehand (in the parent process):
 * Once for any loaded binary file, query the NTFS-IndexNumber using its _current_
   location.
 * Create a temporary application directory, containing:
   + (hardlinks to) the "main.exe" and all the linked dlls,
   + an empty "main.exe.local" file to enable "DotLocal Dll Redirection".
 * Create a temporary subdirectory for each directory a dll was dynamically
   loaded from, containing (hardlinks to) the dlls dynamically loaded from
   that original directory.
 * Retry fork with binary file names from that temporary directories (while
   retaining the original binary file basenames).


== Implementation details ==

With multiprocessing there's a number of combined challenges to take care of:
 * Processes fork concurrently.
 * Processes use same main-executable files of different name (eg. ash->dash).
 * Processes dynamically load additional dlls between fork calls.
 * Processes unload dynamically loaded dlls between fork calls.
 * Processes use same binary file names, but of different age.
 * All of them apply to (forked!) child processes as well.
 * Ensure temporary directories are cleaned up.

To solve these challenges, the forkables implementation is based on these ideas:
 * Always create directories, hardlinks and the .local file, but never fail or
   overwrite an existing item - it really is there for the requested purpose
   already, just created (and in-use right now) by another (similar) process.
 * Multiple main-executable (and .local) file names are fine within one
   temporary application directory - one executable ignores the other.
 * The temporary application directories are created per user SID.
 * The temporary application directory name is formed using the main-
   executable's NTFS-IndexNumber and the most recent time stamp (LastWriteTime)
   of binary files currently loaded in the forking process.

<damn-wrong reason="original directory may not exist any more">
 * The temporary subdirectory name for a dynamically loaded dll is formed
   using the original directory's NTFS-IndexNumber.
</damn-wrong>


-- Synchronization --

To synchronize cleanup, the states of a mutex named along the temporary 
application directory name are used to indicate the directory's state:
 (Locked): do not use for forking.
 (Unlocked): can be used (to create items) for forking.
 (Absent): not in use, ready do clean up.

The available mutex' state transitions are used as:


- At process exit

(Locked -> Absent)
 * Before closing the current process' mutex handle, it is locked with almost no
   timeout. It turned out that closing a _locked_ mutex' handle is promoted more
   synchronously to other processes than closing an unlocked mutex' handle when
   it was the mutex' last handle causing the mutex to be destroyed.

(Absent -> Locked)
 * For temporary application directories found on the file system, the according
   mutex name is tried to "Create-With-Lock".
 * If successful, this one directory is cleaned up while holding the lock.
 * Finally the mutex handle is closed, resulting in either Unlocked or Absent.


- When preparing to fork

(Unlocked -> Locked)
 * Upon first forkables creation in a process, the mutex handle is opened and
   locked with infinite timeout, to wait for any process that might be cleaning
   up the temporary application directory right now.
 * Immediately unlocked once locking succeeded.
 * Both parent and child process keep the mutex handle (inherited by child)
   open until they exit - see above.


/haubi/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]