Previously, when running in parallel mode and a worker process crashed
while processing a task, it was not possible for ngcc to continue
without risking ending up with a corrupted entry-point and therefore it
exited with an error. This, for example, could happen when a worker
process received a `SIGKILL` signal, which was frequently observed in CI
environments. This was probably the result of Docker killing processes
due to increased memory pressure.
One factor that amplifies the problem under Docker (which is often used
in CI) is that it is not possible to distinguish between the available
CPU cores on the host machine and the ones made available to Docker
containers, thus resulting in ngcc spawning too many worker processes.
This commit addresses these issues in the following ways:
1. We take advantage of the fact that files are written to disk only
after an entry-point has been fully analyzed/compiled. The master
process can now determine whether a worker process has not yet
started writing files to disk (even if it was in the middle of
processing a task) and just put the task back into the tasks queue if
the worker process crashes.
2. The master process keeps track of the transformed files that a worker
process will attempt to write to disk. If the worker process crashes
while writing files, the master process can revert any changes and
put the task back into the tasks queue (without risking corruption).
3. When a worker process crashes while processing a task (which can be a
result of increased memory pressure or too many worker processes),
the master process will not try to re-spawn it. This way the number
or worker processes is gradually adjusted to a level that can be
accomodated by the system's resources.
Examples of ngcc being able to recover after a worker process crashed:
- While idling: https://circleci.com/gh/angular/angular/682197
- While compiling: https://circleci.com/gh/angular/angular/682209
- While writing files: https://circleci.com/gh/angular/angular/682267
Jira issue: [FW-2008](https://angular-team.atlassian.net/browse/FW-2008)
Fixes#36278
PR Close#36626