Now that `ngcc/src/ngcc_options` imports `FileWriter` type, there is a
circular dependency detected by the `ts-circular-deps:check` lint check:
```
ngcc/src/ngcc_options.ts
→ ngcc/src/writing/file_writer.ts
→ ngcc/src/packages/entry_point_bundle.ts
→ ngcc/src/ngcc_options.ts
```
This commit moves the `PathMappings` type (and related helpers) to a
separate file to avoid the circular dependency.
NOTE:
The circular dependency was only with taking types into account. There
was no circular dependency for the actual (JS) code.
PR Close#36626
When running in parallel mode, worker processes forward errors thrown
during task processing to the master process, which in turn exits with
an error.
However, there are cases where the error is not directly related to
processing the entry-point. One such case is when there is not enough
memory (for example, due to all the other tasks being processed
simultaneously).
Previously, an `ENOMEM` error thrown on a worker process would propagate
to the master process, eventually causing ngcc to exit with an error.
Example failure: https://circleci.com/gh/angular/angular/682198
This commit improves handling of these low-memory situations by
detecting `ENOMEM` errors and killing the worker process, thus allowing
the master process to decide how to handle that. The master process will
put the task back into the tasks queue and continue processing tasks
with the rest of the worker processes (and thus with lower memory
pressure).
PR Close#36626
Previously, when the last worker process crashed, the master process
would try to re-spawn it indefinitely. This could lead to an infinite
loop (if for some reason the worker process kept crashing).
This commit avoids this by limiting the number of re-spawn attempts to
3, after which ngcc will exit with an error.
PR Close#36626
Previously, when running in parallel mode and a worker process crashed
while processing a task, it was not possible for ngcc to continue
without risking ending up with a corrupted entry-point and therefore it
exited with an error. This, for example, could happen when a worker
process received a `SIGKILL` signal, which was frequently observed in CI
environments. This was probably the result of Docker killing processes
due to increased memory pressure.
One factor that amplifies the problem under Docker (which is often used
in CI) is that it is not possible to distinguish between the available
CPU cores on the host machine and the ones made available to Docker
containers, thus resulting in ngcc spawning too many worker processes.
This commit addresses these issues in the following ways:
1. We take advantage of the fact that files are written to disk only
after an entry-point has been fully analyzed/compiled. The master
process can now determine whether a worker process has not yet
started writing files to disk (even if it was in the middle of
processing a task) and just put the task back into the tasks queue if
the worker process crashes.
2. The master process keeps track of the transformed files that a worker
process will attempt to write to disk. If the worker process crashes
while writing files, the master process can revert any changes and
put the task back into the tasks queue (without risking corruption).
3. When a worker process crashes while processing a task (which can be a
result of increased memory pressure or too many worker processes),
the master process will not try to re-spawn it. This way the number
or worker processes is gradually adjusted to a level that can be
accomodated by the system's resources.
Examples of ngcc being able to recover after a worker process crashed:
- While idling: https://circleci.com/gh/angular/angular/682197
- While compiling: https://circleci.com/gh/angular/angular/682209
- While writing files: https://circleci.com/gh/angular/angular/682267
Jira issue: [FW-2008](https://angular-team.atlassian.net/browse/FW-2008)
Fixes#36278
PR Close#36626
With this commit, the master process will keep track of the transformed
files that each worker process is intending to write to disk.
In a subsequent commit, this info will be used to allow ngcc to recover
when a worker process crashes in the middle of processing a task.
PR Close#36626
With this commit, worker processes will notify the master process about
the transformed files they are about to write to disk before starting
writing them.
In a subsequent commit, this will be used to allow ngcc to recover when
a worker process crashes in the middle of processing a task.
PR Close#36626
This commit enhances the `CompileFn`, which is used to process each
entry-point, to support running a passed-in callback (and wait for it to
complete) before proceeding with writing the transformed files to disk.
This functionality is currently not used. In a subsequent commit, it
will be used for passing info from worker processes to the master
process that will allow ngcc to recover when a worker process crashes in
the middle of processing a task.
PR Close#36626
Rename the `markTaskCompleted()` method to be consistent with the other
similar methods of `TaskQueue` (`markAsFailed()` and
`markAsUnprocessed()`).
PR Close#36626
This commit adds support for stopping processing an in-progress task
and moving it back to the list of pending tasks.
In a subsequent commit, this will be used to allow ngcc to recover when
a worker process crashes in the middle of processing a task.
PR Close#36626
Previously, the "Compiling <entryPoint>" log message was printed before
starting to analyze and transform files, but after creating the
`EntryPointBundle` (which includes creating the TS program).
Since creating the `EntryPointBundle` involves some work, it is more
accurate to move the log message before creating the bundle.
PR Close#36626
The cached file-system was implemented to speed up ngcc
processing, but in reality most files are not accessed many times
and there is no noticeable degradation in speed by removing it.
Benchmarking `ngcc -l debug` for AIO on a local machine
gave a range of 196-236 seconds with the cache and 197-224
seconds without the cache.
Moreover, when running in parallel mode, ngcc has a separate
file cache for each process. This results in excess memory usage.
Notably the master process, which only does analysis of entry-points
holds on to up to 500Mb for AIO when using the cache compared to
only around 30Mb when not using the cache.
Finally, the file-system cache being incorrectly primed with file
contents before being processed has been the cause of a number
of bugs. For example https://github.com/angular/angular-cli/issues/16860#issuecomment-614694269.
PR Close#36687
The current ngcc lock-file strategy spawns a new process in order to
capture a potential `SIGINT` and remove the lock-file. For more
information see #35861.
Previously, this unlocker process was spawned as soon as the `LockFile`
was instantiated in order to have it available as soon as possible
(given that spawning a process is an asynchronous operation). Since the
`LockFile` was instantiated and passed to the `Executor`, this meant
that an unlocker process was spawned for each cluster worker, when
running ngcc in parallel mode. These processes were not needed, since
the `LockFile` was not used in cluster workers, but we still had to pay
the overhead of each process' own memory and V8 instance.
(NOTE: This overhead was small compared to the memory consumed by ngcc's
normal operations, but still unnecessary.)
This commit avoids the extra processes by only spawning an unlocker
process when running on the cluster master process and not on worker
processes.
PR Close#36569
Previously, when an entry-point contained code that caused its compilation
to fail, ngcc would exit in the middle of processing, possibly leaving other
entry-points in a corrupt state.
This change adds a new `errorOnFailedEntryPoint` option to `mainNgcc` that
specifies whether ngcc should exit immediately or log an error and continue
processing other entry-points.
The default is `false` so that ngcc will not error but continue processing
as much as possible. This is useful in post-install hooks, and async CLI
integration, where we do not have as much control over which entry-points
should be processed.
The option is forced to true if the `targetEntryPointPath` is provided,
such as the sync integration with the CLI, since in that case it is targeting
an entry-point that will actually be used in the current project so we do want
ngcc to exit with an error at that point.
PR Close#36083
Later when we implement the ability to continue processing when tasks have
failed to compile, we will also need to avoid processing tasks that depend
upon the failed task.
This refactoring exposes this list of dependent tasks in a way that can be
used to skip processing of tasks that depend upon a failed task.
It also changes the blocking model of the parallel mode of operation so
that non-typings tasks are now blocked on their corresponding typings task.
Previously the non-typings tasks could be triggered to run in parallel to
the typings task, since they do not have a hard dependency on each other,
but this made it difficult to skip task correctly if the typings task failed,
since it was possible that a non-typings task was already in flight when
the typings task failed. The result of this is a small potential degradation
of performance in async parallel processing mode, in the rare cases that
there were not enough unblocked tasks to make use of all the available
workers.
PR Close#36083
Moving the definition of the `onTaskCompleted` callback into `mainNgcc()`
allows it to be configured based on options passed in there more easily.
This will be the case when we want to configure whether to log or throw
an error for tasks that failed to be processed successfully.
This commit also creates two new folders and moves the code around a bit
to make it easier to navigate the code§:
* `execution/tasks`: specific helpers such as task completion handlers
* `execution/tasks/queues`: the `TaskQueue` implementations and helpers
PR Close#36083
The previous implementation mixed up the management
of locking a piece of code (both sync and async) with the
management of writing and removing the lockFile that is
used as the flag for which process has locked the code.
This change splits these two concepts up. Apart from
avoiding the awkward base class it allows the `LockFile`
implementation to be replaced cleanly.
PR Close#35861
With this change we spawn workers lazily based on the amount of work that needs to be done.
Before this change we spawned the maximum of workers possible. However, in some cases there are less tasks than the max number of workers which resulted in created unnecessary workers
Reference: #35717
PR Close#35719
ngcc uses a lockfile to prevent two ngcc instances from executing at the
same time. Previously, if a lockfile was found the current process would
error and exit.
Now, when in async mode, the current process is able to wait for the previous
process to release the lockfile before continuing itself.
PR Close#35131
The message now gives concrete advice to developers who
experience the error due to running multiple simultaneous builds
via webpack.
Fixes#35000
PR Close#35001
Ngcc adds properties to the `package.json` files of the entry-points it
processes to mark them as processed for a format and point to the
created Ivy entry-points (in case of `--create-ivy-entry-points`). When
running ngcc in parallel mode (which is the default for the standalone
ngcc command), multiple formats can be processed simultaneously for the
same entry-point and the order of completion is not deterministic.
Previously, ngcc would append new properties at the end of the target
object in `package.json` as soon as the format processing was completed.
As a result, the order of properties in the resulting `package.json`
(when processing multiple formats for an entry-point in parallel) was
not deterministic. For tools that use file hashes for caching purposes
(such as Bazel), this lead to a high probability of cache misses.
This commit fixes the problem by ensuring that the position of
properties added to `package.json` files is deterministic and
independent of the order in which each format is processed.
Jira issue: [FW-1801](https://angular-team.atlassian.net/browse/FW-1801)
Fixes#34635
PR Close#34870
Previously, it was possible for multiple instance of ngcc to be running
at the same time, but this is not supported and can cause confusing and
flakey errors at build time.
Now, only one instance of ngcc can run at a time. If a second instance
tries to execute it fails with an appropriate error message.
See https://github.com/angular/angular/issues/32431#issuecomment-571825781
PR Close#34722
This gives an overview of how much time is spent in each operation/phase
and makes it easy to do rough comparisons of how different
configurations or changes affect performance.
PR Close#32427
`ngcc` supports both synchronous and asynchronous execution. The default
mode when using `ngcc` programmatically (which is how `@angular/cli` is
using it) is synchronous. When running `ngcc` from the command line
(i.e. via the `ivy-ngcc` script), it runs in async mode.
Previously, the work would be executed in the same way in both modes.
This commit improves the performance of `ngcc` in async mode by
processing tasks in parallel on multiple processes. It uses the Node.js
built-in [`cluster` module](https://nodejs.org/api/cluster.html) to
launch a cluster of Node.js processes and take advantage of multi-core
systems.
Preliminary comparisons indicate a 1.8x to 2.6x speed improvement when
processing the angular.io app (apparently depending on the OS, number of
available cores, system load, etc.). Further investigation is needed to
better understand these numbers and identify potential areas of
improvement.
Inspired by/Based on @alxhub's prototype: alxhub/angular@cb631bdb1
Original design doc: https://hackmd.io/uYG9CJrFQZ-6FtKqpnYJAA?view
Jira issue: [FW-1460](https://angular-team.atlassian.net/browse/FW-1460)
PR Close#32427
This commit adds a new `TaskQueue` implementation that supports
executing multiple tasks in parallel (while respecting interdependencies
between them).
This new implementation is currently not used, thus the behavior of
`ngcc` is not affected by this change. The parallel `TaskQueue` will be
used in a subsequent commit that will introduce parallel task execution.
PR Close#32427
This change does not alter the current behavior, but makes it easier to
introduce `TaskQueue`s implementing different task selection algorithms,
for example to support executing multiple tasks in parallel (while
respecting interdependencies between them).
Inspired by/Based on @alxhub's prototype: alxhub/angular@cb631bdb1
PR Close#32427
Previously, `ngcc` needed to store some metadata related to the
processing of each entry-point. This metadata was stored in a `Map`, in
the form of `EntryPointProcessingMetadata` and passed around as needed.
After some recent refactorings, it turns out that this metadata (with
its only remaining property, `hasProcessedTypings`) was no longer used,
because the relevant information was extracted from other sources (such
as the `processDts` property on `Task`s).
This commit cleans up the code by removing the unused code and types.
PR Close#32427
In the past, a task's processability didn't use to be known in advance.
It was possible that a task would be created and added to the queue
during the analysis phase and then later (during the compilation phase)
it would be found out that the task (i.e. the associated format
property) was not processable.
As a result, certain checks had to be delayed, until a task's processing
had started or even until all tasks had been processed. Examples of
checks that had to be delayed are:
- Whether a task can be skipped due to `compileAllFormats: false`.
- Whether there were entry-points for which no format at all was
successfully processed.
It turns out that (as made clear by the refactoring in 9537b2ff8), once
a task starts being processed it is expected to either complete
successfully (with the associated format being processed) or throw an
error (in which case the process will exit). In other words, a task's
processability is known in advance.
This commit takes advantage of this fact by moving certain checks
earlier in the process (e.g. in the analysis phase instead of the
compilation phase), which in turn allows avoiding some unnecessary work.
More specifically:
- When `compileAllFormats` is `false`, tasks are created _only_ for the
first suitable format property for each entry-point, since the rest of
the tasks would have been skipped during the compilation phase anyway.
This has the following advantages:
1. It avoids the slight overhead of generating extraneous tasks and
then starting to process them (before realizing they should be
skipped).
2. In a potential future parallel execution mode, unnecessary tasks
might start being processed at the same time as the first (useful)
task, even if their output would be later discarded, wasting
resources. Alternatively, extra logic would have to be added to
prevent this from happening. The change in this commit avoids these
issues.
- When an entry-point is not processable, an error will be thrown
upfront without having to wait for other tasks to be processed before
failing.
PR Close#32427
Previously, `ngcc`'s programmatic API would run and complete
synchronously. This was necessary for specific usecases (such as how the
`@angular/cli` invokes `ngcc` as part of the TypeScript module
resolution process), but not for others (e.g. running `ivy-ngcc` as a
`postinstall` script).
This commit adds a new option (`async`) that enables turning on
asynchronous execution. I.e. it signals that the caller is OK with the
function call to complete asynchronously, which allows `ngcc` to
potentially run in a more efficient mode.
Currently, there is no difference in the way tasks are executed in sync
vs async mode, but this change sets the ground for adding new execution
options (that require asynchronous operation), such as processing tasks
in parallel on multiple processes.
NOTE:
When using the programmatic API, the default value for `async` is
`false`, thus retaining backwards compatibility.
When running `ngcc` from the command line (i.e. via the `ivy-ngcc`
script), it runs in async mode (to be able to take advantage of future
optimizations), but that is transparent to the caller.
PR Close#32427
This change does not alter the current behavior, but makes it easier to
introduce new types of `Executors` , for example to do the required work
in parallel (on multiple processes).
Inspired by/Based on @alxhub's prototype: alxhub/angular@cb631bdb1
PR Close#32427
Previously, when run with `createNewEntryPointFormats: true`, `ngcc`
would only update `package.json` with the new entry-point for the first
format property that mapped to a format-path. Subsequent properties
mapping to the same format-path would be detected as processed and not
have their new entry-point format recorded in `package.json`.
This commit fixes this by ensuring `package.json` is updated for all
matching format properties, when writing an `EntryPointBundle`.
PR Close#32052
This refactoring more clearly separates the different phases of the work
performed by `ngcc`, setting the ground for being able to run each phase
independently in the future and improve performance via parallelization.
Inspired by/Based on @alxhub's prototype: alxhub/angular@cb631bdb1
PR Close#32052