Message ID computation makes extensive use of big integer
multiplications in order to translate the message's fingerprint into
a numerical representation. In large compilations with heavy use of i18n
this was showing up high in profiler sessions.
There are two factors contributing to the bottleneck:
1. a suboptimal big integer representation using strings, which requires
repeated allocation and conversion from a character to numeric digits
and back.
2. repeated computation of the necessary base-256 exponents and their
multiplication factors.
The first bottleneck is addressed using a representation that uses an
array of individual digits. This avoids repeated conversion and
allocation overhead is also greatly reduced, as adding two big integers
can now be done in-place with virtually no memory allocations.
The second point is addressed by a memoized exponentiation pool to
optimize the multiplication of a base-256 exponent.
As an additional optimization are the two 32-bit words now converted to
decimal per word, instead of going through an intermediate byte buffer
and doing the decimal conversion per byte.
The results of these optimizations depend a lot on the number of i18n
messages for which a message should be computed. Benchmarks have shown
that computing message IDs is now ~6x faster for 1,000 messages, ~14x
faster for 10,000 messages, and ~24x faster for 100,000 messages.
PR Close#39694