$h = f * g$
Can overlap $h$ with $f$ or $g$.
Preconditions:
- $|f|$ bounded by
$1.65*2^{26},1.65*2^{25},1.65*2^{26},1.65*2^{25},$ etc.
- $|g|$ bounded by
$1.65*2^{26},1.65*2^{25},1.65*2^{26},1.65*2^{25},$ etc.
Postconditions:
- $|h|$ bounded by
$1.01*2^{25},1.01*2^{24},1.01*2^{25},1.01*2^{24},$ etc.
Notes on implementation strategy:
Using schoolbook multiplication. Karatsuba would save a little in some
cost models.
Most multiplications by 2 and 19 are 32-bit precomputations; cheaper than
64-bit postcomputations.
There is one remaining multiplication by 19 in the carry chain; one *19
precomputation can be merged into this, but the resulting data flow is
considerably less clean.
There are 12 carries below. 10 of them are 2-way parallelizable and
vectorizable. Can get away with 11 carries, but then data flow is much
deeper.
With tighter constraints on inputs can squeeze carries into int32.