Although I managed to tweak this method to work with 3 multiplications.
ETA: I just realized you wanted to use 32x32 -> 64 products, while my approach assumes the existence of 64x64 -> 64 products; basically it's just a scaled-up version of the original question and likely not what you're looking for. Hopefully it's still useful though.
First, remove the bottom 8 bits of the two inputs and compute the 44x44->88 product. This can be done with the approach in the post. Then apply the algorithm again, combining that product together with the product of the bottom half of the input to get the full 52x52->104 output. The bounds are a bit tight, but it should work. Here's a numeric example:
a = 98a67ee86f8cf
b = da19d2c9dfe71
(a >> 20) * (b >> 20) = 820d2e04637bf428
(a >> 8) * (b >> 8) % 2**64 = 0547f8cdb2100210
->
(a >> 8) * (b >> 8) = 820d2e0547f8cdb2100210
(a >> 8) * (b >> 8) = 820d2e0547f8cdb2100210
(a * b) % 2**64 = 080978075f64355f
->
a * b = 820d2e0548080978075f64355f
And my attempt at implementation: https://play.rust-lang.org/?version=stable&mode=release&edit...I tried to go even higher, but the bounds seems to break at 55 bits.
https://www.quinapalus.com/qfplib.html
Nice write up here, too, I like the idea of a firm float.
I wisely hit the back button :)
Then the full horror of it hit me.
The IEEE single-precision format with a hidden bit has replaced earlier single-precision formats, most of which had a 24-bit significand and a 7-bit exponent, chosen thus for alignment on byte boundaries, which simplified the emulation in software of the hardware floating-point units at a time when many computers lacked hardware FPUs.
When the idea of using a hidden bit freed one bit in the memory format, a decision was required, whether to use it for the exponent or for the significand.
The experience of using the older single-precision formats before 1980 indicated that the frequent overflows and underflows caused by a 7-bit exponent were more annoying than insufficient precision in the significand, so it was decided to use the hidden bit for increasing the exponent from 7 bits to 8 bits.
At that time, by 1980, the efficient implementation in hardware had become the main consideration in the design of the number formats, and not whether there are some tricks that could be useful for software emulation. The use of a hidden bit is inconvenient for software, which is why it had not been used earlier.
Some earlier formats had also used an 8-bit exponent and a hidden bit, but they differed from the IEEE format because the exponent was byte-aligned and the sign bit was stored over the hidden bit. This simplified the software emulation, but it had the disadvantage that comparing floating-point numbers was more complicated, as one could not compare the bytes in the same order as for integer numbers.