Fast bignum square computation
Problem:
Given two bigints represented as dynamic arrays of unsigned DWORDs, compute y = x^2 as fast as possible without precision loss.
Context:
The problem arises in the context of speeding up bignum divisions, where square operations are crucial.
Question:
How to compute y = x^2 in the most efficient manner?
Answer:
Initial Approach:
The initial approach uses a multiplication y = xx, avoiding multiple multiplications by reducing NN multiplications to (N 1)*(N/2) multiplications.
Karatsuba Multiplication:
The Karatsuba multiplication algorithm is employed to further optimize multiplication operations. It uses divide-and-conquer to speed up multiplication by breaking down large numbers into smaller chunks.
Performance Measurements:
Testing showed that the optimized Karatsuba multiplication outperforms the initial O(N^2) multiplication algorithm for larger numbers (around 32*98 bits).
Modified Schönhage-Strassen Multiplication for SQR implementation:
The modified Schönhage-Strassen multiplication, known as FFT (Fast Fourier Transform), is implemented to speed up SQR operations. However, due to accuracy loss, it is deemed unusable.
NTT Optimization:
The NTT (Number Theoretic Transform) is used to optimize multiplication and SQR operations. It is faster than FFT but requires modular arithmetic and is limited by the number size.
Current State:
The current implementation uses the optimized Karatsuba algorithm for SQR operations when the number size exceeds a certain threshold, and the initial fast SQR approach for smaller numbers.
Outstanding Questions:
The author acknowledges that there might be a more trivial or efficient solution that has been overlooked. The search for a better algorithm continues.
The above is the detailed content of How Can We Achieve the Fastest Possible Bignum Squaring?. For more information, please follow other related articles on the PHP Chinese website!