The last time I queued up, I found that reducing the amount of redundancy can reduce the number of times.
and find an article
Functions with high frequency calls are guaranteed to be optimized, using division and remainder sparingly
The original PO display 404, so only others reproduced.
That is to say: the division, the remainder of the instruction CPU cycle can be 80 times times the addition and subtraction (the more time the period is longer), so the high frequency used in the function, as well as the cycle of a large loop, can be reduced by the number of division and the number of times to optimize the amount of redundancy. It introduces some methods, such as multiplication, subtraction instead.
And then I saw another article.
Efficiency of modulo, multiplication, and division operations on the CPU and GPU
It means that the modulo operation is not as slow as it might seem.
For CPUs, it is best to take modulo operations, where integer division and single-precision multiplication are almost as efficient.
For the GPU (what is), the fastest floating-point operation, followed by modulo operations, integer division is the slowest.
Operation time of division and remainder