Module day15

Module day15 

Source
Expand description

§Dueling Generators

Multithreaded approach using worker threads to generate batches of numbers for judging. Part one can be checked in parallel, but part two must be sent to a single thread as the indices must be checked in order.

The sequence of numbers are modular exponentiation so we can jump to any location in the sequence, without needing to know the previous numbers.

The generator is in the hot path, so anything we can do to make it run faster is worthwhile; start by observing that our divisor 0x7fffffff is of the form 2ᵏ - 1, which lends itself well to computing a remainder with less work than a hardware division (the analysis here works for any number adjacent to a power of two, not just Mersenne primes). At a high level, computing X % Y is the same as repeatedly subtracting Y from a starting point of X until reaching a value less than Y. How many times does that subtraction occur? That’s easy, X / Y. But when dividing by Y is expensive (a hardware division by an odd number takes multiple clock cycles), what if we divide by Y + 1 instead (dividing by 2ᵏ is just performing a bit mask). Conceptually, the remainder after each subtraction of Y + 1 grows by an error of one until we reach a remainder of X % (Y + 1) - but we know the total error: it was the number of times we subtracted the denominator, or X / (Y + 1); and that value is also available, with just a bit shift. Adding the 31-bit adjusted remainder with the 31-bit error can overflow to 32 bits; then a final comparison against MOD gets the correct answer in fast_mod faster than hardware division. See also this post.

Structs§

Block 🔒
Generated numbers from start to start + BLOCK.
Shared
State shared between all threads.

Constants§

BLOCK 🔒
MOD 🔒
PART_ONE 🔒
PART_TWO 🔒

Functions§

fast_mod 🔒
Fast computation of n % 0x7fffffff.
parse
part1
part2
receiver 🔒
sender 🔒

Type Aliases§

Input 🔒