aoc/year2019/
day16.rs

1//! # Flawed Frequency Transmission
2//!
3//! ## Part One
4//!
5//! For each phase we first compute the prefix sum of the digits. This allows us to compute
6//! the sum of any contiguous range of digits with only 2 lookups. For example the sum of the
7//! 5 to 8 is `36 - 10 = 26`.
8//!
9//! ```none
10//!                               -----------
11//!     Digits:     1  2  3   4   5  6  7   8   9
12//!     Prefix sum: 1  3  6 [10] 15 21 28 [36] 45
13//! ```
14//!
15//! The complexity of each phase is the [harmonic series](https://en.wikipedia.org/wiki/Harmonic_series_(mathematics))
16//! so the total complexity is `n` for the prefix sum calculation and `log(n)` for the next digits
17//! for a total of `nlog(n)`.
18//!
19//! As a minor optimization once the phase is greater than ⅓ of the digits, then the pattern
20//! simplifies to a sum of a single range. For example with 11 digits on phase 4 the pattern is:
21//!
22//! ```none
23//!   0 0 0 1 1 1 1 0 0 0 0
24//! ```
25//!
26//! ## Part Two
27//!
28//! If the index from the first 7 digits lies in the second half of the input then we only need to
29//! consider coefficients that form an [upper triangular matrix], for example:
30//!
31//! ```none
32//!   1  1  1  1
33//!   0  1  1  1
34//!   0  0  1  1
35//!   0  0  0  1
36//! ```
37//!
38//! After the first phase:
39//!
40//! ```none
41//!   1  2  3  4
42//!   0  1  2  3
43//!   0  0  1  2
44//!   0  0  0  1
45//! ```
46//!
47//! After the second phase:
48//!
49//! ```none
50//!   1  3  6 10
51//!   0  1  3  6
52//!   0  0  1  3
53//!   0  0  0  1
54//! ```
55//!
56//! After the third phase:
57//!
58//! ```none
59//!   1  4 10 20
60//!   0  1  4 10
61//!   0  0  1  6
62//!   0  0  0  1
63//! ```
64//!
65//! We can see that the third phase is the [triangular number] sequence and that the fourth phase
66//! is the [tetrahedral number] sequence. More generally the `i`th coefficient of the 100th phase
67//! is the [binomial coefficient] `(i + 99, i)`.
68//!
69//! We could compute the coefficient using the formula `nᵏ/k!` however this [grows rather large]
70//! and quickly will overflow even a `u128`.
71//!
72//! However we only need the coefficient modulo 10. [Lucas's theorem] allows us to compute binomial
73//! coefficients modulo some prime number. If we compute the coefficients modulo 2 and modulo 5
74//! then we can use the [Chinese remainder theorem] to find the result modulo 10.
75//!
76//! Two further empirical insights from [Askalski](https://www.reddit.com/user/askalski/)
77//! speed up part two even more. The first is that the coefficients modulo 2 form a cycle of
78//! length 128 and the coefficients of modulo 5 form a cycle of length 125. Since the digits also
79//! form a cycle of length 650 then we only need to process the
80//! [least common multiple](https://en.wikipedia.org/wiki/Least_common_multiple) of each cycle.
81//! This is 41600 for coefficients modulo 2 and 3250 for coefficients modulo 5.
82//!
83//! The second insight is that both of the cycles are very sparse. Only 8 out of 128 modulo 2 values
84//! and 2 out of 125 modulo 5 values respectively are non-zero. By storing the values as a
85//! compressed list of `(coefficient, skip value)` we only need to process of small fraction of
86//! the total digits. In total we need to compute `41600 * (8 / 128) + 3250 * (2 /125) = 2652`
87//! values per digit. This is much much less than the approximately 500,000 coefficients in the
88//! complete range.
89//!
90//! [prefix sum]: https://en.wikipedia.org/wiki/Prefix_sum
91//! [upper triangular matrix]: https://en.wikipedia.org/wiki/Triangular_matrix
92//! [triangular number]: https://en.wikipedia.org/wiki/Triangular_number
93//! [tetrahedral number]: https://en.wikipedia.org/wiki/Tetrahedral_number
94//! [binomial coefficient]: https://en.wikipedia.org/wiki/Binomial_coefficient
95//! [grows rather large]: https://oeis.org/A017763/b017763.txt
96//! [Lucas's theorem]: https://en.wikipedia.org/wiki/Lucas%27s_theorem
97//! [Chinese remainder theorem]: https://en.wikipedia.org/wiki/Chinese_remainder_theorem
98use crate::util::math::*;
99use crate::util::parse::*;
100use crate::util::slice::*;
101use std::array::from_fn;
102
103/// `C(n, k) % 2` This collapses to a special case of a product of only 4 possible values
104/// which are cyclic with a length of 128.
105///
106/// * `C(0, 0) = 1`
107/// * `C(1, 0) = 1`
108/// * `C(1, 1) = 1`
109/// * `C(0, 1) = 0`
110const BINOMIAL_MOD_2: [(i32, usize); 8] =
111    [(1, 4), (1, 4), (1, 4), (1, 4), (1, 4), (1, 4), (1, 4), (1, 100)];
112/// `C(n, k) % 5` Cyclic with a length of 125.
113const BINOMIAL_MOD_5: [(i32, usize); 2] = [(1, 25), (4, 100)];
114
115pub fn parse(input: &str) -> Vec<i32> {
116    input.trim().bytes().map(|b| b.to_decimal() as i32).collect()
117}
118
119pub fn part1(input: &[i32]) -> i32 {
120    let size = input.len();
121    let limit = size.div_ceil(3);
122
123    let mut digits = input.to_vec();
124    let mut prefix_sum = vec![0; size + 1];
125
126    for _ in 0..100 {
127        // Prefix sum for fast computation of arbitrary contiguous ranges.
128        let mut sum = 0;
129
130        for (i, digit) in digits.iter().enumerate() {
131            sum += digit;
132            prefix_sum[i + 1] = sum;
133        }
134
135        // The first third of the phases can contain the complete alternating pattern.
136        for (i, digit) in digits.iter_mut().enumerate().take(limit) {
137            let phase = i + 1;
138            let mut total = 0;
139            let mut sign = 1;
140
141            for start in (phase - 1..size).step_by(2 * phase) {
142                let end = (start + phase).min(size);
143                total += sign * (prefix_sum[end] - prefix_sum[start]);
144                sign *= -1;
145            }
146
147            *digit = total.abs() % 10;
148        }
149
150        // The remaining phases simplify to the sum of a single range.
151        for (i, digit) in digits.iter_mut().enumerate().skip(limit) {
152            let phase = i + 1;
153            let start = phase - 1;
154            let end = (start + phase).min(size);
155            *digit = (prefix_sum[end] - prefix_sum[start]).abs() % 10;
156        }
157    }
158
159    digits[..8].fold_decimal()
160}
161
162pub fn part2(input: &[i32]) -> i32 {
163    let size = input.len();
164    let lower = size * 5_000;
165    let upper = size * 10_000;
166
167    // This approach will only work if the index is in the second half of the input.
168    let start = input[..7].fold_decimal() as usize;
169    assert!(lower <= start && start < upper);
170
171    let first = compute(input, start, upper, BINOMIAL_MOD_2.iter().copied().cycle(), 128);
172    let second = compute(input, start, upper, BINOMIAL_MOD_5.iter().copied().cycle(), 125);
173
174    // Computes C(n, k) % 10
175    // Solving the Chinese remainder theorem for the special case of two congruences:
176    //
177    //     x ​≡ a₁ (mod n₁) ​≡ a₁ (mod 2)
178    //     x ​≡ a₂ (mod n₂) ≡ a₂ (mod 5)
179    //     N = n₁n₂ = 10
180    //     y₁ = N / n₁ = 5
181    //     y₂ = N / n₂ = 2
182    //     z₁ = y₁⁻¹ mod n₁ = 5⁻¹ mod 2 = 1
183    //     z₂ = y₂⁻¹ mod n₂ = 2⁻¹ mod 5 = 3
184    //     x ≡ a₁y₁z₁ + a₂y₂z₂ (mod 10) ≡ 5a₁ + 6a₂ (mod 10)
185    //
186    let result: Vec<_> = first.into_iter().zip(second).map(|(f, s)| (5 * f + 6 * s) % 10).collect();
187    result.fold_decimal()
188}
189
190/// Quickly computes a digit taking advantage of the fact
191/// that the LCM is much smaller than the whole range.
192fn compute<I>(input: &[i32], start: usize, upper: usize, mut nck: I, size: usize) -> [i32; 8]
193where
194    I: Iterator<Item = (i32, usize)>,
195{
196    from_fn(|offset| {
197        let start = start + offset;
198        let total = upper - start;
199
200        // Compute LCM, number of complete ranges and the remaining partial range.
201        let lcm = input.len().lcm(size);
202        let quotient = (total / lcm) as i32;
203        let remainder = total % lcm;
204
205        // Sum partial range first.
206        let mut index = start;
207        let mut partial = 0;
208
209        while index < start + remainder {
210            let (coefficient, skip) = nck.next().unwrap();
211            partial += input[index % input.len()] * coefficient;
212            index += skip;
213        }
214
215        // Then the full range.
216        let mut full = partial;
217
218        while index < start + lcm {
219            let (coefficient, skip) = nck.next().unwrap();
220            full += input[index % input.len()] * coefficient;
221            index += skip;
222        }
223
224        // Calculate sum for the entire range.
225        quotient * full + partial
226    })
227}