Posted: 2023-12-06
Last modified: 2023-12-06 @ 7aa92a7

Advent of Code 2023: Day 05

This post is part of a series: Advent of Code 2023

Code for solutions available on GitHub.

If You Give A Seed A Fertilizer

Time for some gardening! Oh boy I spent some time optimizing part 2 of this one.

The input looks like this:

seeds: 79 14 55 13

seed-to-soil map:
50 98 2
52 50 48

soil-to-fertilizer map:
0 15 37
37 52 2
39 0 15

There’s 5 more x-to-y map: chunks, but I left them out for brevity. The seeds: line is just a list of numbers that need to be threaded through this list of maps. The format of each map chunk is <destination> <source> <range length>. Meaning that the seed-to-soil map maps a contiguous range of 2 numbers starting at 98, to a contiguous range of 2 numbers starting at 50. So seed numbers 98 and 99 map to soil numbers 50 and 51 respectively. And seed numbers 50–97 map to soil numbers 52–99. Any number not covered in the map just maps to itself.

After mapping seed numbers to soil numbers, the soil numbers can be mapped to fertilizer numbers. The fertilizer numbers are mapped to water numbers. Then the water numbers to light numbers. Light numbers to temperature numbers. Temperature to humidity. And finally humidity to location. The goal is to find the seed that results in the lowest location number and output that. Phew. That’s quite a long description. Let’s get started.

Each map consists of a list of ranges. so let’s start with a Range struct:

struct Range {
    start: u64,
    length: u64
}

A mapping consists of a source and a destination range.

struct MapRange {
    source_range: Range,
    destination_range: Range,
}

The input was called an almanac, so a struct for the almanac containing a list of mappings is in order.

struct AlmanacMap {
    ranges: Vec<MapRange>
}

Again, parsing is uninteresting. It’s here. Let’s get on with it.

The AlmanacMap needs a method to perform the mapping to the next AlmanacMap. This is pretty simple to write (with fold!):

impl AlmanacMap {
  fn get(&self, input: u64) -> u64 {
    self
      .ranges
      .iter()
      .fold(input, |acc, range| {
        if range.source_range.start <= acc
          && acc < range.source_range.start + range.source_range.length
        {
          range.destination_range.start
            + (acc - range.source_range.start)
        } else {
          acc
        }
      })
  }
}

Now once we after parsing have an iterable of seed numbers, and an iterable of AlmanacMaps, threading the solution through all the AlmanacMaps is pretty straighforward (again, with fold!):

let mut lowest_location = u64::MAX;
for seed in seeds {
    let location = maps
        .iter()
        .fold(seed, |acc, map| {
            map.get(acc)
        });
    lowest_location = lowest_location.min(location);
}
lowest_location

And part 1 is done! That was almost too easy, what’s the problem?

Part 2

Uh oh. Seems like the seeds: line of the input wasn’t just a simple list. Each non-overlapping pair of two numbers actually represents a range, just like the ranges of the maps. Oh well, let’s just expand the ranges to a full list of numbers and reuse the solution for part 1.

…Yeah, that’s no good. It’s going to take a long time, as the numbers and range lengths in the actual input are on the order of billions. Though funnily enough, my benchmarking results were decent-ish. Reusing the solution for part 1 took 70 seconds on average. Over a minute, which is disappointing in the context of other solutions taking microseconds, but it’s still pretty fast compared to some other solutions I saw on the Advent of Code subreddit afterwards.

Instead, the smarter way to do it is to add functionality to the Range type, and map entire ranges of numbers at once. What can happen when trying to map a range? I considered these cases:

  1. The range is not mapped at all by the map. This is the simplest case.
  2. The entire range is covered by the map. Also pretty simple.
  3. The range partially overlaps, with the start not being covered by the map.
  4. The range partially overlaps, with the end not being covered by the map.
  5. The range completely overlaps, extending outside the range of the map.

Some diagrams are probably in order here.

Case 1:

  seed range |-------|
|=========================================================|
  mapped ranges        |------|   |-----|   |----------|
Case 2:

  seed range                                  |-------|
|=========================================================|
  mapped ranges        |------|   |-----|   |----------|
Case 3:

  seed range        |-------|
|=========================================================|
  mapped ranges        |------|   |-----|   |----------|
Case 4:

  seed range            |-------|
|=========================================================|
  mapped ranges        |------|   |-----|   |----------|
Case 5:

  seed range                     |-------|
|=========================================================|
  mapped ranges        |------|   |-----|   |----------|

There’s technically also the case where the start of the seed range matches one range, the middle matches maybe none, and the end matches another range, but these will be handled in a pretty similar way to case 5.

The key is that when parsing the AlmanacMaps, the ranges are sorted in order of increasing start. This then allows us to iterate through them wihout looking back. But to get there, the Range type needs some more functionality.

First, a Range needs to know if it contains a point.

impl Range {
    fn end(&self) -> u64 {
        self.start + self.length
    }

    fn contains(&self, location: &u64) -> bool {
        *location => self.start && *location < self.end()
    }
}

Then, a Range needs to know if it overlaps with another range.

impl Range {
    fn overlaps(&self, other: &Range) -> bool {
        self.contains(&other.start)
            || self.contains(&other.end() - 1)
    }
}

A way to get a Range that represents the intersection of two ranges would be useful.

impl Range {
    fn intersection(&self, other: &Range) -> Option<Range> {
        if self.overlaps(other) {
            let start = self.start.max(other.start);
            let len = self.end().min(other.end()) - start;
            Some(Range::new((start, len)))
        } else {
            None
        }
    }
}

And finally, a way to get a pair of ranges representing the parts of a range that are to the left or right of another (such as in cases 3, 4, and 5 above). In set theory, this is called the relative complement. Basically the whole set minus the intersection with another.

impl Range {
  fn relative_complement(&self, other: &Range) -> (Option<Self>, Option<Self>) {
    let left = if other.start >= self.start {
      None
    } else {
      let end = self.start.min(other.end());
      let length = end - other.start;
      Some(Range::new((other.start, length)))
    };

    let right = if other.end() <= self.end() {
      None
    } else {
      let start = self.end().max(other.start);
      let length = other.length - (self.end().saturating_sub(other.start));
      Some(Range::new((start, length)))
    };

    (left, right)
  }
}

Now what needs to happen when an AlmanacMap gets the mapping for a Range is that it returns a list of Ranges, not just a single range. This is because different sections of the range may or may not be mapped by the ranges in the AlmanacMap, effectively splitting the input range into multiple output ranges. This mapping can still be done in one sweep of the AlmanacMap’s ranges by keeping track of which range sections have and haven’t been mapped, and thanks to the ranges being sorted in order of increasing start, we can iterate through them in order without looking back and just do some bookkeeping. For each range in the AlmanacMap, we check every sub-range of the input range that hasn’t yet matched anything to see if it has an intersection with the current map range.

fn get(&self, source_range: &Range) -> Vec<Range> {
        let mut ranges = Vec::new();
        let mut ranges_to_check = vec![*source_range];
        for range in &self.ranges {
            let left_to_check = ranges_to_check.len();
            for _ in 0..left_to_check {
                let checked_range = ranges_to_check.pop().unwrap();

                if let Some(mut intersection) = range.source_range.intersection(&checked_range) {
                    let offset = intersection.start - range.source_range.start;
                    intersection.start = range.destination_range.start + offset;
                    ranges.push(intersection); // the intersection is completely mapped
                }

                let (left, right) = range.source_range.relative_complement(&checked_range);
                if let Some(left) = left {
                    ranges_to_check.push(left); // the sub-range to the left is not mapped. Check it next iteration of outer loop
                }

                if let Some(right) = right {
                    ranges_to_check.push(right); // the sub-range to the right is not mapped. Check it next iteration of outer loop
                }
            }
        }
        ranges.extend(ranges_to_check); // add the remaining ranges with identity mapping
        ranges
    }

And now we’re basically done. The solving part just needs a bit of a tweak, because now we have an iterable of seed ranges, not numbers.

let mut lowest_location = u64::MAX;
for range in seed_ranges {
    let location = maps
        .iter()
        .fold(vec![range], |ranges, map| {
            let mut new_ranges = Vec::new();
            for range in ranges {
                new_ranges.extend(map.get(&range));
            }
            new_ranges
        })
        .iter()
        .fold(u64::MAX, |acc, range| acc.min(range.start));

    lowest_location = lowest_location.min(location);
}

lowest_location

A double fold is needed here, because the get method returns a list of ranges, not just a single range. At the end, we get a list of output ranges that we simply fold over to get the minimum start value of the range and that’s the answer. I spent well over 2 hours trying to figure this out. But! The execution time went down from 70 seconds to about 100 microseconds. That’s almost a million times faster! That’s very nice :)

I also went back to my solution for part 1 and converted it to use ranges of length 1 for the seed numbers so that I could use the same solution code for both parts.

Posts in this series

  1. Advent of Code 2023: Day 01
  2. Advent of Code 2023: Day 02
  3. Advent of Code 2023: Day 03
  4. Advent of Code 2023: Day 04
  5. Advent of Code 2023: Day 05
  6. Advent of Code 2023: Day 06
  7. Advent of Code 2023: Day 07
  8. Advent of Code 2023: Day 08
  9. Advent of Code 2023: Day 09
  10. Advent of Code 2023: Day 10
  11. Advent of Code 2023: Day 11