Making games play themselves with Rust Part 2: Parsing the Game State

Table of contents

Introduction

Last time I laid the foundation for game state parsing, namely reading pixels from the screen and normalizing the coordinate system so we don’t have to worry about the rest of the screen outside the Dungeons and Diagrams window. Today I’d like to go from a screenshot with half a million pixels to a Rust object that I can write a solver around. With that said, my objectives for today are:

  1. Detect tile contents (Empty, Treasure, or Monster)
  2. Parse the numbers on the top and left side of the board
  3. Parse the room seed
  4. Read and write the board state to/from a file
  5. (Bonus) Testing

Quick note about error handling

In the previous post I made some custom error types to handle all the failure modes of my program. I ended up rewriting these to use anyhow which reduces boilerplate for cases where you don’t need to recover from the errors. If I hit an error at runtime it is enough to print a message with some information about what went wrong and exit.

The Game State

A puzzle in Dungeons and Diagrams has a pretty simple state representation. The puzzle requires only 16 numbers between 0 and 7 (inclusive) and 64 tiles which can each be in one of 5 states. For the randomly generated puzzles there is additionally a seed between 0 and 99999999, but the original 63 curated puzzles do not have one.

Examples of each state
From left to right: Path, Treasure, Wall, Empty, Monster

I chose to represent the tiles as an enum with 5 variants. The numbers on each side of the board which represent the number of walls required for each row and column are represented with byte arrays. Lastly the seed, which may or not be present depending on whether this is a curated puzzle or not, can be wrapped in an Option. It can range from 0 to 99,999,999 so it fits nicely in a u32.

enum Tile {
    Empty,
    Wall,
    Path,
    Treasure,
    Monster,
}

struct Puzzle {
    tiles: [[Tile; 8]; 8],
    top_nums: [u8; 8],
    left_nums: [u8; 8],
    pub seed: Option<u32>,
}

For now I used u8s to represent the numbers because given the rules, I knew they can never be larger than 8. In the future I’d like to try generating extremely large puzzles of my own, so I may come back and revise this struct to use a larger datatype for these variables.

Structure from Pixels

I had to teach my compute how to see - it needed some way to take a collection of ~500 thousand pixels and intelligently filling out the Puzzle struct. It was tempting to reach into the big bag of AI tricks that have become so prevalent in recent years, but I wanted to have complete understanding of the model that is being used to analyze my screenshots. The parsing had to be accurate without relying on flimsy statistical assurances.

Since there are only a small number of digits and a handful of monsters to identify, I figured I could use the game’s retro style to my advantage. The low resolution and limited color palette vastly cut down the amount of patterns I had to consider. Furthermore, I assumed the player hasn’t made any moves to solve the puzzle yet so there will be no Wall or Path tiles placed, which reduced the possibilities to the 3 remaining options (Monster, Treasure, and Empty).

Discriminants

Let me quickly introduce the concept of discriminants, which I’ll be making heavy use of in this post. I’m not talking about the discriminants you might be familiar with from linear algebra, but rather the more general concept of a feature that enables things to be distinguished from one another. For example, crocodiles and alligators are often mistaken for one another but you can easily tell them apart by observing the shape of the snout; crocodiles are long and pointy whereas alligators have shorter, rounder ones.

Translating a concept like “snout shape” to a computer model is something that is well-suited to machine learning algorithms, but concepts like “pixel color” or “number of pixels in a region matching a predicate” are much simpler to implement, deterministic, and more performant to run. You might even remember that I already used one in the last post to find the offset of the game window using a 3-pixel pattern. In the next few sections I’ll cover the the choice and application of discriminants to parse tile states, wall counts, and the seed.

Parsing the Tiles

Recall that I only need to determine which of three states a tile is in. It was pretty straightforward to tell if a tile is Empty by comparing the tile to the static background image. From there, I suspected I could tell whether it was Treasure or Monster by comparing the pixel colors from each of the sprites. I theorized that if I overlaid images of all the monster sprites with the treasure sprite, I could find at least one pixel which satisfies the following constraints:

  1. It is opaque every sprite
  2. Its color for each monster is not equal to the treasure chest
  3. Its color for each monster is not equal to the background color

More formally, I hoped to find a position p within the 32x32 pixel sprite such that the colors for all the monsters at p and the color for the treasure sprite at p form a disjoint set. If such a case could be found, the tile parsing discriminant would be a single pixel and the parsing code would be but a morsel.

match pixel {
    background_color => Tile::Empty,
    treasure_color => Tile::Treasure,
    _ => Tile::Monster,
}

Computing the tile discriminant turned out to be quite involved. I have left the details to a separate bonus post covering texture decoding, sprite registration (alignment), and use of imagemagick, a command-line image editing tool. It’s not required reading but it contains a proof for the following statement which is important for rest of the post:

Sampling a single pixel in a in a tile with offset (16, 12) yields enough information to tell whether it contains a monster, treasure, or nothing.

There was one small catch, which is that the background color at that offset varies for each tile. I fixed this by building a lookup table containing the color at (16, 12) for each tile.

Background pixels
Left: The pixels with (16, 12) offset in each tile are highlighted. Right: The 64 pixels extracted into a single image.

I extracted these 64 sample pixels and stored them in a const array, alongside the color of the treasure sprite at the same offset. Each pixel is 4 bytes, represented as [u8; 4].

const TREASURE_COLOR: [u8; 4] = [220, 170, 109, 255];
const BACKGROUND_PIXELS: [[[u8; 4]; 8]; 8] = [
    [
        [176, 128, 93, 255], [55, 58, 59, 255],
        [125, 113, 90, 255], [54, 58, 55, 255],
        [176, 128, 93, 255], [55, 58, 59, 255],
        [125, 113, 90, 255], [54, 58, 55, 255],
    ],
    // for brevity, only 8 pixel values are shown here
];

Now I could loop over each tile, compute the offset of the sample points, and then check against the background/treasure colors to determine the contents of the cell.

// Input image is a cropped view of only the tiles
fn parse_tiles(img: SubImage<&RgbaImage>) -> [[Tile; 8]; 8] {
    let mut tiles = [[Tile::Empty; 8]; 8];
    for tile_y in 0..8usize {
        for tile_x in 0..8usize {
            // Lookup the background color for the current tile
            let bg_color = BACKGROUND_PIXELS[tile_y][tile_x];
            let px = tile_x as u32 * TILE_STRIDE + TILE_SAMPLE_POINT.0;
            let py = tile_y as u32 * TILE_STRIDE + TILE_SAMPLE_POINT.1;
            
            // Fetch the color at the tile's sample point
            let sample = img.get_pixel(px, py).0;

            // #computervision
            if sample == bg_color {
                tiles[tile_y][tile_x] = Tile::Empty;
            } else if sample == TREASURE_COLOR {
                tiles[tile_y][tile_x] = Tile::Treasure;
            } else {
                tiles[tile_y][tile_x] = Tile::Monster;
            }
        }
    }

    tiles
}

This logic turned out to be surprisingly simple and reminded me of the hotdog, not hotdog (warning: language) sketch from the show Silicon Valley.

Now I was ready to finally test it out. I wrote a function to overlay the detected state on the screenshot with red being a monster and green being a treasure, and generate a bunch of random puzzles to test it. The overlay was nice because I didn’t have to look at a bunch of debug text on the command line.

100 random puzzles with the parsed state overlaid.

Reading Numbers

With tile parsing out of the way, I could get started on the numbers along the edge of the board. These numbers give crucial information to the player about how many walls should be placed in each row and column. Unlike the tiles where I only had 3 states to discern, there are 9 digits in two colors each that might show up in a puzzle, which is 18 potential states. Having more states and less color variability means my discriminant will likely need to be more than one pixel.

Digits
The digit sprites. The red color indicates that row or column doesn't have the correct number of walls placed.

I could whittle this down however by carefully considering the mechanics of the game. The brown numbers indicate a satisfied constraint (the correct number of walls in that row/column) and red represents an unsatisfied constraint. Considering any new puzzle will have no walls placed, the following must be true:

Additionally, after exploring a few thousand random puzzles, I never once came across an 8 on the board, so I assume the developer explicitly forbid this from the level generator as it effectively just creates smaller, less interesting puzzles. To summarize, the only digits I will see on a new board are brown 0 and red 1 through 7. A total of 8 possible states to look out for. To begin parsing, I started by isolating the pixels of the digits from the rest of the game with a threshold filter.

Game image, thresholded.
Center: RGB values of the digits. Right: Game masked to only allow pixels with green = 91 and 69 < blue < 77.

The brown and red digits coincidentally had equivalent green values, and pretty similar blue values which let me produce a tight threshold that accurately excised the numbers from the background. With the isolated pixels I decided to try counting the number of pixels for each digit to see if they each had a unique count I could key off. I only used the top half of the bounding box so as to avoid large monster sprites covering up the bottom half of the numbers and throwing the results off. My function produced the following output.

[102, 54, 82, 68, 84, 94, 64, 72]

These are the pixel counts for each digit from 0 on the left to 7 on the right. Thankfully they were all unique which meant I could map these counts directly to the digits they represent. The only caveat is that it required checking 22x16=352 pixels per digit. With 16 digits it came to over 5600 pixels to check and I wanted to minimize that. I wrote another function, this time to count the pixels with an iteratively shrinking region to try and identify the smallest possible region that could still uniquely identify all 8 digits.

pub fn get_large_digit_discriminant() {
    let digits_img = image::open(LARGE_FONT_PATH).unwrap();

    // Extract individual digits (brown 0, red 1-7)
    let digits_imgs = [0, 1, 2, 3, 4, 5, 6, 7].map(|i| {
        digits_img.view(
            DIGIT_OFFSETS[i],
            if i == 0 { 32 } else { 0 },
            MAX_PATTERN_WIDTH,
            MAX_PATTERN_HEIGHT,
        )
    });

    for (i, digit) in digits_imgs.iter().enumerate() {
        digit
            .to_image()
            .save(format!("script_output/digit_discrim/{i}.png"))
            .unwrap();
    }
    // Create a window with size 'len' and move it across each digit
    let mut best_size = u32::MAX;
    // Iterate over windows of varying sizes
    for width in (1..=MAX_PATTERN_WIDTH).rev() {
        for height in (1..=MAX_PATTERN_HEIGHT).rev() {
            // I need a minimum of 8 pixels in the region
            if width * height < 8 {
                continue;
            }
            // Iterate over all positions
            for y in 0..=MAX_PATTERN_HEIGHT - height {
                for x in 0..=MAX_PATTERN_WIDTH - width {
                    // Count digits in the region that pass threshold filter
                    let counts = [0, 1, 2, 3, 4, 5, 6, 7].map(|i| {
                        digits_imgs[i]
                            .view(x, y, width, height)
                            .pixels()
                            .map(|(_, _, p)| p.0) // destructure the pixel
                            .filter(|p| p[1] == 91 && p[2] >= 69 && p[2] <= 78)
                            .count()
                    });
                    // Check that the counts are unique
                    let unique_count = HashSet::from(counts).len();
                    if unique_count >= 8 {
                        if width * height < best_size {
                            best_size = width * height;
                            println!("{width}x{height}+{x}+{y}");
                            println!("  {counts:?}");
                        }
                    }
                }
            }
        }
    }
} // wheeee!

I added a bit of extra code not shown here to output frames to an animation showing the refinement of the discrimination region.

Refining the discriminant region for the digits after thresholding.

Looking at just the just the last frame of the animation, you can see quite easily that each blue region contains a unique number of white pixels.

Digit discrimination
Each digit has a unique number of white pixels in the search region.

The output of the function told me that the region has a size of 2x4 pixels with an offset of (3, 11). The counts for each digit are:

Digit01234567
Count40523861

I slapped this into a match statement and boom! Optical character recognition!

// Map the discriminant values to the digits they represent
fn count_to_digit(count: usize) -> Result<u8> {
    Ok(match count {
        4 => 0u8,
        0 => 1,
        5 => 2,
        2 => 3,
        3 => 4,
        8 => 5,
        6 => 6,
        1 => 7,
        _ => return Err(anyhow!("Error parsing digit")),
    })
}

const SAMPLE_OFFSET: (u32, u32) = (3, 11);  // From above computation
const SAMPLE_SIZE: (u32, u32) = (4, 2);     // From above computation
const TILE_STRIDE: u32 = 33;                // Spacing between digits
const TOP_NUMS_OFFSETS: [u32; 8] = [1, 0, 0, 0, 0, 0, 0, 0];
const LEFT_NUMS_OFFSETS: [u32; 8] = [0, 2, 2, 1, 1, 2, 2, 1];

// Pass in two subimages cropped to the numbers on the top and left sides
fn parse_wall_counts(
    top_img: SubImage<&RgbaImage>,
    left_img: SubImage<&RgbaImage>,
) -> Result<([u8; 8], [u8; 8])> {
    let mut top_nums = [0; 8];
    let mut left_nums = [0; 8];
    for i in 0..8 {
        // Crop the image to a single digit, threshold, and count pixels
        let top_x = i * TILE_STRIDE + 3 + TOP_NUMS_OFFSETS[i as usize];
        let top_y = 11;
        let top_count = top_img
            .view(top_x, top_y, 4, 2)   //
            .pixels()
            .map(|(_, _, p)| p.0)
            .filter(|&[_r, g, b, _a]| g == 91 && b >= 69 && b <= 78)
            .count();

        // Repeat for the numbers on the left side
        let left_x = 3;
        let left_y = i * TILE_STRIDE + 11 + LEFT_NUMS_OFFSETS[i as usize];
        let left_count = left_img
            .view(left_x, left_y, 4, 2)
            .pixels()
            .map(|(_, _, p)| p.0)
            .filter(|&[_r, g, b, _a]| g == 91 && b >= 69 && b <= 78)
            .count();

        // Map the counts to digits using the discriminant values
        top_nums[i as usize] = count_to_digit(top_count)?;
        left_nums[i as usize] = count_to_digit(left_count)?;
    }

    Ok((top_nums, left_nums))
}

You may have also noticed the definition of TOP_NUMS_OFFSETS and LEFT_NUMS_OFFSETS. When I first ran this it wasn’t giving proper results until I realized the digit sprites have an extra offset when drawn to screen and don’t follow a perfect grid. Each one is jittered by 0-2 pixels. I computed these offsets manually and added them back into the coordinate computation to fix it. I modified my parsing overlay function to also show the numbers my program thought it was seeing.

100 more random puzzles with the parsed state overlaid, including wall counts.

Harvesting the Seed

The last piece of the puzzle is the room seed. It’s an 8 digit number above the puzzle that can be used to revisit puzzles you’ve played in the past. I want to build a simple database of puzzles which I can reuse for testing without relying on image parsing or even having the game running all the time, and the seed will serve to let me remove duplicates. It can also be used as a small easy-to-copy type which I can pass around without dragging the rest of the puzzle data with it. Let’s take a look at the font and separate it from the background like we did with the wall counts.

Seed thresholding
A bunch of random seeds. On the right the image is thresholded with red = 122.

I’ll zoom in on the font a bit more so you can see some of the observations I made.

Small font
Enhance!

What I noticed:

  1. The font is non-monospaced (polyspaced?)
  2. The text appears to be center-justified
  3. There is exactly 1 pixel of spacing between characters
  4. The pixels are all 7 pixels tall and between 4 and 7 pixels wide
  5. The leftmost column of pixels forms a unique pattern for each digit

The last observation turned out to be the key to success. To elaborate on the idea, consider the first digit, 0. The column reads from top to bottom 0, 0, 1, 1, 1, 0, 0. If we combine these bits into a binary number 0b0011100, it corresponds to a value of 28 which I’ll call its hash. Moving to the next digit, we get 0b0100001 which is 33. 2 has a hash of 17, and so on.

Small font2
The same font, but with the first column of each digit highlighted in blue.

For completeness, here’s the full table:

Digit01234567
Hash2833171881226216

Fortunately for me, each of these hashes is unique which meant I could use them as the discriminant to parse the seed. I developed a simple algorithm to complete this task.

  1. Starting from the left, consider a column of 7 pixels, thresholded to be either black or white
  2. Read the pixels from top to bottom as a 7-bit binary number n, and call this the hash
  3. Look up the hash in a table to identify which digit we’re looking at and how many columns to advance
  4. Accumulate each digit into a single variable

For step 3, I needed a table which had each digit’s width so I could skip ahead each time I identified a digit. I just counted the pixels manually and added 1 for the extra space between characters, tabulating them in an array.

// Width of each digit + 1 (to include the kerning)
const SEED_OFFSETS: [u32; 10] = [8, 5, 8, 8, 7, 8, 8, 8, 8, 8];

Parsing the seed is then pretty easy. See for yourself!

const SEED_BASE: (u32, u32) = (109, 103);   // Offset to top-left pixel of seed
const SEED_SIZE: (u32, u32) = (63, 7);      // Size of full seed in pixels
fn parse_seed(img: SubImage<&RgbaImage>) -> Result<Option<u32>> {
    // Parse seed
    let mut seed = 0u32;
    let mut seed_present = false;
    let mut x = 0;

    // Scan from the left
    while x < SEED_SIZE.0 {
        // Extract a column and compute the hash
        let hash = img
            .view(x, 0, 1, SEED_SIZE.1)
            .pixels()
            .map(|(_x, _y, Rgba([r, _g, _b, _a]))| if r == 52 { 1 } else { 0 })
            .fold(0, |acc, x| (acc << 1) + x);

        // Column was empty, move to the next column
        if hash == 0 {
            x += 1;
            continue;
        }

        // I track this to differentiate between the cases of no seed present and a seed of 00000000
        seed_present = true;

        // Map the hash back to a digit
        let digit = match hash {
            28 => 0,
            33 => 1,
            17 => 2,
            18 => 3,
            8 => 4,
            122 => 5,
            62 => 6,
            16 => 7,
            54 => 8,
            50 => 9,
            _ => return Err(anyhow!("Seed parse error")),
        };

        // Keep running tally of the seed as we find each digit
        seed = seed * 10 + digit;
        x += SEED_OFFSETS[digit as usize];
    }

    // If there was no seed, return None (for curated puzzles with no seed)
    Ok(seed_present.then_some(seed))
}

Bringing it All Together

By this point I had individually figured out how to parse all the parts of the game state. I just had to wrap it all up in a bow with a function that would handle segmenting the image into smaller regions and delegating to the appropriate parsing functions. The result is a populated Puzzle struct.

// Cropping coordinates
const BOARD_BASE: (u32, u32) = (49, 175);
const BOARD_SIZE: (u32, u32) = (264, 265);
const TOP_NUMS_BASE: (u32, u32) = (55, 138);
const TOP_NUMS_SIZE: (u32, u32) = (263, 32);
const LEFT_NUMS_BASE: (u32, u32) = (19, 174);
const LEFT_NUMS_SIZE: (u32, u32) = (32, 263);

impl Puzzle {
    pub fn from_image(img: SubImage<&RgbaImage>) -> Result<Self> {
        // Crop regions for the board, seed, and wall counts
        let tiles = img.view(BOARD_BASE.0, BOARD_BASE.1, BOARD_SIZE.0, BOARD_SIZE.1);
        let top_nums = img.view(
            TOP_NUMS_BASE.0,
            TOP_NUMS_BASE.1,
            TOP_NUMS_SIZE.0,
            TOP_NUMS_SIZE.1,
        );
        let left_nums = img.view(
            LEFT_NUMS_BASE.0,
            LEFT_NUMS_BASE.1,
            LEFT_NUMS_SIZE.0,
            LEFT_NUMS_SIZE.1,
        );
        let seed = img.view(SEED_BASE.0, SEED_BASE.1, SEED_SIZE.0, SEED_SIZE.1);

        // Parse each region
        let tiles = parse_tiles(tiles);
        let (top_nums, left_nums) = parse_wall_counts(top_nums, left_nums)?;
        let seed = parse_seed(seed)?;

        Ok(Self {
            tiles,
            top_nums,
            left_nums,
            seed,
        })
    }
}

I don’t have much more to say about this function, but I whipped up another overlay image to highlight the parsing regions this time. Red for tiles, blue for the two digit regions, and green for the seed.

Pasring overlay
The 4 regions being passed to the parsers.

Serialization/Deserialization

I mentioned in the introduction that I wanted to store parsed puzzles in a file. This would let me run a parsing loop overnight to collect a couple hundred thousand puzzles and use the pre-parsed puzzles in automated testing in the future. To start, I needed to come up with a file format that is relatively compact and simple to read and write.

Recall the original definition of the Puzzle struct:

struct Puzzle {
    tiles: [[Tile; 8]; 8],
    top_nums: [u8; 8],
    left_nums: [u8; 8],
    seed: Option<u32>,
}

We have tiles which holds 64 variants of the Tile enum. Tile has 5 variants, which would require 3 bits per tile to encode. Remember though that I only care about Monster, Treasure, and Empty states for now. I chose to use two bitmaps to encode the treasure and monster locations, and assume any bit not flagged as either would be empty. I made the bitmap by looping over the tiles and cumulatively bit-shifting the state into two u64s. In the future when I get to generating my own larger puzzles, I will explore run-length encoding of the tile data which should be very efficient.

For the top and left wall count numbers, there are a combined 16 values which are between 0-7 inclusive, requiring 3 bits each for a total of 48 bits. I compute the value in a u64 and then write out only the lowest 6 bytes.

The seed is an optional value which ranges from 0-99999999 (10 million values). I could add an extra bit (requires a full byte) with a flag indicating whether or not the seed is present, but it is be even easier to add 1 to the seed and store a value of 0 in cases when the seed is not present. On deserialization, I do the opposite and subtract 1 for a non-zero seed value. The max value can be represented with only 24 bits, so I use a u32 and only write the lowest 3 bytes.

This is what the data will look like when written to a 26 byte buffer. It is written in little endian format, chosen arbitrarily.

OffsetDescription
0-4Seed
4-10Wall Counts
10-18Monster Locations
18-26Treasure Locations

The serialization code looks like this:

// The generic parameter T lets me serialize to a buffer or directly to a file
fn serialize<T>(&self, cursor: &mut T) -> Result<()>
where
    T: std::io::Write + byteorder::WriteBytesExt,
{
    // Concatenate top and left numbers, fold them 3 bits at a time into a u64
    let wall_counts: u64 = [self.top_nums, self.left_nums]
        .concat()
        .iter()
        .fold(0, |acc, &x| (acc << 3) + x.min(7) as u64);

    // Iterate over each tile, mapping each monster to 1, else 0. Fold into a u64
    let monster_locations = self
        .tiles
        .iter()
        .flatten()
        .map(|tile| if let Tile::Monster = tile { 1 } else { 0 })
        .fold(0, |acc, x| (acc << 1) + x as u64);

    // Repeat for treasure locations
    let treasure_locations = self
        .tiles
        .iter()
        .flatten()
        .map(|tile| if let Tile::Treasure = tile { 1 } else { 0 })
        .fold(0, |acc, x| (acc << 1) + x as u64);

    // Seed is 0 for None or (seed + 1) for Some
    if let Some(seed) = self.seed {
        cursor.write_u32::<LE>(seed + 1)?;
    } else {
        cursor.write_u32::<LE>(0)?;
    }
    cursor.write_all(&wall_counts.to_le_bytes()[0..6])?;
    cursor.write_u64::<LE>(monster_locations)?;
    cursor.write_u64::<LE>(treasure_locations)?;

    Ok(())
}

Deserialization is basically the same thing but in reverse.

fn deserialize(bytes: &[u8; 26]) -> Result<Puzzle> {
    let mut cursor = Cursor::new(bytes);
    // Read the seed first. Subtract 1 if it was non-zero.
    let seed = cursor.read_u32::<LE>()?;
    let seed = if seed == 0 { None } else { Some(seed - 1) };

    // Read the packed wall counts into an 8 byte buffer and convert to a u64
    let mut wall_counts: [u8; 8] = Default::default();
    cursor.read_exact(&mut wall_counts[0..6])?;
    let wall_counts = u64::from_le_bytes(wall_counts);

    // Decode the top and left counts with reverse shift and mask operations
    let top_nums  = [0, 1, 2, 3, 4, 5, 6, 7].map(|i| (wall_counts >> (45 - (i * 3))) as u8 & 0b111);
    let left_nums = [0, 1, 2, 3, 4, 5, 6, 7].map(|i| (wall_counts >> (21 - (i * 3))) as u8 & 0b111);

    // Read and unpack the monster/treasure locations
    let monster_locations = cursor.read_u64::<LE>()?;
    let treasure_locations = cursor.read_u64::<LE>()?;
    let mut tiles = [[Tile::Empty; 8]; 8];
    for y in 0..8 {
        for x in 0..8 {
            let i = y * 8 + x;
            let monster = monster_locations & (1 << (63 - i)) > 0;
            let treasure = treasure_locations & (1 << (63 - i)) > 0;
            if monster {
                tiles[y][x] = Tile::Monster;
            } else if treasure {
                tiles[y][x] = Tile::Treasure;
            }
        }
    }

    Ok(Puzzle {
        tiles,
        top_nums,
        left_nums,
        seed,
    })
}

I’ll be honest, this didn’t work immediately on my first attempt. The problem had to do with the bitshift offsets which I fixed quickly and then wrote a loop to cycle through puzzles and write them to a file.

const DB_PATH: &str = "data";
const DB_FILE: &str = "puzzles.db";
const PUZZLES_PER_BATCH: usize = 100;
fn collect_puzzles() -> Result<()> {
    // Create db file if it doesn't exist
    let db_path = Path::new(DB_PATH).join(DB_FILE);
    if !db_path.exists() {
        fs::create_dir_all(DB_PATH)?;
    }

    // Instantiate dungeon crawler and open db file in append mode
    let mut dc = dungeon_crawler::DungeonCrawler::new()?;
    let mut file = OpenOptions::new()
        .write(true)
        .append(true)
        .create(true)
        .open(db_path)
        .expect("Database file should have been created");

    // Create buffer to serialize puzzles into
    let buffer = Vec::new();
    let mut cursor = Cursor::new(buffer);

    let t0 = Instant::now();

    for _ in 0..PUZZLES_PER_BATCH {
        dc.random_board();
        let puzzle = dc.parse()?;
        puzzle.serialize(&mut cursor)?;
    }

    // Append buffer to file
    file.write_all(&cursor.into_inner())?;

    let elapsed = t0.elapsed();
    println!(
        "{PUZZLES_PER_BATCH} parsed in {:.02}s. ({:.02}/s)",
        elapsed.as_secs_f32(),
        PUZZLES_PER_BATCH as f32 / elapsed.as_secs_f32()
    );

    Ok(())
}

I collected 100 puzzles at a time and appended them to the database file. When I ran it, it cycled through all 100 puzzles and parsed them in about 3.6 seconds, or 28 puzzles/second. I suspect the theoretical maximum would be 165/sec (my monitors refresh rate, with 1 puzzle per frame). I may try to push my speed here in the future but for now I am happy with ~30/sec.

I wrote some companion functions to read all the puzzles and parse them, and output the total count and the number that are unique.

/// Read and parse all the puzzles in the database.
fn read_puzzles() -> Result<Vec<Puzzle>> {
    let db_path = Path::new(DB_PATH).join(DB_FILE);
    let buffer = fs::read(db_path)?;

    let puzzles = buffer
        .array_chunks::<26>()
        .map(|chunk| puzzle::Puzzle::deserialize(chunk))
        .collect::<Result<Vec<puzzle::Puzzle>>>()?;

    Ok(puzzles)
}

/// Count the number of puzzles and see how many are unique.
fn print_db_info() -> Result<()> {
    let puzzles = read_puzzles()?;

    let seeds: HashSet<u32> = HashSet::from_iter(
        puzzles
            .iter()
            .map(|puzzle| puzzle.seed.unwrap_or(10000000u32)),
    );

    println!("Puzzle database");
    println!("  count: {}", puzzles.len());
    println!("  unique: {}", seeds.len());

    Ok(())
}
 Puzzle database
   count: 5000
   unique: 5000

After 5000 puzzles there were no duplicates which I guess makes sense given a sample size of 10 million. I’ll use this database in the next post when I begin solving them. This concludes the bulk of the content I wanted to cover in this post, but if you haven’t had enough yet, read on for some bonus content on testing.

(Bonus) Testing

I had now written parsing, serialization, and deserialization routines for puzzles and I wanted to add a bit of automated testing to make sure that A) these functions are doing what I intended, and B) they don’t break in the future if I make modifications to the serialization format or parsing routines. I started with a parsing test that took an image as an input and compared the parsed Puzzle with a manually verified struct literal. I added one test case per each of the 18 different enemy types, some with and some without seeds. I learned about and made use of the test-case crate which let me write a single test case and instantiate it multiple times for different input data.

// This function lets me lookup the verified puzzles by monster name. I've ommitted 17 other cases for brevity.
fn get_reference_puzzle(monster: &str) -> Puzzle {
    use Tile::*;
    match monster {
        "slime" => Puzzle {
            tiles: [
                [Empty, Empty, Monster, Empty, Empty, Empty, Monster, Empty],
                [Empty, Empty, Empty, Empty, Monster, Empty, Empty, Monster],
                [Empty, Empty, Empty, Empty, Empty, Empty, Empty, Empty],
                [Empty, Empty, Empty, Empty, Empty, Empty, Empty, Empty],
                [Empty, Empty, Empty, Empty, Empty, Empty, Empty, Empty],
                [Empty, Empty, Empty, Monster, Empty, Monster, Empty, Empty],
                [Monster, Empty, Empty, Empty, Empty, Empty, Empty, Monster],
                [Empty, Empty, Empty, Empty, Empty, Monster, Empty, Empty],
            ],
            top_nums: [4, 2, 4, 4, 3, 4, 3, 4],
            left_nums: [4, 4, 1, 5, 2, 4, 4, 4],
            seed: Some(6606032),
        },
        // < 17 other cases >
        _ => Puzzle::default(),
    }
}

#[test_case("bear")]
#[test_case("chest")]
// < 14 more monsters >
#[test_case("slime")]
#[test_case("squid")]
/// Tests that images can be successfully parsed by comparing to reference results.
fn parse(monster: &str) {
    // Open the image
    let path = std::path::Path::new("monster_refs").join(format!("{monster}.png"));
    let img = open(path).expect("{monster}.png not found").to_rgba8();

    // Parse it
    let p = Puzzle::from_image(img.view(0, 0, img.width(), img.height())).unwrap();

    // Print the incorrect result on failure for debugging
    assert!(p == get_reference_puzzle(monster), "{monster} => {p:?},")
}

This test case actually helped me to find a bug in my seed parsing, due to not properly handling the case for curated puzzles where there is no seed.

The second test I added was a round-trip serialization/deserialization operation. It opens the image and parses it, then serializes it to a buffer and attempts to deserialize the buffer back into a struct. Comparing the pre- and post-serialization structs lets me know if I am clobbering any data in the process.

#[test_case("bear")]
#[test_case("chest")]
// < 14 more monsters >
#[test_case("slime")]
#[test_case("squid")]
/// Test that a screenshot can be parsed, and the serialization/deserialization
/// round trip doesn't clobber any data
fn serialization(monster: &str) {
    // Open the image and do initial parsing.
    let path = Path::new("monster_refs").join(format!("{monster}.png"));
    let img = open(path).expect("Open {path}").to_rgba8();
    let original = Puzzle::from_image(img.view(0, 0, img.width(), img.height())).unwrap()

    // Create a buffer and serialize the puzzle into it.
    let write_buffer = Vec::with_capacity(1000);
    let mut cursor = Cursor::new(write_buffer);
    original.serialize(&mut cursor).unwrap();

    // Fetch the buffer back from the Cursor wrapper, and pass it to the deserializer
    let read_buffer = cursor.into_inner();
    let deserialized = Puzzle::deserialize(read_buffer.first_chunk::<26>().unwrap()).unwrap();

    assert!(deserialized == original, "Original:\n{original}\nDeserialized:\n{deserialized}");
}

Likewise, this test also caught a bug in how I handled the optional seed parameter with deserialization, and thankfully it was a quick fix. This was my first real foray into rust’s testing framework and I am pretty happy with being able to simply run cargo test and see my results summarized right away.

Conclusion

This post ended up being significantly longer than I was expecting, especially with the detour post. We covered converting pixels to structs, serialization, and automated testing. If you made it this far, thanks for tuning in and get ready for the next post where I’ll be attempting to solve some puzzles for the first time using backtracking!