Published on: 2025-10-22

Making games play themselves with Rust Part 3: Solving Puzzles with Backtracking

Introduction

In the last post I covered the process of parsing the game state from a screenshot. While the end result is fairly simple and requires a small amount of code, it took a pretty large amount of effort to get there. Today it’s time to actually solve puzzles, and I’ll be doing so using an algorithm called backtracking in lieu of proper logical deduction. Today I’ll cover:

Some background on backtracking and its feasibility
A naive solver implementation
Optimizing the solver
Solving millions of puzzles
A reflection on this whole project

Primer on Backtracking

(If you’re familiar with backtracking, feel free to skip this section)

Backtracking is a simple algorithm often used to find solutions to constraint problems. Let’s take everyone’s favourite logic puzzle, Sudoku, as an example. Sudoku has 3 constraints.

Every row must contain one of each digit 1-9
Every column must contain one of each digit 1-9
Every 3x3 subregion must contain one of each digit 1-9.

A Sudoku puzzle is presented as an incomplete 9x9 grid which can be uniquely filled in by successive application of only the 3 rules above. For example in the puzzle below, I can place a 4 in the highlighted square because the other 4s in the puzzle exclude the use of 4 in all but one square in that 3x3 subregion.

Deduction of the position of a 4.

I won’t go into detail here, but there are many techniques which can be employed to deduce unsolved clues in a Sudoku. However, as you try to solve harder and harder puzzles (such as the minimal 17-clue Sudoku), the techniques become complicated enough that sometimes it’s easier to just guess and hope for the best. For example, if you know a square can only be a 3 or a 6, you can just assume it is 3 and continue with the puzzle until you either solve the puzzle or reach a contradiction. Reaching a contradition means you can ‘undo’ the moves to the point where you made the guess, and instead go with 6 because you’ve now proven 3 can’t work.

I’ve heard this strategy called “guess and check” or “trial and error”, but in computer science it is called backtracking. The number of empty squares indicates the depth of the solution space, where each attempted solution requires N=depth guesses to fill out the entire puzzle. This is a pretty frustrating and unfun way to solve puzzles by hand, but for computers it is often one of the easiest ways to implement a solver.

Turning our attention back to Dungeons and Diagrams, puzzles are instead presented on an 8x8 grid with somewhere between 3 and 13 clues given (values gathered empirically). The player has to deduce the state of the remaining 50-60 empty tiles (which must be either Wall or Path). Before I get into the implementation, here’s a quick refresher on the constraints a solution must obey (I’ve numbered them as they’ll be referred to later):

Each row or column must contain a certain number of walls, indicated by the adjacent number
All empty space must be part of a hallway or treasure room
Treasure rooms are always 3x3 with a single entrance
Hallways are always one space wide. Outside of a treasure room there will never be a 2x2 block of empty spaces.
Every monster is in a dead end. Every dead end must contain a monster
All empty spaces connect into a single contiguous shape. Diagonal spaces are not considered connected

Napkin Math

Backtracking is simple, but it’s not a silver bullet. Many games such as Chess or Go are famously not solvable (yet?) due to the astronomical number of possible games, which I have seen estimated to be higher than the number of atoms in the universe. Because the puzzle grid is smaller than Sudoku and because I’ve written a backtracking Sudoku solver before (this is a common first year programming assignment), I assumed DnD would also be feasible, but I came up with some heuristics reinforce that assumption.

In this puzzle, the rows have varying numbers of walls required and monsters present. Focusing on the bottom row, we can see it has 5 empty tiles, and requires 3 of those to be walls. We can use combinatorics to compute the number of ways to place the walls with the “n choose k” function. For the bottom row, 5 choose 3 (I’ll use 5c3 notation going forward) evaluates to 10.

The 10 unique wall layouts that can be used to complete the bottom row

If we repeat this process for all 8 rows, we get 7c1 = 7, 7c6 = 7, 7c6 = 7, 7c2 = 21, 8c7 = 7, 7c3 = 35, 7c3 = 35, and lastly 5c3 = 10. Multiplying all of these numbers together results in a solution space with ~705 million possibilities. This sounds high but is within the realm of possibility for modern computers. Repeating this calculation for a bunch more puzzles, the upper bound hovered around ~4.2 billion. A little scarier but let’s press on!

The actual solution space should end up being quite a bit smaller, as we don’t have to evaluate each chain of guesses all the way through the whole puzzle. Like the sudoku puzzle above, we will inevitably run into contradictions if we make a wrong guess early on. When this happens, we don’t have to continue the search from that point and can cut off entire branches of the solution space. The earlier a contradiction is found, the larger the branch we get to prune.

As a bit of premature optimization, I also realized that as I made guesses for each row I could keep track of the current wall count in each column, and mask off entire columns as they are filled. This will reduce the number of combinations for subsequent rows being checked.

First Attempt

My implementation needed to iterate over each empty tile, place a wall if legal, and then recurse to the next tile. This would result in a stack depth of 50-60 depending on the puzzle - an amount which would incur considerable function call overhead throughout the several hundred million calls needed to traverse the solution space. I figured I could instead traverse the puzzle whole rows at a time, considering groups of 8 tiles at a time. This meant the stack would have a maximum depth of 8, and I could replace the lost functionality with a tight for loop that would go through each permutation of wall placements for the entire row.

I was also inspired by modern chess engines which use bitboards to represent the game state in a compact manner that is both fast to copy (useful for recursion) and fast to process (I could use bitwise operators to operate on all 64 tiles in parallel rather than looping over arrays of bytes). I created a new puzzle representation that could be passed into the solver to take advantage of these ideas. With an 8x8 grid and 4 states to track, I chose to use 4 u64s to make bitmaps of each state. I will also keep the top_nums and left_nums fields from before, as they are already pretty compact.

pub struct Backtracker {
    empty_mask: u64,
    wall_mask: u64,
    treasure_mask: u64,
    monster_mask: u64,
    left_nums: [u8; 8],
    top_nums: [u8; 8],
}

impl Backtracker {
    // Convert our existing 'Puzzle' struct to the new bitboard format
    pub fn new(puzzle: &Puzzle) -> Self {
        let mut empty_mask = 0u64;
        let mut wall_mask = 0u64;
        let mut treasure_mask = 0u64;
        let mut monster_mask = 0u64;

        // Iterate over each tile and set the bits in the corresponding masks
        puzzle
            .tiles
            .as_flattened()
            .iter()
            .enumerate()
            .for_each(|(i, tile)| {
                let bit = 1 << (63 - i);
                match tile {
                    Tile::Empty => empty_mask |= bit,
                    Tile::Wall => wall_mask |= bit,
                    Tile::Treasure => treasure_mask |= bit,
                    Tile::Monster => monster_mask |= bit,
                }
            });

        Self {
            empty_mask,
            wall_mask,
            treasure_mask,
            monster_mask,
            left_nums: puzzle.left_nums,
            top_nums: puzzle.top_nums,
        }
    }
}

Let’s check out an example of parsing a puzzle into this format.

For the puzzle above, the data will look like this.

empty_mask:    1101111011111111111111101111111111110111111011111111111111111101
monster_mask:  0010000100000000000000010000000000001000000000000000000000000010
treasure_mask: 0000000000000000000000000000000000000000000100000000000000000000
wall_mask:     0000000000000000000000000000000000000000000000000000000000000000

If I wrap the bits into 8x8 grids, you can see the bits map exactly to the tile locations.

  emmpty    monster   treasure  wall
  11011110  00100001  00000000  00000000
  11111111  00000000  00000000  00000000
  11111110  00000001  00000000  00000000
  11111111  00000000  00000000  00000000
  11110111  00001000  00000000  00000000
  11101111  00000000  00010000  00000000
  11111111  00000000  00000000  00000000
  11111101  00000010  00000000  00000000

The task now is to figure out for what value of wall_mask are all the puzzle constraints satisfied? This puzzle has 58 empty tiles, so there are 2^58 or ~288 quadrillion ways to fill out the walls if I started with a wall_mask of 0 and counted up. For fun I calculated that it would take my computer ~5 years to simply count that high without doing any other work. I already knew from my combinatorial calculations earlier that I could do much better, so I focused my attention on the first row of the puzzle to figure out my approach for the recursive solution.

The first row of a puzzle. It has two monsters and requires 3 walls placed among 6 empty cells.

We can see there are two monsters, leaving 6 empty cells to place 3 walls. This gives 6c3 = 20 possible wall layouts. However this ignores the 0 constraint in the 6th column indicating that there cannot be any walls there. This reduces the number of valid tiles to place walls to 5, and the number of wall permutations to 5c3 = 10. Considering the tree example from earlier, we just cut down half the tree! This demonstrates the importance of looking for early pruning opportunities in our search.

The 10 possible solutions for the first row.

We can pick any of the 10 wall permutations (although it makes sense to traverse them in an order which is fast to compute) and assume it to be the correct solution for the first row of the puzzle. For each wall filled in, I subtract 1 from the vertical constraints as the number of walls remaining for that column will have now decreased. From there we move on to the next row and rinse and repeat. If we make it to the last row and haven’t violated any constraints, we will have discovered a solution to the puzzle! I say a solution instead of the solution because I can’t guarantee the puzzle generator always produces layouts with a unique solution. I therefore designed my algorithm to take this into account and push any solution it finds into a list which I can review after the search is complete. A solution only needs to encode the wall placement, so it turns out that storing the wall_mask wich is a u64 is adequate. The algorithm can then be described in just a couple steps:

Iterate over all Wall permutations for the current row
Place walls according to the current permutation
If we are at max depth (last row), verify that wall placements satisfy the puzzle constraints
- If the solution is valid, push it into solution vector
Otherwise descend to the next row, recursively

Iterating through the Wall permutations for a row can be described as finding all the 8-bit numbers with 3 1 bits, and then exclude ones where the 1s overlap with the positions of the monsters. Let me illustrate, again with the first row of the puzzle:

monster positions: 00100001
satisfied columns: 00000100   // the 6th column requires 0 walls, so it is already complete

From here, we can OR these numbers together and invert the result to get a mask that show our wall candidates:

NOT (00100001 OR 00000100) = 11011010 (the 5 1’s correspond to potential wall placements)

Finally we can iterate a number N from 0 to 255, ANDing each number with this mask and keeping only those that result in exactly 3 bits being set. Rust has a built-in function called count_ones() that makes this pretty easy. Here’s what the first handful of matches look like:

`N` (decimal)	`N` (binary)	`M` (N AND Mask)
26	00011010	00011010
27	00011011	00011010
30	00011110	00011010
31	00011111	00011010
58	00111010	00011010
59	00111011	00011010
62	00111110	00011010
63	00111111	00011010
74	01001010	01001010
75	01001011	01001010
78	01001110	01001010
79	01001111	01001010
82	01010010	01010010
83	01010011	01010010
86	01010110	01010010
87	01010111	01010010
88	01011000	01011000
89	01011001	01011000
92	01011100	01011000
93	01011101	01011000

There’s a small problem here, and that is that our masking function results in multiple Ns that map to the same wall placement configuration (for example 26, 27, 30, 31, 58, 59, 62, and 63 all map to 00011010). This will result in performing the same guess multiple times and duplicating work. As a simple workaround I kept track of the last value of M, and if the current value is the same, skip the guess and increment N. This was a slight improvement, but eventually I just created a lookup table called BIT_PERMUTATIONS that could be indexed by the current empty cell mask and the required wall count. It returned a list with the valid masks for the current row. It looked something like this:

let candidate_mask = 0b11011010; // mask indicating potential wall positions
let walls_required = 3;          // number of walls required
let guesses = BIT_PERMUTATIONS[valid_positions][walls_required];
// 'guesses' is now the list of numbers with 3 of the 5 bits set from candidate mask
// -> [26, 74, 82, 88, 138, 146, 152, 194, 200, 208]

Implementing the rest of the recursive solver goes basically as you might expect. I’ll leave some annotated code here that hopefully explains itself well enough. Feel free to skip over it and just look at the results :)

We start with depth=0, representing the top row of the puzzle. Guesses are subsequently made for each row until a solution is found and pushed to the solutions vector. There are a few conditions that are tested to handle invalid states like a row needing more walls than it has space for.

fn solve_recursive(
    &self,
    // The numbers along the top of the puzzle. These will be modified as walls are placed.
    column_counts: [u8; 8],
    // 64 bit mask where every 1 represents an Empty tile
    empty_mask: u64,
    // 64 bit mask where every 1 represents a Wall tile
    wall_mask: u64,
    // Recursion depth (represents which row we are currently guessing)
    depth: usize,
    // Output vector to hold any/all solutions we (hopefully) find
    solutions: &mut Vec<u64>,
) {
    // The number of walls required by the row at the current depth
    let row_wall_count = self.left_nums[depth];

    // An 8 bit bitmask of empty cells for the current row. The call to 'to_be_bytes()'
    // casts the u64 as a [u8; 8], letting me pluck out the byte I need
    let row_empty_cells = empty_mask.to_be_bytes()[depth];

    // 8 bit bitmask marking columns that are not full yet (remaining count > 0)
    let unsatisfied_columns = column_counts
        .into_iter()
        .map(|i| if i > 0 { 1u8 } else { 0 })
        .fold(0, |acc, x| (acc << 1) + x);

    // Final mask for the row is 'empty cells that have not yet met their vertical constraint'
    let valid_positions_mask = row_empty_cells & unsatisfied_columns;

    if valid_positions_mask.count_ones() < row_wall_count as u32 {
        // The row requires more walls than there are valid positions, return early
        return;
    } else if depth == 7 && valid_positions_mask.count_ones() != row_wall_count as u32 {
        // We're on the last row and we don't have exactly as many valid positions as walls
        // to place, return early (final row must result in ALL constraints satisfied)
        return;
    }

    // Iterate through all possible 8 bit masks, starting with 0b00000001 until it wraps back to 0
    let mut candidate_mask = 1u8;
    while candidate_mask != 0 {
        // Skip this mask if it would place a wall on an invalid tile
        if !valid_positions_mask & candidate_mask != 0 {
            candidate_mask = candidate_mask.wrapping_add(1);
            continue;
        }
        // Only consider masks that will place the number of walls required
        if candidate_mask.count_ones() == row_wall_count as u32 {
            // Place the walls by OR'ing the candidate mask into the wall mask
            let wall_mask = wall_mask | (candidate_mask as u64) << (56 - 8 * depth);
            let empty_mask = empty_mask & !wall_mask;

            // Update the vertical counts (subtract 1 from each column where a wall was placed)
            let mut counts = column_counts.clone();
            for i in 0..8 {
                if (candidate_mask & (1 << (7 - i))) >= 1 {
                    counts[i] -= 1;
                }
            }

            // If we've reached the last row and haven't returned early, the current wall mask
            // represents a candidate solution. Check it's validity and push to solution vector
            if depth == 7 {
                if is_valid_solution(empty_mask, self.monster_mask, self.treasure_mask) {
                    let sol = wall_mask;
                    solutions.push(sol);
                }
            } else {
                self.solve_recursive(
                    counts,
                    empty_mask,
                    wall_mask,
                    depth + 1,
                    solutions,
                );
            }
        }
        candidate_mask = candidate_mask.wrapping_add(1);
    }
}

I snuck in a call to is_valid_solution() there, which I haven’t yet implemented. For now I’ll just return true to see if the recursion/mask iteration is functional. I will also need a way to simulate entering the solution back into the game, so I added these functions to DungeonCrawler:

    // Maps tile indices (0-7) to pixel coordinates for clicking
    pub fn place_wall(&mut self, x: u8, y: u8) -> Result<()> {
        self.click(
            x as u32 * TILE_STRIDE + TILE_BASE.0,
            y as u32 * TILE_STRIDE + TILE_BASE.1,
        )
    }

    // Loop over each bit in the solution, click the tiles where walls are needed
    pub fn enter_solution(&mut self, solution: u64) -> Result<()> {
        println!("{solution}");
        for i in 0..64 {
            let y = i / 8;
            let x = i % 8;
            if solution & (1 << (63 - i)) != 0 {
                self.place_wall(x, y)?;
            }
        }
        // Aleep a brief moment in case we are solving multiple puzzles quickly to not miss inputs
        thread::sleep(Duration::from_millis(10));
        Ok(())
    }

And lastly I could put together the main solve loop, which will continuously parse puzzles, solve them, and enter the solution back into the game. I added the solve time to the output as well as the puzzle seed and number of solutions found so I could revisit puzzles if I needed to debug anything. I added an escape if multiple solutions were found to just generate a new puzzle.


fn solve_loop() -> Result<()> {
    let mut dc = dungeon_crawler::DungeonCrawler::new()?;

    loop {
        let t0 = Instant::now(); // For measuring solve time
        let puzzle = dc.parse()?;
        let bt = solve::Backtracker::new(&puzzle);
        let solutions = bt.solve();

        let time = format!("{:.2}s", t0.elapsed().as_secs_f32());

        print!("seed: {:0>8} - ", puzzle.seed.unwrap());
        match solutions.len() {
            0 => println!(" no solution found in {time}"),
            1 => {
                println!("unique solution found in {time}");
                dc.enter_solution(*solutions.last().unwrap()).unwrap();
                // Give the game time to process the solution being entered
                thread::sleep(Duration::from_millis(25));
            }
            _ => {
                println!(" {: >3} solutions found in {time}", solutions.len());
            }
        }

        // Go to the next puzzle
        dc.random_board()
    }

    Ok(())
}

Surprisingly, this kind of worked on the first attempt. Letting it run for a while, the output looked like this:

seed: 33095486 - 4720 solutions found in 0.09s
seed: 94075043 - 48 solutions found in 0.10s
seed: 87234007 - 11 solutions found in 0.01s
seed: 45151188 - 7936 solutions found in 0.46s
seed: 49329791 - 672 solutions found in 1.11s
seed: 49074641 - 2 solutions found in 0.00s
seed: 91524114 - 8 solutions found in 0.23s
seed: 16324322 - 10816 solutions found in 0.90s
seed: 69331359 - 21040 solutions found in 2.77s
seed: 18835572 - 32 solutions found in 0.75s
seed: 22819008 - 912 solutions found in 0.26s
seed: 93365886 - 4 solutions found in 0.08s
seed: 60424626 - 27600 solutions found in 3.45s
seed: 74146725 - 184 solutions found in 0.04s
seed: 10221255 - 64 solutions found in 1.02s
seed: 17959478 - 1216 solutions found in 0.39s
seed: 63973204 - 1376 solutions found in 0.04s
seed: 46593034 - 608 solutions found in 0.67s
seed: 17089073 - 128 solutions found in 0.01s
seed: 14192672 - 1488 solutions found in 0.74s

This looked promising performance-wise, with most solves being under 1 second, but I was unable to find any unique solutions. Running it on a 10000 puzzles from my puzzle database gave some more insight into how much work is left to do:

The vast majority of puzzles are solved in under 1 second, but some take over 30 seconds and 2 of them were over 5 minutes
No unique solutions were found at all

To be honest this is not really surprising considering I’ve only set out to satisfy constraint #1 so far (horizontal and vertical wall counts) while ignoring the other 5. This lead to solutions like these:

Two 'solutions' with correct wall counts but several other constraints being violated.

If you count the number of walls in each row and column they will be correct, but all 5 other constraints are violated. I’ll reiterate them quickly here - We have treasure rooms that are not 3x3, treasure rooms with more than one exit, hallways that are 2x2, monsters that are not in a dead end, and lastly a path that is not a single contiguous block. Let’s see if we can do better.

Correctness and Performance Improvements

One thing that stood out in the “solutions” above was just how bad it was at placing walls around the treasure chests. In retrospect this should have been an obvious failure case for my approach because rooms have a well-defined structure over a 5x5 area (3x3 room surrounded by mostly walls) and I was haphazardly rolling dice to guess where the walls go. This has a worst-case probability of (1/2)^25, or about 1 in 33 million. I knew I could do better if I was clever about it. Since the rooms dominate a large area of the board, I thought perhaps I could precompute the valid treasure room positions ahead of time, and use those layouts as starting points for the solver.

Given two treasure chests, there are only 4 valid ways to place two treasure rooms. Rooms can neither touch monsters nor other rooms.

Given the 4 room layouts above I can invoke the recursive solver 4 times, each time using the blue regions as a mask to exclude wall placement. If the rooms are not in the true location, it will likely cause the solver to terminate much faster.

Revisiting the 15 puzzles above, our results now look like this:

seed: 33095486 - 12 solutions found in 0.00s
seed: 94075043 - 4 solutions found in 0.00s
seed: 87234007 - 4 solutions found in 0.00s
seed: 45151188 - 18 solutions found in 0.00s
seed: 49329791 - 9 solutions found in 0.01s
seed: 49074641 - unique solution found in 0.00s
seed: 91524114 - unique solution found in 0.01s
seed: 16324322 - 44 solutions found in 0.01s
seed: 69331359 - 693 solutions found in 0.06s
seed: 18835572 - 3 solutions found in 0.01s
seed: 22819008 - 4 solutions found in 0.00s
seed: 93365886 - unique solution found in 0.00s
seed: 60424626 - 823 solutions found in 0.04s
seed: 74146725 - 8 solutions found in 0.00s
seed: 10221255 - unique solution found in 0.02s
seed: 17959478 - 3 solutions found in 0.01s
seed: 63973204 - 31 solutions found in 0.00s
seed: 46593034 - 10 solutions found in 0.01s
seed: 17089073 - 3 solutions found in 0.00s
seed: 14192672 - 38 solutions found in 0.01s

Huge success! I was now finding a handful of unique solutions, and the solve speed was maybe 2 orders of magnitude faster. Running it on 10000 puzzles completed in just under 4 seconds. In order to cut down on the cases with multiple solutions, I had to start intelligently checking whether my solutions actually met all the constraints. I planned to do this with a series of predicates, one per constraint, relying mostly on bitwise math to remain performant. The checker would look something like this:

// Chaining these calls together with && allows short-circuit evaluation, meaning the
// slowest predicates will only be tested if necessary
fn is_valid_solution(empty_mask: u64, features: &PuzzleFeatures) -> bool {
    no_wide_hallways(empty_mask, features)
    && check_treasure_rooms(empty_mask, features)
    && dead_ends_have_monsters(empty_mask, features)
    && contiguous_path(empty_mask, features)
}

I added a parameter of type PuzzleFeatures which is just a collection of unchanging features of the puzzle that are useful in constraint evaluation. empty_mask is a bitmask identifying empty cells, which constantly changes as guesses are made. The PuzzleFeatures struct has 5 members:

struct PuzzleFeatures {
    /// Constraints on the left side of the puzzle
    left_nums: [u8; 8],
    /// Bitmask of monster locations
    monster_mask: u64,
    /// Bitmask of treasure locations
    treasure_mask: u64,
    /// Bitmask of 3x3 treasure rooms
    rooms_mask: u64,
    /// List of (x, y) coordinates for treasure chests (blue regions from the gif above)
    room_positions: Vec<(u8, u8)>,
}

Let’s see what we can do with these values. I’ll walk you through the implementation for each of the predicates.

—– 1. No Wide Hallways —–

Detecting wide hallways (2x2 or larger section of Empty tiles) is pretty easy with bitwise math. We just look at each empty tile and check if the tiles to the east, south, and southeast are also empty. Checking the neighbors can be done quickly by bitshifting the empty_mask. A shift by 1 bit will access neighbors left/right, whereas a shift by 8 bits will access neighbors above/below. There is one small edge case (literally) to handle, and that is that bits on the left and right edge will assume they are adjacent to one another, as I will try to demonstrate visually (this was hard to visualize):

Left: bit indices. Middle: shifting by one bit accesses the horizontal neighbors. Right: shifting by 8 bits accesses vertical neighbors.

To compensate for this faux-adjacency, we can further mask the leftmost and rightmost columns when doing left and right shifts. The least significant bit corresponds to the bottom right tile, increasing first to the left and then vertically. The most significant bit represents the top left bit. In red I’ve highlighted the bits that either wrap around the board, or get shifted in as 0s. Knowing this, I created a few more masks which I can use to ignore those bits.

// masks to hide bits that wrapped around the edge of the board
const LEFTSHIFT_MASK: u64 = 0xfefefefefefefefe;
const RIGHTSHIFT_MASK: u64 = 0x7f7f7f7f7f7f7f7f;

fn no_wide_hallways(empty_mask: u64, features: &PuzzleFeatures) -> bool {
    // Create a bitmask indicating cells that are part of a 2x2 Empty region
    // Check cell to the East, ignoring bits that wrap around the board
    let mut wide_hall_mask = empty_mask & ((empty_mask << 1) & LEFTSHIFT_MASK);
    // Check cell to the South and Southeast
    wide_hall_mask &= wide_hall_mask << 8;
    // Check cell to the West
    wide_hall_mask |= (wide_hall_mask >> 1) & RIGHTSHIFT_MASK;
    // Check cell to the North and Northwest
    wide_hall_mask |= wide_hall_mask >> 8;

    // At this point, all the 1s in the mask are part of 2x2 or larger Empty regions
    // Need to ignore bits that are part of a treasure room (2x2 Empty is legal there)
    wide_hall_mask & !features.room_mask == 0
}

This solution is wicked fast. The bitwise math allows us to operate on all 64 tiles at once with no need for looping or array indexing. I later learned this is referred to as SIMD Within a Register (SWAR).

—– 2. Monsters in Dead Ends —–

The condtion we need to evaluate is “Monsters have exactly 1 Empty tile adjacent to them”. I put together a boolean function to do this similar to the wide hallway detection, but it is slightly more involved. My implementation is fast, but I suspect a more efficient method exists.

—– 3. Treasure Rooms have 1 Entrance —–

I thought about this one for a while - and initially planned to loop over the 16 tiles surrounding each 3x3 treasure room and count the Empty tiles. This worked but I was itching for another cute SWAR solution using bitmath. I figured I could create a mask of the treasure room boundary and multiply it with the Empty tile mask. I can assert that the result has a single 1 bit to prove that the room has exactly one entrance.

The red pixels indicate the boundary of one of the potential treasure room positions. Exactly one of the red tiles needs to be Empty

In the image above, I have chosen a potential spot for the treasure room in blue. The 12 red boundary tiles are what I want to create a mask of. Reading the bits out in a grid looks like this:

And converting that to a u64: 0b0000111000010001000100010001000100001110000000000000000000000000 (or 0x0E1111110E000000 in hexidecimal)

I built up a table of these masks, one for each of the 36 potential treasure room locations (3x3 room can only fit in an 8x8 grid 36 ways), and I can look it up based on the room location.

fn check_treasure_rooms(empty_mask: u64, features: &PuzzleFeatures) -> bool {
    // Loop over each treasure room
    for &(x, y) in features.room_positions.iter() {
        // Multiply the boundary mask with the empty mask and count the 1s
        if (empty_mask & ROOM_BORDER_MASKS[y as usize][x as usize]).count_ones() != 1 {
            return false;
        }
    }
    true
}

—– 4. Contiguous Path —–

I saved the worst for last. Checking if the entire path is contiguous boils down to a floodfill algorithm which is hard to do without a iterating. I can strategically do this check last so it only happens if all the other conditions are satisfied, meaning it should be called as infrequently as possible. I implemented it by performing iterative dilation starting from a single Empty tile and repeating until no change is detected. If the number of 1s in the flooded region is equal to the number of Empty + Treasure + Monster tiles (treasure and monsters can be considered part of the path).

fn contiguous_path(empty_mask: u64, features: &PuzzleFeatures) -> bool {
    // Create the initial flood mask by only setting the first 1 bit from the Empty mask
    let tz = empty_mask.trailing_zeros();
    let mut flood_mask = 1u64 << tz;

    // Bitmask of region to flood includes treasure/monsters
    let e_mask = empty_mask | features.treasure_mask | features.monster_mask;
    loop {
        // Create copy of flood mask so we can detect if anything changed
        let mut flood_mask2 = flood_mask;
        // Check if N,E,S,W neighbors are empty and dilate the mask appropriately
        flood_mask2 |= ((flood_mask << 1) & LEFTSHIFT_MASK) & e_mask;
        flood_mask2 |= ((flood_mask2 >> 1) & RIGHTSHIFT_MASK) & e_mask;
        flood_mask2 |= (flood_mask2 << 8) & e_mask;
        flood_mask2 |= (flood_mask2 >> 8) & e_mask;
        // If no change occurred, we're done!
        if flood_mask2 == flood_mask {
            break;
        }
        flood_mask = flood_mask2;
    }

    // Check that dilated region is equal in size to the original count of Empty tiles
    if flood_mask.count_ones() != e_mask.count_ones() {
        return false;
    }

    true
}

This method expands once in each direction every step of the loop. It could probably be made faster (empirically, not algorithimcally) by starting the search near the center of the board rather than just “the first Empty cell we can find”, approximately reducing the number of steps by 2. I found this optimization was not needed.

Floodfill/dilation process over 17 iterations

In the animation above, you can see the search starts in the first Empty cell on the bottom row. Due to the use of trailing_zeros to find this cell, it will by nature be in the bottom-most row and as far to the right as possible.

This animation was fun to make and I hope it’s fun to watch. Now after entirely too much ado, I could finally taste the fruits of my labour.

The 316 guesses required to solve this puzzle.

I generated a gif of the solver showing each of the guesses it makes as it progresses through the puzzle. It was immensely satisfying to see this for the first time, and also caught my eye with another optimization opportunity. You can see for the first 2-3 seconds of this gif show the top-right skeleton fully surrounded by walls, which is illegal. The fix was simple, and it was to apply the “monsters in dead ends” predicate after each guess instead of only upon reaching the bottom row. This (along with the “no wide hallways” predicate) massively pruned the search tree and resulted in an overall speedup of over an order of magnitude.

Results

I tested the solver on my database of nearly 6 million puzzles that I parsed from the game. I was able to solve the entire corpus in under 4 seconds (parallelized across 16 threads) which implies an average single-threaded solve time of about 10 microseconds per puzzle. This was MUCH faster than I had hoped to achieve when I started this project and I was very excited to see that. Lastly, please enjoy a recording of the solver chugging through puzzles at lightspeed.

A recording of the finished program running through 20 puzzles.

I like seeing the mouse cursor jump around from tile to tile, hammering home the somewhat melancholy reality that I have just spent dozens of hours writing a computer program to automate the process of having fun and documenting it for the handful of people who may one day stumble upon this.

Conclusion

I’m really pleased with the outcome of this solver and had a blast writing it. I would love to come back and take a stab at generating my own puzzles in the future, but for now I would like even more to hang this project up as finished. I finished the solver well over a year ago in what was probably a single digit amount of hours. I have spent easily 5 times that on writing this series of blog posts, and it’s been weighing on me to wrap it up and publish it. I vastly underestimated the amount of time I would spend writing scripts to generate animations and just editing and rewriting large parts of the posts. Nonetheless I really enjoyed putting it all together and I’m eager to get working on the next project!

Additional Thoughts

While putting this post together, there were several things I wanted to write about but they generally didn’t add value to the plot. I decided to briefly cover them here for extra curious readers.

I have thought a lot about how one might generate these puzzles. I did breifly take a stab at it with some decent results, but the puzzles a) didn’t support treasure rooms, and b) were of generally extreme difficulty, often requiring backtracking (which is a hallmark of poor puzzle design). If I decide to come back to this project, It would certainly be to tackle this problem and try to generate arbitrary-sized puzzles of prescribed difficulty.
My solver doesn’t use any logical deduction (for example “this row needs 4 walls and has 4 spaces, therefore they all must be walls”, or, “the space between a monster and a treasure room must be a wall”). I originally wanted to incorporate these into the solver, but with the speed I achieved they simply weren’t necessary. If I were to attempt solving larger puzzles (like 16x16), they perhaps would be mandatory to have acceptable solve times. There exists an online puzzle generator which can produce larger puzzles, but occasionally it places invalid treasure rooms.
My solver is very finely tuned for 8x8 puzzles. It would not be impossible to adapt it to larger puzzles but would certainly require a couple hours of work. Of course, generating puzzles beyond 8x8 would require a solver that can prove they have a unique solution, so that work would need to be done if I take another stab at puzzle generation. I had the idea to use integer arrays for the bitboards for larger sizes, so instead of a u64 with 8 bits per row in an 8x8 puzzle, I could use an array [u16; 16] to store 16 16-bit rows (or larger sizes with [u32; 32] or even [u128; 128] for the masochists). Doing bitwise operations between two bitmasks becomes a bit more involved but not difficult.
The speed I achieved is promising for the notion that much larger puzzles could be generated/solved. However, the solution space grows factorially due to the combinatorial nature of enumerating potential guesses. Incorporating some of the more trivial logical deduction techniques could improve this I imagine, but by how much remains to be seen.

If you enjoyed this post. If you did, feel free to share it or drop a comment on Bluesky or Reddit