Making games play themselves with Rust Part 3: Solving Puzzles with Backtracking
Table of contents
- Introduction
- Primer on Backtracking
- Napkin Math
- First Attempt
- Correctness and Performance Improvements
- Results
- Conclusion
- Additional Thoughts
Introduction
In the last post I covered the process of parsing the game state from a screenshot. While the end result is fairly simple and requires a small amount of code, it took a pretty large amount of effort to get there. Today it’s time to actually solve puzzles, and I’ll be doing so using an algorithm called backtracking in lieu of proper logical deduction. Today I’ll cover:
- Some background on backtracking and its feasibility
- A naive solver implementation
- Optimizing the solver
- Solving millions of puzzles
- A reflection on this whole project
Primer on Backtracking
(If you’re familiar with backtracking, feel free to skip this section)
Backtracking is a simple algorithm often used to find solutions to constraint problems. Let’s take everyone’s favourite logic puzzle, Sudoku, as an example. Sudoku has 3 constraints.
- Every row must contain one of each digit 1-9
- Every column must contain one of each digit 1-9
- Every 3x3 subregion must contain one of each digit 1-9.
A Sudoku puzzle is presented as an incomplete 9x9 grid which can be uniquely filled in by successive application of only the 3 rules above. For example in the puzzle below, I can place a 4
in the highlighted square because the other 4
s in the puzzle exclude the use of 4
in all but one square in that 3x3 subregion.
I won’t go into detail here, but there are many techniques which can be employed to deduce unsolved clues in a Sudoku. However, as you try to solve harder and harder puzzles (such as the minimal 17-clue Sudoku), the techniques become complicated enough that sometimes it’s easier to just guess and hope for the best. For example, if you know a square can only be a 3
or a 6
, you can just assume it is 3
and continue with the puzzle until you either solve the puzzle or reach a contradiction. Reaching a contradition means you can ‘undo’ the moves to the point where you made the guess, and instead go with 6
because you’ve now proven 3
can’t work.
I’ve heard this strategy called “guess and check” or “trial and error”, but in computer science it is called backtracking. The number of empty squares indicates the depth of the solution space, where each attempted solution requires N=depth guesses to fill out the entire puzzle. This is a pretty frustrating and unfun way to solve puzzles by hand, but for computers it is often one of the easiest ways to implement a solver.
Turning our attention back to Dungeons and Diagrams, puzzles are instead presented on an 8x8 grid with somewhere between 3 and 13 clues given (values gathered empirically). The player has to deduce the state of the remaining 50-60 empty tiles (which must be either Wall
or Path
). Before I get into the implementation, here’s a quick refresher on the constraints a solution must obey (I’ve numbered them as they’ll be referred to later):
- Each row or column must contain a certain number of walls, indicated by the adjacent number
- All empty space must be part of a hallway or treasure room
- Treasure rooms are always 3x3 with a single entrance
- Hallways are always one space wide. Outside of a treasure room there will never be a 2x2 block of empty spaces.
- Every monster is in a dead end. Every dead end must contain a monster
- All empty spaces connect into a single contiguous shape. Diagonal spaces are not considered connected
Napkin Math
Backtracking is simple, but it’s not a silver bullet. Many games such as Chess or Go are famously not solvable (yet?) due to the astronomical number of possible games, which I have seen estimated to be higher than the number of atoms in the universe. Because the puzzle grid is smaller than Sudoku and because I’ve written a backtracking Sudoku solver before (this is a common first year programming assignment), I assumed DnD would also be feasible, but I came up with some heuristics reinforce that assumption.
In this puzzle, the rows have varying numbers of walls required and monsters present. Focusing on the bottom row, we can see it has 5 empty tiles, and requires 3 of those to be walls. We can use combinatorics to compute the number of ways to place the walls with the “n choose k” function. For the bottom row, 5 choose 3 (I’ll use 5c3 notation going forward) evaluates to 10.
If we repeat this process for all 8 rows, we get 7c1 = 7, 7c6 = 7, 7c6 = 7, 7c2 = 21, 8c7 = 7, 7c3 = 35, 7c3 = 35, and lastly 5c3 = 10. Multiplying all of these numbers together results in a solution space with ~705 million possibilities. This sounds high but is within the realm of possibility for modern computers. Repeating this calculation for a bunch more puzzles, the upper bound hovered around ~4.2 billion. A little scarier but let’s press on!
The actual solution space should end up being quite a bit smaller, as we don’t have to evaluate each chain of guesses all the way through the whole puzzle. Like the sudoku puzzle above, we will inevitably run into contradictions if we make a wrong guess early on. When this happens, we don’t have to continue the search from that point and can cut off entire branches of the solution space. The earlier a contradiction is found, the larger the branch we get to prune.
As a bit of premature optimization, I also realized that as I made guesses for each row I could keep track of the current wall count in each column, and mask off entire columns as they are filled. This will reduce the number of combinations for subsequent rows being checked.
First Attempt
My implementation needed to iterate over each empty tile, place a wall if legal, and then recurse to the next tile. This would result in a stack depth of 50-60 depending on the puzzle - an amount which would incur considerable function call overhead throughout the several hundred million calls needed to traverse the solution space. I figured I could instead traverse the puzzle whole rows at a time, considering groups of 8 tiles at a time. This meant the stack would have a maximum depth of 8, and I could replace the lost functionality with a tight for
loop that would go through each permutation of wall placements for the entire row.
I was also inspired by modern chess engines which use bitboards to represent the game state in a compact manner that is both fast to copy (useful for recursion) and fast to process (I could use bitwise operators to operate on all 64 tiles in parallel rather than looping over arrays of bytes). I created a new puzzle representation that could be passed into the solver to take advantage of these ideas. With an 8x8 grid and 4 states to track, I chose to use 4 u64
s to make bitmaps of each state. I will also keep the top_nums
and left_nums
fields from before, as they are already pretty compact.
pub struct Backtracker {
empty_mask: u64,
wall_mask: u64,
treasure_mask: u64,
monster_mask: u64,
left_nums: [u8; 8],
top_nums: [u8; 8],
}
impl Backtracker {
// Convert our existing 'Puzzle' struct to the new bitboard format
pub fn new(puzzle: &Puzzle) -> Self {
let mut empty_mask = 0u64;
let mut wall_mask = 0u64;
let mut treasure_mask = 0u64;
let mut monster_mask = 0u64;
// Iterate over each tile and set the bits in the corresponding masks
puzzle
.tiles
.as_flattened()
.iter()
.enumerate()
.for_each(|(i, tile)| {
let bit = 1 << (63 - i);
match tile {
Tile::Empty => empty_mask |= bit,
Tile::Wall => wall_mask |= bit,
Tile::Treasure => treasure_mask |= bit,
Tile::Monster => monster_mask |= bit,
}
});
Self {
empty_mask,
wall_mask,
treasure_mask,
monster_mask,
left_nums: puzzle.left_nums,
top_nums: puzzle.top_nums,
}
}
}
Let’s check out an example of parsing a puzzle into this format.
For the puzzle above, the data will look like this.
empty_mask: 1101111011111111111111101111111111110111111011111111111111111101
monster_mask: 0010000100000000000000010000000000001000000000000000000000000010
treasure_mask: 0000000000000000000000000000000000000000000100000000000000000000
wall_mask: 0000000000000000000000000000000000000000000000000000000000000000
If I wrap the bits into 8x8 grids, you can see the bits map exactly to the tile locations.
emmpty monster treasure wall
11011110 00100001 00000000 00000000
11111111 00000000 00000000 00000000
11111110 00000001 00000000 00000000
11111111 00000000 00000000 00000000
11110111 00001000 00000000 00000000
11101111 00000000 00010000 00000000
11111111 00000000 00000000 00000000
11111101 00000010 00000000 00000000
The task now is to figure out for what value of wall_mask
are all the puzzle constraints satisfied? This puzzle has 58 empty tiles, so there are 2^58 or ~288 quadrillion ways to fill out the walls if I started with a wall_mask
of 0 and counted up. For fun I calculated that it would take my computer ~5 years to simply count that high without doing any other work. I already knew from my combinatorial calculations earlier that I could do much better, so I focused my attention on the first row of the puzzle to figure out my approach for the recursive solution.
We can see there are two monsters, leaving 6 empty cells to place 3 walls. This gives 6c3 = 20 possible wall layouts. However this ignores the 0
constraint in the 6th column indicating that there cannot be any walls there. This reduces the number of valid tiles to place walls to 5, and the number of wall permutations to 5c3 = 10. Considering the tree example from earlier, we just cut down half the tree! This demonstrates the importance of looking for early pruning opportunities in our search.
We can pick any of the 10 wall permutations (although it makes sense to traverse them in an order which is fast to compute) and assume it to be the correct solution for the first row of the puzzle. For each wall filled in, I subtract 1 from the vertical constraints as the number of walls remaining for that column will have now decreased. From there we move on to the next row and rinse and repeat. If we make it to the last row and haven’t violated any constraints, we will have discovered a solution to the puzzle! I say a solution instead of the solution because I can’t guarantee the puzzle generator always produces layouts with a unique solution. I therefore designed my algorithm to take this into account and push any solution it finds into a list which I can review after the search is complete. A solution only needs to encode the wall placement, so it turns out that storing the wall_mask
wich is a u64
is adequate. The algorithm can then be described in just a couple steps:
- Iterate over all
Wall
permutations for the current row - Place walls according to the current permutation
- If we are at max depth (last row), verify that wall placements satisfy the puzzle constraints
- If the solution is valid, push it into solution vector
- Otherwise descend to the next row, recursively
Iterating through the Wall
permutations for a row can be described as finding all the 8-bit numbers with 3 1
bits, and then exclude ones where the 1
s overlap with the positions of the monsters. Let me illustrate, again with the first row of the puzzle:
monster positions: 00100001
satisfied columns: 00000100 // the 6th column requires 0 walls, so it is already complete
From here, we can OR these numbers together and invert the result to get a mask that show our wall candidates:
NOT (00100001
OR 00000100
) = 11011010
(the 5 1
’s correspond to potential wall placements)
Finally we can iterate a number N
from 0 to 255, ANDing each number with this mask and keeping only those that result in exactly 3 bits being set. Rust has a built-in function called count_ones()
that makes this pretty easy. Here’s what the first handful of matches look like:
N (decimal) | N (binary) | M (N AND Mask) |
---|---|---|
26 | 00011010 | 00011010 |
27 | 00011011 | 00011010 |
30 | 00011110 | 00011010 |
31 | 00011111 | 00011010 |
58 | 00111010 | 00011010 |
59 | 00111011 | 00011010 |
62 | 00111110 | 00011010 |
63 | 00111111 | 00011010 |
74 | 01001010 | 01001010 |
75 | 01001011 | 01001010 |
78 | 01001110 | 01001010 |
79 | 01001111 | 01001010 |
82 | 01010010 | 01010010 |
83 | 01010011 | 01010010 |
86 | 01010110 | 01010010 |
87 | 01010111 | 01010010 |
88 | 01011000 | 01011000 |
89 | 01011001 | 01011000 |
92 | 01011100 | 01011000 |
93 | 01011101 | 01011000 |
There’s a small problem here, and that is that our masking function results in multiple N
s that map to the same wall placement configuration (for example 26, 27, 30, 31, 58, 59, 62, and 63 all map to 00011010). This will result in performing the same guess multiple times and duplicating work. As a simple workaround I kept track of the last value of M
, and if the current value is the same, skip the guess and increment N
. This was a slight improvement, but eventually I just created a lookup table called BIT_PERMUTATIONS
that could be indexed by the current empty cell mask and the required wall count. It returned a list with the valid masks for the current row. It looked something like this:
let candidate_mask = 0b11011010; // mask indicating potential wall positions
let walls_required = 3; // number of walls required
let guesses = BIT_PERMUTATIONS[valid_positions][walls_required];
// 'guesses' is now the list of numbers with 3 of the 5 bits set from candidate mask
// -> [26, 74, 82, 88, 138, 146, 152, 194, 200, 208]
Implementing the rest of the recursive solver goes basically as you might expect. I’ll leave some annotated code here that hopefully explains itself well enough. Feel free to skip over it and just look at the results :)
We start with depth=0
, representing the top row of the puzzle. Guesses are subsequently made for each row until a solution is found and pushed to the solutions
vector. There are a few conditions that are tested to handle invalid states like a row needing more walls than it has space for.
fn solve_recursive(
&self,
// The numbers along the top of the puzzle. These will be modified as walls are placed.
column_counts: [u8; 8],
// 64 bit mask where every 1 represents an Empty tile
empty_mask: u64,
// 64 bit mask where every 1 represents a Wall tile
wall_mask: u64,
// Recursion depth (represents which row we are currently guessing)
depth: usize,
// Output vector to hold any/all solutions we (hopefully) find
solutions: &mut Vec<u64>,
) {
// The number of walls required by the row at the current depth
let row_wall_count = self.left_nums[depth];
// An 8 bit bitmask of empty cells for the current row. The call to 'to_be_bytes()'
// casts the u64 as a [u8; 8], letting me pluck out the byte I need
let row_empty_cells = empty_mask.to_be_bytes()[depth];
// 8 bit bitmask marking columns that are not full yet (remaining count > 0)
let unsatisfied_columns = column_counts
.into_iter()
.map(|i| if i > 0 { 1u8 } else { 0 })
.fold(0, |acc, x| (acc << 1) + x);
// Final mask for the row is 'empty cells that have not yet met their vertical constraint'
let valid_positions_mask = row_empty_cells & unsatisfied_columns;
if valid_positions_mask.count_ones() < row_wall_count as u32 {
// The row requires more walls than there are valid positions, return early
return;
} else if depth == 7 && valid_positions_mask.count_ones() != row_wall_count as u32 {
// We're on the last row and we don't have exactly as many valid positions as walls
// to place, return early (final row must result in ALL constraints satisfied)
return;
}
// Iterate through all possible 8 bit masks, starting with 0b00000001 until it wraps back to 0
let mut candidate_mask = 1u8;
while candidate_mask != 0 {
// Skip this mask if it would place a wall on an invalid tile
if !valid_positions_mask & candidate_mask != 0 {
candidate_mask = candidate_mask.wrapping_add(1);
continue;
}
// Only consider masks that will place the number of walls required
if candidate_mask.count_ones() == row_wall_count as u32 {
// Place the walls by OR'ing the candidate mask into the wall mask
let wall_mask = wall_mask | (candidate_mask as u64) << (56 - 8 * depth);
let empty_mask = empty_mask & !wall_mask;
// Update the vertical counts (subtract 1 from each column where a wall was placed)
let mut counts = column_counts.clone();
for i in 0..8 {
if (candidate_mask & (1 << (7 - i))) >= 1 {
counts[i] -= 1;
}
}
// If we've reached the last row and haven't returned early, the current wall mask
// represents a candidate solution. Check it's validity and push to solution vector
if depth == 7 {
if is_valid_solution(empty_mask, self.monster_mask, self.treasure_mask) {
let sol = wall_mask;
solutions.push(sol);
}
} else {
self.solve_recursive(
counts,
empty_mask,
wall_mask,
depth + 1,
solutions,
);
}
}
candidate_mask = candidate_mask.wrapping_add(1);
}
}
I snuck in a call to is_valid_solution()
there, which I haven’t yet implemented. For now I’ll just return true to see if the recursion/mask iteration is functional. I will also need a way to simulate entering the solution back into the game, so I added these functions to DungeonCrawler
:
// Maps tile indices (0-7) to pixel coordinates for clicking
pub fn place_wall(&mut self, x: u8, y: u8) -> Result<()> {
self.click(
x as u32 * TILE_STRIDE + TILE_BASE.0,
y as u32 * TILE_STRIDE + TILE_BASE.1,
)
}
// Loop over each bit in the solution, click the tiles where walls are needed
pub fn enter_solution(&mut self, solution: u64) -> Result<()> {
println!("{solution}");
for i in 0..64 {
let y = i / 8;
let x = i % 8;
if solution & (1 << (63 - i)) != 0 {
self.place_wall(x, y)?;
}
}
// Aleep a brief moment in case we are solving multiple puzzles quickly to not miss inputs
thread::sleep(Duration::from_millis(10));
Ok(())
}
And lastly I could put together the main solve loop, which will continuously parse puzzles, solve them, and enter the solution back into the game. I added the solve time to the output as well as the puzzle seed and number of solutions found so I could revisit puzzles if I needed to debug anything. I added an escape if multiple solutions were found to just generate a new puzzle.
fn solve_loop() -> Result<()> {
let mut dc = dungeon_crawler::DungeonCrawler::new()?;
loop {
let t0 = Instant::now(); // For measuring solve time
let puzzle = dc.parse()?;
let bt = solve::Backtracker::new(&puzzle);
let solutions = bt.solve();
let time = format!("{:.2}s", t0.elapsed().as_secs_f32());
print!("seed: {:0>8} - ", puzzle.seed.unwrap());
match solutions.len() {
0 => println!(" no solution found in {time}"),
1 => {
println!("unique solution found in {time}");
dc.enter_solution(*solutions.last().unwrap()).unwrap();
// Give the game time to process the solution being entered
thread::sleep(Duration::from_millis(25));
}
_ => {
println!(" {: >3} solutions found in {time}", solutions.len());
}
}
// Go to the next puzzle
dc.random_board()
}
Ok(())
}
Surprisingly, this kind of worked on the first attempt. Letting it run for a while, the output looked like this:
seed: 33095486 - 4720 solutions found in 0.09s
seed: 94075043 - 48 solutions found in 0.10s
seed: 87234007 - 11 solutions found in 0.01s
seed: 45151188 - 7936 solutions found in 0.46s
seed: 49329791 - 672 solutions found in 1.11s
seed: 49074641 - 2 solutions found in 0.00s
seed: 91524114 - 8 solutions found in 0.23s
seed: 16324322 - 10816 solutions found in 0.90s
seed: 69331359 - 21040 solutions found in 2.77s
seed: 18835572 - 32 solutions found in 0.75s
seed: 22819008 - 912 solutions found in 0.26s
seed: 93365886 - 4 solutions found in 0.08s
seed: 60424626 - 27600 solutions found in 3.45s
seed: 74146725 - 184 solutions found in 0.04s
seed: 10221255 - 64 solutions found in 1.02s
seed: 17959478 - 1216 solutions found in 0.39s
seed: 63973204 - 1376 solutions found in 0.04s
seed: 46593034 - 608 solutions found in 0.67s
seed: 17089073 - 128 solutions found in 0.01s
seed: 14192672 - 1488 solutions found in 0.74s
This looked promising performance-wise, with most solves being under 1 second, but I was unable to find any unique solutions. Running it on a 10000 puzzles from my puzzle database gave some more insight into how much work is left to do:
- The vast majority of puzzles are solved in under 1 second, but some take over 30 seconds and 2 of them were over 5 minutes
- No unique solutions were found at all
To be honest this is not really surprising considering I’ve only set out to satisfy constraint #1 so far (horizontal and vertical wall counts) while ignoring the other 5. This lead to solutions like these:
If you count the number of walls in each row and column they will be correct, but all 5 other constraints are violated. I’ll reiterate them quickly here - We have treasure rooms that are not 3x3, treasure rooms with more than one exit, hallways that are 2x2, monsters that are not in a dead end, and lastly a path that is not a single contiguous block. Let’s see if we can do better.
Correctness and Performance Improvements
One thing that stood out in the “solutions” above was just how bad it was at placing walls around the treasure chests. In retrospect this should have been an obvious failure case for my approach because rooms have a well-defined structure over a 5x5 area (3x3 room surrounded by mostly walls) and I was haphazardly rolling dice to guess where the walls go. This has a worst-case probability of (1/2)^25, or about 1 in 33 million. I knew I could do better if I was clever about it. Since the rooms dominate a large area of the board, I thought perhaps I could precompute the valid treasure room positions ahead of time, and use those layouts as starting points for the solver.
Given the 4 room layouts above I can invoke the recursive solver 4 times, each time using the blue regions as a mask to exclude wall placement. If the rooms are not in the true location, it will likely cause the solver to terminate much faster.
Revisiting the 15 puzzles above, our results now look like this:
seed: 33095486 - 12 solutions found in 0.00s
seed: 94075043 - 4 solutions found in 0.00s
seed: 87234007 - 4 solutions found in 0.00s
seed: 45151188 - 18 solutions found in 0.00s
seed: 49329791 - 9 solutions found in 0.01s
seed: 49074641 - unique solution found in 0.00s
seed: 91524114 - unique solution found in 0.01s
seed: 16324322 - 44 solutions found in 0.01s
seed: 69331359 - 693 solutions found in 0.06s
seed: 18835572 - 3 solutions found in 0.01s
seed: 22819008 - 4 solutions found in 0.00s
seed: 93365886 - unique solution found in 0.00s
seed: 60424626 - 823 solutions found in 0.04s
seed: 74146725 - 8 solutions found in 0.00s
seed: 10221255 - unique solution found in 0.02s
seed: 17959478 - 3 solutions found in 0.01s
seed: 63973204 - 31 solutions found in 0.00s
seed: 46593034 - 10 solutions found in 0.01s
seed: 17089073 - 3 solutions found in 0.00s
seed: 14192672 - 38 solutions found in 0.01s
Huge success! I was now finding a handful of unique solutions, and the solve speed was maybe 2 orders of magnitude faster. Running it on 10000 puzzles completed in just under 4 seconds. In order to cut down on the cases with multiple solutions, I had to start intelligently checking whether my solutions actually met all the constraints. I planned to do this with a series of predicates, one per constraint, relying mostly on bitwise math to remain performant. The checker would look something like this:
// Chaining these calls together with && allows short-circuit evaluation, meaning the
// slowest predicates will only be tested if necessary
fn is_valid_solution(empty_mask: u64, features: &PuzzleFeatures) -> bool {
no_wide_hallways(empty_mask, features)
&& check_treasure_rooms(empty_mask, features)
&& dead_ends_have_monsters(empty_mask, features)
&& contiguous_path(empty_mask, features)
}
I added a parameter of type PuzzleFeatures
which is just a collection of unchanging features of the puzzle that are useful in constraint evaluation. empty_mask
is a bitmask identifying empty cells, which constantly changes as guesses are made. The PuzzleFeatures
struct has 5 members:
struct PuzzleFeatures {
/// Constraints on the left side of the puzzle
left_nums: [u8; 8],
/// Bitmask of monster locations
monster_mask: u64,
/// Bitmask of treasure locations
treasure_mask: u64,
/// Bitmask of 3x3 treasure rooms
rooms_mask: u64,
/// List of (x, y) coordinates for treasure chests (blue regions from the gif above)
room_positions: Vec<(u8, u8)>,
}
Let’s see what we can do with these values. I’ll walk you through the implementation for each of the predicates.
—– 1. No Wide Hallways —–
Detecting wide hallways (2x2 or larger section of Empty
tiles) is pretty easy with bitwise math. We just look at each empty tile and check if the tiles to the east, south, and southeast are also empty. Checking the neighbors can be done quickly by bitshifting the empty_mask
. A shift by 1 bit will access neighbors left/right, whereas a shift by 8 bits will access neighbors above/below. There is one small edge case (literally) to handle, and that is that bits on the left and right edge will assume they are adjacent to one another, as I will try to demonstrate visually (this was hard to visualize):

To compensate for this faux-adjacency, we can further mask the leftmost and rightmost columns when doing left and right shifts. The least significant bit corresponds to the bottom right tile, increasing first to the left and then vertically. The most significant bit represents the top left bit. In red I’ve highlighted the bits that either wrap around the board, or get shifted in as 0s. Knowing this, I created a few more masks which I can use to ignore those bits.
// masks to hide bits that wrapped around the edge of the board
const LEFTSHIFT_MASK: u64 = 0xfefefefefefefefe;
const RIGHTSHIFT_MASK: u64 = 0x7f7f7f7f7f7f7f7f;
fn no_wide_hallways(empty_mask: u64, features: &PuzzleFeatures) -> bool {
// Create a bitmask indicating cells that are part of a 2x2 Empty region
// Check cell to the East, ignoring bits that wrap around the board
let mut wide_hall_mask = empty_mask & ((empty_mask << 1) & LEFTSHIFT_MASK);
// Check cell to the South and Southeast
wide_hall_mask &= wide_hall_mask << 8;
// Check cell to the West
wide_hall_mask |= (wide_hall_mask >> 1) & RIGHTSHIFT_MASK;
// Check cell to the North and Northwest
wide_hall_mask |= wide_hall_mask >> 8;
// At this point, all the 1s in the mask are part of 2x2 or larger Empty regions
// Need to ignore bits that are part of a treasure room (2x2 Empty is legal there)
wide_hall_mask & !features.room_mask == 0
}
This solution is wicked fast. The bitwise math allows us to operate on all 64 tiles at once with no need for looping or array indexing. I later learned this is referred to as SIMD Within a Register (SWAR).
—– 2. Monsters in Dead Ends —–
The condtion we need to evaluate is “Monsters have exactly 1 Empty
tile adjacent to them”. I put together a boolean function to do this similar to the wide hallway detection, but it is slightly more involved. My implementation is fast, but I suspect a more efficient method exists.
—– 3. Treasure Rooms have 1 Entrance —–
I thought about this one for a while - and initially planned to loop over the 16 tiles surrounding each 3x3 treasure room and count the Empty
tiles. This worked but I was itching for another cute SWAR solution using bitmath. I figured I could create a mask of the treasure room boundary and multiply it with the Empty
tile mask. I can assert that the result has a single 1
bit to prove that the room has exactly one entrance.

In the image above, I have chosen a potential spot for the treasure room in blue. The 12 red boundary tiles are what I want to create a mask of. Reading the bits out in a grid looks like this:
00001110
00010001
00010001
00010001
00001110
00000000
00000000
00000000
And converting that to a u64: 0b0000111000010001000100010001000100001110000000000000000000000000
(or 0x0E1111110E000000
in hexidecimal)
I built up a table of these masks, one for each of the 36 potential treasure room locations (3x3 room can only fit in an 8x8 grid 36 ways), and I can look it up based on the room location.
fn check_treasure_rooms(empty_mask: u64, features: &PuzzleFeatures) -> bool {
// Loop over each treasure room
for &(x, y) in features.room_positions.iter() {
// Multiply the boundary mask with the empty mask and count the 1s
if (empty_mask & ROOM_BORDER_MASKS[y as usize][x as usize]).count_ones() != 1 {
return false;
}
}
true
}
—– 4. Contiguous Path —–
I saved the worst for last. Checking if the entire path is contiguous boils down to a floodfill algorithm which is hard to do without a iterating. I can strategically do this check last so it only happens if all the other conditions are satisfied, meaning it should be called as infrequently as possible. I implemented it by performing iterative dilation starting from a single Empty
tile and repeating until no change is detected. If the number of 1
s in the flooded region is equal to the number of Empty
+ Treasure
+ Monster
tiles (treasure and monsters can be considered part of the path).
fn contiguous_path(empty_mask: u64, features: &PuzzleFeatures) -> bool {
// Create the initial flood mask by only setting the first 1 bit from the Empty mask
let tz = empty_mask.trailing_zeros();
let mut flood_mask = 1u64 << tz;
// Bitmask of region to flood includes treasure/monsters
let e_mask = empty_mask | features.treasure_mask | features.monster_mask;
loop {
// Create copy of flood mask so we can detect if anything changed
let mut flood_mask2 = flood_mask;
// Check if N,E,S,W neighbors are empty and dilate the mask appropriately
flood_mask2 |= ((flood_mask << 1) & LEFTSHIFT_MASK) & e_mask;
flood_mask2 |= ((flood_mask2 >> 1) & RIGHTSHIFT_MASK) & e_mask;
flood_mask2 |= (flood_mask2 << 8) & e_mask;
flood_mask2 |= (flood_mask2 >> 8) & e_mask;
// If no change occurred, we're done!
if flood_mask2 == flood_mask {
break;
}
flood_mask = flood_mask2;
}
// Check that dilated region is equal in size to the original count of Empty tiles
if flood_mask.count_ones() != e_mask.count_ones() {
return false;
}
true
}
This method expands once in each direction every step of the loop. It could probably be made faster (empirically, not algorithimcally) by starting the search near the center of the board rather than just “the first Empty
cell we can find”, approximately reducing the number of steps by 2. I found this optimization was not needed.
In the animation above, you can see the search starts in the first Empty
cell on the bottom row. Due to the use of trailing_zeros
to find this cell, it will by nature be in the bottom-most row and as far to the right as possible.
This animation was fun to make and I hope it’s fun to watch. Now after entirely too much ado, I could finally taste the fruits of my labour.
I generated a gif of the solver showing each of the guesses it makes as it progresses through the puzzle. It was immensely satisfying to see this for the first time, and also caught my eye with another optimization opportunity. You can see for the first 2-3 seconds of this gif show the top-right skeleton fully surrounded by walls, which is illegal. The fix was simple, and it was to apply the “monsters in dead ends” predicate after each guess instead of only upon reaching the bottom row. This (along with the “no wide hallways” predicate) massively pruned the search tree and resulted in an overall speedup of over an order of magnitude.
Results
I tested the solver on my database of nearly 6 million puzzles that I parsed from the game. I was able to solve the entire corpus in under 4 seconds (parallelized across 16 threads) which implies an average single-threaded solve time of about 10 microseconds per puzzle. This was MUCH faster than I had hoped to achieve when I started this project and I was very excited to see that. Lastly, please enjoy a recording of the solver chugging through puzzles at lightspeed.
I like seeing the mouse cursor jump around from tile to tile, hammering home the somewhat melancholy reality that I have just spent dozens of hours writing a computer program to automate the process of having fun and documenting it for the handful of people who may one day stumble upon this.
Conclusion
I’m really pleased with the outcome of this solver and had a blast writing it. I would love to come back and take a stab at generating my own puzzles in the future, but for now I would like even more to hang this project up as finished. I finished the solver well over a year ago in what was probably a single digit amount of hours. I have spent easily 5 times that on writing this series of blog posts, and it’s been weighing on me to wrap it up and publish it. I vastly underestimated the amount of time I would spend writing scripts to generate animations and just editing and rewriting large parts of the posts. Nonetheless I really enjoyed putting it all together and I’m eager to get working on the next project!
Additional Thoughts
While putting this post together, there were several things I wanted to write about but they generally didn’t add value to the plot. I decided to briefly cover them here for extra curious readers.
I have thought a lot about how one might generate these puzzles. I did breifly take a stab at it with some decent results, but the puzzles a) didn’t support treasure rooms, and b) were of generally extreme difficulty, often requiring backtracking (which is a hallmark of poor puzzle design). If I decide to come back to this project, It would certainly be to tackle this problem and try to generate arbitrary-sized puzzles of prescribed difficulty.
My solver doesn’t use any logical deduction (for example “this row needs 4 walls and has 4 spaces, therefore they all must be walls”, or, “the space between a monster and a treasure room must be a wall”). I originally wanted to incorporate these into the solver, but with the speed I achieved they simply weren’t necessary. If I were to attempt solving larger puzzles (like 16x16), they perhaps would be mandatory to have acceptable solve times. There exists an online puzzle generator which can produce larger puzzles, but occasionally it places invalid treasure rooms.
My solver is very finely tuned for 8x8 puzzles. It would not be impossible to adapt it to larger puzzles but would certainly require a couple hours of work. Of course, generating puzzles beyond 8x8 would require a solver that can prove they have a unique solution, so that work would need to be done if I take another stab at puzzle generation. I had the idea to use integer arrays for the bitboards for larger sizes, so instead of a
u64
with 8 bits per row in an 8x8 puzzle, I could use an array[u16; 16]
to store 16 16-bit rows (or larger sizes with[u32; 32]
or even[u128; 128]
for the masochists). Doing bitwise operations between two bitmasks becomes a bit more involved but not difficult.The speed I achieved is promising for the notion that much larger puzzles could be generated/solved. However, the solution space grows factorially due to the combinatorial nature of enumerating potential guesses. Incorporating some of the more trivial logical deduction techniques could improve this I imagine, but by how much remains to be seen.
If you enjoyed this post. If you did, feel free to share it or drop a comment on Bluesky or Reddit