8. Early Design

We've got the algorithm details and now are ready to talk a little about more exact data structure details.

Data design

Recall the original grid? Let's imagine that we use one matrix element for each room in our maze and then initialize it so as to set up four walls for each room. Every room does have four walls, after all. We might do this by assigning a binary bit for each wall in any particular way we want. For example, the 1's bit could represent the west wall, the 2's bit the east wall, the 4's bit the north wall, and the 8's bit the south wall. It doesn't matter which way they get assigned, just so long as we are consistent in our use of them. Here's an example:

0000	0001	0010	0011	0100	0101	0110	0111	1000	1001	1010	1011	1100	1101	1110	1111
`+ + + +`	`+ + \| + +`	`+ + \| + +`	`+ + \| \| + +`	`+--+ + +`	`+--+ \| + +`	`+--+ \| + +`	`+--+ \| \| + +`	`+ + +--+`	`+ + \| +--+`	`+ + \| +--+`	`+ + \| \| +--+`	`+--+ +--+`	`+--+ \| +--+`	`+--+ \| +--+`	`+--+ \| \| +--+`

These are numbers going from 0 to 15 and they provide us with every possible combination of walls for each cell.

But if we did this, we'd have duplicate walls everywhere because the north wall of the south room and the south wall of the north room would both be for the same intervening wall actually between them. The effect would be something like this:

+--+ +--+ +--+
|  | |  | |  |
+--+ +--+ +--+
+--+ +--+ +--+
|  | |  | |  |
+--+ +--+ +--+
+--+ +--+ +--+
|  | |  | |  |
+--+ +--+ +--+

I think you can easily see that the interior walls are duplicated twice, in each cell and its adjacent cell. So setting up our matrix so that each cell of it held the condition of the four surrounding walls would cause a lot of unnecessary replication. Duplicate walls also means we'd need to update both duplicates when we broke down an intervening barrier. That's an extra and unneeded pain.

A solution is to set up each room with, say, a west wall and a south wall (or a left wall and a bottom wall, if you prefer.) That's it. Something more like:

+    +    +
|    |    |
+--+ +--+ +--+

+    +    +
|    |    |
+--+ +--+ +--+

+    +    +
|    |    |
+--+ +--+ +--+

Or, when merged, conceptually more like:

+  +  +
|  |  |
+--+--+--+
|  |  |
+--+--+--+
|  |  |
+--+--+--+

Now the duplicate walls are indeed gone.

Some problems do remain, but they are easy to cope with. We don't have any exterior walls on the north side or the east side of the maze. But that is easily addressed by realizing that we never need to keep any status about them because we cannot break them down, anyway. So they always exist and when it comes time to print out the maze, we just add them arbitrarily. Or you might choose to create an extra row and column for the maze, along those sides, so that you do have the wall status values to play with. Now another problem is that, in some cases, we update a wall status in the room where the marker is currently at and in other cases we update a wall status in the room we will advance the marker into. So we cannot just assume which room has the status that needs updating, but must decide that based on which of the four directions we are traveling. But those issues are probably easier to deal with than having to perform multiple updates on a design that doesn't really reflect what we are doing very well.

However we do it, whether an array of west and south walls, or of east and north, or of east and south, or of west and north, or some other scheme of your own devising, we will also need an array of markers indicating whether or not we've been to the room. (Kind of like leaving bread crumbs along out path, eh?) We need that so we can decide whether or not a given wall can be broken down or not. And finally, we may also need to keep track of rooms that are available for restarting from and updating that list when we find rooms that have no exits. That list would help us quickly discover where we can restart a new path branch.

We also only need one bit per wall or marker since we only need to know YES/NO information about each. It would be somewhat easier, if we just used an entire INTEGER for the purpose. But after some calculations, you'll see why we should go to the trouble. Wasting space like that can really limit the size of your mazes.

If you've never tried it before, generating a maze can seem easy enough in concept yet quite hard to grapple with, once you try to work out the exact details of automatically generating one with a computer. But it turns out that it really isn't all the difficult, once you've done it and made it work.

Let's get started.

Last updated: Sunday, July 31, 2005 02:04