Apologies .... I thought I'd posted this entry last week but it turns out it was saved as a draft.
Anyway, this is the last thing I'll say about sudoku, at least for a while. But I did promise something a little more useful out of all this discussion, so lets talk a bit about generating puzzles. Be warned, though, this is all pretty hackish.
There are all sorts of issues here, particularly the issue that to have a reliable idea of how difficult these things are for a person to solve, we probably want a heuristic solver, no a backtracking solver like the one discussed here. So as far as `difficulty' goes, we'll just have to rely on the number of clues as a rough proxy.
We want to generate puzzles, and one way to think about this is to first figure out how to make a valid solution (i.e. a fully solved puzzle) and then keep taking away positions until we have the right number of clues. We may not be able to take away as many as we want, though, and still keep the original position the
First off, realize that because of the way the solver works, we already know how to generate a solved puzzle, we just start of with an empty (all zeros) one:
BLOG> (find-first-solution (make-array 81 :initial-element 0))(notice that if we had used find-all-solutions instead of find-first-solution, we'd have to wait a long time --- the code would happily go along generating every possible solved sudoku board).
#(2 7 5 1 4 3 8 6 9 1 3 6 7 9 8 2 4 5 8 4 9 5 6 2 7 1 3 7 1 2 8 3 5 4 9 6 4 6 3
2 1 9 5 7 8 5 9 8 4 7 6 1 3 2 6 5 4 3 2 1 9 8 7 3 2 1 9 8 7 6 5 4 9 8 7 6 5 4
3 2 1)
So that's great --- except it will always generate the same board. We'll have to introduce some randomness. It's important to note that I have done no analysis of this, and while it will generate a really large number of different boards, I'm not going to claim that it is uniformly distributed or anything like that. What we do is `seed' a board with a certain number of random positions:
(defun seed-sudoku (num-sites)It is possible that for larger num-sites this will not work, and for smaller num-sites there will be a large number of solutions. So we will just repeat until we find a valid board:
(let ((soln (make-array 81 :initial-element 0))
(sites (shuffle! (loop for n below 81 collecting n))))
(labels ((digit-at (n)
(aref soln n))
(possible-digits (n)
(set-difference
`(1 2 3 4 5 6 7 8 9)
(mapcar #'digit-at (neighbors-of n))))
(nth-value-set-p (n)
(/= 0 (digit-at n)))
(random-element (list)
(nth (random (length list)) list))
(update (n)
(when (or (null sites) (zerop num-sites))
(return-from seed-sudoku soln))
(let ((possible-digits (possible-digits n)))
(when possible-digits
(setf (aref soln n) (random-element possible-digits))
(decf num-sites)))
(update (pop sites))))
(update (pop sites)))))
(defun generate-filled-soduku ()We allow for the possibility that a solution isn't possible (it usually is) and just repeat until we get one. I picked 20 for the random seeds as an empirical balance. Too low (e.g. 10) and occasionally the solver takes a very long time to find a solution (on the order of minutes). Too high (e.g. 30) and often seed-sudoku will fail to find enough places to place a clue (i.e., they all become constrained)
(loop for x = (seed-sudoku 20)
unless (null x) do
(let ((res (find-first-solution x)))
(when res (return-from generate-filled-soduku res)))))
So now we can reliably generate solved boards, we'll add a little bit of code to try and remove clues until a desired number is achieved. We start of with 81 (everything known) and randomly select ones to remove so long as doing that doesn't make the solution non-unique.
(defun generate-sudoku (target-clues)Now for one last thing, just to give more control. I might want to know that I actually have a 25 clue puzzle, not that I tried for 25 and only made it down to 30. We now do pure rejection sampling on the number of clues. In other words, we keep running generate-sudoku with an argument of n until it is successful:
"returns a valid sudoku puzzle, tries to get it down to target-clues number of non-zero entries"
(loop with res = (generate-filled-soduku)
with sites = (shuffle! (loop for n below 81 collect n))
with clues = 81
until (= clues target-clues)
while sites
for n = (pop sites)
for val = (aref res n) do
(setf (aref res n) 0)
(if (multiple-solutions-p res)
(setf (aref res n) val)
(decf clues))
finally
(return (values res clues))))
(defun sudoku-of-n-clues (num-clues)This becomes slow (apparently exponentially growing in the time to find a solution) at about 20 clues. For my purposes this is fine. I don't do the puzzles myself, but have been told the 20 solution ones that I was generating were pretty hard to do. If we stay above 20, things are really not very expensive:
(loop for (res clues) = (multiple-value-list (generate-sudoku num-clues))
until (= clues num-clues)
finally (return res)))
BLOG> (time (loop repeat 10000 collecting (sudoku-of-n-clues (+ 22 (random 20)))))Combining this with the pdf-output I had before would generate a book of 10,000 puzzles, which should take a while to work through.
Evaluation took: 195.26 seconds of real time
Of course, this really isn't the end of the road, there are a large number of improvements that could be made, and this algorithm would need some rethinking if we wanted to seriously explore the space for whatever reasons, particularly for low clue numbers. Anyway, it was just an example to show a few CL idioms etc., but I added this as a starting-off point, perhaps, for a decent puzzle generator.
