

物理维度:PG 在处理 81 位以上状态压缩或高频字符串变异时,底层开销更小。
2. 步骤分析
以官方测试数据中id = 996 的数据为例 。将其展开成 9宫格如图所示

2.1 外层框架
SELECT g.id,g.puzzle,regexp_replace(solution.result, '(.{9})(?=.)', '\1' || E'\n', 'g') AS resultFROM (SELECT * FROM game3.sudoku9_9) gCROSS JOIN LATERAL (... 这里面是核心解题逻辑 ...) solutionORDER BY g.id;
FROM (...) g:把 sudoku9_9 表里的每一道题目(每一行)拿出来,标记为g
CROSS JOIN LATERAL (...):使用 LATERAL 对于每一道题目 g,都单独启动一次括号里的计算程序,算出结果叫 solution,然后把结果贴回题目后面。
2.2 核心逻辑
在 WITH RECURSIVE 里面,定义了一堆“临时表”(CTE),相当于解题前的准备工作。
2.2.1 常量初始化
digits AS (SELECT n,(n - 1) AS idx,(1::bigint << (n - 1)) AS bit, -- 重点!n::text AS digit_textFROM generate_series(1,9) n),
结果如下:

这里预计算了 1-9 的二进制位特征。
例如数字 1 对应二进制 000000001,数字 9 对应 100000000。
收益:后续校验时,只需做一次“按位与(&)”操作即可判断数字是否冲突,比数值比较或字符串匹配快得多。
2.2.2 坐标映射
cells AS (SELECT pos,pos/9 AS r,pos%9 AS c,((pos/9)/3)*3 + ((pos%9)/3) AS b, -- 算宫号substring(clean.s FROM pos + 1 FOR 1) AS chFROM cleanCROSS JOIN generate_series(0,80) AS gs(pos)),
结果:

将 81 个字符的字符串展开成 81 行数据,并计算好每个格子所属的 r(行)、c(列)、b(宫)坐标。这是为了初始化掩码做准备。
2.2.3 状态初始化:位掩码生成
-- 行/列/宫掩码:每 9bit 表示一行/列/宫已用数字row_mask AS (SELECT coalesce(bit_or(set_bit(B'0'::bit(81), (pos/9)*9 + (8 - (ch::int - 1)), 1)),B'0'::bit(81)) AS maskFROM cellsWHERE ch <> '?'),col_mask AS (SELECT coalesce(bit_or(set_bit(B'0'::bit(81), (pos%9)*9 + (8 - (ch::int - 1)), 1)),B'0'::bit(81)) AS maskFROM cellsWHERE ch <> '?'),box_mask AS (SELECT coalesce(bit_or(set_bit(B'0'::bit(81), ((((pos/9)/3)*3 + ((pos%9)/3))*9) + (8 - (ch::int - 1)), 1)),B'0'::bit(81)) AS maskFROM cellsWHERE ch <> '?'),-- todo:记录所有待填位置todo AS (SELECT coalesce(array_agg(pos ORDER BY pos), ARRAY[]::int[]) AS arrFROM cellsWHERE ch = '?'),seed AS (SELECT clean.s,row_mask.mask AS row_mask,col_mask.mask AS col_mask,box_mask.mask AS box_mask,todo.arr AS todoFROM clean, row_mask, col_mask, box_mask, todo),
结果:

没有用 3 个数组存状态,而是用了 3 个 81位的二进制串。
row_mask 的结构:前 9 位表示第 0 行使用了哪些数字,第 10-18 位表示第 1 行...
set_bit:将已存在的数字对应的位置为 1。
bit_or:聚合操作,将所有格子的状态合并成一个最终的掩码。
通过这 3 个二进制串,全盘的约束状态被压缩在极小的内存中,读取效率极高。
2.2.4 递归
solver AS (-- 初始层SELECT s, row_mask, col_mask, box_mask, todoFROM seedUNION ALL-- 递归扩展:挑选一个空格尝试填数SELECT overlay(c.s placing d.digit_text FROM pick.pos + 1 FOR 1) AS s,set_bit(c.row_mask, pick.r*9 + (8 - d.idx), 1) AS row_mask,set_bit(c.col_mask, pick.cidx*9 + (8 - d.idx), 1) AS col_mask,set_bit(c.box_mask, pick.b*9 + (8 - d.idx), 1) AS box_mask,array_remove(c.todo, pick.pos) AS todoFROM solver cJOIN LATERAL (-- MRV:挑候选数最少的格子SELECT pos,pos / 9 AS r,pos % 9 AS cidx,((pos / 9)/3)*3 + ((pos % 9)/3) AS b,mask_int,bit_count(((~mask_int) & 511)::bit(9)) AS free,ordFROM (SELECT pos,ord,pos / 9 AS r,pos % 9 AS cidx,((pos / 9)/3)*3 + ((pos % 9)/3) AS b,((substring(c.row_mask from (pos/9)*9 + 1 for 9)::bit(9)::int) |(substring(c.col_mask from (pos%9)*9 + 1 for 9)::bit(9)::int) |(substring(c.box_mask from ((((pos/9)/3)*3 + ((pos%9)/3))*9) + 1 for 9)::bit(9)::int)) AS mask_intFROM unnest(c.todo) WITH ORDINALITY AS todo(pos, ord)) baseORDER BY free, ordLIMIT 1) pick ON trueJOIN digits d ON (pick.mask_int & d.bit) = 0)
join lateral 内部结果

取出约束:对于每一个空格,从 row_mask、col_mask、box_mask 中截取对应的 9 位。
合并约束:使用 位或(|) 运算。比如行里有1,列里有2,宫里有3,运算结果就是 ...00111,表示 1,2,3 都不能填。(图中mask_int = 509 ,他对应的二进制为 111111101 )
计算自由度:bit_count 计算有多少个位是 0(合法的候选数)。
排序 (ORDER BY free):这是关键点, 不按顺序填,而是永远挑那个“自由度最小”的格子填。
如果一个格子只剩 1 个能填的数(自由度=1),立刻填它。
如果自由度不是1,比如一个格子,它有两个合法的候选数:1 和 4,此时 JOIN digits 会匹配成功两次,然后两个数据同时进入到下一轮递归。直到后面某一轮 有一个分支,他的自由度为0,JOIN digits 找不到匹配的数字,这条数据没能生成下一代,直接“断子绝孙”,从内存中消失。

2.2.5 收尾
SELECT s AS resultFROM solverWHERE coalesce(array_length(todo, 1), 0) = 0LIMIT 1
递归会产生很多中间状态,我只要那个 todo 数组长度为 0 的(也就是全填完了的)那一行。
LIMIT 1:如果有多解,我只要第一个。
3. 总结
快在“不用眼睛看”,而用“开关查”:没有用程序去一个个对比数字(1等于1吗?),而是把数字变成了二进制开关。检查冲突只需要一次CPU的“且运算”,这比比对字符快了几十倍。SELECT g.id,g.puzzle,-- 将 81 位结果每 9 位插入换行,便于人工核对regexp_replace(solution.result, '(.{9})(?=.)', '\1' || E'\n', 'g') AS resultFROM (SELECT *FROM game3.sudoku9_9) gCROSS JOIN LATERAL (WITH RECURSIVE-- digits:缓存 1~9 及其位掩码digits AS (SELECT n,(n - 1) AS idx,(1::bigint << (n - 1)) AS bit,n::text AS digit_textFROM generate_series(1,9) n),-- clean:去掉 puzzle 中的空白/换行clean AS (SELECT translate(g.puzzle, E'\r\n\t ', '') AS s),-- cells:展开 81 个格子并标注行/列/宫cells AS (SELECT pos,pos/9 AS r,pos%9 AS c,((pos/9)/3)*3 + ((pos%9)/3) AS b,substring(clean.s FROM pos + 1 FOR 1) AS chFROM cleanCROSS JOIN generate_series(0,80) AS gs(pos)),-- 行/列/宫掩码:每 9bit 表示一行/列/宫已用数字row_mask AS (SELECT coalesce(bit_or(set_bit(B'0'::bit(81), (pos/9)*9 + (8 - (ch::int - 1)), 1)),B'0'::bit(81)) AS maskFROM cellsWHERE ch <> '?'),col_mask AS (SELECT coalesce(bit_or(set_bit(B'0'::bit(81), (pos%9)*9 + (8 - (ch::int - 1)), 1)),B'0'::bit(81)) AS maskFROM cellsWHERE ch <> '?'),box_mask AS (SELECT coalesce(bit_or(set_bit(B'0'::bit(81), ((((pos/9)/3)*3 + ((pos%9)/3))*9) + (8 - (ch::int - 1)), 1)),B'0'::bit(81)) AS maskFROM cellsWHERE ch <> '?'),-- todo:记录所有待填位置todo AS (SELECT coalesce(array_agg(pos ORDER BY pos), ARRAY[]::int[]) AS arrFROM cellsWHERE ch = '?'),seed AS (SELECT clean.s,row_mask.mask AS row_mask,col_mask.mask AS col_mask,box_mask.mask AS box_mask,todo.arr AS todoFROM clean, row_mask, col_mask, box_mask, todo),solver AS (-- 初始层SELECT s, row_mask, col_mask, box_mask, todoFROM seedUNION ALL-- 递归扩展:挑选一个空格尝试填数SELECT overlay(c.s placing d.digit_text FROM pick.pos + 1 FOR 1) AS s,set_bit(c.row_mask, pick.r*9 + (8 - d.idx), 1) AS row_mask,set_bit(c.col_mask, pick.cidx*9 + (8 - d.idx), 1) AS col_mask,set_bit(c.box_mask, pick.b*9 + (8 - d.idx), 1) AS box_mask,array_remove(c.todo, pick.pos) AS todoFROM solver cJOIN LATERAL (-- MRV:挑候选数最少的格子SELECT pos,pos / 9 AS r,pos % 9 AS cidx,((pos / 9)/3)*3 + ((pos % 9)/3) AS b,mask_int,bit_count(((~mask_int) & 511)::bit(9)) AS free,ordFROM (SELECT pos,ord,pos / 9 AS r,pos % 9 AS cidx,((pos / 9)/3)*3 + ((pos % 9)/3) AS b,((substring(c.row_mask from (pos/9)*9 + 1 for 9)::bit(9)::int) |(substring(c.col_mask from (pos%9)*9 + 1 for 9)::bit(9)::int) |(substring(c.box_mask from ((((pos/9)/3)*3 + ((pos%9)/3))*9) + 1 for 9)::bit(9)::int)) AS mask_intFROM unnest(c.todo) WITH ORDINALITY AS todo(pos, ord)) baseORDER BY free, ordLIMIT 1) pick ON trueJOIN digits d ON (pick.mask_int & d.bit) = 0)SELECT s AS resultFROM solverWHERE coalesce(array_length(todo, 1), 0) = 0LIMIT 1) solutionORDER BY g.id;
感谢大家对本次《数据库编程大赛》的关注和支持,欢迎加入技术交流群,更多精彩活动不断,欢迎各路数据库爱好者来挑战!
