Mathematics of Sudoku

The class of Sudoku puzzles consists of a partially completed row-column grid of cells partitioned into N regions each of size N cells, to be filled in using a prescribed set of N distinct symbols (typically the numbers {1, …, N}), so that each row, column and region contains exactly one of each element of the set. The puzzle can be investigated using mathematics.

The mathematical analysis of Sudoku falls into two main areas: analyzing the properties of a) completed grids and b) puzzles. Grid analysis has largely focused on counting (enumerating) possible solutions for different variants. Puzzle analysis centers on the initial given values.

The techniques used in either are largely the same: combinatoric and permutation group theory, augmented by the dexterous application of programming tools. There are many Sudoku variants, (partially) characterized by the size (N) and shape of their regions.

For classic Sudoku, N=9 and the regions are 3×3 squares (blocks). A rectangular Sudoku uses rectangular regions of row-column dimension R×C. For R×1 (and 1×C), i.e. where the region is a row or column, Sudoku becomes a Latin square. Other Sudoku variants also exist, such as those with irregularly-shaped regions or with additional constraints or different constraint types. See Sudoku – Variants for a discussion of variants and Sudoku terms and jargon for an expanded listing.

The mathematics of Sudoku is a relatively new area of exploration, mirroring the popularity of Sudoku itself. NP-completeness was documented late 2002 , enumeration results began appearing in May 2005 . In contrast with the two main mathematical approaches of Sudoku mentioned above, an approach resting on mathematical logic and dealing with the resolution of the puzzles from the viewpoint of a player has recently been proposed in Denis Berthier’s book “The Hidden Logic of Sudoku”. This formalizes certain mathematical symmetries.

The general problem of solving Sudoku puzzles on n2 × n2 boards of n × n blocks is known to be NP-complete . For n=3 (classical Sudoku), however, this result is of little relevance: algorithms such as Dancing Links can solve puzzles in fractions of a second. Solving Sudoku puzzles can be expressed as a graph coloring problem. Consider the 9 × 9 = 32 × 32 case. The aim of the puzzle in its standard form is to construct a proper 9-coloring of a particular graph, given a partial 9-coloring.

The graph in question has 81 vertices, one vertex for each cell of the grid. The vertices can be labeled with the ordered pairs (x, y), where x and y are integers between 1 and 9. In this case, two distinct vertices labeled by (x, y) and (x′, y′) are joined by an edge if and only if: • x = x′ (same column) or, • y = y′ (same row) or, • ⌈ x/3 ⌉ = ⌈ x′/3 ⌉ and ⌈ y/3 ⌉ = ⌈ y′/3 ⌉ (same 3 × 3 cell) The puzzle is then completed by assigning an integer between 1 and 9 to each vertex, in such a way that vertices that are joined by an edge do not have the same integer assigned to them. A valid Sudoku solution grid is also a Latin square.

There are significantly fewer valid Sudoku solution grids than Latin squares because Sudoku imposes the additional regional constraint. Nonetheless, the number of valid Sudoku solution grids for the standard 9×9 grid was calculated by Bertram Felgenhauer and Frazer Jarvis in 2005 to be 6,670,903,752,021,072,936,960 (sequence A107739 [12] in OEIS). This number is equal to 9! × 722 × 27 × 27,704,267,971, the last factor of which is prime. The result was derived through logic and brute force computation. Russell and Jarvis also showed that when symmetries were taken into account, there were 5,472,730,538 solutions (sequence A109741 in OEIS).

The number of valid Sudoku solution grids for the 16×16 derivation is not known. The maximum number of givens that can be provided while still not rendering the solution unique, regardless of variation, is four short of a full grid; if two instances of two numbers each are missing and the cells they are to occupy are the corners of an orthogonal rectangle, and exactly two of these cells are within one region, there are two ways the numbers can be added.

The inverse of this—the fewest givens that render a solution unique—is an unsolved problem, although the lowest number yet found for the standard variation without a symmetry constraint is 17, a number of which have been found by Japanese puzzle enthusiasts , and 18 with the givens in rotation symmetric cells. Sudoku from group tables As in the case of Latin squares the (addition- or) multiplication tables (Cayley tables) of finite groups can be used to construct Sudoku and related tables of numbers. Namely, one has to take subgroups and quotient groups into account: Take for example the group of pairs, adding each component separately modulo some . By omitting one of the components, we suddenly find ourselves in (and this mapping is obviously compatible with the respective additions, i.e. it is a group). One also says that the latter is a quotient group of the former, because some once different elements become equal in the new group. However, it is also a subgroup, because we can simply fill the missing component with to get back to .