Como encontrar uma lista de possíveis palavras a partir de uma matriz de letras

Ultimamente tenho jogado um jogo no meu iPhone chamado Scramble. Alguns de vocês podem conhecer esse jogo como Boggle. Essencialmente, quando o jogo começa, você recebe uma matriz de letras assim:

FXIE AMLO EWBX ASTU 

O objective do jogo é encontrar o máximo de palavras que você puder formar, unindo letras. Você pode começar com qualquer letra, e todas as letras que a cercam são justas, e depois que você passar para a próxima letra, todas as letras que cercam essa letra são válidas, exceto por quaisquer letras usadas anteriormente . Então, na grade acima, por exemplo, eu poderia usar as palavras LOB , TUX , SEA , FAME , etc. As palavras devem ter pelo menos 3 caracteres, e não mais que NxN, que seriam 16 neste jogo, mas podem variam em algumas implementações. Embora este jogo seja divertido e viciante, aparentemente eu não sou muito bom nisso e queria trapacear um pouco fazendo um programa que me desse as melhores palavras possíveis (quanto maior a palavra, mais pontos você ganha).

Amostra de Boggle http://sofpt.miximages.com/puzzle/sample.gif

Eu não sou, infelizmente, muito bom com algoritmos ou suas eficiências e assim por diante. Minha primeira tentativa usa um dictionary como este (~ 2,3MB) e faz uma pesquisa linear tentando combinar combinações com inputs de dictionary. Isso leva muito tempo para encontrar as palavras possíveis, e como você só recebe 2 minutos por rodada, isso simplesmente não é adequado.

Estou interessado em ver se algum Stackoverflowers pode apresentar soluções mais eficientes. Eu estou procurando principalmente soluções usando o Big 3 Ps: Python, PHP e Perl, embora qualquer coisa com Java ou C ++ seja legal também, já que a velocidade é essencial.

SOLUÇÕES ATUAIS :

  • Adam Rosenfield, Python, ~ 20s
  • John Fouhy, Python, ~ 3s
  • Kent Fredric, Perl, ~ 1s
  • Darius Bacon, Python, ~ 1s
  • rvarcher, vb.net (live link) , ~ 1s
  • Paolo Bergantino, PHP (link ao vivo) , ~ 5s (~ 2s localmente)

BOUNTY :

Eu estou adicionando uma recompensa a esta pergunta como a minha maneira de agradecer a todas as pessoas que participaram de seus programas. Infelizmente eu só posso dar a resposta aceita para um de vocês, então eu vou medir quem tem o mais rápido solucionador de boggle daqui a 7 dias e premiar o vencedor com a recompensa.

Recompensa concedida. Obrigado a todos que participaram.

Minha resposta funciona como as outras aqui, mas vou postar porque parece um pouco mais rápido que as outras soluções Python, a partir da configuração do dictionary mais rápido. (Verifiquei isso com a solução de John Fouhy.) Depois da configuração, o tempo para resolver é baixo no ruído.

 grid = "fxie amlo ewbx astu".split() nrows, ncols = len(grid), len(grid[0]) # A dictionary word that could be a solution must use only the grid's # letters and have length >= 3. (With a case-insensitive match.) import re alphabet = ''.join(set(''.join(grid))) bogglable = re.compile('[' + alphabet + ']{3,}$', re.I).match words = set(word.rstrip('\n') for word in open('words') if bogglable(word)) prefixes = set(word[:i] for word in words for i in range(2, len(word)+1)) def solve(): for y, row in enumerate(grid): for x, letter in enumerate(row): for result in extending(letter, ((x, y),)): yield result def extending(prefix, path): if prefix in words: yield (prefix, path) for (nx, ny) in neighbors(path[-1]): if (nx, ny) not in path: prefix1 = prefix + grid[ny][nx] if prefix1 in prefixes: for result in extending(prefix1, path + ((nx, ny),)): yield result def neighbors((x, y)): for nx in range(max(0, x-1), min(x+2, ncols)): for ny in range(max(0, y-1), min(y+2, nrows)): yield (nx, ny) 

Uso da amostra:

 # Print a maximal-length word and its path: print max(solve(), key=lambda (word, path): len(word)) 

Editar: filtre as palavras com menos de três letras.

Edit 2: Fiquei curioso porque a solução Perl de Kent Fredric foi mais rápida; Acontece que usa correspondência de expressão regular em vez de um conjunto de caracteres. Fazendo o mesmo em Python sobre dobra a velocidade.

A solução mais rápida que você irá obter provavelmente envolverá armazenar seu dictionary em um trio . Em seguida, crie uma fila de trios ( x , y , s ), em que cada elemento da fila corresponde a um prefixo s de uma palavra que pode ser grafada na grade, terminando na localização ( x , y ). Inicialize a fila com N x N elementos (onde N é o tamanho da sua grade), um elemento para cada quadrado na grade. Então, o algoritmo procede da seguinte forma:

 Enquanto a fila não está vazia:
   Dequeue um triplo (x, y, s)
   Para cada quadrado (x ', y') com a letra c adjacente a (x, y):
     Se s + c é uma palavra, saída s + c
     Se s + c é um prefixo de uma palavra, insira (x ', y', s + c) na fila

Se você armazenar seu dictionary em um trio, testar se s + c é uma palavra ou um prefixo de uma palavra pode ser feito em tempo constante (desde que você também mantenha alguns metadados extras em cada datum de fila, como um ponteiro para o nó atual no trie), então o tempo de execução deste algoritmo é O (número de palavras que podem ser escritas).

[Edit] Aqui está uma implementação em Python que acabei de codificar:

 #!/usr/bin/python class TrieNode: def __init__(self, parent, value): self.parent = parent self.children = [None] * 26 self.isWord = False if parent is not None: parent.children[ord(value) - 97] = self def MakeTrie(dictfile): dict = open(dictfile) root = TrieNode(None, '') for word in dict: curNode = root for letter in word.lower(): if 97 < = ord(letter) < 123: nextNode = curNode.children[ord(letter) - 97] if nextNode is None: nextNode = TrieNode(curNode, letter) curNode = nextNode curNode.isWord = True return root def BoggleWords(grid, dict): rows = len(grid) cols = len(grid[0]) queue = [] words = [] for y in range(cols): for x in range(rows): c = grid[y][x] node = dict.children[ord(c) - 97] if node is not None: queue.append((x, y, c, node)) while queue: x, y, s, node = queue[0] del queue[0] for dx, dy in ((1, 0), (1, -1), (0, -1), (-1, -1), (-1, 0), (-1, 1), (0, 1), (1, 1)): x2, y2 = x + dx, y + dy if 0 <= x2 < cols and 0 <= y2 < rows: s2 = s + grid[y2][x2] node2 = node.children[ord(grid[y2][x2]) - 97] if node2 is not None: if node2.isWord: words.append(s2) queue.append((x2, y2, s2, node2)) return words 

Exemplo de uso:

 d = MakeTrie('/usr/share/dict/words') print(BoggleWords(['fxie','amlo','ewbx','astu'], d)) 

Saída:

['fa', 'xi', 'ie', 'io', 'el', 'sou', 'ax', 'ae', 'aw', 'mi', 'ma', 'eu', ' lo ',' li ',' oe ',' boi ',' em ',' ea ',' ea ',' es ',' wa ',' nós ',' wa ',' bo ',' bu ' , 'as', 'aw', 'ae', 'st', 'se', 'sa', 'tu', 'ut', 'fam', 'fae', 'imi', 'eli', ' elm ',' elb ',' ami ',' ama ',' ame ',' aes ',' awl ',' awa ',' admiração ',' awa ',' mix ',' mim ',' mil ' , 'mam', 'max', 'mae', 'maw', 'mew', 'mem', 'mes', 'lob', 'lox', 'lei', 'leo', 'mentira', ' lim ',' óleo ',' olm ',' ewe ',' eme ',' cera ',' waf ',' wae ',' waw ',' wem ',' wea ',' wea ',' was ' , 'waw', 'wae', 'bob', 'blo', 'bub', 'mas', 'ast', 'ase', 'asa', 'awl', 'awa', 'admiração', ' awa ',' aes ',' swa ',' swa ',' costurar ',' mar ',' mar ',' serra ',' tux ',' banheira ',' tut ',' twa ',' twa ' , 'tst', 'utu', 'fama', 'fama', 'ixil', 'imã', 'amli', 'amil', 'ambo', 'axil', 'axle', 'mimi', ' mima, mímica, milo, milha, mewl, mese, mesa, lolo, lobo, lima, lima, limb, lile , 'oime', 'oleo', 'olio', 'oboé', 'obol', 'emim', 'emil', 'east', 'ease', 'wame', 'wawa', 'wawa', ' weam ',' west ',' wese ',' wast ',' wase ' , 'wawa', 'wawa', 'fervura', 'bolo', 'bole', 'bobo', 'blob', 'bleo', 'bubão', 'asem', 'stub', 'stut', ' swam ',' semi ',' seme ',' costura ',' seax ',' sasa ',' sawt ',' tutu ',' tuts ',' twae ',' twas ',' twae ',' ilima ' , 'amble', 'axile', 'awest', 'mamie', 'mambo', 'maxim', 'mease', 'mesem', 'limax', 'limes', 'limbo', 'limbu', ' obole ',' emesa ',' embox ',' awest ',' swami ',' famble ',' mimble ',' maxima ',' embolo ',' embole ',' wamble ',' semese ',' semble ' 'sawbwa', 'sawbwa']

Notas: Este programa não produz palavras de 1 letra, nem filtra por comprimento de palavra. Isso é fácil de adicionar, mas não é realmente relevante para o problema. Também produz algumas palavras várias vezes, se elas puderem ser escritas de várias maneiras. Se uma palavra pode ser escrita de muitas maneiras diferentes (pior caso: cada letra na grade é a mesma (eg 'A') e uma palavra como 'aaaaaaaaaa' está no seu dictionary), então o tempo de execução será horrivelmente exponencial . A filtragem de duplicatas e a sorting são triviais para o prazo final após o término do algoritmo.

Para uma aceleração de dictionary, há uma transformação / processo geral que você pode fazer para reduzir bastante as comparações de dictionary antes do tempo.

Dado que a grade acima contém apenas 16 caracteres, alguns deles duplicados, você pode reduzir bastante o número total de chaves em seu dictionary simplesmente filtrando as inputs que possuem caracteres inatingíveis.

Eu pensei que esta era a otimização óbvia, mas vendo ninguém fez isso eu estou mencionando isso.

Isso me reduziu de um dictionary de 200.000 chaves para apenas 2.000 chaves simplesmente durante a passagem de input. Isso, no mínimo, reduz a sobrecarga de memory, e isso com certeza irá mapear para um aumento de velocidade em algum lugar, já que a memory não é infinitamente rápida.

Implementação de Perl

Minha implementação é um pouco pesada, porque eu coloquei importância em poder saber o caminho exato de cada string extraída, não apenas a validade nela.

Eu também tenho algumas adaptações lá que teoricamente permitiriam que uma grade com buracos funcionasse, e grades com linhas de tamanhos diferentes (assumindo que você obtém a input correta e alinha de alguma forma).

O filtro antecipado é de longe o gargalo mais significativo em meu aplicativo, como suspeito anteriormente, comentando que a linha incha de 1.5s para 7.5s.

Após a execução, parece que todos os dígitos individuais estão em suas próprias palavras válidas, mas tenho certeza que isso ocorre devido ao funcionamento do arquivo de dictionary.

É um pouco inchado, mas pelo menos eu reutilizo Tree :: Trie de cpan

Parte disso foi inspirada parcialmente pelas implementações existentes, algumas das quais eu já tinha em mente.

Críticas construtivas e formas de melhorar o processo de boas-vindas (/ me observa que ele nunca procurou CPAN por um solucionador de boggle , mas isso foi mais divertido de resolver)

atualizado para novos critérios

 #!/usr/bin/perl use strict; use warnings; { # this package manages a given path through the grid. # Its an array of matrix-nodes in-order with # Convenience functions for pretty-printing the paths # and for extending paths as new paths. # Usage: # my $p = Prefix->new(path=>[ $startnode ]); # my $c = $p->child( $extensionNode ); # print $c->current_word ; package Prefix; use Moose; has path => ( isa => 'ArrayRef[MatrixNode]', is => 'rw', default => sub { [] }, ); has current_word => ( isa => 'Str', is => 'rw', lazy_build => 1, ); # Create a clone of this object # with a longer path # $o->child( $successive-node-on-graph ); sub child { my $self = shift; my $newNode = shift; my $f = Prefix->new(); # Have to do this manually or other recorded paths get modified push @{ $f->{path} }, @{ $self->{path} }, $newNode; return $f; } # Traverses $o->path left-to-right to get the string it represents. sub _build_current_word { my $self = shift; return join q{}, map { $_->{value} } @{ $self->{path} }; } # Returns the rightmost node on this path sub tail { my $self = shift; return $self->{path}->[-1]; } # pretty-format $o->path sub pp_path { my $self = shift; my @path = map { '[' . $_->{x_position} . ',' . $_->{y_position} . ']' } @{ $self->{path} }; return "[" . join( ",", @path ) . "]"; } # pretty-format $o sub pp { my $self = shift; return $self->current_word . ' => ' . $self->pp_path; } __PACKAGE__->meta->make_immutable; } { # Basic package for tracking node data # without having to look on the grid. # I could have just used an array or a hash, but that got ugly. # Once the matrix is up and running it doesn't really care so much about rows/columns, # Its just a sea of points and each point has adjacent points. # Relative positioning is only really useful to map it back to userspace package MatrixNode; use Moose; has x_position => ( isa => 'Int', is => 'rw', required => 1 ); has y_position => ( isa => 'Int', is => 'rw', required => 1 ); has value => ( isa => 'Str', is => 'rw', required => 1 ); has siblings => ( isa => 'ArrayRef[MatrixNode]', is => 'rw', default => sub { [] } ); # Its not implicitly uni-directional joins. It would be more effient in therory # to make the link go both ways at the same time, but thats too hard to program around. # and besides, this isn't slow enough to bother caring about. sub add_sibling { my $self = shift; my $sibling = shift; push @{ $self->siblings }, $sibling; } # Convenience method to derive a path starting at this node sub to_path { my $self = shift; return Prefix->new( path => [$self] ); } __PACKAGE__->meta->make_immutable; } { package Matrix; use Moose; has rows => ( isa => 'ArrayRef', is => 'rw', default => sub { [] }, ); has regex => ( isa => 'Regexp', is => 'rw', lazy_build => 1, ); has cells => ( isa => 'ArrayRef', is => 'rw', lazy_build => 1, ); sub add_row { my $self = shift; push @{ $self->rows }, [@_]; } # Most of these functions from here down are just builder functions, # or utilities to help build things. # Some just broken out to make it easier for me to process. # All thats really useful is add_row # The rest will generally be computed, stored, and ready to go # from ->cells by the time either ->cells or ->regex are called. # traverse all cells and make a regex that covers them. sub _build_regex { my $self = shift; my $chars = q{}; for my $cell ( @{ $self->cells } ) { $chars .= $cell->value(); } $chars = "[^$chars]"; return qr/$chars/i; } # convert a plain cell ( ie: [x][y] = 0 ) # to an intelligent cell ie: [x][y] = object( x, y ) # we only really keep them in this format temporarily # so we can go through and tie in neighbouring information. # after the neigbouring is done, the grid should be considered inoperative. sub _convert { my $self = shift; my $x = shift; my $y = shift; my $v = $self->_read( $x, $y ); my $n = MatrixNode->new( x_position => $x, y_position => $y, value => $v, ); $self->_write( $x, $y, $n ); return $n; } # go through the rows/collums presently available and freeze them into objects. sub _build_cells { my $self = shift; my @out = (); my @rows = @{ $self->{rows} }; for my $x ( 0 .. $#rows ) { next unless defined $self->{rows}->[$x]; my @col = @{ $self->{rows}->[$x] }; for my $y ( 0 .. $#col ) { next unless defined $self->{rows}->[$x]->[$y]; push @out, $self->_convert( $x, $y ); } } for my $c (@out) { for my $n ( $self->_neighbours( $c->x_position, $c->y_position ) ) { $c->add_sibling( $self->{rows}->[ $n->[0] ]->[ $n->[1] ] ); } } return \@out; } # given x,y , return array of points that refer to valid neighbours. sub _neighbours { my $self = shift; my $x = shift; my $y = shift; my @out = (); for my $sx ( -1, 0, 1 ) { next if $sx + $x < 0; next if not defined $self->{rows}->[ $sx + $x ]; for my $sy ( -1, 0, 1 ) { next if $sx == 0 && $sy == 0; next if $sy + $y < 0; next if not defined $self->{rows}->[ $sx + $x ]->[ $sy + $y ]; push @out, [ $sx + $x, $sy + $y ]; } } return @out; } sub _has_row { my $self = shift; my $x = shift; return defined $self->{rows}->[$x]; } sub _has_cell { my $self = shift; my $x = shift; my $y = shift; return defined $self->{rows}->[$x]->[$y]; } sub _read { my $self = shift; my $x = shift; my $y = shift; return $self->{rows}->[$x]->[$y]; } sub _write { my $self = shift; my $x = shift; my $y = shift; my $v = shift; $self->{rows}->[$x]->[$y] = $v; return $v; } __PACKAGE__->meta->make_immutable; } use Tree::Trie; sub readDict { my $fn = shift; my $re = shift; my $d = Tree::Trie->new(); # Dictionary Loading open my $fh, '< ', $fn; while ( my $line = <$fh> ) { chomp($line); # Commenting the next line makes it go from 1.5 seconds to 7.5 seconds. EPIC. next if $line =~ $re; # Early Filter $d->add( uc($line) ); } return $d; } sub traverseGraph { my $d = shift; my $m = shift; my $min = shift; my $max = shift; my @words = (); # Inject all grid nodes into the processing queue. my @queue = grep { $d->lookup( $_->current_word ) } map { $_->to_path } @{ $m->cells }; while (@queue) { my $item = shift @queue; # put the dictionary into "exact match" mode. $d->deepsearch('exact'); my $cword = $item->current_word; my $l = length($cword); if ( $l >= $min && $d->lookup($cword) ) { push @words, $item; # push current path into "words" if it exactly matches. } next if $l > $max; # put the dictionary into "is-a-prefix" mode. $d->deepsearch('boolean'); siblingloop: foreach my $sibling ( @{ $item->tail->siblings } ) { foreach my $visited ( @{ $item->{path} } ) { next siblingloop if $sibling == $visited; } # given path y , iterate for all its end points my $subpath = $item->child($sibling); # create a new path for each end-point if ( $d->lookup( $subpath->current_word ) ) { # if the new path is a prefix, add it to the bottom of the queue. push @queue, $subpath; } } } return \@words; } sub setup_predetermined { my $m = shift; my $gameNo = shift; if( $gameNo == 0 ){ $m->add_row(qw( FXIE )); $m->add_row(qw( AMLO )); $m->add_row(qw( EWBX )); $m->add_row(qw( ASTU )); return $m; } if( $gameNo == 1 ){ $m->add_row(qw( DGHI )); $m->add_row(qw( KLPS )); $m->add_row(qw( YEUT )); $m->add_row(qw( EORN )); return $m; } } sub setup_random { my $m = shift; my $seed = shift; srand $seed; my @letters = 'A' .. 'Z' ; for( 1 .. 4 ){ my @r = (); for( 1 .. 4 ){ push @r , $letters[int(rand(25))]; } $m->add_row( @r ); } } # Here is where the real work starts. my $m = Matrix->new(); setup_predetermined( $m, 0 ); #setup_random( $m, 5 ); my $d = readDict( 'dict.txt', $m->regex ); my $c = scalar @{ $m->cells }; # get the max, as per spec print join ",\n", map { $_->pp } @{ traverseGraph( $d, $m, 3, $c ) ; }; 

Arch / informações de execução para comparação:

 model name : Intel(R) Core(TM)2 Duo CPU T9300 @ 2.50GHz cache size : 6144 KB Memory usage summary: heap total: 77057577, heap peak: 11446200, stack peak: 26448 total calls total memory failed calls malloc| 947212 68763684 0 realloc| 11191 1045641 0 (nomove:9063, dec:4731, free:0) calloc| 121001 7248252 0 free| 973159 65854762 Histogram for block sizes: 0-15 392633 36% ================================================== 16-31 43530 4% ===== 32-47 50048 4% ====== 48-63 70701 6% ========= 64-79 18831 1% == 80-95 19271 1% == 96-111 238398 22% ============================== 112-127 3007 <1% 128-143 236727 21% ============================== 

Mais resmungos sobre essa otimização Regex

A otimização de regex que eu uso é inútil para dictionarys de multi-resolução, e para multi-resolução você vai querer um dictionary completo, não um pré-aparado.

No entanto, dito isto, para soluções únicas, é realmente rápido. (Perl regex estão em C! :))

Aqui estão algumas adições de código variadas:

 sub readDict_nofilter { my $fn = shift; my $re = shift; my $d = Tree::Trie->new(); # Dictionary Loading open my $fh, '< ', $fn; while ( my $line = <$fh> ) { chomp($line); $d->add( uc($line) ); } return $d; } sub benchmark_io { use Benchmark qw( cmpthese :hireswallclock ); # generate a random 16 character string # to simulate there being an input grid. my $regexen = sub { my @letters = 'A' .. 'Z' ; my @lo = (); for( 1..16 ){ push @lo , $_ ; } my $c = join '', @lo; $c = "[^$c]"; return qr/$c/i; }; cmpthese( 200 , { filtered => sub { readDict('dict.txt', $regexen->() ); }, unfiltered => sub { readDict_nofilter('dict.txt'); } }); } 
            s / iter filtrado não filtrado
 não filtrada 8,16 - -94%
 filtrada 0,464 1658% -

ps: 8,16 * 200 = 27 minutos.

Você pode dividir o problema em duas partes:

  1. Algum tipo de algoritmo de busca que irá enumerar possíveis strings na grade.
  2. Uma maneira de testar se uma string é uma palavra válida.

Idealmente, (2) também deve include uma maneira de testar se uma string é um prefixo de uma palavra válida – isso permitirá que você apague sua pesquisa e economize um monte de tempo.

Trie de Adam Rosenfield é uma solução para (2). É elegante e provavelmente o que seu especialista em algoritmos preferiria, mas com linguagens modernas e computadores modernos, podemos ser um pouco mais preguiçosos. Além disso, como Kent sugere, podemos reduzir o tamanho do nosso dictionary descartando palavras que não possuem letras na grade. Aqui está um pouco de python:

 def make_lookups(grid, fn='dict.txt'): # Make set of valid characters. chars = set() for word in grid: chars.update(word) words = set(x.strip() for x in open(fn) if set(x.strip()) < = chars) prefixes = set() for w in words: for i in range(len(w)+1): prefixes.add(w[:i]) return words, prefixes 

Uau; teste de prefixo em tempo constante. Demora alguns segundos para carregar o dictionary que você ligou, mas apenas um par 🙂 (note que as words < = prefixes )

Agora, para a parte (1), estou inclinado a pensar em termos de charts. Então eu vou construir um dictionary que se pareça com isso:

 graph = { (x, y):set([(x0,y0), (x1,y1), (x2,y2)]), } 

ie graph[(x, y)] é o conjunto de coordenadas que você pode alcançar a partir da posição (x, y) . Também adicionarei um nó fictício None que se conectará a tudo.

Construir é um pouco desajeitado, porque há 8 posições possíveis e você tem que fazer a verificação de limites. Aqui está um código python correspondentemente desajeitado:

 def make_graph(grid): root = None graph = { root:set() } chardict = { root:'' } for i, row in enumerate(grid): for j, char in enumerate(row): chardict[(i, j)] = char node = (i, j) children = set() graph[node] = children graph[root].add(node) add_children(node, children, grid) return graph, chardict def add_children(node, children, grid): x0, y0 = node for i in [-1,0,1]: x = x0 + i if not (0 < = x < len(grid)): continue for j in [-1,0,1]: y = y0 + j if not (0 <= y < len(grid[0])) or (i == j == 0): continue children.add((x,y)) 

Esse código também cria um mapeamento de dictionary (x,y) para o caractere correspondente. Isso me permite transformar uma lista de posições em uma palavra:

 def to_word(chardict, pos_list): return ''.join(chardict[x] for x in pos_list) 

Finalmente, fazemos uma pesquisa em profundidade. O procedimento básico é:

  1. A pesquisa chega em um determinado nó.
  2. Verifique se o caminho até agora pode fazer parte de uma palavra. Se não, não explore mais esse ramo.
  3. Verifique se o caminho até agora é uma palavra. Em caso afirmativo, adicione à lista de resultados.
  4. Explore todas as crianças que não fazem parte do caminho até agora.

Python:

 def find_words(graph, chardict, position, prefix, results, words, prefixes): """ Arguments: graph :: mapping (x,y) to set of reachable positions chardict :: mapping (x,y) to character position :: current position (x,y) -- equals prefix[-1] prefix :: list of positions in current string results :: set of words found words :: set of valid words in the dictionary prefixes :: set of valid words or prefixes thereof """ word = to_word(chardict, prefix) if word not in prefixes: return if word in words: results.add(word) for child in graph[position]: if child not in prefix: find_words(graph, chardict, child, prefix+[child], results, words, prefixes) 

Execute o código como:

 grid = ['fxie', 'amlo', 'ewbx', 'astu'] g, c = make_graph(grid) w, p = make_lookups(grid) res = set() find_words(g, c, None, [], res, w, p) 

e inspecione res para ver as respostas. Aqui está uma lista de palavras encontradas para o seu exemplo, classificadas por tamanho:

  ['a', 'b', 'e', 'f', 'i', 'l', 'm', 'o', 's', 't', 'u', 'w', 'x', 'ae', 'am', 'as', 'aw', 'ax', 'bo', 'bu', 'ea', 'el', 'em', 'es', 'fa', 'ie', 'io', 'li', 'lo', 'ma', 'me', 'mi', 'oe', 'ox', 'sa', 'se', 'st', 'tu', 'ut', 'wa', 'we', 'xi', 'aes', 'ame', 'ami', 'ase', 'ast', 'awa', 'awe', 'awl', 'blo', 'but', 'elb', 'elm', 'fae', 'fam', 'lei', 'lie', 'lim', 'lob', 'lox', 'mae', 'maw', 'mew', 'mil', 'mix', 'oil', 'olm', 'saw', 'sea', 'sew', 'swa', 'tub', 'tux', 'twa', 'wae', 'was', 'wax', 'wem', 'ambo', 'amil', 'amli', 'asem', 'axil', 'axle', 'bleo', 'boil', 'bole', 'east', 'fame', 'limb', 'lime', 'mesa', 'mewl', 'mile', 'milo', 'oime', 'sawt', 'seam', 'seax', 'semi', 'stub', 'swam', 'twae', 'twas', 'wame', 'wase', 'wast', 'weam', 'west', 'amble', 'awest', 'axile', 'embox', 'limbo', 'limes', 'swami', 'embole', 'famble', 'semble', 'wamble'] 

O código leva (literalmente) alguns segundos para carregar o dictionary, mas o resto é instantâneo na minha máquina.

Minha tentativa em Java. Demora cerca de 2 s para ler o arquivo e construir trie, e cerca de 50 ms para resolver o enigma. Eu usei o dictionary ligado na pergunta (tem algumas palavras que eu não sabia existir em inglês como fae, ima)

 0 [main] INFO gineer.bogglesolver.util.Util - Reading the dictionary 2234 [main] INFO gineer.bogglesolver.util.Util - Finish reading the dictionary 2234 [main] INFO gineer.bogglesolver.Solver - Found: FAM 2234 [main] INFO gineer.bogglesolver.Solver - Found: FAME 2234 [main] INFO gineer.bogglesolver.Solver - Found: FAMBLE 2234 [main] INFO gineer.bogglesolver.Solver - Found: FAE 2234 [main] INFO gineer.bogglesolver.Solver - Found: IMA 2234 [main] INFO gineer.bogglesolver.Solver - Found: ELI 2234 [main] INFO gineer.bogglesolver.Solver - Found: ELM 2234 [main] INFO gineer.bogglesolver.Solver - Found: ELB 2234 [main] INFO gineer.bogglesolver.Solver - Found: AXIL 2234 [main] INFO gineer.bogglesolver.Solver - Found: AXILE 2234 [main] INFO gineer.bogglesolver.Solver - Found: AXLE 2234 [main] INFO gineer.bogglesolver.Solver - Found: AMI 2234 [main] INFO gineer.bogglesolver.Solver - Found: AMIL 2234 [main] INFO gineer.bogglesolver.Solver - Found: AMLI 2234 [main] INFO gineer.bogglesolver.Solver - Found: AME 2234 [main] INFO gineer.bogglesolver.Solver - Found: AMBLE 2234 [main] INFO gineer.bogglesolver.Solver - Found: AMBO 2250 [main] INFO gineer.bogglesolver.Solver - Found: AES 2250 [main] INFO gineer.bogglesolver.Solver - Found: AWL 2250 [main] INFO gineer.bogglesolver.Solver - Found: AWE 2250 [main] INFO gineer.bogglesolver.Solver - Found: AWEST 2250 [main] INFO gineer.bogglesolver.Solver - Found: AWA 2250 [main] INFO gineer.bogglesolver.Solver - Found: MIX 2250 [main] INFO gineer.bogglesolver.Solver - Found: MIL 2250 [main] INFO gineer.bogglesolver.Solver - Found: MILE 2250 [main] INFO gineer.bogglesolver.Solver - Found: MILO 2250 [main] INFO gineer.bogglesolver.Solver - Found: MAX 2250 [main] INFO gineer.bogglesolver.Solver - Found: MAE 2250 [main] INFO gineer.bogglesolver.Solver - Found: MAW 2250 [main] INFO gineer.bogglesolver.Solver - Found: MEW 2250 [main] INFO gineer.bogglesolver.Solver - Found: MEWL 2250 [main] INFO gineer.bogglesolver.Solver - Found: MES 2250 [main] INFO gineer.bogglesolver.Solver - Found: MESA 2250 [main] INFO gineer.bogglesolver.Solver - Found: MWA 2250 [main] INFO gineer.bogglesolver.Solver - Found: MWA 2250 [main] INFO gineer.bogglesolver.Solver - Found: LIE 2250 [main] INFO gineer.bogglesolver.Solver - Found: LIM 2250 [main] INFO gineer.bogglesolver.Solver - Found: LIMA 2250 [main] INFO gineer.bogglesolver.Solver - Found: LIMAX 2250 [main] INFO gineer.bogglesolver.Solver - Found: LIME 2250 [main] INFO gineer.bogglesolver.Solver - Found: LIMES 2250 [main] INFO gineer.bogglesolver.Solver - Found: LIMB 2250 [main] INFO gineer.bogglesolver.Solver - Found: LIMBO 2250 [main] INFO gineer.bogglesolver.Solver - Found: LIMBU 2250 [main] INFO gineer.bogglesolver.Solver - Found: LEI 2250 [main] INFO gineer.bogglesolver.Solver - Found: LEO 2250 [main] INFO gineer.bogglesolver.Solver - Found: LOB 2250 [main] INFO gineer.bogglesolver.Solver - Found: LOX 2250 [main] INFO gineer.bogglesolver.Solver - Found: OIME 2250 [main] INFO gineer.bogglesolver.Solver - Found: OIL 2250 [main] INFO gineer.bogglesolver.Solver - Found: OLE 2250 [main] INFO gineer.bogglesolver.Solver - Found: OLM 2250 [main] INFO gineer.bogglesolver.Solver - Found: EMIL 2250 [main] INFO gineer.bogglesolver.Solver - Found: EMBOLE 2250 [main] INFO gineer.bogglesolver.Solver - Found: EMBOX 2250 [main] INFO gineer.bogglesolver.Solver - Found: EAST 2250 [main] INFO gineer.bogglesolver.Solver - Found: WAF 2250 [main] INFO gineer.bogglesolver.Solver - Found: WAX 2250 [main] INFO gineer.bogglesolver.Solver - Found: WAME 2250 [main] INFO gineer.bogglesolver.Solver - Found: WAMBLE 2250 [main] INFO gineer.bogglesolver.Solver - Found: WAE 2250 [main] INFO gineer.bogglesolver.Solver - Found: WEA 2250 [main] INFO gineer.bogglesolver.Solver - Found: WEAM 2250 [main] INFO gineer.bogglesolver.Solver - Found: WEM 2250 [main] INFO gineer.bogglesolver.Solver - Found: WEA 2250 [main] INFO gineer.bogglesolver.Solver - Found: WES 2250 [main] INFO gineer.bogglesolver.Solver - Found: WEST 2250 [main] INFO gineer.bogglesolver.Solver - Found: WAE 2250 [main] INFO gineer.bogglesolver.Solver - Found: WAS 2250 [main] INFO gineer.bogglesolver.Solver - Found: WASE 2250 [main] INFO gineer.bogglesolver.Solver - Found: WAST 2250 [main] INFO gineer.bogglesolver.Solver - Found: BLEO 2250 [main] INFO gineer.bogglesolver.Solver - Found: BLO 2250 [main] INFO gineer.bogglesolver.Solver - Found: BOIL 2250 [main] INFO gineer.bogglesolver.Solver - Found: BOLE 2250 [main] INFO gineer.bogglesolver.Solver - Found: BUT 2250 [main] INFO gineer.bogglesolver.Solver - Found: AES 2250 [main] INFO gineer.bogglesolver.Solver - Found: AWA 2250 [main] INFO gineer.bogglesolver.Solver - Found: AWL 2250 [main] INFO gineer.bogglesolver.Solver - Found: AWE 2250 [main] INFO gineer.bogglesolver.Solver - Found: AWEST 2250 [main] INFO gineer.bogglesolver.Solver - Found: ASE 2250 [main] INFO gineer.bogglesolver.Solver - Found: ASEM 2250 [main] INFO gineer.bogglesolver.Solver - Found: AST 2250 [main] INFO gineer.bogglesolver.Solver - Found: SEA 2250 [main] INFO gineer.bogglesolver.Solver - Found: SEAX 2250 [main] INFO gineer.bogglesolver.Solver - Found: SEAM 2250 [main] INFO gineer.bogglesolver.Solver - Found: SEMI 2250 [main] INFO gineer.bogglesolver.Solver - Found: SEMBLE 2250 [main] INFO gineer.bogglesolver.Solver - Found: SEW 2250 [main] INFO gineer.bogglesolver.Solver - Found: SEA 2250 [main] INFO gineer.bogglesolver.Solver - Found: SWA 2250 [main] INFO gineer.bogglesolver.Solver - Found: SWAM 2250 [main] INFO gineer.bogglesolver.Solver - Found: SWAMI 2250 [main] INFO gineer.bogglesolver.Solver - Found: SWA 2250 [main] INFO gineer.bogglesolver.Solver - Found: SAW 2250 [main] INFO gineer.bogglesolver.Solver - Found: SAWT 2250 [main] INFO gineer.bogglesolver.Solver - Found: STU 2250 [main] INFO gineer.bogglesolver.Solver - Found: STUB 2250 [main] INFO gineer.bogglesolver.Solver - Found: TWA 2250 [main] INFO gineer.bogglesolver.Solver - Found: TWAE 2250 [main] INFO gineer.bogglesolver.Solver - Found: TWA 2250 [main] INFO gineer.bogglesolver.Solver - Found: TWAE 2250 [main] INFO gineer.bogglesolver.Solver - Found: TWAS 2250 [main] INFO gineer.bogglesolver.Solver - Found: TUB 2250 [main] INFO gineer.bogglesolver.Solver - Found: TUX 

Source code consists of 6 classs. I’ll post them below (if this is not the right practice on StackOverflow, please tell me).

gineer.bogglesolver.Main

 package gineer.bogglesolver; import org.apache.log4j.BasicConfigurator; import org.apache.log4j.Logger; public class Main { private final static Logger logger = Logger.getLogger(Main.class); public static void main(String[] args) { BasicConfigurator.configure(); Solver solver = new Solver(4, "FXIE" + "AMLO" + "EWBX" + "ASTU"); solver.solve(); } } 

gineer.bogglesolver.Solver

 package gineer.bogglesolver; import gineer.bogglesolver.trie.Trie; import gineer.bogglesolver.util.Constants; import gineer.bogglesolver.util.Util; import org.apache.log4j.Logger; public class Solver { private char[] puzzle; private int maxSize; private boolean[] used; private StringBuilder stringSoFar; private boolean[][] matrix; private Trie trie; private final static Logger logger = Logger.getLogger(Solver.class); public Solver(int size, String puzzle) { trie = Util.getTrie(size); matrix = Util.connectivityMatrix(size); maxSize = size * size; stringSoFar = new StringBuilder(maxSize); used = new boolean[maxSize]; if (puzzle.length() == maxSize) { this.puzzle = puzzle.toCharArray(); } else { logger.error("The puzzle size does not match the size specified: " + puzzle.length()); this.puzzle = puzzle.substring(0, maxSize).toCharArray(); } } public void solve() { for (int i = 0; i < maxSize; i++) { traverseAt(i); } } private void traverseAt(int origin) { stringSoFar.append(puzzle[origin]); used[origin] = true; //Check if we have a valid word if ((stringSoFar.length() >= Constants.MINIMUM_WORD_LENGTH) && (trie.containKey(stringSoFar.toString()))) { logger.info("Found: " + stringSoFar.toString()); } //Find where to go next for (int destination = 0; destination < maxSize; destination++) { if (matrix[origin][destination] && !used[destination] && trie.containPrefix(stringSoFar.toString() + puzzle[destination])) { traverseAt(destination); } } used[origin] = false; stringSoFar.deleteCharAt(stringSoFar.length() - 1); } } 

gineer.bogglesolver.trie.Node

 package gineer.bogglesolver.trie; import gineer.bogglesolver.util.Constants; class Node { Node[] children; boolean isKey; public Node() { isKey = false; children = new Node[Constants.NUMBER_LETTERS_IN_ALPHABET]; } public Node(boolean key) { isKey = key; children = new Node[Constants.NUMBER_LETTERS_IN_ALPHABET]; } /** Method to insert a string to Node and its children @param key the string to insert (the string is assumed to be uppercase) @return true if the node or one of its children is changed, false otherwise */ public boolean insert(String key) { //If the key is empty, this node is a key if (key.length() == 0) { if (isKey) return false; else { isKey = true; return true; } } else {//otherwise, insert in one of its child int childNodePosition = key.charAt(0) - Constants.LETTER_A; if (children[childNodePosition] == null) { children[childNodePosition] = new Node(); children[childNodePosition].insert(key.substring(1)); return true; } else { return children[childNodePosition].insert(key.substring(1)); } } } /** Returns whether key is a valid prefix for certain key in this trie. For example: if key "hello" is in this trie, tests with all prefixes "hel", "hell", "hello" return true @param prefix the prefix to check @return true if the prefix is valid, false otherwise */ public boolean containPrefix(String prefix) { //If the prefix is empty, return true if (prefix.length() == 0) { return true; } else {//otherwise, check in one of its child int childNodePosition = prefix.charAt(0) - Constants.LETTER_A; return children[childNodePosition] != null && children[childNodePosition].containPrefix(prefix.substring(1)); } } /** Returns whether key is a valid key in this trie. For example: if key "hello" is in this trie, tests with all prefixes "hel", "hell" return false @param key the key to check @return true if the key is valid, false otherwise */ public boolean containKey(String key) { //If the prefix is empty, return true if (key.length() == 0) { return isKey; } else {//otherwise, check in one of its child int childNodePosition = key.charAt(0) - Constants.LETTER_A; return children[childNodePosition] != null && children[childNodePosition].containKey(key.substring(1)); } } public boolean isKey() { return isKey; } public void setKey(boolean key) { isKey = key; } } 

gineer.bogglesolver.trie.Trie

 package gineer.bogglesolver.trie; public class Trie { Node root; public Trie() { this.root = new Node(); } /** Method to insert a string to Node and its children @param key the string to insert (the string is assumed to be uppercase) @return true if the node or one of its children is changed, false otherwise */ public boolean insert(String key) { return root.insert(key.toUpperCase()); } /** Returns whether key is a valid prefix for certain key in this trie. For example: if key "hello" is in this trie, tests with all prefixes "hel", "hell", "hello" return true @param prefix the prefix to check @return true if the prefix is valid, false otherwise */ public boolean containPrefix(String prefix) { return root.containPrefix(prefix.toUpperCase()); } /** Returns whether key is a valid key in this trie. For example: if key "hello" is in this trie, tests with all prefixes "hel", "hell" return false @param key the key to check @return true if the key is valid, false otherwise */ public boolean containKey(String key) { return root.containKey(key.toUpperCase()); } } 

gineer.bogglesolver.util.Constants

 package gineer.bogglesolver.util; public class Constants { public static final int NUMBER_LETTERS_IN_ALPHABET = 26; public static final char LETTER_A = 'A'; public static final int MINIMUM_WORD_LENGTH = 3; public static final int DEFAULT_PUZZLE_SIZE = 4; } 

gineer.bogglesolver.util.Util

 package gineer.bogglesolver.util; import gineer.bogglesolver.trie.Trie; import org.apache.log4j.Logger; import java.io.File; import java.io.FileNotFoundException; import java.util.Scanner; public class Util { private final static Logger logger = Logger.getLogger(Util.class); private static Trie trie; private static int size = Constants.DEFAULT_PUZZLE_SIZE; /** Returns the trie built from the dictionary. The size is used to eliminate words that are too long. @param size the size of puzzle. The maximum lenght of words in the returned trie is (size * size) @return the trie that can be used for puzzle of that size */ public static Trie getTrie(int size) { if ((trie != null) && size == Util.size) return trie; trie = new Trie(); Util.size = size; logger.info("Reading the dictionary"); final File file = new File("dictionary.txt"); try { Scanner scanner = new Scanner(file); final int maxSize = size * size; while (scanner.hasNext()) { String line = scanner.nextLine().replaceAll("[^\\p{Alpha}]", ""); if (line.length() < = maxSize) trie.insert(line); } } catch (FileNotFoundException e) { logger.error("Cannot open file", e); } logger.info("Finish reading the dictionary"); return trie; } static boolean[] connectivityRow(int x, int y, int size) { boolean[] squares = new boolean[size * size]; for (int offsetX = -1; offsetX <= 1; offsetX++) { for (int offsetY = -1; offsetY <= 1; offsetY++) { final int calX = x + offsetX; final int calY = y + offsetY; if ((calX >= 0) && (calX < size) && (calY >= 0) && (calY < size)) squares[calY * size + calX] = true; } } squares[y * size + x] = false;//the current x, y is false return squares; } /** Returns the matrix of connectivity between two points. Point i can go to point j iff matrix[i][j] is true Square (x, y) is equivalent to point (size * y + x). For example, square (1,1) is point 5 in a puzzle of size 4 @param size the size of the puzzle @return the connectivity matrix */ public static boolean[][] connectivityMatrix(int size) { boolean[][] matrix = new boolean[size * size][]; for (int x = 0; x < size; x++) { for (int y = 0; y < size; y++) { matrix[y * size + x] = connectivityRow(x, y, size); } } return matrix; } } 

I think you will probably spend most of your time trying to match words that can’t possibly be built by your letter grid. So, the first thing I would do is try to speed up that step and that should get you most of the way there.

For this, I would re-express the grid as a table of possible “moves” that you index by the letter-transition you are looking at.

Start by assigning each letter a number from your entire alphabet (A=0, B=1, C=2, … and so forth).

Vamos pegar este exemplo:

 hbcd eegh llkl mofp 

And for now, lets use the alphabet of the letters we have (usually you’d probably want to use the same whole alphabet every time):

  b | c | d | e | f | g | h | k | l | m | o | p ---+---+---+---+---+---+---+---+---+---+----+---- 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 

Then you make a 2D boolean array that tells whether you have a certain letter transition available:

  | 0 1 2 3 4 5 6 7 8 9 10 11 < - from letter | bcdefghklmop -----+-------------------------------------- 0 b | TTTT 1 c | TTTTT 2 d | TTT 3 e | TTTTTTT 4 f | TTTT 5 g | TTTTTTT 6 h | TTTTTTT 7 k | TTTTTTT 8 l | TTTTTTTTT 9 m | TT 10 o | TTTT 11 p | TTT ^ to letter 

Now go through your word list and convert the words to transitions:

 hello (6, 3, 8, 8, 10): 6 -> 3, 3 -> 8, 8 -> 8, 8 -> 10 

Then check if these transitions are allowed by looking them up in your table:

 [6][ 3] : T [3][ 8] : T [8][ 8] : T [8][10] : T 

If they are all allowed, there's a chance that this word might be found.

For example the word "helmet" can be ruled out on the 4th transition (m to e: helMEt), since that entry in your table is false.

And the word hamster can be ruled out, since the first (h to a) transition is not allowed (doesn't even exist in your table).

Now, for the probably very few remaining words that you didn't eliminate, try to actually find them in the grid the way you're doing it now or as suggested in some of the other answers here. This is to avoid false positives that result from jumps between identical letters in your grid. For example the word "help" is allowed by the table, but not by the grid.

Some further performance improvement tips on this idea:

  1. Instead of using a 2D array, use a 1D array and simply compute the index of the second letter yourself. So, instead of a 12x12 array like above, make a 1D array of length 144. If you then always use the same alphabet (ie a 26x26 = 676x1 array for the standard english alphabet), even if not all letters show up in your grid, you can pre-compute the indices into this 1D array that you need to test to match your dictionary words. For example, the indices for 'hello' in the example above would be

     hello (6, 3, 8, 8, 10): 42 (from 6 + 3x12), 99, 104, 128 -> "hello" will be stored as 42, 99, 104, 128 in the dictionary 
  2. Extend the idea to a 3D table (expressed as a 1D array), ie all allowed 3-letter combinations. That way you can eliminate even more words immediately and you reduce the number of array lookups for each word by 1: For 'hello', you only need 3 array lookups: hel, ell, llo. It will be very quick to build this table, by the way, as there are only 400 possible 3-letter-moves in your grid.

  3. Pre-compute the indices of the moves in your grid that you need to include in your table. For the example above, you need to set the following entries to 'True':

     (0,0) (0,1) -> here: h, b : [6][0] (0,0) (1,0) -> here: h, e : [6][3] (0,0) (1,1) -> here: h, e : [6][3] (0,1) (0,0) -> here: b, h : [0][6] (0,1) (0,2) -> here: b, c : [0][1] . : 
  4. Also represent your game grid in a 1-D array with 16 entries and have the table pre-computed in 3. contain the indices into this array.

I'm sure if you use this approach you can get your code to run insanely fast, if you have the dictionary pre-computed and already loaded into memory.

BTW: Another nice thing to do, if you are building a game, is to run these sort of things immediately in the background. Start generating and solving the first game while the user is still looking at the title screen on your app and getting his finger into position to press "Play". Then generate and solve the next game as the user plays the previous one. That should give you a lot of time to run your code.

(I like this problem, so I'll probably be tempted to implement my proposal in Java sometime in the next days to see how it would actually perform... I'll post the code here once I do.)

ATUALIZAR:

Ok, I had some time today and implemented this idea in Java:

 class DictionaryEntry { public int[] letters; public int[] triplets; } class BoggleSolver { // Constants final int ALPHABET_SIZE = 5; // up to 2^5 = 32 letters final int BOARD_SIZE = 4; // 4x4 board final int[] moves = {-BOARD_SIZE-1, -BOARD_SIZE, -BOARD_SIZE+1, -1, +1, +BOARD_SIZE-1, +BOARD_SIZE, +BOARD_SIZE+1}; // Technically constant (calculated here for flexibility, but should be fixed) DictionaryEntry[] dictionary; // Processed word list int maxWordLength = 0; int[] boardTripletIndices; // List of all 3-letter moves in board coordinates DictionaryEntry[] buildDictionary(String fileName) throws IOException { BufferedReader fileReader = new BufferedReader(new FileReader(fileName)); String word = fileReader.readLine(); ArrayList result = new ArrayList(); while (word!=null) { if (word.length()>=3) { word = word.toUpperCase(); if (word.length()>maxWordLength) maxWordLength = word.length(); DictionaryEntry entry = new DictionaryEntry(); entry.letters = new int[word.length() ]; entry.triplets = new int[word.length()-2]; int i=0; for (char letter: word.toCharArray()) { entry.letters[i] = (byte) letter - 65; // Convert ASCII to 0..25 if (i>=2) entry.triplets[i-2] = (((entry.letters[i-2] < < ALPHABET_SIZE) + entry.letters[i-1]) << ALPHABET_SIZE) + entry.letters[i]; i++; } result.add(entry); } word = fileReader.readLine(); } return result.toArray(new DictionaryEntry[result.size()]); } boolean isWrap(int a, int b) { // Checks if move a->b wraps board edge (like 3->4) return Math.abs(a%BOARD_SIZE-b%BOARD_SIZE)>1; } int[] buildTripletIndices() { ArrayList result = new ArrayList(); for (int a=0; a=0) && (b=0) && (c=0) && (next 

Here are some results:

For the grid from the picture posted in the original question (DGHI...):

 Precomputation finished in 239.59ms: Words in the dictionary: 178590 Longest word: 15 letters Number of triplet-moves: 408 Initialization finished in 0.22ms Board solved in 3.70ms: Number of candidates: 230 Number of actual words: 163 Words found: eek, eel, eely, eld, elhi, elk, ern, erupt, erupts, euro eye, eyer, ghi, ghis, glee, gley, glue, gluer, gluey, glut gluts, hip, hiply, hips, his, hist, kelp, kelps, kep, kepi kepis, keps, kept, kern, key, kye, lee, lek, lept, leu ley, lunt, lunts, lure, lush, lust, lustre, lye, nus, nut nuts, ore, ort, orts, ouph, ouphs, our, oust, out, outre outs, oyer, pee, per, pert, phi, phis, pis, pish, plus plush, ply, plyer, psi, pst, pul, pule, puler, pun, punt punts, pur, pure, puree, purely, pus, push, put, puts, ree rely, rep, reply, reps, roe, roue, roup, roups, roust, rout routs, rue, rule, ruly, run, runt, runts, rupee, rush, rust rut, ruts, ship, shlep, sip, sipe, spue, spun, spur, spurn spurt, strep, stroy, stun, stupe, sue, suer, sulk, sulker, sulky sun, sup, supe, super, sure, surely, tree, trek, trey, troupe troy, true, truly, tule, tun, tup, tups, turn, tush, ups urn, uts, yeld, yelk, yelp, yelps, yep, yeps, yore, you your, yourn, yous 

For the letters posted as the example in the original question (FXIE...)

 Precomputation finished in 239.68ms: Words in the dictionary: 178590 Longest word: 15 letters Number of triplet-moves: 408 Initialization finished in 0.21ms Board solved in 3.69ms: Number of candidates: 87 Number of actual words: 76 Words found: amble, ambo, ami, amie, asea, awa, awe, awes, awl, axil axile, axle, boil, bole, box, but, buts, east, elm, emboli fame, fames, fax, lei, lie, lima, limb, limbo, limbs, lime limes, lob, lobs, lox, mae, maes, maw, maws, max, maxi mesa, mew, mewl, mews, mil, mile, milo, mix, oil, ole sae, saw, sea, seam, semi, sew, stub, swam, swami, tub tubs, tux, twa, twae, twaes, twas, uts, wae, waes, wamble wame, wames, was, wast, wax, west 

For the following 5x5-grid:

 RPRIT AHHLN IETEP ZRYSG OGWEY 

it gives this:

 Precomputation finished in 240.39ms: Words in the dictionary: 178590 Longest word: 15 letters Number of triplet-moves: 768 Initialization finished in 0.23ms Board solved in 3.85ms: Number of candidates: 331 Number of actual words: 240 Words found: aero, aery, ahi, air, airt, airth, airts, airy, ear, egest elhi, elint, erg, ergo, ester, eth, ether, eye, eyen, eyer eyes, eyre, eyrie, gel, gelt, gelts, gen, gent, gentil, gest geste, get, gets, gey, gor, gore, gory, grey, greyest, greys gyre, gyri, gyro, hae, haet, haets, hair, hairy, hap, harp heap, hear, heh, heir, help, helps, hen, hent, hep, her hero, hes, hest, het, hetero, heth, hets, hey, hie, hilt hilts, hin, hint, hire, hit, inlet, inlets, ire, leg, leges legs, lehr, lent, les, lest, let, lethe, lets, ley, leys lin, line, lines, liney, lint, lit, neg, negs, nest, nester net, nether, nets, nil, nit, ogre, ore, orgy, ort, orts pah, pair, par, peg, pegs, peh, pelt, pelter, peltry, pelts pen, pent, pes, pest, pester, pesty, pet, peter, pets, phi philter, philtre, phiz, pht, print, pst, rah, rai, rap, raphe raphes, reap, rear, rei, ret, rete, rets, rhaphe, rhaphes, rhea ria, rile, riles, riley, rin, rye, ryes, seg, sel, sen sent, senti, set, sew, spelt, spelter, spent, splent, spline, splint split, stent, step, stey, stria, striae, sty, stye, tea, tear teg, tegs, tel, ten, tent, thae, the, their, then, these thesp, they, thin, thine, thir, thirl, til, tile, tiles, tilt tilter, tilth, tilts, tin, tine, tines, tirl, trey, treys, trog try, tye, tyer, tyes, tyre, tyro, west, wester, wry, wryest wye, wyes, wyte, wytes, yea, yeah, year, yeh, yelp, yelps yen, yep, yeps, yes, yester, yet, yew, yews, zero, zori 

For this I used the TWL06 Tournament Scrabble Word List , since the link in the original question no longer works. This file is 1.85MB, so it's a little bit shorter. And the buildDictionary function throws out all words with less than 3 letters.

Here are a couple of observations about the performance of this:

  • It's about 10 times slower than the reported performance of Victor Nicollet's OCaml implementation. Whether this is caused by the different algorithm, the shorter dictionary he used, the fact that his code is compiled and mine runs in a Java virtual machine, or the performance of our computers (mine is an Intel Q6600 @ 2.4MHz running WinXP), I don't know. But it's much faster than the results for the other implementations quoted at the end of the original question. So, whether this algorithm is superior to the trie dictionary or not, I don't know at this point.

  • The table method used in checkWordTriplets() yields a very good approximation to the actual answers. Only 1 in 3-5 words passed by it will fail the checkWords() test (See number of candidates vs. number of actual words above).

  • Something you can't see above: The checkWordTriplets() function takes about 3.65ms and is therefore fully dominant in the search process. The checkWords() function takes up pretty much the remaining 0.05-0.20 ms.

  • The execution time of the checkWordTriplets() function depends linearly on the dictionary size and is virtually independent of board size!

  • The execution time of checkWords() depends on the board size and the number of words not ruled out by checkWordTriplets() .

  • The checkWords() implementation above is the dumbest first version I came up with. It is basically not optimized at all. But compared to checkWordTriplets() it is irrelevant for the total performance of the application, so I didn't worry about it. But , if the board size gets bigger, this function will get slower and slower and will eventually start to matter. Then, it would need to be optimized as well.

  • One nice thing about this code is its flexibility:

    • You can easily change the board size: Update line 10 and the String array passed to initializeBoard() .
    • It can support larger/different alphabets and can handle things like treating 'Qu' as one letter without any performance overhead. To do this, one would need to update line 9 and the couple of places where characters are converted to numbers (currently simply by subtracting 65 from the ASCII value)

Ok, but I think by now this post is waaaay long enough. I can definitely answer any questions you might have, but let's move that to the comments.

Surprisingly, no one attempted a PHP version of this.

This is a working PHP version of John Fouhy’s Python solution.

Although I took some pointers from everyone else’s answers, this is mostly copied from John.

 $boggle = "fxie amlo ewbx astu"; $alphabet = str_split(str_replace(array("\n", " ", "\r"), "", strtolower($boggle))); $rows = array_map('trim', explode("\n", $boggle)); $dictionary = file("C:/dict.txt"); $prefixes = array(''=>''); $words = array(); $regex = '/[' . implode('', $alphabet) . ']{3,}$/S'; foreach($dictionary as $k=>$value) { $value = trim(strtolower($value)); $length = strlen($value); if(preg_match($regex, $value)) { for($x = 0; $x < $length; $x++) { $letter = substr($value, 0, $x+1); if($letter == $value) { $words[$value] = 1; } else { $prefixes[$letter] = 1; } } } } $graph = array(); $chardict = array(); $positions = array(); $c = count($rows); for($i = 0; $i < $c; $i++) { $l = strlen($rows[$i]); for($j = 0; $j < $l; $j++) { $chardict[$i.','.$j] = $rows[$i][$j]; $children = array(); $pos = array(-1,0,1); foreach($pos as $z) { $xCoord = $z + $i; if($xCoord < 0 || $xCoord >= count($rows)) { continue; } $len = strlen($rows[0]); foreach($pos as $w) { $yCoord = $j + $w; if(($yCoord < 0 || $yCoord >= $len) || ($z == 0 && $w == 0)) { continue; } $children[] = array($xCoord, $yCoord); } } $graph['None'][] = array($i, $j); $graph[$i.','.$j] = $children; } } function to_word($chardict, $prefix) { $word = array(); foreach($prefix as $v) { $word[] = $chardict[$v[0].','.$v[1]]; } return implode("", $word); } function find_words($graph, $chardict, $position, $prefix, $prefixes, &$results, $words) { $word = to_word($chardict, $prefix); if(!isset($prefixes[$word])) return false; if(isset($words[$word])) { $results[] = $word; } foreach($graph[$position] as $child) { if(!in_array($child, $prefix)) { $newprefix = $prefix; $newprefix[] = $child; find_words($graph, $chardict, $child[0].','.$child[1], $newprefix, $prefixes, $results, $words); } } } $solution = array(); find_words($graph, $chardict, 'None', array(), $prefixes, $solution); print_r($solution); 

Here is a live link if you want to try it out. Although it takes ~2s in my local machine, it takes ~5s on my webserver. In either case, it is not very fast. Still, though, it is quite hideous so I can imagine the time can be reduced significantly. Any pointers on how to accomplish that would be appreciated. PHP’s lack of tuples made the coordinates weird to work with and my inability to comprehend just what the hell is going on didn’t help at all.

EDIT : A few fixes make it take less than 1s locally.

Not interested in VB? 🙂 I couldn’t resist. I’ve solved this differently than many of the solutions presented here.

My times are:

  • Loading the dictionary and word prefixes into a hashtable: .5 to 1 seconds.
  • Finding the words: averaging under 10 milliseconds.

EDIT: Dictionary load times on the web host server are running about 1 to 1.5 seconds longer than my home computer.

I don’t know how badly the times will deteriorate with a load on the server.

I wrote my solution as a web page in .Net. myvrad.com/boggle

I’m using the dictionary referenced in the original question.

Letters are not reused in a word. Only words 3 characters or longer are found.

I’m using a hashtable of all unique word prefixes and words instead of a trie. I didn’t know about trie’s so I learned something there. The idea of creating a list of prefixes of words in addition to the complete words is what finally got my times down to a respectable number.

Read the code comments for additional details.

Aqui está o código:

 Imports System.Collections.Generic Imports System.IO Partial Class boggle_Default 'Bob Archer, 4/15/2009 'To avoid using a 2 dimensional array in VB I'm not using typical X,Y 'coordinate iteration to find paths. ' 'I have locked the code into a 4 by 4 grid laid out like so: ' abcd ' efgh ' ijkl ' mnop ' 'To find paths the code starts with a letter from a to p then 'explores the paths available around it. If a neighboring letter 'already exists in the path then we don't go there. ' 'Neighboring letters (grid points) are hard coded into 'a Generic.Dictionary below. 'Paths is a list of only valid Paths found. 'If a word prefix or word is not found the path is not 'added and extending that path is terminated. Dim Paths As New Generic.List(Of String) 'NeighborsOf. The keys are the letters a to p. 'The value is a string of letters representing neighboring letters. 'The string of neighboring letters is split and iterated later. Dim NeigborsOf As New Generic.Dictionary(Of String, String) 'BoggleLetters. The keys are mapped to the lettered grid of a to p. 'The values are what the user inputs on the page. Dim BoggleLetters As New Generic.Dictionary(Of String, String) 'Used to store last postition of path. This will be a letter 'from a to p. Dim LastPositionOfPath As String = "" 'I found a HashTable was by far faster than a Generic.Dictionary ' - about 10 times faster. This stores prefixes of words and words. 'I determined 792773 was the number of words and unique prefixes that 'will be generated from the dictionary file. This is a max number and 'the final hashtable will not have that many. Dim HashTableOfPrefixesAndWords As New Hashtable(792773) 'Stores words that are found. Dim FoundWords As New Generic.List(Of String) 'Just to validate what the user enters in the grid. Dim ErrorFoundWithSubmittedLetters As Boolean = False Public Sub BuildAndTestPathsAndFindWords(ByVal ThisPath As String) 'Word is the word correlating to the ThisPath parameter. 'This path would be a series of letters from a to p. Dim Word As String = "" 'The path is iterated through and a word based on the actual 'letters in the Boggle grid is assembled. For i As Integer = 0 To ThisPath.Length - 1 Word += Me.BoggleLetters(ThisPath.Substring(i, 1)) Next 'If my hashtable of word prefixes and words doesn't contain this Word 'Then this isn't a word and any further extension of ThisPath will not 'yield any words either. So exit sub to terminate exploring this path. If Not HashTableOfPrefixesAndWords.ContainsKey(Word) Then Exit Sub 'The value of my hashtable is a boolean representing if the key if a word (true) or 'just a prefix (false). If true and at least 3 letters long then yay! word found. If HashTableOfPrefixesAndWords(Word) AndAlso Word.Length > 2 Then Me.FoundWords.Add(Word) 'If my List of Paths doesn't contain ThisPath then add it. 'Remember only valid paths will make it this far. Paths not found 'in the HashTableOfPrefixesAndWords cause this sub to exit above. If Not Paths.Contains(ThisPath) Then Paths.Add(ThisPath) 'Examine the last letter of ThisPath. We are looking to extend the path 'to our neighboring letters if any are still available. LastPositionOfPath = ThisPath.Substring(ThisPath.Length - 1, 1) 'Loop through my list of neighboring letters (representing grid points). For Each Neighbor As String In Me.NeigborsOf(LastPositionOfPath).ToCharArray() 'If I find a neighboring grid point that I haven't already used 'in ThisPath then extend ThisPath and feed the new path into 'this recursive function. (see recursive.) If Not ThisPath.Contains(Neighbor) Then Me.BuildAndTestPathsAndFindWords(ThisPath & Neighbor) Next End Sub Protected Sub ButtonBoggle_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles ButtonBoggle.Click 'User has entered the 16 letters and clicked the go button. 'Set up my Generic.Dictionary of grid points, I'm using letters a to p - 'not an x,y grid system. The values are neighboring points. NeigborsOf.Add("a", "bfe") NeigborsOf.Add("b", "cgfea") NeigborsOf.Add("c", "dhgfb") NeigborsOf.Add("d", "hgc") NeigborsOf.Add("e", "abfji") NeigborsOf.Add("f", "abcgkjie") NeigborsOf.Add("g", "bcdhlkjf") NeigborsOf.Add("h", "cdlkg") NeigborsOf.Add("i", "efjnm") NeigborsOf.Add("j", "efgkonmi") NeigborsOf.Add("k", "fghlponj") NeigborsOf.Add("l", "ghpok") NeigborsOf.Add("m", "ijn") NeigborsOf.Add("n", "ijkom") NeigborsOf.Add("o", "jklpn") NeigborsOf.Add("p", "klo") 'Retrieve letters the user entered. BoggleLetters.Add("a", Me.TextBox1.Text.ToLower.Trim()) BoggleLetters.Add("b", Me.TextBox2.Text.ToLower.Trim()) BoggleLetters.Add("c", Me.TextBox3.Text.ToLower.Trim()) BoggleLetters.Add("d", Me.TextBox4.Text.ToLower.Trim()) BoggleLetters.Add("e", Me.TextBox5.Text.ToLower.Trim()) BoggleLetters.Add("f", Me.TextBox6.Text.ToLower.Trim()) BoggleLetters.Add("g", Me.TextBox7.Text.ToLower.Trim()) BoggleLetters.Add("h", Me.TextBox8.Text.ToLower.Trim()) BoggleLetters.Add("i", Me.TextBox9.Text.ToLower.Trim()) BoggleLetters.Add("j", Me.TextBox10.Text.ToLower.Trim()) BoggleLetters.Add("k", Me.TextBox11.Text.ToLower.Trim()) BoggleLetters.Add("l", Me.TextBox12.Text.ToLower.Trim()) BoggleLetters.Add("m", Me.TextBox13.Text.ToLower.Trim()) BoggleLetters.Add("n", Me.TextBox14.Text.ToLower.Trim()) BoggleLetters.Add("o", Me.TextBox15.Text.ToLower.Trim()) BoggleLetters.Add("p", Me.TextBox16.Text.ToLower.Trim()) 'Validate user entered something with a length of 1 for all 16 textboxes. For Each S As String In BoggleLetters.Keys If BoggleLetters(S).Length <> 1 Then ErrorFoundWithSubmittedLetters = True Exit For End If Next 'If input is not valid then... If ErrorFoundWithSubmittedLetters Then 'Present error message. Else 'Else assume we have 16 letters to work with and start finding words. Dim SB As New StringBuilder Dim Time As String = String.Format("{0}:{1}:{2}:{3}", Date.Now.Hour.ToString(), Date.Now.Minute.ToString(), Date.Now.Second.ToString(), Date.Now.Millisecond.ToString()) Dim NumOfLetters As Integer = 0 Dim Word As String = "" Dim TempWord As String = "" Dim Letter As String = "" Dim fr As StreamReader = Nothing fr = New System.IO.StreamReader(HttpContext.Current.Request.MapPath("~/boggle/dic.txt")) 'First fill my hashtable with word prefixes and words. 'HashTable(PrefixOrWordString, BooleanTrueIfWordFalseIfPrefix) While fr.Peek <> -1 Word = fr.ReadLine.Trim() TempWord = "" For i As Integer = 0 To Word.Length - 1 Letter = Word.Substring(i, 1) 'This optimization helped quite a bit. Words in the dictionary that begin 'with letters that the user did not enter in the grid shouldn't go in my hashtable. ' 'I realize most of the solutions went with a Trie. I'd never heard of that before, 'which is one of the neat things about SO, seeing how others approach challenges 'and learning some best practices. ' 'However, I didn't code a Trie in my solution. I just have a hashtable with 'all words in the dicitonary file and all possible prefixes for those words. 'A Trie might be faster but I'm not coding it now. I'm getting good times with this. If i = 0 AndAlso Not BoggleLetters.ContainsValue(Letter) Then Continue While TempWord += Letter If Not HashTableOfPrefixesAndWords.ContainsKey(TempWord) Then HashTableOfPrefixesAndWords.Add(TempWord, TempWord = Word) End If Next End While SB.Append("Number of Word Prefixes and Words in Hashtable: " & HashTableOfPrefixesAndWords.Count.ToString()) SB.Append("
") SB.Append("Loading Dictionary: " & Time & " - " & String.Format("{0}:{1}:{2}:{3}", Date.Now.Hour.ToString(), Date.Now.Minute.ToString(), Date.Now.Second.ToString(), Date.Now.Millisecond.ToString())) SB.Append("
") Time = String.Format("{0}:{1}:{2}:{3}", Date.Now.Hour.ToString(), Date.Now.Minute.ToString(), Date.Now.Second.ToString(), Date.Now.Millisecond.ToString()) 'This starts a path at each point on the grid an builds a path until 'the string of letters correlating to the path is not found in the hashtable 'of word prefixes and words. Me.BuildAndTestPathsAndFindWords("a") Me.BuildAndTestPathsAndFindWords("b") Me.BuildAndTestPathsAndFindWords("c") Me.BuildAndTestPathsAndFindWords("d") Me.BuildAndTestPathsAndFindWords("e") Me.BuildAndTestPathsAndFindWords("f") Me.BuildAndTestPathsAndFindWords("g") Me.BuildAndTestPathsAndFindWords("h") Me.BuildAndTestPathsAndFindWords("i") Me.BuildAndTestPathsAndFindWords("j") Me.BuildAndTestPathsAndFindWords("k") Me.BuildAndTestPathsAndFindWords("l") Me.BuildAndTestPathsAndFindWords("m") Me.BuildAndTestPathsAndFindWords("n") Me.BuildAndTestPathsAndFindWords("o") Me.BuildAndTestPathsAndFindWords("p") SB.Append("Finding Words: " & Time & " - " & String.Format("{0}:{1}:{2}:{3}", Date.Now.Hour.ToString(), Date.Now.Minute.ToString(), Date.Now.Second.ToString(), Date.Now.Millisecond.ToString())) SB.Append("
") SB.Append("Num of words found: " & FoundWords.Count.ToString()) SB.Append("
") SB.Append("
") FoundWords.Sort() SB.Append(String.Join("
", FoundWords.ToArray())) 'Output results. Me.LiteralBoggleResults.Text = SB.ToString() Me.PanelBoggleResults.Visible = True End If End Sub End Class

As soon as I saw the problem statement, I thought “Trie”. But seeing as several other posters made use of that approach, I looked for another approach just to be different. Alas, the Trie approach performs better. I ran Kent’s Perl solution on my machine and it took 0.31 seconds to run, after adapting it to use my dictionary file. My own perl implementation required 0.54 seconds to run.

Essa foi a minha abordagem:

  1. Create a transition hash to model the legal transitions.

  2. Iterate through all 16^3 possible three letter combinations.

    • In the loop, exclude illegal transitions and repeat visits to the same square. Form all the legal 3-letter sequences and store them in a hash.
  3. Then loop through all words in the dictionary.

    • Exclude words that are too long or short
    • Slide a 3-letter window across each word and see if it is among the 3-letter combos from step 2. Exclude words that fail. This eliminates most non-matches.
    • If still not eliminated, use a recursive algorithm to see if the word can be formed by making paths through the puzzle. (This part is slow, but called infrequently.)
  4. Print out the words I found.

    I tried 3-letter and 4-letter sequences, but 4-letter sequences slowed the program down.

In my code, I use /usr/share/dict/words for my dictionary. It comes standard on MAC OS X and many Unix systems. You can use another file if you want. To crack a different puzzle, just change the variable @puzzle. This would be easy to adapt for larger matrices. You would just need to change the %transitions hash and %legalTransitions hash.

The strength of this solution is that the code is short, and the data structures simple.

Here is the Perl code (which uses too many global variables, I know):

 #!/usr/bin/perl use Time::HiRes qw{ time }; sub readFile($); sub findAllPrefixes($); sub isWordTraceable($); sub findWordsInPuzzle(@); my $startTime = time; # Puzzle to solve my @puzzle = ( F, X, I, E, A, M, L, O, E, W, B, X, A, S, T, U ); my $minimumWordLength = 3; my $maximumPrefixLength = 3; # I tried four and it slowed down. # Slurp the word list. my $wordlistFile = "/usr/share/dict/words"; my @words = split(/\n/, uc(readFile($wordlistFile))); print "Words loaded from word list: " . scalar @words . "\n"; print "Word file load time: " . (time - $startTime) . "\n"; my $postLoad = time; # Define the legal transitions from one letter position to another. # Positions are numbered 0-15. # 0 1 2 3 # 4 5 6 7 # 8 9 10 11 # 12 13 14 15 my %transitions = ( -1 => [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15], 0 => [1,4,5], 1 => [0,2,4,5,6], 2 => [1,3,5,6,7], 3 => [2,6,7], 4 => [0,1,5,8,9], 5 => [0,1,2,4,6,8,9,10], 6 => [1,2,3,5,7,9,10,11], 7 => [2,3,6,10,11], 8 => [4,5,9,12,13], 9 => [4,5,6,8,10,12,13,14], 10 => [5,6,7,9,11,13,14,15], 11 => [6,7,10,14,15], 12 => [8,9,13], 13 => [8,9,10,12,14], 14 => [9,10,11,13,15], 15 => [10,11,14] ); # Convert the transition matrix into a hash for easy access. my %legalTransitions = (); foreach my $start (keys %transitions) { my $legalRef = $transitions{$start}; foreach my $stop (@$legalRef) { my $index = ($start + 1) * (scalar @puzzle) + ($stop + 1); $legalTransitions{$index} = 1; } } my %prefixesInPuzzle = findAllPrefixes($maximumPrefixLength); print "Find prefixes time: " . (time - $postLoad) . "\n"; my $postPrefix = time; my @wordsFoundInPuzzle = findWordsInPuzzle(@words); print "Find words in puzzle time: " . (time - $postPrefix) . "\n"; print "Unique prefixes found: " . (scalar keys %prefixesInPuzzle) . "\n"; print "Words found (" . (scalar @wordsFoundInPuzzle) . ") :\n " . join("\n ", @wordsFoundInPuzzle) . "\n"; print "Total Elapsed time: " . (time - $startTime) . "\n"; ########################################### sub readFile($) { my ($filename) = @_; my $contents; if (-e $filename) { # This is magic: it opens and reads a file into a scalar in one line of code. # See http://www.perl.com/pub/a/2003/11/21/slurp.html $contents = do { local( @ARGV, $/ ) = $filename ; <> } ; } else { $contents = ''; } return $contents; } # Is it legal to move from the first position to the second? They must be adjacent. sub isLegalTransition($$) { my ($pos1,$pos2) = @_; my $index = ($pos1 + 1) * (scalar @puzzle) + ($pos2 + 1); return $legalTransitions{$index}; } # Find all prefixes where $minimumWordLength < = length <= $maxPrefixLength # # $maxPrefixLength ... Maximum length of prefix we will store. Three gives best performance. sub findAllPrefixes($) { my ($maxPrefixLength) = @_; my %prefixes = (); my $puzzleSize = scalar @puzzle; # Every possible N-letter combination of the letters in the puzzle # can be represented as an integer, though many of those combinations # involve illegal transitions, duplicated letters, etc. # Iterate through all those possibilities and eliminate the illegal ones. my $maxIndex = $puzzleSize ** $maxPrefixLength; for (my $i = 0; $i < $maxIndex; $i++) { my @path; my $remainder = $i; my $prevPosition = -1; my $prefix = ''; my %usedPositions = (); for (my $prefixLength = 1; $prefixLength <= $maxPrefixLength; $prefixLength++) { my $position = $remainder % $puzzleSize; # Is this a valid step? # a. Is the transition legal (to an adjacent square)? if (! isLegalTransition($prevPosition, $position)) { last; } # b. Have we repeated a square? if ($usedPositions{$position}) { last; } else { $usedPositions{$position} = 1; } # Record this prefix if length >= $minimumWordLength. $prefix .= $puzzle[$position]; if ($prefixLength >= $minimumWordLength) { $prefixes{$prefix} = 1; } push @path, $position; $remainder -= $position; $remainder /= $puzzleSize; $prevPosition = $position; } # end inner for } # end outer for return %prefixes; } # Loop through all words in dictionary, looking for ones that are in the puzzle. sub findWordsInPuzzle(@) { my @allWords = @_; my @wordsFound = (); my $puzzleSize = scalar @puzzle; WORD: foreach my $word (@allWords) { my $wordLength = length($word); if ($wordLength > $puzzleSize || $wordLength < $minimumWordLength) { # Reject word as too short or too long. } elsif ($wordLength <= $maximumPrefixLength ) { # Word should be in the prefix hash. if ($prefixesInPuzzle{$word}) { push @wordsFound, $word; } } else { # Scan through the word using a window of length $maximumPrefixLength, looking for any strings not in our prefix list. # If any are found that are not in the list, this word is not possible. # If no non-matches are found, we have more work to do. my $limit = $wordLength - $maximumPrefixLength + 1; for (my $startIndex = 0; $startIndex < $limit; $startIndex ++) { if (! $prefixesInPuzzle{substr($word, $startIndex, $maximumPrefixLength)}) { next WORD; } } if (isWordTraceable($word)) { # Additional test necessary: see if we can form this word by following legal transitions push @wordsFound, $word; } } } return @wordsFound; } # Is it possible to trace out the word using only legal transitions? sub isWordTraceable($) { my $word = shift; return traverse([split(//, $word)], [-1]); # Start at special square -1, which may transition to any square in the puzzle. } # Recursively look for a path through the puzzle that matches the word. sub traverse($$) { my ($lettersRef, $pathRef) = @_; my $index = scalar @$pathRef - 1; my $position = $pathRef->[$index]; my $letter = $lettersRef->[$index]; my $branchesRef = $transitions{$position}; BRANCH: foreach my $branch (@$branchesRef) { if ($puzzle[$branch] eq $letter) { # Have we used this position yet? foreach my $usedBranch (@$pathRef) { if ($usedBranch == $branch) { next BRANCH; } } if (scalar @$lettersRef == $index + 1) { return 1; # End of word and success. } push @$pathRef, $branch; if (traverse($lettersRef, $pathRef)) { return 1; # Recursive success. } else { pop @$pathRef; } } } return 0; # No path found. Failed. } 

I know I’m super late but I made one of these a while ago in PHP – just for fun too…

http://www.lostsockdesign.com.au/sandbox/boggle/index.php?letters=fxieamloewbxastu Found 75 words (133 pts) in 0.90108 seconds

F.........X..I..............E............... A......................................M..............................L............................O............................... E....................W............................B..........................X A..................S..................................................T.................U....

Gives some indication of what the program is actually doing – each letter is where it starts looking through the patterns while each ‘.’ shows a path that it has tried to take. The more ‘.’ there are the further it has searched.

Let me know if you want the code… it is a horrible mix of PHP and HTML that was never meant to see the light of day so I dare not post it here 😛

I spent 3 months working on a solution to the 10 best point dense 5×5 Boggle boards problem.

The problem is now solved and laid out with full disclosure on 5 web pages. Please contact me with questions.

The board analysis algorithm uses an explicit stack to pseudo-recursively traverse the board squares through a Directed Acyclic Word Graph with direct child information, and a time stamp tracking mechanism. This may very well be the world’s most advanced lexicon data structure.

The scheme evaluates some 10,000 very good boards per second on a quad core. (9500+ points)

Parent Web Page:

DeepSearch.c – http://www.pathcom.com/~vadco/deep.html

Component Web Pages:

Optimal Scoreboard – http://www.pathcom.com/~vadco/binary.html

Advanced Lexicon Structure – http://www.pathcom.com/~vadco/adtdawg.html

Board Analysis Algorithm – http://www.pathcom.com/~vadco/guns.html

Parallel Batch Processing – http://www.pathcom.com/~vadco/parallel.html

– This comprehensive body of work will only interest a person who demands the very best.

Does your search algorithm continually decrease the word list as your search continues?

For instance, in the search above there are only 13 letters that your words can start with (effectively reducing to half as many starting letters).

As you add more letter permutations it would further decrease the available word sets decreasing the searching necessary.

I’d start there.

I’d have to give more thought to a complete solution, but as a handy optimisation, I wonder whether it might be worth pre-computing a table of frequencies of digrams and trigrams (2- and 3-letter combinations) based on all the words from your dictionary, and use this to prioritise your search. I’d go with the starting letters of words. So if your dictionary contained the words “India”, “Water”, “Extreme”, and “Extraordinary”, then your pre-computed table might be:

 'IN': 1 'WA': 1 'EX': 2 

Then search for these digrams in the order of commonality (first EX, then WA/IN)

First, read how one of the C# language designers solved a related problem: http://blogs.msdn.com/ericlippert/archive/2009/02/04/a-nasality-talisman-for-the-sultana-analyst.aspx .

Like him, you can start with a dictionary and the canonacalize words by creating a dictionary from an array of letters sorted alphabetically to a list of words that can be spelled from those letters.

Next, start creating the possible words from the board and looking them up. I suspect that will get you pretty far, but there are certainly more tricks that might speed things up.

I suggest making a tree of letters based on words. The tree would be composed of a letter structs, like this:

 letter: char isWord: boolean 

Then you build up the tree, with each depth adding a new letter. In other words, on the first level there’d be the alphabet; then from each of those trees, there’d be another another 26 entries, and so on, until you’ve spelled out all the words. Hang onto this parsed tree, and it’ll make all possible answers faster to look up.

With this parsed tree, you can very quickly find solutions. Here’s the pseudo-code:

 BEGIN: For each letter: if the struct representing it on the current depth has isWord == true, enter it as an answer. Cycle through all its neighbors; if there is a child of the current node corresponding to the letter, recursively call BEGIN on it. 

This could be sped up with a bit of dynamic programming. For example, in your sample, the two ‘A’s are both next to an ‘E’ and a ‘W’, which (from the point they hit them on) would be identical. I don’t have enough time to really spell out the code for this, but I think you can gather the idea.

Also, I’m sure you’ll find other solutions if you Google for “Boggle solver”.

Just for fun, I implemented one in bash. It is not super fast, but reasonable.

http://dev.xkyle.com/bashboggle/

Hilarious. I nearly posted the same question a few days ago due to the same damn game! I did not however because just searched google for boggle solver python and got all the answers I could want.

I realize this question’s time has come and gone, but since I was working on a solver myself, and stumbled onto this while googling about, I thought I should post a reference to mine as it seems a bit different from some of the others.

I chose to go with a flat array for the game board, and to do recursive hunts from each letter on the board, traversing from valid neighbor to valid neighbor, extending the hunt if the current list of letters if a valid prefix in an index. While traversing the notion of the current word is list of indexes into board, not letters that make up a word. When checking the index, the indexes are translated to letters and the check done.

The index is a brute force dictionary that’s a bit like a trie, but allows for Pythonic queries of the index. If the words ‘cat’ and ‘cater’ are in the list, you’ll get this in the dictionary:

  d = { 'c': ['cat','cater'], 'ca': ['cat','cater'], 'cat': ['cat','cater'], 'cate': ['cater'], 'cater': ['cater'], } 

So if the current_word is ‘ca’ you know that it is a valid prefix because 'ca' in d returns True (so continue the board traversal). And if the current_word is ‘cat’ then you know that it is a valid word because it is a valid prefix and 'cat' in d['cat'] returns True too.

If felt like this allowed for some readable code that doesn’t seem too slow. Like everyone else the expense in this system is reading/building the index. Solving the board is pretty much noise.

The code is at http://gist.github.com/268079 . It is intentionally vertical and naive with lots of explicit validity checking because I wanted to understand the problem without crufting it up with a bunch of magic or obscurity.

I wrote my solver in C++. I implemented a custom tree structure. I’m not sure it can be considered a trie but it’s similar. Each node has 26 branches, 1 for each letter of the alphabet. I traverse the branches of the boggle board in parallel with the branches of my dictionary. If the branch does not exist in the dictionary, I stop searching it on the Boggle board. I convert all the letters on the board to ints. So ‘A’ = 0. Since it’s just arrays, lookup is always O(1). Each node stores if it completes a word and how many words exist in its children. The tree is pruned as words are found to reduce repeatedly searching for the same words. I believe pruning is also O(1).

CPU: Pentium SU2700 1.3GHz
RAM: 3gb

Loads dictionary of 178,590 words in < 1 second.
Solves 100×100 Boggle (boggle.txt) in 4 seconds. ~44,000 words found.
Solving a 4×4 Boggle is too fast to provide a meaningful benchmark. 🙂

Fast Boggle Solver GitHub Repo

Given a Boggle board with N rows and M columns, let’s assume the following:

  • N*M is substantially greater than the number of possible words
  • N*M is substantially greater than the longest possible word

Under these assumptions, the complexity of this solution is O(N*M).

I think comparing running times for this one example board in many ways misses the point but, for the sake of completeness, this solution completes in <0.2s on my modern MacBook Pro.

This solution will find all possible paths for each word in the corpus.

 #!/usr/bin/env ruby # Example usage: ./boggle-solver --board "fxie amlo ewbx astu" autoload :Matrix, 'matrix' autoload :OptionParser, 'optparse' DEFAULT_CORPUS_PATH = '/usr/share/dict/words'.freeze # Functions def filter_corpus(matrix, corpus, min_word_length) board_char_counts = Hash.new(0) matrix.each { |c| board_char_counts[c] += 1 } max_word_length = matrix.row_count * matrix.column_count boggleable_regex = /^[#{board_char_counts.keys.reduce(:+)}]{#{min_word_length},#{max_word_length}}$/ corpus.select{ |w| w.match boggleable_regex }.select do |w| word_char_counts = Hash.new(0) w.each_char { |c| word_char_counts[c] += 1 } word_char_counts.all? { |c, count| board_char_counts[c] >= count } end end def neighbors(point, matrix) i, j = point ([i-1, 0].max .. [i+1, matrix.row_count-1].min).inject([]) do |r, new_i| ([j-1, 0].max .. [j+1, matrix.column_count-1].min).inject(r) do |r, new_j| neighbor = [new_i, new_j] neighbor.eql?(point) ? r : r < < neighbor end end end def expand_path(path, word, matrix) return [path] if path.length == word.length next_char = word[path.length] viable_neighbors = neighbors(path[-1], matrix).select do |point| !path.include?(point) && matrix.element(*point).eql?(next_char) end viable_neighbors.inject([]) do |result, point| result + expand_path(path.dup << point, word, matrix) end end def find_paths(word, matrix) result = [] matrix.each_with_index do |c, i, j| result += expand_path([[i, j]], word, matrix) if c.eql?(word[0]) end result end def solve(matrix, corpus, min_word_length: 3) boggleable_corpus = filter_corpus(matrix, corpus, min_word_length) boggleable_corpus.inject({}) do |result, w| paths = find_paths(w, matrix) result[w] = paths unless paths.empty? result end end # Script options = { corpus_path: DEFAULT_CORPUS_PATH } option_parser = OptionParser.new do |opts| opts.banner = 'Usage: boggle-solver --board  [--corpus ]' opts.on('--board BOARD', String, 'The board (eg "fxi aml ewb ast")') do |b| options[:board] = b end opts.on('--corpus CORPUS_PATH', String, 'Corpus file path') do |c| options[:corpus_path] = c end opts.on_tail('-h', '--help', 'Shows usage') do STDOUT.puts opts exit end end option_parser.parse! unless options[:board] STDERR.puts option_parser exit false end unless File.file? options[:corpus_path] STDERR.puts "No corpus exists - #{options[:corpus_path]}" exit false end rows = options[:board].downcase.scan(/\S+/).map{ |row| row.scan(/./) } raw_corpus = File.readlines(options[:corpus_path]) corpus = raw_corpus.map{ |w| w.downcase.rstrip }.uniq.sort solution = solve(Matrix.rows(rows), corpus) solution.each_pair do |w, paths| STDOUT.puts w paths.each do |path| STDOUT.puts "\t" + path.map{ |point| point.inspect }.join(', ') end end STDOUT.puts "TOTAL: #{solution.count}" 

I have implemented a solution in OCaml . It pre-compiles a dictionary as a trie, and uses two-letter sequence frequencies to eliminate edges that could never appear in a word to further speed up processing.

It solves your example board in 0.35ms (with an additional 6ms start-up time which is mostly related to loading the trie into memory).

The solutions found:

 ["swami"; "emile"; "limbs"; "limbo"; "limes"; "amble"; "tubs"; "stub"; "swam"; "semi"; "seam"; "awes"; "buts"; "bole"; "boil"; "west"; "east"; "emil"; "lobs"; "limb"; "lime"; "lima"; "mesa"; "mews"; "mewl"; "maws"; "milo"; "mile"; "awes"; "amie"; "axle"; "elma"; "fame"; "ubs"; "tux"; "tub"; "twa"; "twa"; "stu"; "saw"; "sea"; "sew"; "sea"; "awe"; "awl"; "but"; "btu"; "box"; "bmw"; "was"; "wax"; "oil"; "lox"; "lob"; "leo"; "lei"; "lie"; "mes"; "mew"; "mae"; "maw"; "max"; "mil"; "mix"; "awe"; "awl"; "elm"; "eli"; "fax"] 

A Node.JS JavaScript solution. Computes all 100 unique words in less than a second which includes reading dictionary file (MBA 2012).

Saída:
[“FAM”,”TUX”,”TUB”,”FAE”,”ELI”,”ELM”,”ELB”,”TWA”,”TWA”,”SAW”,”AMI”,”SWA”,”SWA”,”AME”,”SEA”,”SEW”,”AES”,”AWL”,”AWE”,”SEA”,”AWA”,”MIX”,”MIL”,”AST”,”ASE”,”MAX”,”MAE”,”MAW”,”MEW”,”AWE”,”MES”,”AWL”,”LIE”,”LIM”,”AWA”,”AES”,”BUT”,”BLO”,”WAS”,”WAE”,”WEA”,”LEI”,”LEO”,”LOB”,”LOX”,”WEM”,”OIL”,”OLM”,”WEA”,”WAE”,”WAX”,”WAF”,”MILO”,”EAST”,”WAME”,”TWAS”,”TWAE”,”EMIL”,”WEAM”,”OIME”,”AXIL”,”WEST”,”TWAE”,”LIMB”,”WASE”,”WAST”,”BLEO”,”STUB”,”BOIL”,”BOLE”,”LIME”,”SAWT”,”LIMA”,”MESA”,”MEWL”,”AXLE”,”FAME”,”ASEM”,”MILE”,”AMIL”,”SEAX”,”SEAM”,”SEMI”,”SWAM”,”AMBO”,”AMLI”,”AXILE”,”AMBLE”,”SWAMI”,”AWEST”,”AWEST”,”LIMAX”,”LIMES”,”LIMBU”,”LIMBO”,”EMBOX”,”SEMBLE”,”EMBOLE”,”WAMBLE”,”FAMBLE”]

Código:

 var fs = require('fs') var Node = function(value, row, col) { this.value = value this.row = row this.col = col } var Path = function() { this.nodes = [] } Path.prototype.push = function(node) { this.nodes.push(node) return this } Path.prototype.contains = function(node) { for (var i = 0, ii = this.nodes.length; i < ii; i++) { if (this.nodes[i] === node) { return true } } return false } Path.prototype.clone = function() { var path = new Path() path.nodes = this.nodes.slice(0) return path } Path.prototype.to_word = function() { var word = '' for (var i = 0, ii = this.nodes.length; i < ii; ++i) { word += this.nodes[i].value } return word } var Board = function(nodes, dict) { // Expects nxm array. this.nodes = nodes this.words = [] this.row_count = nodes.length this.col_count = nodes[0].length this.dict = dict } Board.from_raw = function(board, dict) { var ROW_COUNT = board.length , COL_COUNT = board[0].length var nodes = [] // Replace board with Nodes for (var i = 0, ii = ROW_COUNT; i < ii; ++i) { nodes.push([]) for (var j = 0, jj = COL_COUNT; j < jj; ++j) { nodes[i].push(new Node(board[i][j], i, j)) } } return new Board(nodes, dict) } Board.prototype.toString = function() { return JSON.stringify(this.nodes) } Board.prototype.update_potential_words = function(dict) { for (var i = 0, ii = this.row_count; i < ii; ++i) { for (var j = 0, jj = this.col_count; j < jj; ++j) { var node = this.nodes[i][j] , path = new Path() path.push(node) this.dfs_search(path) } } } Board.prototype.on_board = function(row, col) { return 0 <= row && row < this.row_count && 0 <= col && col < this.col_count } Board.prototype.get_unsearched_neighbours = function(path) { var last_node = path.nodes[path.nodes.length - 1] var offsets = [ [-1, -1], [-1, 0], [-1, +1] , [ 0, -1], [ 0, +1] , [+1, -1], [+1, 0], [+1, +1] ] var neighbours = [] for (var i = 0, ii = offsets.length; i < ii; ++i) { var offset = offsets[i] if (this.on_board(last_node.row + offset[0], last_node.col + offset[1])) { var potential_node = this.nodes[last_node.row + offset[0]][last_node.col + offset[1]] if (!path.contains(potential_node)) { // Create a new path if on board and we haven't visited this node yet. neighbours.push(potential_node) } } } return neighbours } Board.prototype.dfs_search = function(path) { var path_word = path.to_word() if (this.dict.contains_exact(path_word) && path_word.length >= 3) { this.words.push(path_word) } var neighbours = this.get_unsearched_neighbours(path) for (var i = 0, ii = neighbours.length; i < ii; ++i) { var neighbour = neighbours[i] var new_path = path.clone() new_path.push(neighbour) if (this.dict.contains_prefix(new_path.to_word())) { this.dfs_search(new_path) } } } var Dict = function() { this.dict_array = [] var dict_data = fs.readFileSync('./web2', 'utf8') var dict_array = dict_data.split('\n') for (var i = 0, ii = dict_array.length; i < ii; ++i) { dict_array[i] = dict_array[i].toUpperCase() } this.dict_array = dict_array.sort() } Dict.prototype.contains_prefix = function(prefix) { // Binary search return this.search_prefix(prefix, 0, this.dict_array.length) } Dict.prototype.contains_exact = function(exact) { // Binary search return this.search_exact(exact, 0, this.dict_array.length) } Dict.prototype.search_prefix = function(prefix, start, end) { if (start >= end) { // If no more place to search, return no matter what. return this.dict_array[start].indexOf(prefix) > -1 } var middle = Math.floor((start + end)/2) if (this.dict_array[middle].indexOf(prefix) > -1) { // If we prefix exists, return true. return true } else { // Recurse if (prefix < = this.dict_array[middle]) { return this.search_prefix(prefix, start, middle - 1) } else { return this.search_prefix(prefix, middle + 1, end) } } } Dict.prototype.search_exact = function(exact, start, end) { if (start >= end) { // If no more place to search, return no matter what. return this.dict_array[start] === exact } var middle = Math.floor((start + end)/2) if (this.dict_array[middle] === exact) { // If we prefix exists, return true. return true } else { // Recurse if (exact < = this.dict_array[middle]) { return this.search_exact(exact, start, middle - 1) } else { return this.search_exact(exact, middle + 1, end) } } } var board = [ ['F', 'X', 'I', 'E'] , ['A', 'M', 'L', 'O'] , ['E', 'W', 'B', 'X'] , ['A', 'S', 'T', 'U'] ] var dict = new Dict() var b = Board.from_raw(board, dict) b.update_potential_words() console.log(JSON.stringify(b.words.sort(function(a, b) { return a.length - b.length }))) 

So I wanted to add another PHP way of solving this, since everyone loves PHP. There’s a little bit of refactoring I would like to do, like using a regexpression match against the dictionary file, but right now I’m just loading the entire dictionary file into a wordList.

I did this using a linked list idea. Each Node has a character value, a location value, and a next pointer.

The location value is how I found out if two nodes are connected.

 1 2 3 4 11 12 13 14 21 22 23 24 31 32 33 34 

So using that grid, I know two nodes are connected if the first node’s location equals the second nodes location +/- 1 for the same row, +/- 9, 10, 11 for the row above and below.

I use recursion for the main search. It takes a word off the wordList, finds all the possible starting points, and then recursively finds the next possible connection, keeping in mind that it can’t go to a location it’s already using (which is why I add $notInLoc).

Anyway, I know it needs some refactoring, and would love to hear thoughts on how to make it cleaner, but it produces the correct results based on the dictionary file I’m using. Depending on the number of vowels and combinations on the board, it takes about 3 to 6 seconds. I know that once I preg_match the dictionary results, that will reduce significantly.

 < ?php ini_set('xdebug.var_display_max_depth', 20); ini_set('xdebug.var_display_max_children', 1024); ini_set('xdebug.var_display_max_data', 1024); class Node { var $loc; function __construct($value) { $this->value = $value; $next = null; } } class Boggle { var $root; var $locList = array (1, 2, 3, 4, 11, 12, 13, 14, 21, 22, 23, 24, 31, 32, 33, 34); var $wordList = []; var $foundWords = []; function __construct($board) { // Takes in a board string and creates all the nodes $node = new Node($board[0]); $node->loc = $this->locList[0]; $this->root = $node; for ($i = 1; $i < strlen($board); $i++) { $node->next = new Node($board[$i]); $node->next->loc = $this->locList[$i]; $node = $node->next; } // Load in a dictionary file // Use regexp to elimate all the words that could never appear and load the // rest of the words into wordList $handle = fopen("dict.txt", "r"); if ($handle) { while (($line = fgets($handle)) !== false) { // process the line read. $line = trim($line); if (strlen($line) > 2) { $this->wordList[] = trim($line); } } fclose($handle); } else { // error opening the file. echo "Problem with the file."; } } function isConnected($node1, $node2) { // Determines if 2 nodes are connected on the boggle board return (($node1->loc == $node2->loc + 1) || ($node1->loc == $node2->loc - 1) || ($node1->loc == $node2->loc - 9) || ($node1->loc == $node2->loc - 10) || ($node1->loc == $node2->loc - 11) || ($node1->loc == $node2->loc + 9) || ($node1->loc == $node2->loc + 10) || ($node1->loc == $node2->loc + 11)) ? true : false; } function find($value, $notInLoc = []) { // Returns a node with the value that isn't in a location $current = $this->root; while($current) { if ($current->value == $value && !in_array($current->loc, $notInLoc)) { return $current; } if (isset($current->next)) { $current = $current->next; } else { break; } } return false; } function findAll($value) { // Returns an array of nodes with a specific value $current = $this->root; $foundNodes = []; while ($current) { if ($current->value == $value) { $foundNodes[] = $current; } if (isset($current->next)) { $current = $current->next; } else { break; } } return (empty($foundNodes)) ? false : $foundNodes; } function findAllConnectedTo($node, $value, $notInLoc = []) { // Returns an array of nodes that are connected to a specific node and // contain a specific value and are not in a certain location $nodeList = $this->findAll($value); $newList = []; if ($nodeList) { foreach ($nodeList as $node2) { if (!in_array($node2->loc, $notInLoc) && $this->isConnected($node, $node2)) { $newList[] = $node2; } } } return (empty($newList)) ? false : $newList; } function inner($word, $list, $i = 0, $notInLoc = []) { $i++; foreach($list as $node) { $notInLoc[] = $node->loc; if ($list2 = $this->findAllConnectedTo($node, $word[$i], $notInLoc)) { if ($i == (strlen($word) - 1)) { return true; } else { return $this->inner($word, $list2, $i, $notInLoc); } } } return false; } function findWord($word) { if ($list = $this->findAll($word[0])) { return $this->inner($word, $list); } return false; } function findAllWords() { foreach($this->wordList as $word) { if ($this->findWord($word)) { $this->foundWords[] = $word; } } } function displayBoard() { $current = $this->root; for ($i=0; $i < 4; $i++) { echo $current->value . " " . $current->next->value . " " . $current->next->next->value . " " . $current->next->next->next->value . "
"; if ($i < 3) { $current = $current->next->next->next->next; } } } } function randomBoardString() { return substr(str_shuffle(str_repeat("abcdefghijklmnopqrstuvwxyz", 16)), 0, 16); } $myBoggle = new Boggle(randomBoardString()); $myBoggle->displayBoard(); $x = microtime(true); $myBoggle->findAllWords(); $y = microtime(true); echo ($y-$x); var_dump($myBoggle->foundWords); ?>

Here is the solution Using Predefined words in NLTK toolkit NLTK has nltk.corpus package in that we have package called words and it contains more than 2Lakhs English words you can simply use all into your program.

Once creating your matrix convert it into a character array and perform this code

 import nltk from nltk.corpus import words from collections import Counter def possibleWords(input, charSet): for word in input: dict = Counter(word) flag = 1 for key in dict.keys(): if key not in charSet: flag = 0 if flag == 1 and len(word)>5: #its depends if you want only length more than 5 use this otherwise remove that one. print(word) nltk.download('words') word_list = words.words() # prints 236736 print(len(word_list)) charSet = ['h', 'e', 'l', 'o', 'n', 'v', 't'] possibleWords(word_list, charSet) 

Saída:

 eleven eleventh elevon entente entone ethene ethenol evolve evolvent hellhole helvell hooven letten looten nettle nonene nonent nonlevel notelet novelet novelette novene teenet teethe teevee telethon tellee tenent tentlet theelol toetoe tonlet toothlet tootle tottle vellon velvet velveteen venene vennel venthole voeten volent volvelle volvent voteen 

I hope you get it.

Here is my java implementation: https://github.com/zouzhile/interview/blob/master/src/com/interview/algorithms/tree/BoggleSolver.java

Trie build took 0 hours, 0 minutes, 1 seconds, 532 milliseconds
Word searching took 0 hours, 0 minutes, 0 seconds, 92 milliseconds

 eel eeler eely eer eke eker eld eleut elk ell elle epee epihippus ere erept err error erupt eurus eye eyer eyey hip hipe hiper hippish hipple hippus his hish hiss hist hler hsi ihi iphis isis issue issuer ist isurus kee keek keeker keel keeler keep keeper keld kele kelek kelep kelk kell kelly kelp kelper kep kepi kept ker kerel kern keup keuper key kyl kyle lee leek leeky leep leer lek leo leper leptus lepus ler leu ley lleu lue lull luller lulu lunn lunt lunule luo lupe lupis lupulus lupus lur lure lurer lush lushly lust lustrous lut lye nul null nun nupe nurture nurturer nut oer ore ort ouphish our oust out outpeep outpeer outpipe outpull outpush output outre outrun outrush outspell outspue outspurn outspurt outstrut outstunt outsulk outturn outusure oyer pee peek peel peele peeler peeoy peep peeper peepeye peer pele peleus pell peller pelu pep peplus pepper pepperer pepsis per pern pert pertussis peru perule perun peul phi pip pipe piper pipi pipistrel pipistrelle pipistrellus pipper pish piss pist plup plus plush ply plyer psi pst puerer pul pule puler pulk pull puller pulley pullus pulp pulper pulu puly pun punt pup puppis pur pure puree purely purer purr purre purree purrel purrer puru purupuru pus push puss pustule put putt puture ree reek reeker reeky reel reeler reeper rel rely reoutput rep repel repeller repipe reply repp reps reree rereel rerun reuel roe roer roey roue rouelle roun roup rouper roust rout roy rue ruelle ruer rule ruler rull ruller run runt rupee rupert rupture ruru rus rush russ rust rustre rut shi shih ship shipper shish shlu sip sipe siper sipper sis sish sisi siss sissu sist sistrurus speel speer spelk spell speller splurt spun spur spurn spurrer spurt sput ssi ssu stre stree streek streel streeler streep streke streperous strepsis strey stroup stroy stroyer strue strunt strut stu stue stull stuller stun stunt stupe stupeous stupp sturnus sturt stuss stut sue suer suerre suld sulk sulker sulky sull sully sulu sun sunn sunt sunup sup supe super superoutput supper supple supplely supply sur sure surely surrey sus susi susu susurr susurrous susurrus sutu suture suu tree treey trek trekker trey troupe trouper trout troy true truer trull truller truly trun trush truss trust tshi tst tsun tsutsutsi tue tule tulle tulu tun tunu tup tupek tupi tur turn turnup turr turus tush tussis tussur tut tuts tutu tutulus ule ull uller ulu ululu unreel unrule unruly unrun unrust untrue untruly untruss untrust unturn unurn upper upperer uppish uppishly uppull uppush upspurt upsun upsup uptree uptruss upturn ure urn uro uru urus urushi ush ust usun usure usurer utu yee yeel yeld yelk yell yeller yelp yelper yeo yep yer yere yern yoe yor yore you youl youp your yourn yoy 

Note: I used the dictionary and character matrix at the beginning of this thread. The code was run on my MacBookPro, below is some information about the machine.

Model Name: MacBook Pro
Model Identifier: MacBookPro8,1
Processor Name: Intel Core i5
Processor Speed: 2.3 GHz
Number Of Processors: 1
Total Number Of Cores: 2
L2 Cache (per core): 256 KB
L3 Cache: 3 MB
Memory: 4 GB
Boot ROM Version: MBP81.0047.B0E
SMC Version (system): 1.68f96

I solved this too, with Java. My implementation is 269 lines long and pretty easy to use. First you need to create a new instance of the Boggler class and then call the solve function with the grid as a parameter. It takes about 100 ms to load the dictionary of 50 000 words on my computer and it finds the words in about 10-20 ms. The found words are stored in an ArrayList, foundWords .

 import java.io.BufferedReader; import java.io.File; import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io.IOException; import java.io.InputStreamReader; import java.net.URISyntaxException; import java.net.URL; import java.util.ArrayList; import java.util.Arrays; import java.util.Comparator; public class Boggler { private ArrayList words = new ArrayList(); private ArrayList roundWords = new ArrayList(); private ArrayList foundWords = new ArrayList(); private char[][] letterGrid = new char[4][4]; private String letters; public Boggler() throws FileNotFoundException, IOException, URISyntaxException { long startTime = System.currentTimeMillis(); URL path = GUI.class.getResource("words.txt"); BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(new File(path.toURI()).getAbsolutePath()), "iso-8859-1")); String line; while((line = br.readLine()) != null) { if(line.length() < 3 || line.length() > 10) { continue; } this.words.add(line); } } public ArrayList getWords() { return this.foundWords; } public void solve(String letters) { this.letters = ""; this.foundWords = new ArrayList(); for(int i = 0; i < letters.length(); i++) { if(!this.letters.contains(letters.substring(i, i + 1))) { this.letters += letters.substring(i, i + 1); } } for(int i = 0; i < 4; i++) { for(int j = 0; j < 4; j++) { this.letterGrid[i][j] = letters.charAt(i * 4 + j); } } System.out.println(Arrays.deepToString(this.letterGrid)); this.roundWords = new ArrayList(); String pattern = "[" + this.letters + "]+"; for(int i = 0; i < this.words.size(); i++) { if(this.words.get(i).matches(pattern)) { this.roundWords.add(this.words.get(i)); } } for(int i = 0; i < this.roundWords.size(); i++) { Word word = checkForWord(this.roundWords.get(i)); if(word != null) { System.out.println(word); this.foundWords.add(word); } } } private Word checkForWord(String word) { char initial = word.charAt(0); ArrayList startPoints = new ArrayList(); int x = 0; int y = 0; for(char[] row: this.letterGrid) { x = 0; for(char letter: row) { if(initial == letter) { startPoints.add(new LetterCoord(x, y)); } x++; } y++; } ArrayList letterCoords = null; for(int initialTry = 0; initialTry < startPoints.size(); initialTry++) { letterCoords = new ArrayList(); x = startPoints.get(initialTry).getX(); y = startPoints.get(initialTry).getY(); LetterCoord initialCoord = new LetterCoord(x, y); letterCoords.add(initialCoord); letterLoop: for(int letterIndex = 1; letterIndex < word.length(); letterIndex++) { LetterCoord lastCoord = letterCoords.get(letterCoords.size() - 1); char currentChar = word.charAt(letterIndex); ArrayList letterLocations = getNeighbours(currentChar, lastCoord.getX(), lastCoord.getY()); if(letterLocations == null) { return null; } for(int foundIndex = 0; foundIndex < letterLocations.size(); foundIndex++) { if(letterIndex != word.length() - 1 && true == false) { char nextChar = word.charAt(letterIndex + 1); int lastX = letterCoords.get(letterCoords.size() - 1).getX(); int lastY = letterCoords.get(letterCoords.size() - 1).getY(); ArrayList possibleIndex = getNeighbours(nextChar, lastX, lastY); if(possibleIndex != null) { if(!letterCoords.contains(letterLocations.get(foundIndex))) { letterCoords.add(letterLocations.get(foundIndex)); } continue letterLoop; } else { return null; } } else { if(!letterCoords.contains(letterLocations.get(foundIndex))) { letterCoords.add(letterLocations.get(foundIndex)); continue letterLoop; } } } } if(letterCoords != null) { if(letterCoords.size() == word.length()) { Word w = new Word(word); w.addList(letterCoords); return w; } else { return null; } } } if(letterCoords != null) { Word foundWord = new Word(word); foundWord.addList(letterCoords); return foundWord; } return null; } public ArrayList getNeighbours(char letterToSearch, int x, int y) { ArrayList neighbours = new ArrayList(); for(int _y = y - 1; _y < = y + 1; _y++) { for(int _x = x - 1; _x <= x + 1; _x++) { if(_x < 0 || _y < 0 || (_x == x && _y == y) || _y > 3 || _x > 3) { continue; } if(this.letterGrid[_y][_x] == letterToSearch && !neighbours.contains(new LetterCoord(_x, _y))) { neighbours.add(new LetterCoord(_x, _y)); } } } if(neighbours.isEmpty()) { return null; } else { return neighbours; } } } class Word { private String word; private ArrayList letterCoords = new ArrayList(); public Word(String word) { this.word = word; } public boolean addCoords(int x, int y) { LetterCoord lc = new LetterCoord(x, y); if(!this.letterCoords.contains(lc)) { this.letterCoords.add(lc); return true; } return false; } public void addList(ArrayList letterCoords) { this.letterCoords = letterCoords; } @Override public String toString() { String outputString = this.word + " "; for(int i = 0; i < letterCoords.size(); i++) { outputString += "(" + letterCoords.get(i).getX() + ", " + letterCoords.get(i).getY() + ") "; } return outputString; } public String getWord() { return this.word; } public ArrayList getList() { return this.letterCoords; } } class LetterCoord extends ArrayList { private int x; private int y; public LetterCoord(int x, int y) { this.x = x; this.y = y; } public int getX() { return this.x; } public int getY() { return this.y; } @Override public boolean equals(Object o) { if(!(o instanceof LetterCoord)) { return false; } LetterCoord lc = (LetterCoord) o; if(this.x == lc.getX() && this.y == lc.getY()) { return true; } return false; } @Override public int hashCode() { int hash = 7; hash = 29 * hash + this.x; hash = 24 * hash + this.y; return hash; } } 

I solved this in c. It takes around 48 ms to run on my machine (with around 98% of the time spent loading the dictionary from disk and creating the trie). The dictionary is /usr/share/dict/american-english which has 62886 words.

Código fonte

I solved this perfectly and very fast. I put it into an android app. View the video at the play store link to see it in action.

Word Cheats is an app that “cracks” any matrix style word game. This app was built to to help me cheat at word scrambler. It can be used for word searches, ruzzle, words, word finder, word crack, boggle, and more!

It can be seen here https://play.google.com/store/apps/details?id=com.harris.wordcracker

View the app in action in the video https://www.youtube.com/watch?v=DL2974WmNAI

I have solved this in C#, using a DFA algorithm. You can check out my code at

https://github.com/attilabicsko/wordshuffler/

In addition to finding words in a matrix, my algorithm saves the actual paths for the words, so for designing a word finder game, you can check wether there is a word on an actual path.

How about simple sorting and using the binary search in the dictionary?

Returns whole list in 0.35 sec and can be further optimized (by for instance removing words with unused letters etc.).

 from bisect import bisect_left f = open("dict.txt") D.extend([line.strip() for line in f.readlines()]) D = sorted(D) def neibs(M,x,y): n = len(M) for i in xrange(-1,2): for j in xrange(-1,2): if (i == 0 and j == 0) or (x + i < 0 or x + i >= n or y + j < 0 or y + j >= n): continue yield (x + i, y + j) def findWords(M,D,x,y,prefix): prefix = prefix + M[x][y] # find word in dict by binary search found = bisect_left(D,prefix) # if found then yield if D[found] == prefix: yield prefix # if what we found is not even a prefix then return # (there is no point in going further) if len(D[found]) < len(prefix) or D[found][:len(prefix)] != prefix: return # recourse for neib in neibs(M,x,y): for word in findWords(M,D,neib[0], neib[1], prefix): yield word def solve(M,D): # check each starting point for x in xrange(0,len(M)): for y in xrange(0,len(M)): for word in findWords(M,D,x,y,""): yield word grid = "fxie amlo ewbx astu".split() print [x for x in solve(grid,D)]