Word Match
Dr. Phillip M. Feldman
I've created a word match solver that one can either run online or download to one's computer to run locally. Such a solver is useful for the game of hangman, and can also be handy when one is stumped while playing or creating a crossword puzzle. (For crossword puzzles, one would probably want an unabridged spelling dictionary that includes a wide variety of proper nouns. If anyone can supply such a dictionary, I will be happy to post it). I also have an online game, based loosely on hangman, called Word Match.
With both the solver and the game, information about allowed letters and their locations is specified via a template and a list of excluded letters. The template is typically a sequence of letters and underscores, but may also contain certain other symbols that are not used in hangman. The following non-letter symbols are permitted:
underscore (_
): As in hangman, the underscore
represents an unknown letter.
plus sign (+
): This represents one or more unknown
letters. For example, the template +apt
matches the words
rapt and adapt, but not apt.
asterisk (*
): This represents zero or more unknown
letters, i.e., the asterisk will match anything or nothing. For example, the
template *apt
matches the words apt, adapt, and
rapt. Note that two consecutive asterisks are equivalent to a single
asterisk.
question mark (?
): This represents zero or one unknown
letters, i.e., either a single letter may be substituted for the question
mark, or the question mark may be deleted, closing up the space. For example,
the template ?apt
matches the words apt and rapt,
but not adapt. Note that two consecutive question marks represents
zero, one, or two unknown letters.
Excluded letters may appear in the template. When this happens, the interpretation is that excluded letters may appear in a matching word only in positions corresponding to those where they appear in the template.
When using the solver, it is sometimes more convenient to specify allowed
letters rather than excluded letters. This can be done by typing a caret (^) at
the front of the list of excluded letters. If, for example, one specifies
_e_
as the template and ^gnt
for the list of excluded
letters, the code will produce the following matching words: get,
net, and ten.
To run word_match.py
online:
Enter a template into the first box. If you provide a template that allows for too much flexibility, e.g., one that uses only the asterisk and underscore characters, without any letters, you will get a large number of matches.
Press the tab key to advance to the second box and specify any excluded letters (letters that must not appear in matching words). If there are no excluded letters, press the Delete key to clear this field.
Select one of the two spelling dictionaries.
Click on the yellow button.
In this variation on the game of hangman, the goal is to find all words that match the template provided by the computer. Note:
Enter the words into the box to the right of the yellow "check my answer" button and then click on the button to find out whether you are right.
To obtain a new randomly-selected puzzle, click the "Next Puzzle" button.
When attempting to find solutions, don't fall into the trap of assuming
that a group of adjacent letters is a phonetic unit (syllable). If, for example,
one is working on the template _hot_
, thinking of hot as a
phonetic unit might prevent one from seeing one of the solutions (photo).
Select one or more levels of play via the check boxes. Over half of the puzzles (most of those at levels 3-5) make use of SAT- and GRE-level words in the hint text, the answer itself, or both.
For the sake of reasonability, the game uses only the small (15,050-word) spelling dictionary.
One can check whether a given word is in the small dictionary by using the above solver. (This is not considered cheating). Enter the entire word in the template field.
As in the original game of hangman, an underscore character denotes a single unknown letter.
The special symbols ?
, +
, and *
denote an optional unknown letter, one or more unknown letters, and an arbitrary
number of unknown letters (possible none), respectively. The +
and
*
symbols appear more frequently on the higher levels of play. If
you unclear about the meanings of these symbols, refer to the material near the
top of this page.
English words having similar spellings often have completely different
meanings. This 'feature' of the language can easily confound and confuse, and I
have deliberately exploited it in creating many of the puzzles. The template
th???ugh?
, for example, matches four unrelated words having very
similar spellings.
The definitions of words in the hints sometimes differ—for the most part in minor ways—from the definitions that you may find in other sources. The main reason for this is that limited space on the screen forces me to condense and abridge. For some words with multiple meanings, I've disagreed with the online dictionaries concerning the relative importance of the different meanings. Lastly, I have no patience with definitions that resort to euphemism or that are constrained by considerations of political correctness. Those who are learning the language, whether native or non-native, are encouraged to look up the meanings of words using an online dictionary such as dictionary.com, www.thefreedictionary.com, or merriam-webster.com, and to compare with the definitions that I've given.
To run the Python script on your computer:
Make sure that you have a working installation of Python on your computer. See the section in my Python material entitled "How to Get Started".
Verify that a spelling dictionary file containing one word per line exists either in the folder containing word_match.py or one level higher in the folder tree.
Verify that the file console_input.py exists either in the same folder as
word_match.py or in a folder specified via the PYTHONPATH
environment variable.
Depending on what operating system you are using, open a Windows or Linux command prompt and make the folder containing word_match.py the current folder.
To run word_match.py using the default spelling dictionary 'large.txt', and to be prompted for other inputs, issue the following command:
python word_match.py
To run word_match.py with a non-default spelling dictionary, issue a command of the following form:
python word_match.py template excluded dictionary
where 'template' is the hangman template, 'excluded' is the list of excluded letters (enclosing quotes are optional unless the list is empty, in which case the quotes are required), and 'dictionary' is the name of the dictionary to be used, e.g., 'small.txt'.
The Python code for this solver is surprisingly simple. For those interested in programming, the script operates as follows:
The word template is converted into a simple regular expression (regex),
and the excluded characters are stored as a Python set
.
The following operations are performed for each word in the designated spelling dictionary:
The current word is converted into a Python set
.
If the intersection of the two sets (current word and excluded characters) is non-empty, the current word is rejected.
If the intersection of the two sets is empty, the current word is tested against the regex. If it matches, the word is added to a list of matching words.
The list of matching words is displayed.
(An alternative approach would have been to formulate a regex that incorporates the information about the excluded characters, rather than performing two separate tests). A wide range of word puzzles can be solved using regex's; for more on this subject, see my page Solving Word Puzzles via Regular Expression Matching.
29 Dec., 2018: There are now 1,000 Word Match puzzles.
24 Oct., 2015: There are now 900 Word Match puzzles.
1 Jan., 2014: I've completed a scrub of the database to resolve issues created by recent changes to the small spelling dictionary, and also fixed an issue with scoring.
1. http://en.wikipedia.org/wiki/Hangman_(game): Wikipedia article on hangman
2. "25 Best Hangman Words" (an analysis of the game of hangman)
Last update: 7 July 2024