Written by Michele Pratusevich. When learning how to program, I always recommend having a project or application in mind. For me, it makes it easier to concentrate and learn specific skills and flex the critical thinking muscle. What better project to attempt than the latest game craze, Wordle? In this post-essay, I’ll take you through the steps required to write your own version of Wordle on the command-line without a GUI in Python.
If you’d rather, feel free to take this as an exercise / challenge yourself (definitely something like 7 chilis ), then come back here to see how I did it. Alternatively, use this as a guide to code along, as you see my thought process. Note that the way I’ve chosen to structure the code is just one way this could have been implemented. If you go a different route, let’s discuss in the comments. Feel free to skip around!
Before starting any programming project, it is important to make a list of all the features you want it to have. This accomplishes two things: (1) gives you a “stopping point” to know when you’re done, and (2) lets you organize your thoughts to make sure your choices for how to implement cover all your use cases.
In our case, for a Python command-line clone of Wordle, here is our requirements and features list:
Gameplay
Game randomly selects a single 5-letter word from a word list to use as the “game word”.
There is a command-line display asking the player to enter guesses.
A guess is only valid if it is the same length as the game word.
All guesses should come from a word list. If the guess is not from the list, tell the player, and don’t accept the word as part of the number of guesses.
The player can enter words either in upper or lower case.
To quit the game early, the player should press CTRL-C. The reasoning behind this choice (rather than, for example, having the player type “quit” or “Q”) is that CTRL-C can never be a guessed word, whereas “QUIT” can be. This means that modifying the game to work with 4-letter words does not break the core game play.
The game will keep track of how many guesses the player has done and display that at the end, either when the word is guessed or when the game is quit.
Rules
A guessed letter that is correct and in the correct place displays a *.
A guessed letter that is correct but in the incorrect place displays a -.
All other guessed letters are returned with a blank _.
When the game word has two of the same letter, but the guessed word has one of that letter, the letter in the guessed word follows the rules for correct and incorrect placement. For example, if the game word is “STEER” and the guess word is “PLACE” then the “E” in “PLACE” is shown as -. If the game word is “STEER” and the guess word is “MONEY” then the “E” in “MONEY” is shown as *.
Correctly guessed letters in the correct place are displayed first.
If the game word has one of a letter, but the guessed word has two, both in the wrong place, only the first letter in the guessed word is displayed as -.
If the game word has one of a letter, but the guessed word has two, one in the correct place, the correct letter is displayed as * and the second copy of the letter is unmarked.
Basic Design
For any command-line game, the basic program structure is a large infinite while loop that keeps “playing the game” until an end condition is met. According to our requirements list, the end conditions are:
The player guesses the word correctly.
The player exits by pressing CTRL-C.
So the first step is to structure our program with this basic scaffold (in a __main__ block). First we tackle the first condition. We don’t deal with selecting the word or processing guesses, so we create two dummy variables (WORD and GUESS), set them to specific strings, and continue adding on later.
Now we can add in the second end condition (exit on CTRL-C). This is slightly more complicated, since in Python the way to catch a CTRL-C press is through a KeyboardInterrupt exception, so we wrap the existing loop in a try/catch to catch the CTRL-C.
Now that we have the structure of the main loop, we can add in the mechanics for player interaction, namely, taking player guesses.
Player Guesses
One feature of the game play is that we need some error checking / handling on a player’s guesses. Because this is a functionality that can be independently tested and verified, processing and getting a user guess is a great candidate for putting inside of a function. The heading of this function will look like this:
The type annotation is natively supported in Python 3, as seen in the official documentation for the typing library. What our function signature tells us is that the function is going to get the user guess. It takes in two inputs: (1) the length of the expected game word (which is an int), and (2) the list of available guess words as strings. The function will return a single string, which is the player’s guess. Because we return the string of the guess from this function, all the logic around validating a guess, displaying to the player why the guess is in valid, and asking the user to input a new guess, is contained in this function. Because we are potentially in a situation where we need to ask the user multiple times to enter a guess (if for example they keep entering a 4-letter word when the game word is 5 letters), we use another while loop here for the structure.
The choice I’ve made here is to abstract the “validation” logic into yet another function. This is because validating a guess is a repeatable action we take on a word that can be independently tested. Let’s write that validation function now:
Our validation function header will validate a guess according to any validation rules we want. It takes in a guess as a string (any string), the expected length of the game word, and the list of possible game words. What the validation function returns is a tuple containing the error and the guess word. If there is no error, None is returned for the error. This is again a design choice - some programmers will not return the guess word back, and rely on the fact that if None is returned for the error, the initial guess word is fine. However, the choice here of returning the guess word is a form of filtering & validation. Because we listed in the requirements that guesses can be entered in uppercase or lower case, we want to standardize all the guesses into uppercase. This means that somewhere we will have to convert the guess into all uppercase letters. What better place to do this than the same function where we are validating our guesses?
The choice for why our wordlist variable is a set comes out of this function as well. Inside the validation function, we are checking whether the guess is a member of the valid wordlist - this operation is cheap to do from a set, so we choose to take in our wordlist as a set of strings.
The implementation of the function then looks like this:
Note that all the return statements return a tuple: the first argument is an error string (or None if we get all the way to the end), and the guess converted into uppercase.
Aside: The syntax in the return statement with the f"Guess must be of length {wordlen}" uses Python 3’s built-in f-strings. They are called f-strings because it stands for “formatted string literals.” It is a special syntax to turn variables easily into strings using the curly braces around the variable name, with lots of formatting options depending on what is desired. The Python documentation has a few examples.
Now that we have a validation function, a user input function, and a main loop, our code so far looks like:
At this point, we have a game where player input is continually read and validated! Next up, displaying information back to the player so the game is actually fun!
Displaying Guesses
Now we get to an interesting coding step: we need to take the guess that we have taken from the user, parse it, and display the information back to the user in a way that makes the game fun. There are two types of information that can be displayed back to the user: guessed letters in the correct place, and guessed letters in the incorrect place. We implement these in that order, to make sure we got it right!
Like always, we start with the function signature:
What we’ll do is take in the game word (in the expected variable) as a string, and the (parsed and validated) guess word (in the guess variable) as a string. This function will do all the character comparison, and output a list of strings denoting the state of each letter in the guess. Per our requirements set up in the previous section, the characters in the output mean the following: (a) _ is a blank - no information is known about this letter; (b) * is a guessed letter is in the correct place in the game word; (c) - is a guessed letter that is correct but in the incorrect place in the game word.
We return this parsed list with symbols rather than do the printing directly in this function to separate concerns. If we ever want to change how the display is done in the command-line (for example, add emojis, or add colors, etc.) then we can use the returned parsed list to generate an output.
The first step in writing the function is to set up the output, and we will tackle the easier part of answer parsing: characters in the guessed word that are in the correct place in the game word.
Since this is a tricky function in the Wordle implementation, and the one that most directly affects gameplay, let’s start with a set of test cases that we can call on our function to make sure we have all the cases right!
Function call
Expected Output
compare("steer", "stirs")
* * _ - _
compare("steer", "floss")
_ _ _ - _
compare("pains", "stirs")
_ _ * _ *
compare("creep", "enter")
- _ _ * -
compare("crape", "enter")
- _ _ _ -
compare("ennui", "enter")
* * _ _ _
These test cases cover a range of outputs and cases that we expect this function to accomplish, both “normal” behavior and a few edge cases that are tricky. What we’re doing here is a miniature version of a software engineering concept called Test Driven Development (TDD). In TDD, the idea is that before you actually write your code, you think about how you are going to test your code, and oftentimes, write the tests first. That way, the verification for “does my code do what it needs to” is already done.
So, let’s begin! First we implement the logic for checking whether guessed characters are in the correct place. This is done (per the rules requirements above) first before the other characters, and those correctly-guessed letters cannot be double-counted. Here is our implementation:
At this point if we call this function with our test cases, we will see that we correctly mark the correctly-guessed characters!
Function call
Output so far
compare("steer", "stirs")
* * _ _ _
compare("steer", "floss")
_ _ _ _ _
compare("pains", "stirs")
_ _ * _ *
compare("creep", "enter")
_ _ _ * _
compare("crape", "enter")
_ _ _ _ _
compare("ennui", "enter")
* * _ _ _
Now we move on to the more challenging part: checking for guessed letters that are in the correct place. The way we are going to do this is as follows:
First account for all the letters guessed in the correct positions (done above)
Keep track of all the character indices in the game word that have already been “accounted for” by letters from the guessed word. Add all the positions of correctly-guessed letters into this set.
For each guessed character (excluding characters that have already been marked as correctly-positioned guesses), find all the corresponding indices in the game word for that character.
If that position is already accounted for (i.e. is present in the set of accounted-for letters), keep checking for other positions of that character. Once you find a position that has not yet been accounted for, mark the guess with a -, add that position to the accounted-for set, and move on to the next guessed character.
Note that we have to introduce a new structure here to capture the game state: we need to keep track of which characters in the game word have been correctly identified in the game word (whether in the correct or incorrect positions). The reason for this is to take into account the potential for the game word and guess word to both have multiple characters. If there are multiple characters in the guess word, we only count their correct or incorrectness once. That is, if the game word has one “R” but the guess word has two “R”s, then we at most display information back to the user for a single “R” in the guess word (whether that is in the correct or incorrect place). Our test cases listed above cover these gameplay cases.
In this design, we need to write one helper function. Namely, we need to find all the positions of a given letter in the game word. There isn’t any pre-built Python function to do this, so we need to build our own. We base our helper function on the built-in function .find() that returns the first index of the desired character in the string. What we want is ALL the positions, so we wrap this function into a loop, and take advantage of the fact that we can specify where in the target word we can search for a character.
We make sure to return a sorted list of positions in the game word as a list of integers, since this is how we expect to use the output of this function for gameplay. The neat thing here is that we do not need to call sorted() on our output list positions because the way we call .find() is guaranteed to return the positions in increasing order.
So with the find_all_char_positions function built, our code now looks like this:
We choose a set as the data structure (in the counted_pos variable) for keeping track of the counted positions, since the ordering of the positions doesn’t matter, and the most common operation we do on this structure is checking whether the variable pos is contained in it, which makes a set an appropriate data structure.
Now we can check against our test cases to confirm we got it right:
Function call
Output
compare("steer", "stirs")
* * _ - _
compare("steer", "floss")
_ _ _ - _
compare("pains", "stirs")
_ _ * _ *
compare("creep", "enter")
- _ _ * -
compare("crape", "enter")
- _ _ _ -
compare("ennui", "enter")
* * _ _ _
And there we go! A comparison function for parsing user guesses against a game word. Now we plug this into our main loop for some gameplay!
And now we have a playable game!
Generating the Wordlist
Of course, one “trick” we did with our implementation so far is hard-code the game word. This makes for a pretty boring game, so as the next step is to generate some word lists that are more fun!
There is “right way” to generate a word list. Because of a previous exercise on this blog, I had a local copy of SOWPODS, the official Scrabble word list, locally. However, as you know of Scrabble words, some are a bit obscure and not very fun for gameplay. Besides, SOWPODS has words of many lengths, whereas in our version of Wordle we wanted to limit only to 5-letter words.
The additional detail about word lists in Wordle is that there are actually two word lists. One word list for game words (which has fewer words, and contains more “common” words), and a separate word list for validating guesses, which contains the game word list and a bunch of other more “obscure” words. I don’t know for sure how the original implementor of Wordle generated these lists, but I have to assume some amount of hand-curation was done.
We on the other hand don’t need to hand-curate our list (unless you want to for your own implementation). Instead, we look at the page source for Wordle and extract the lists from that code. I won’t go into exactly how I did that here, since I didn’t use Python for it! We save these as two separate word lists (gamewords.txt and guesswords.txt) that we can load into our game. One helper function we will write (since it will be used twice, once for each list of words), is a create_wordlist() helper function that loads the word lists into a list and makes sure all the words are upper case.
We actually implement this as two functions: one for filtering/modifying words, and one for ingesting the word list in the first place. This is because if we ever want to expand the word lists to contain words of multiple lengths, we don’t need to maintain separate word lists for all the lengths; we can simply put them in one list, and filter the random game/guess words in real time during game play.
First we implement a filter_word() function that takes in a word and any filtering criteria. It returns back the word modified (in our case converted to upper case). If we decide to add new functionality into our Wordle implementation (for example, limiting words to only start with “T” or something else), then that functionality would be added into this function.
Now when we implement our create_wordlist() function, we can make use of a Python built-in function called map. I have not yet written an exercise specifically about this function, so I’ll briefly touch on it here. The purpose of map is to apply the same function on any iterable (list, dictionary, set) and return a new version of that iterable with the function applied. In our case this works out perfectly: what we want to do is take in a list of words that we loaded from a text file, and we want to make them all uppercase. More specifically, we want to apply our filter_word function to each element of the word list.
The end result is that when we call the create_wordlist() function on our word list, it converts all the words properly to upper case and filters out any undesirable words.
At this point we are ready to put it all together!
Putting it All Together
Now we just add a few bells and whistles to the text of what the user sees during gameplay (the number of guesses taken, for example), and voila! We have a full Wordle implementation in Python!
When you run this code through the terminal, a sample gameplay looks like this:
Related Exercises
This website contains a few exercises that are relevant to the skills required to code up Wordle. Here are a few, but feel free to get started anywhere!