[In this reprinted article from Game Developer magazine's 2012 Game Career Guide issue, Ira Fay explains how game designers can use Microsoft Excel for data analysis.]
When people ask me what tools I use as a game designer, they're usually surprised when I tell them how much I use Microsoft Excel (or more recently, Google Docs). Excel can be a powerful data analysis tool for any level of game designer, and it takes some practice to master. Feed playtest results into your spreadsheet and suddenly you have all of the information you could ever want about your game. But it's important to remember that Excel can't work on its own: We need to analyze data thoughtfully. It's also worth remembering that Excel can help with text-based games. Here are some Excel tips and tricks for aspiring game designers with real-life examples taken from games that I've designed.
Microsoft vs. Google
First, let's talk about Microsoft Excel vs. Google Docs spreadsheets. While these two options are generally interchangeable, there are some key differences. Google Docs is free and cross-platform (it runs in your web browser), and it's great if you want multiple people to edit documents simultaneously. Unfortunately, you need to be connected to the Internet to create and save your documents, and it's not easy to track changes. Excel documents can be easily integrated into your version-control system, can be saved and edited without an internet connection, and include advanced features such as countifs() and sumifs(). But Excel isn't free, and multiple people can't work simultaneously on the same document.
In most cases, you can use either one, though it's useful to be aware of the differences so you can choose the tool that's best suited to your needs.
I won't spend time covering the basics of using spreadsheets, which you can learn in a few minutes on your own. I'll let you know how to use these functions: sum(), average(), max(), min(), and len(). Also, you should know how to sort your data ascending and descending and employ freeze rows and columns so you can always see row and column headers while scrolling your document. If any of that feels unfamiliar to you, spend a few minutes playing around or searching the Internet for answers.
Setting up your data
When starting a game project, I make sure to consider what data I'll need to gather. Plan to gather the right data early on and you'll ensure you have the right tools to balance the game later. When I was designing a casual bingo/lottery scratcher card game called Lottso Express (http://pogo.com/games/lottsoexpress), I worked with the game's engineer to output certain metrics into an Excel spreadsheet that was generated every day, filled with data from our beta testers. If you want to play along, you can find the raw Excel files at http://gdmag.com/resources/code.php.
For Lottso Express, I started by looking at the rank points (experience points) a player earned during a game, the game tokens (in-game currency) the player earned, and the jackpot tokens the player earned from a slot machine-like bonus at the end of some games. All these metrics were measured for a single game, which was recorded in the game length column-make sure to note the units! Are we measuring in seconds, minutes, or something else? I asked the engineer, and game length was being measured in seconds, so the entry in D8 (see Figure 1) is actually a single game that took more than five hours (18,965 seconds / 60 seconds in a minute / 60 minutes in an hour =~5.27 hours). Since this is a casual game designed to last 1 minute to 15 minutes, I knew something was wrong. I asked the engineer again, and we were actually measuring time from game start to game over. If players pause the game for a long time, we get very long game lengths, which throw off the metrics we're trying to balance.
Figure 1: Game metrics from a few sessions of Lottso Express
This brings up our first important Excel lesson: Don't blindly trust the data. Data is important, but you must use your powers of analysis and reason to determine the correct course of action. Also, you'll want to make sure you have good communication with the engineering team, since they're the people feeding you the data you need to do your job.
Analyzing your data
Now that we have some statistics that show how people are playing our game, we can start to figure out what we should fine-tune. The tuning goal for this game was 3,000 tokens per hour, with 2,500 coming from the game itself and 500 coming from the jackpot slot machine. The following formulas get us started:
Total game tokens earned (in B9): =sum(B2:B8)
Total jackpot tokens earned (in C9): =sum(C2:C8)
Total game length (in D9): =sum(D2:D8)
Instead of typing each formula by hand, only create the formula for game tokens (in B9, the cell below the current data). Then, select B9, C9, and D9, and use the Fill Right command (Control + R). You'll save yourself countless hours of typing if you learn and properly employ the Fill Right and Fill Down commands.
Now we can use the initial sums to calculate tokens per hour:
Game tokens per hour: =B9 / ( D9 / 3600) Result: 290
Jackpot tokens per hour: =C9 / ( D9 / 3600) Result: 82
It's important to remember the units of your data, which is why we must divide D9 by 3,600, since there are 3,600 seconds in an hour. Given those results, the game is paying out less than 400 tokens per hour, which is way off from our goal of 3,000 per hour. Of course, that analysis includes some really long games, which are skewing the results. Therefore, we can look more closely and exclude abnormally long games. To do so, I recommend the function sumif(), as follows:
Game tokens earned in games <30 minutes:=sumif(D2:D8, "<1800", B2:B8)
Jackpot tokens earned in games <30 minutes:=sumif(D2:D8, "<1800", C2:C8)
Total game length in games <30 minutes: =sumif(D2:D8, "<1800")
In this case, if you tried the Fill Right shortcut, Excel would properly change "B2:B8" to "C2:C8", but "D2:D8" would also change to "E2:E8", which would mess up the whole formula. Therefore, to tell Excel to keep the column "D" fixed even when using Fill Right, we add a dollar sign ($) in front of the D, like this:
Game tokens earned in games <30 minutes:=sumif($D2:$D8, "<1800", B2:B8)
With that formula, we could Fill Right and the formula would remain correct. Since this example only has a few columns, Fill Right doesn't matter too much since we could type three formulas by hand. When you're working with many columns of data, Fill Right becomes vital to avoid typing out tens or hundreds of slightly different formulas, when Excel could do it all for you with Control + R.
Now that we've eliminated the very long games, let's look at the results of the formulas to get our modified rate-of-tokens-per-hour: Game tokens come out to 1,543 per hour, and jackpot tokens come out to 580. We're still not up to 3,000 per hour, but we got a lot closer by intelligently updating our model of player behavior. To properly balance a game, you need accurate data and data analysis. One other thing to note with this example is that the sample size is very small and only contains seven entries. In a normal beta-testing situation, we would have hundreds or thousands of data points per day.
Although we tend to think of Excel as a tool for numeric data, I've also used Excel to manage text in many games. Recently, I was working on a vocabulary game where I had a dictionary of only singular words, and I needed plural words, too, so I created a table like the one in Figure 2.
Figure 2: You can use Excel to generate a list of plural words, which is useful for word games.
The data in Column A is taken directly from the dictionary file used in the game. The function in B2 is simply =right(A2, 4). Then I selected B2 through B9 and used the Fill Down command (Control + D). C2 contains =left(A2, len(A2)-4). The "-4" is in the formula because I want to split the word apart, leaving the last two letters (and quote and comma) separate.
Unfortunately, the rules for plural words in English are tricky. First, I started with something simple, like =LEFT(B2, 2)&"s", which simply appends an "s" to the end of the word. Then I realized I should detect words that end in "y," so I used an IF() statement, like this:
=IF(MID(B2,2,1)="y", LEFT(B2,1)&"ies", LEFT(B2,2)&"s")
After that, I realized I should actually append an "s" instead of "ies" if the letter preceding the "y" is a vowel. Therefore, I updated the formula to look like:
IF( OR(LEFT(B2,1)="o", LEFT(B2,1)="a", LEFT(B2,1)="u", LEFT(B2,1)="e", LEFT(B2,1)="i"),
As you can see, the actual algorithm for finding the plural spelling of an English word gets complicated quite quickly. Nonetheless, if you build your formula one step at a time, you'll be able to create powerful results. Start small, and work your way up in complexity. At some point, you may reach diminishing returns for rare corner cases, and it's okay to handle those by hand. You can see the word "child" turns into "childs," even though the correct plural form is "children." I simply dealt with that case by hand.
These are only a few of the ways you can use Excel as a game design and data mining tool. You can also create probability and game simulations, advanced text processing, complex asset tracking combined with easy integration into game scripts, and much more. For anyone who wants to practice their Excel skills, I created several Excel puzzles that are available here: http://irafay.com/Excel. Enjoy!