As you may know, the newest Fold-it update includes support for Lua Scripting. In very basic terms, Lua is a simple programming language that we can use to manipulate proteins and automate repetitive tasks -- much like we already do with the Cookbook.
Scope and Audience Edit
This tutorial is designed to walk you through the process of writing and running a Lua script. Due to some feedback I received, I decided to gear the tutorial toward players who may have never written a program or script before. However, there will also be plenty of material that will be helpful for people with more advanced skills.
It is assumed that you have no prior knowledge of Lua or programming or scripting in general. All you need is some experience playing Foldit and using the in-game tools. You should be familiar with basic game vocabulary like “residue”, “wiggle”, “shake”, etc. Experience running cookbook recipes – or better yet, building your own recipes – is a plus, but not absolutely necessary.
Revisions and Attributions Edit
Crashguard303 and Mat747 helped provide ideas for this tutorial. Crashguard303 has also already made some brilliant modifications to the "simple wiggle by ones" script I focus on in this tutorial. Mat747 has suggested (and I concur) that we split this tutorial into beginner and intermediate level tutorials -- see details on the talk page. I welcome any and all other suggestions, revisions, and edits to this tutorial and my scripts.
Working with Your First Lua Script Edit
Loading the Script Edit
To begin, load my sample Lua script called"simple wiggle by ones" into your game using the same method you use to load shared cookbook recipes (with your game running, click the Add to Cookbook button on the recipe's webpage). The script is a simple backbone walk. It doesn't do anything amazing or revolutionary, but it is a great first script to look at and modify. I've included lots of comments to explain what is going on in the code, as well as some verbose output that you can follow along with in the output window as the script runs.
Opening an Output Window EditGo ahead and open that output window by clicking the "show recipe output" button on the recipe bar on the left side of the screen. Position the output window wherever you
like so it doesn't obstruct your view of the protein.
Running the Script Edit
Locate the Lua script you downloaded in your list of recipes. Hover over it and click the green arrow to run it. Watch what happens -- it works pretty much as you would expect -- it wiggles the first segment of the protein, then moves on to the next one, wiggles that, and so on until it reaches the end of the protein. While it runs, some text appears in the output window. That text is telling you what is happening in the script moment-by-moment and is a great way to see what is going on and (when you start writing your own scripts) to check to make sure no errors are occurring. If an error does occur, you will get some text explaining what the problem is.
Looking at the Code Edit
Next, you will want to start looking at the code in the script. Do this by hovering over the name of the script and clicking on the notepad icon ("Edit recipe"). You will see a plain text document.
A Lua Script in Depth: Comments, Loops, Commands, and Variables EditThere are four basic components you will see in the script: comments, loops, commands, and variables. Let's work through each one in turn.
Comments are pretty easy to pick out -- they begin with a "--". You can write a comment anywhere in a script (they can take up a whole line or part of a line), and although they're not required, they're a great way to explain what is happening in the code to others who might look at it. Other than that, they do absolutely nothing! Go ahead and take a quick glance at those comments now.
A second component is called a loop. Although there are many types of loops, in this script, we use a "for" loop that looks like this:
for seg=1, numSegs do ... end
A loop is a programming technique in which we tell the computer to loop (repeat) through a set of actions over and over again. In our case, we tell the computer to start at segment 1, do something to it, then move onto the second segment, do something to it, and so on, until it ends at the last segment in the protein. You can immediately see how this is going to be very useful in almost any type of protein folding script you can imagine writing! So a good way to get started when you finish this tutorial would be to take my script and put new commands inside the loop.
What is our for loop doing? The basic structure of the for loop is:
for [something] to [something else] do [some things] end
In Lua, the different parts of a for loop could be all on one line or separated onto multiple lines -- it looks nicer on several lines, so that's what I've done, but you can jam them all together if you want. Basically, we are telling the computer to do some things over and over again for a specified interval from something to something else, and then end. We'll worry about the details of how this works later, because we need to learn about commands and variables first. For now, it is enough to understand the basic idea of what this loop is doing.
Now that you understand how that main loop is letting us do things to each segment in the protein in turn, let's figure out how we actually "do things"! If you take a look at the code inside of the for loop, what you will see is a list of commands:
deselect_all() select_index(seg) print(seg) print(get_segment_score(seg)) do_global_wiggle_all(5)
A command is pretty much what it sounds like -- it is a statement that tells Foldit to do something in particular to the protein. For example, there are commands for shaking, wiggling, mutating, and banding the protein. There are also commands that don't really do anything except give us information, such as the score of your protein or the structure at a given segment. The developers of Foldit have already gone ahead and set up a bunch of commands for us to use (the complete list is here ) and what's really nice is that we don't have to worry about how they work. We just need to understand how to use them.
The easiest commands to use are the ones that can do one and only one thing. deselect_all() is a great example of this. If any segments have been selected, this command unselects them. That's all there is to it. Similarly, we can select_all() if we want. If you look at the list of commands, you'll find several examples just like this where there are empty parentheses. For example:
The only thing you need to do if you want to use any of this type of command is to remove the first word ("number", "string", "integer", or "void"). These words refer to the type of the command, which we don't need to worry about right now. So if you want to use get band count, it becomes:
So we now know how to use those "easy" commands. Here is one that is slightly harder:'
As you can no doubt figure out, this command wiggles the entire protein (similarly, we could have used do_global_wiggle_backbone(5) or do_global_wiggle_sidechain(5) instead). The number inside the parentheses is called a parameter. A parameter is any kind of information we want to give to a command to tell it how to run. In this case, it refers to the number of turns we want to wiggle the protein -- any whole number will work. If you don't put a number inside the parentheses, the result would be that the protein is wiggled forever ... or until you push the space bar! As before, note that when looking at the list, we have removed the first word, "void". We have also removed the word "integer" that is in front of the parameter.
Before using a command, make sure you figure out how many and what type of parameters you will need. For example, '
void band_add_segment_segment(integer segment_index_1, integer segment_index_2)
places a band between two segments. We can see if we study this command for a moment that it needs two parameters -- segment_index_1 and segment_index_2. Basically, we are going to provide the command with two bits of information it needs to do its job -- where to start a band and where to end it. The word "integer" in front of each parameter (which we will remove before using the command) refers to the type of information that we provide. In this case, we need integers (whole numbers), but other times, we need numbers (a whole or decimal number) or strings (any set of alphanumeric data surrounded by quotation marks "like this".) In the end, when we fill in the parameters and remove the words that refer to type, we end up with something like this to place a band between segment 3 and 7:
A more versatile and interesting way of filling in those parameters we talked about is to use variables instead:
What does this line of code do? Well, it places bands depending on what num1 and num2 are. Just like in algebra, num1 and num2 are variables -- they could be just about anything (and we could have named them just about anything, as well.) Somewhere in our program, before we use the line above, we need to assign these variables a value -- otherwise the command won't know what to do! How do we do that? We can say:
num1 = 3 num2 = 7
Easy enough, but it seems kind of pointless if we could just say band_add_segment_segment(3,7) instead.
Variables are actually quite powerful -- let's use a better example to see how. Let's return to the loop we set up earlier -- now we're in a better position to figure out how it works:
for seg=1, numSegs do
Here, we decided to make a variable called "seg" that will refer to what segment we are on in the protein (we could have called it "jabberwocky" or just about anything else we wanted to). In the for loop, we initialize it, which means we tell the computer what value it should represent initially. So when the program is started, seg equals 1. Each time the loop executes, seg is automatically incremented, meaning that it increases by 1. So the second time the loop goes around, seg is 2. The third time, it is 3, and so on. Like any variable, its value can change throughout the execution of the program. numSegs is another variable -- this time, we're using it to tell the for loop when to stop. When seg, which is increasing each round, gets big enough to be larger than whatever numSegs is, the for loop stops. We've already initialized numSegs in the previous line:
numSegs = get_segment_count()
Recognize that? Yes, we have used a command, get_segment_count() that takes zero parameters. The computer returns a value, which is the total number of segments in the protein. Whatever that number is is getting assigned to the variable numSegs. How convenient, because every time we run this script on a different protein, it will be able to do the exact same thing, no matter how large or small the protein is -- grab the total number of segments and then loop through each one starting with segment 1, segment 2, segment 3 ... and so on until the last segment.
Seg is called our loop control variable or loop counter. Often times, we might want to actually use it while we're inside the loop. For example, we have:
This selects whatever our current segment is. Or, we could do more elaborate things:
What does this do? It adds a band between each segment and the segment at the very middle of the protein. This might be a good way to try compressing the protein to maybe increase your points! You can do any kind of mathematical calculation with variables as part of a parameter or a statement. You can also assign one variable to another, compare variables, and do all sorts of other neat things:
bestScore = 0 for seg=1, numSegs do currScore = get_segment_score(seg) if (currScore > bestScore) then do bestScore = currScore end end
This snippet is beyond the scope of this tutorial, so don't panic if you can't totally follow it, but it might give you some ideas about all the things you can do with variables. In brief, we loop through the protein, get the score of each segment in turn, and then compare it to whatever is in the variable bestScore using an if statement. You can find out more about if statements in the Lua manual, but the basic form is if [some comparison] then do [some stuff] end. The first loop through, bestScore is zero and currScore is the score of the first segment. Say the first segment's score is 5. Since 5 is > 0, bestScore is now 5. If the first segment score was, say, -5, then we would do nothing. The next time through, we get the second segment's score. Again, if it's bigger than our best score (5), we make IT our best score. If not, we do nothing. What is the result of looping through an entire protein? What would bestScore be? How could you use some of these techniques working with variables in your own script?
What next? Edit
You should now be able to download a shared script, run it, look at its output, and also view its code. Hopefully, while you may not understand every line in the code, you have a pretty good general idea of how it works and how it uses loops, variables, and commands. Great, but what next?
Modify a Script Edit
A great way to get started writing scripts is to modify the ones that other people have written. Of course, my simple wiggle by ones script should be pretty easy to play with and modify -- you can try making it wiggle by twos or threes, add a shake after each wiggle, or add a combination of shakes/wiggles after the loop is finished. I'm sure you can think up some other possibilities, as well. Other people's scripts online are also ripe for modification, so take a look at what scripts are shared online. Note that you will need to save these scripts under a new name. Feel free to share a modified script online, but it's always a good idea to attribute the original author(s) in the comments!
Write a Script Edit
If you want to write a script from scratch, simply select the "new (Script)" button at the bottom of the scripting/cookbook window. You'll get a boring blank screen that you can fill with code to your heart's content. Lua is a simple language, so you should find that you can get a new script up and running right away. Even a single line with a command on it will run properly, so don't feel like you need to put together something elaborate with loops and other structures. It is possible to cut and paste from a plain text editor if you'd prefer to write your scripts there, but sometimes strange characters will get carried over into the Foldit script window, so be careful.
Additional notes Edit
Debugging is a fancy term for "trying to figure out what the heck went wrong" and you're going to have to do it a lot if you start writing or modding scripts. Mistakes are easy to make, and sometimes a program doesn't behave exactly as you expect. Keep an output window open whenever you run a script so that you can see when and where an error occurs and get some information about the error. The most common situation is that you've missed a parentheses or misspelled a variable name or are missing an "end" statement in your loop. The output window will yell at you, but computers are "stupid", so what it says is wrong may or may not make sense. Make note of the line number where the error is occuring and study that line carefully to see if you can find your mistake. Keep in mind that the real error may have happened somewhere else -- for example, sometimes a variable ends up not getting a value. In that case, the output window will tell you that an error is on the line where the variable is being used. But the point at which it was given the wrong value (or no value) in the first place is earlier in the code. Debugging can be a fine art, so don't give up if your code doesn't run right away. Relax and work through it line by line to find the problem or share your code with someone else -- perhaps they can find the error.
Print Commands Edit
In my script you will notice several print commands, i.e., print(numSegs) or print("hi there!"). Print commands are a great way of giving your user (and yourself) helpful output as the script runs. They are also vital to debugging, because many times, you need to give yourself information about what is happening in the script to help you find that annoying error. For example, if you're debugging a loop, a print("loop number:", numSegs) is helpful because it will print each time the loop executes. Then you can find out at what exact point during the loop the script errors out. If there may be something wrong with a variable getting no value or getting the wrong value, put in something like print("num1", num1) right after the point where your variable num1 should be getting a value. Then you'll see what that variable is storing at that exact moment.
Library Functions Edit
Unfortunately, the developers have decided not to allow us access to standard library functions in Lua, although this may change in the future. So keep this in mind when you're looking at Lua scripts online or the Lua reference manual -- much of the code and commands will be unusable for us.
If statements and other conditionals should be added to this tutorial at some point, because they're very useful in any basic script. Check out the Lua manual or information online if you want to find out more for now.
Band Limits and Infinite Loops Edit
There seems to be a limit on the number of bands you can create in a game. So be careful that your program does not create tons of bands -- you may crash Foldit. Likewise, it is possible to create something called an "infinite loop" that could lead to a crash. This happens when you create a loop that can never, ever end. In a Lua for loop, this could happen if you start messing around with your loop counter inside the body of the loop. For example, what if you had this line:
seg = seg - 1
in the body of your loop? The loop would never move onto the next segment of the protein and the loop would be infinite. I'd recommend not making any changes to a loop counter inside the body of the loop. Using it is fine, but don't change its value. If you want to do some math with the loop counter, assign it to another variable and mess around with THAT variable instead.