Research Chat Room

The Research Chat Room is a globally-accessible room/channel on the Fold.it chat servers that is open to all fold.it members interested in learning more about biochemistry and exploiting what we learn to develop new folding strategies. Drop in any time!

What's been going on[]

So, for those of you wondering what we've been up to, the following is a summary of everything that happened before May 28, 2011:

Puzzle 418[]

We started off collaborating on Puzzle 418 “Back Me Up”, whose description noted that “We designed a helix to contact the ligand, but the helix needs some back up to stay in place. You can mutate many of the residues to create a more stable configuration and make a more active catalyst (ie create a Helix-Loop-Helix)”. "While it's possible that the best one will be a helix-loop-helix like the starting structure, we're also very interested in finding out if the Foldit community can come up with an even better geometry. So if you find something that gives you a better score, go for it." The ensuing discussions focused on trying to work out the structural characteristics required to facilitate a Diels-Alder reaction. The discussions drew upon the following links:

https://secure.wikimedia.org/wikipedia/en/wiki/Basic-helix-loop-helix

http://www.rsc.org/chemistryworld/News/2010/July/15071002.asp

http://www.youtube.com/watch?v=i-gUtPwgi3E&feature=related

http://structure.med.miami.edu/bmb615_rna_notes.html

http://www.mskcc.org/mskcc/html/63118.cfm

http://www.ncbi.nlm.nih.gov/pubmed/17846637

http://www.youtube.com/watch?v=M8UiLDK5Qzk

http://www.rcsb.org/pdb/images/1x0o_asym_r_500.jpg

http://www.rcsb.org/pdb/explore/explore.do?structureId=3F1P

http://jchemed.chem.wisc.edu/JCESoft/CCA/CCA5/MAIN/1ORGANIC/ORG06/DIELS/MENU.HTM

http://en.wikipedia.org/wiki/Diels%E2%80%93Alder_reaction

http://fold.it/portal/files/chatimg/irc_192195_1305158578.png

“The diene component in the Diels–Alder reaction can be open-chain or cyclic and it can have many different kinds of substituents. There is only one limitation: it must be able to exist in the s-cis conformation.” However, the cis angle usually leads to clashing in foldit, except for ‘g’ Glycine segments which are much more flexible. "The reaction occurs via a single transition state, which has a smaller volume than either the starting materials or the product." By increasing the levels of catalytic activity around the ligand, the fold.it scientists are hoping to make this enzyme react very "specifically" only to this ligand, sort of like a better "lock" for the ligand. The idea is that the ligand is the key to unlock the "operation" of the enzyme, so the lock must be very specific. Some ligands act like batteries, they give the enzyme or protein energy. It looks like it might be important to align areas along the molecule or backbone that are electron deficient (e.g. red-oxygen atoms) to areas along its complimentary backbone or molecule that are electron rich (e.g. grey/white-carbon atoms). There's also discussion that's come up about "pi orbitals" or conjugation, which is the overlap of one p-orbital with another across an intervening sigma bond (in larger atoms d-orbitals can also be involved), the same as what occurs (i believe) in a sigma-cis angle

PFAAT[]

http://www.proteinscience.org/details/journalArticle/117017/A_role_for_surface_hydrophobicity_in_proteinprotein_recognition_.html

http://pfaat.sourceforge.net/

http://www.rosettacommons.org/ "Pfaat is a Java-based tool for multiple sequence alignment visualization, editing, annotation and interaction with phylograms and 3-D structure. We invite all to download the application and source code, which is an open-source project under the GNU General Public License. A manuscript describing Pfaat was published in the March 2003 issue of Bioinformatics. In October 2004 we completed a significant update to Pfaat. The current version (and all future updates) are now made available using Java Web Start technology. The updated functionality in PFAAT was covered in BMC Bioinformatics in 2007. We use Pfaat extensively in our research to analyze functional regions of proteins, such as protein-protein interfaces and residues involved in ligand recognition." We should start examining the flow of electrons across the channel and see whats happening from an electronic stand point. The negative voltage potential across a channel facilitates opening of the gate. dude, this stuff is EE101. foldit.it has superior visualization of molecular modeling, but its physics still needs minor accuracy tweaks and maybe more tools for EEs.

Iterations[]

What does the “1” represent in do_shake( 1 )? It says Run shake for 'iterations'. Is that equivalent to shake icon for one "tick" on the in-game clock? No its not time-based – that clock is just an animation. I seriously doubt it's calibrated to anything. Is it time-based? maybe. It goes faster with smaller proteins. Testing - 17 ticks for do_shake(1) on 421. maybe 16, 27 for do_shake(2). doesn't seem to be linear. i'm thinking animation is time based - clicking cancel pends until that iteration finishes

Pivot point[]

q when mouse over segment makes it center of rotation

Ubiquitin and Puzzle 420[]

I've seen some pdb figures that have the ligand only interact with the lysine tail of ubiquitin. It juts right out in the air. So what does ubiquitin mean? Ubiquitin is the entire thing that marks proteins for deletion. It forms in chains.

http://en.wikipedia.org/wiki/Ubiquitin

http://web.mit.edu/tokmakofflab/pp.htm

one item is they say they understand *how* ubiquitin unfolds (step-by-step)

Note author http://www.ncbi.nlm.nih.gov/pubmed/12142448

Check out the ubiquitylation system pic. Employs different internediates to form protein chains, then shepherds them to the proteasome for proteolysis (deletion). So on 420 we were trying to do the same kind of thing? The ligand represented c-terminus of ebola, so ebola would be the chain.

Was the lingrad ment to be fixed or was it ment to be changed in shape? They will lock it if they require it to be fixed, but I don't exactly think they wanted us rebuilding it. In one solution I changed it to a sheet and it made nice bonds.

Does someone here have the time to check the AA sequence I have for 420? "FLPKGWEVRHAPNGRPFFIDHNTKTTTWEDPRLKITAPPEYMEA". IF that is right, here is the predicted secondary structure (3 coils) http://bioinf.cs.ucl.ac.uk/psipred/result/248810 but i need someone to double check the letters.

http://www.rcsb.org/pdb/results/results.do?gotopage=1&qrid=3759E41F&tabtoshow=Current

http://www.rcsb.org/pdb/explore/explore.do?structureId=3OLM

http://www.rcsb.org/pdb/explore/explore.do?structureId=3L4H

http://bioinf.cs.ucl.ac.uk/psipred/result/248811

http://bioinf.cs.ucl.ac.uk/psipred/result/248812

http://bioinf.cs.ucl.ac.uk/psipred/result/248815

http://bioinf.cs.ucl.ac.uk/psipred/result/248816

http://bioinf.cs.ucl.ac.uk/psipred/result/248817

http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/pdbsum/GetPage.pl?pdbcode=2jmf

Standalone version of fold.it for academics/nonprofit research[]

I just got a standalone version of foldit, i can make my own peptides or download pdbs in a different program and then work on them in that program! How did you do that??? Through university of washington!

http://depts.washington.edu/uwc4c/express-licenses/assets/foldit/

But most of us are not academic, gov or non-prof organizations? I know, register as other. Go to the part that says academic license and Click the register button. Institution/Organization -> put *insert last name here* Folding. YOU MAY NOT DISTRIBUTE THESE FILES or use them for commercial purposes. They're a little protective... And i recommend downloading ballview to make your pdbs. Should I get v1.4 or 2.0 of ballview? either one, 1.4 is stable but 2.0 has more features. so 2.0 then :P Woah...you can put ci to more than 1. It goes up to two! And there's a lua terminal! Can we load our puzzle in that, eport as pdb, play a bit outside and load it back? Apparently no, cant load from saves.. Does it work faster than foldit? I mean modelling? it's decent, sort of laggy if you have other apps open. I only use it to make the peptides, but there's an energy minimization feature on it. One problem with the standalone version- it doesn't support macros, it only has a terminal. hmm, no need if I can output to normal foldit client. Is there no copy-paste to terminal? i can see that going badly...with this client, it uses actual energy scoring- the lower the score, the better. might give scripts trouble. But that’s easily fixed. Ok, waiting for ballview, wil put denovo sequence to it. a few problems i've noticed with it - If you're working with a PDB file that has a set aa sequence, you can still mutate it and the option to lock the sidechains and amount of residues doesn't work. Still not sure what would happen if you pasted in, say, the code for vk or acid tweeker. I intend to try that soon.

You know the standalone foldit version allows folders to import any PDB protein. My question is why isn;t the ability to do that given to the public? It seems like the devs don't really care about the science, throwing us one measly protein at a time. Wouldn't it be better served if we could import any protein and fold it to the best of our abilities, like in the academic version? No I disagree. I think they get far more concentrated information from doing it this way. also they keep interest up from the players... more players gives a LOT more data. I see. Only a limited number to go around. well maybe they could keep the current form of foldit, but also release the standalone version. if we had whole librarys to play with at any time we wanted then we would fiddle aroudn slowly and the overall knowledge would grow but the amount of time they would have tow wait for a useful amount per individual protien would take forever. Maybe they could provide a library of say 50 pdb files, give us 6 months to work on them, and we pick and choose which ones we do. a comprimise solution like that ? Maybe.

Kinky Proteins[]

"Analysis of the code relating sequence to conformation in globular proteins. The distribution of residue pairs in turns and kinks in the backbone chain" - a "kink" along the protein backbone can result in a loss of proper protein function

http://www.ncbi.nlm.nih.gov/pubmed/4463968

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1168194/pdf/biochemj00577-0298.pdf

http://www.bc.edu/schools/cas/biology/facadmin/oconnor.html

http://www.hhmi.org/research/ecs/thornton.html

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2903753/

These references may provide an insight to protein folding. there may be "a priori" and "ab inito" backbone angle which inherently reduces the propensity of this. ah rosetta structure prediction models already rely on this approach.

Clash Importance[]

Does anyone really know what CI does? i just thought that CI let the aa crash into each outher more. but how would that help? it alows them to change position easier. also if you are mutating it alows a larger aa to occupy that spot. The documentation says "avoid clashes at all costs" for score. yes true, but i think the thing is that if a aa is blocked by other aa from swinging then the ci allows them to clash harder and therefor bump past the other aa and find a new position. i wonder if there is a simple experiment to test that out. the icon of a sidechain is misleading -- the cloud extends further out than the picture would suggest

http://phillips-lab.biochem.wisc.edu/vdsmf/subsectionstar3_3_2.html

http://baoilleach.blogspot.com/2007/12/using-jmol-for-drug-design-depict.html

http://wiki.jmol.org/index.php/File_formats/Surfaces

http://paulbourke.net/papers/conscript/

Null Point[]

so when I fold up a denovo puzzle, i start at one end and get to the middle. at that point i get into an area where the sheets resist moving. how do you deal with that to make nice flat rows of sheets? The q key changes the center of rotation but doesn’t seem to affect this null point. if it's the anchor, then it's deep into the software, can't reassign. is it the 0 , 0, 0, point on the graph that plots the protien in space? It always seems to be in the middle. what graph? well the computer must plot the 3d shape of the protien inorder to display the protien on our monitors. I believe the middle segment of the protien is always used as a reference, and the turning momentum on each segment varies by its distance from the center of the plot, so the end segments swing around easier and the center segment cant be moved. The Q key re-centers the protein. But what im talking about is somewhat different then the Q position. well if you think about it, Q is ALWAYS at the origin. never moves. so the Q position can be changed but the center segment of the protien always is the same seg. Q, initially, may not be assigned to a segment. Q is a point of view. however null is always along the protein. But the center segment is always the same and is not necessarily the center of the protein at least when the protein is folded. It can be at the hair pin turn on the out side of the protein. thanks that was good i think we may be getting closer to it. so Q gives us the illusion that we have moved the null segment but trying to move "frozen" backbone segments exposes this assumption as an error. so in a way it seems that instead of folding from one end, it might be best to fold in from both ends. something to try.

in summation, frozen backbone segments introduce an unresolved translational ambiguity. so foldit chooses a default. part of the math. hence, bug.

Scripting Work[]

Glycine Hinge script[]

The discussion turned to employing a glycine hinge and whether the process could be scripted. The discussions drew upon the following links:

http://foldit.wikia.com/wiki/Glycine_Hinge

http://www.ncbi.nlm.nih.gov/pubmed/19796639

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2807821/

http://www.rlmueller.net/Numbers.htm

http://www.rlmueller.net/Programs/MWC32.txt

http://fold.it/portal/node/267249

http://fold.it/portal/node/986241

http://www.ncbi.nlm.nih.gov/pubmed/16103276

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2266578/

http://mibpaste.com/OvfnlM

http://www.gnu.org/licenses/gpl-howto.html

http://www.mushclient.com/forum/bbshowpost.php?bbsubject_id=8028

https://github.com/

https://github.com/Seagat2011

https://github.com/Seagat2011/fold.it.git

https://github.com/Seagat2011/fold.it/wiki/fold.it

http://www.neuro.fsu.edu/~dfadool/David3.pdf

http://www.springerimages.com/Images/RSS/1-10.1007_s00249-007-0206-7-0

http://www.springerlink.com/content/0946-2716/89/3/?sa_campaign=email/PROM/BIO13888_V2

http://chemistry.umeche.maine.edu/MAT500/Proteins4.html

http://fold.it/portal/recipe/28374

The discussion noted "We mutationally analyzed conserved glycine residues within each β-strand that might provide flexibility for tRNA translocation." For the current puzzle, the entire thing might just be a single helix, split near its center with a single "glycine" that acts as a hinge! This would allow the “cis” situation. Or perhaps there could be up to 3 hinges? Starting somewhere near the middle would help gather all the acidic and basic segs near their correct locations. Maybe we could try to bring all acids together and find the center of gravity of the lot of them, and that center will keep moving as the acids coalesce. As they find equilibrium, we introduce new terms.

"αGly147 is an “activation” hinge where backbone flexibility maintains high values for intrinsic gating, the affinity of the resting conformation for agonists and net ligand binding energy. αGly153 is a “deactivation” hinge that maintains low values for these parameters. αTrp149 (between these two glycines) serves mainly to provide ligand binding energy for gating. We propose that a concerted motion of the two glycine hinges (plus other structural elements at the binding site) positions αTrp149 so that it provides physiologically optimal binding and gating function at the nerve-muscle synapse. "

"A predictive rule for protein folding is presented that involves two recurrent glycine-based motifs that cap the carboxyl termini of alpha helices. In proteins, helices that terminated in glycine residues were found predominantly in one of these two motifs. These glycine structures had a characteristic pattern of polar and apolar residues. Visual inspection of known …. two motifs from each other and from internal glycines that fail to terminate helices. These glycine motifs--in which the local sequence selects between available structures--represent an example of a stereochemical rule for protein folding.

Recipe shared – finds glycine and creates glycine hinge. ci set to 0.30 then wa or lws, whichever user prefers. then bands disabled, wa 100%, restore best. just modify glycine hinge to try all glycines, not just the first one

Quick Stabilizer and Mutable Rebuilder script[]

It was also proposed that we script a routine that found mutable segments in a protein, predicted their secondary structures, then ran through the usual mutate/rebuild/shake/wiggle sequence. Alternatively, since the rebuilds are based on database info (dunbrack db?), a secondary structure detector could run after the rebuild. The links from the discussion are collected below:

http://dunbrack.fccc.edu/bbdep/

http://dunbrack.fccc.edu/Software.php

http://jmol.sourceforge.net/

http://www.sigmaaldrich.com/life-science/metabolomics/learning-center/amino-acid-reference-chart.html

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1450267/

http://en.wikipedia.org/wiki/List_of_standard_amino_acids

http://en.wikipedia.org/wiki/AMBER

https://code.google.com/p/etherpad/

http://github.com/ether/pad

http://etherpad.org/public-sites/

http://piratepad.net/G7xfgayN4c

http://piratepad.net/uVsgDiaqkc

http://alternativeto.net/software/piratepad/

http://www.wiley.com/college/fob/quiz/quiz06/6-8.html

https://secure.wikimedia.org/wikipedia/en/wiki/Collaborative_real-time_editor

http://cloud9ide.com/

http://fold.it/portal/recipe/28489

http://fold.it/portal/recipe/28498

In jmol, does the environment itself assert AMBER forces or do you have to give it info? jmol renders based upon the information given to it - no molecular inferences are made, and we cannot manipulate the protein into a new conformation. Seagat2011 and phi have been discussing recipe for a few weeks so seems rather big, but there’s nothing on paper yet. The thinking is that a mutate would affects sidechains (sc's), which places backbone (bb) in a new direction, so we should stabilize first and mutate last correct? Hold on, from a physical perspective why would it be important to stablize, then mutate? Because it might sort out the bad areas, then we can rebuild, stabilize and compare. But why would order matter? I think order is important – it prioritizes things. We want optimal placement of bb first. I agree. Then we match the space and polarization using sidechains. But the question becomes whether placement of the backbone using the default AA would lead to an *optimal* placement.

So we mutate to valine? I was thinking proline? Maybe glycine? Maybe we should make that a parameter. The script would be like a walker, mutate to X, place bb, compare scores? The main body of the mutate rebuilder comes after? Would the phobicities interfere? If we were to walk it "linear" then that would introduce personality affinities and such, making it a nonlinear optimization problem. Sorry, let’s mutate ALL to X. Proline (being small) would allow more compact solution and also not interfere so much with the bb, so most of the score would be attributable to the configuration of the bb. This operation goes before the stabilize? But secondary structure (SS) comes first. Do we assign structures? No lets assume the recipe user has done that already. like any good project, when in doubt, reduce scope! [Sea and phi’s secondary structure predictor could do this. It takes into account van der waals volume and currently uses AMBER scoring function (AMBER force field, energy equation) it's elegant because it only include 4 out of 56 possible rosetta terms]

OK, so first we want to mutate all to X, then optimize bb (throw in the jitter code from glycine hinge). Do we test for glycine, proline..and alanine? We *could* try them all. But there's like 20! So maybe we have an array of desired mutate AA – with an intelligent default of course. Maybe rather than working on a mutating rebuilder, we start with a mutating stabilizer first? i don't think there is a mutating stablizer script in public recipes, is there? Are we really "mutating" because mutating implies a degree of unpredictability? It’s more like a Q-Stab AA (Quick STABilization AA). i was also thinking etherpad (or equivalent) for joint editing of scripts.

EtherPad is a web-based realtime collaborative document editor. Active development is going on at the github repository. Services and servers based on the EtherPad software include PiratePad TypeWith.me Sync.in EtherPad Foundation iEtherPad.com Google Docs. EtherPad is the only web-based word processor that allows people to work together in really real-time. When multiple people edit the same document simultaneously, any changes are instantly reflected on everyone's screen. The result is a new and productive way to collaborate on text documents, useful for meeting notes, drafting sessions, education, team programming, and more. btw - what is $Id$ ? it expands to the 40-character hexadecimal blob object name so you can definitively say which version it is. http://www.kernel.org/pub/software/scm/git/docs/gitattributes.html

Recipe shared. testing on 420 and rising in rank... gained 73 points! Now we can work on the mutate script.

TonyOrigami used a mutating rebuilder on 418 and he did very well. Basically, he just added in the first line of a loop rebuilder recipe do_shake = do_mutate [sic: function do_shake(x) return do_mutate(x) end] and achieved 15095. he also tried mutating all to alanine and rb but that didn't work for him.

I would like to tweek QS-Fuse for mutate puzzles (and also a rebuilder). I think a fuser for mutate puzzles (if one already hasn't been done!) shows lots of promise. what's the difference between blue/pink/yellow? bluefuse is the traditional. pink is quick-stabilizer tested at only once at a single ci (and then restored best). yellow - not so sure. i think adding jitter to the fuzes would help. specifically ci and number of iterations. i wonder if there is a bug in foldit regarding mutations - how can mutating a segment reduce the flexibility? Well, mutate tries to close that space, which probably results in strong bonds. Still, I wonder if mutate does a local wiggle in the code?

Sliding Strands script[]

Thom says: for the queue in the future, i have an idea on how to simulate Tweak in Lua. So here's my idea: sheets like to lie next to one another, right? For example, marie's "after alignment 4" recipe tries to thread the 2 sheets together. So you get the concept of making 2 sheets align with each other in an anti-parallel fashion. We could write a “slide” script that moves one sheet up or down relative to the other sheet to verify the best match between the 2 sheets. strands are NOT always the same length, that's the point. This would operate on any 2 sheets that are in physical proximity.

https://secure.wikimedia.org/wikipedia/en/wiki/Beta_sheet

http://www.ncbi.nlm.nih.gov/pubmed/10595526

http://mibpaste.com/nMPROQ

https://secure.wikimedia.org/wikipedia/en/wiki/Pure_research

http://www.ics.uci.edu/~baldig/betasheet.html

http://sandwalk.blogspot.com/2008/03/strands-and-sheets.html

http://fold.it/portal/recipe/27518

https://secure.wikimedia.org/wikipedia/en/wiki/Ubiquitin

http://www.web-books.com/MoBio/Free/Ch2C6.htm

http://www-lbit.iro.umontreal.ca/bShuffle/index.html

http://www-lbit.iro.umontreal.ca/bSpider/index.html

http://piratepad.net/b9thDvg81b

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1450267/

http://bioinformatics.oxfordjournals.org/content/21/suppl_1/i75.abstract

http://piratepad.net/7Mt1KvCMz5

http://titanpad.com/ep/pro-signup/

http://foldit_research.titanpad.com/1

To code this, for now, let’s have the user identify strand 1 and 2 as segment ranges. freeze 1 strand. band between corresponding segments between the 2 strands. shake/wiggle. check via get_segment_distance() AND score. Then repeat for each possible relative position between the strands to see which is best. "The amino acid composition of β strands tends to favor hydrophobic (water fearing) amino acid residues."

I find that trying to move frozen segments very frustrating. it seem that one is always stuck in place. they don't move intuitively - they seemed pinned. you CAN move a frozen segment, but I don't know when, or why it doesn't always work. Only way i found is banding hard frozen one and wiggle backbone on very low ci 0.1 or less. Then see if sheets are aligned or not. In most cases have to rebuild loops between the sheets. Perhaps we should have a sctipt that make all bands 3.6 lenght and 3 strong, and another script as band redcuer form 2.5 str and 0.05 ci to str 0.1 and ci 0.9, to get the sheets aligned.

Compatibility Matrices[]

Thom says: does anyone know if there are recipes that use the Hydrophobe Compatibility Matrix (HCM)? What about Charge Compatibility Matrix (CCM) or Size Compatibility Matrix (SCM)? ST - Charge Compatibility Index (CCI) recipe shared - useful for ligand and catalytic area puzzles. run this on Amino Acid (AA) sequences you want evaluated for their charge compatibility. ST - Hydropathy Compatibility Index (HCI) recipe shared. run this on Amino Acid (AA) sequences you want evaluated for hydropathy compatibility. ST - Size Compatibility Index (SCI) recipe shared. Recommended for pocket design puzzles, advanced structure puzzles, or puzzles with mutable segments. run this on Amino Acid (AA) sequences you want evaluated for size compatibility. We should also do a "hydropathy plot" which walks the backbone, gathers the hydrophobic moments and identifies groups (ranges) where the AA's are either hydrophobic or hydrophilic so we can "guess" at which ranges go on the outside/inside of protein. Maybe it would be better to summarize (e.g. segments 5-13 is hydrophobic, segments 41-52 are hydropilic).

Lehninger "Principles of Biochemistry" David L. Nelson and Michael M. Cox, (5th edition) - it's pricey but you *can* find it heavily discounted. about 1200 pages, but lots of pictures. Ramachandran plot is on p 117 (phi vs. psi). Hydropathy plot is p. 378. Chapter 4 is "The Three-Dimensional Structure of Proteins"

ST - mini rosetta energy model[]

shared https://fold.it/portal/recipe/28823

recipe quickly scores a fold based on compactness, hiding, clashing, and disulfide bridges. recommended for Puzzles ( Exploration, Design, Multi-start, Freestyle ) + combinations, thereof. ok on the mini roseta script when i run it it says get seg dist, the seting bands and then print (sometimes) and then it closes. so what is it doing and what and where does it print. and what does it all mean ? sorry i know these scripts are important but i just dont understand how to run them. oh you need to examine the output window. sorry but where is the output window . nothing is left on my screen ? on the bottom of your recipe editor click the black thing next to the notepad which says "show recipe output". let me know when you have it. the first line says st-mini rosetta modle. then step 1 using bands to indicate contracting seg. step 2 tabulating scores. correct hydropathy matching 53%. keep going.. do you see your score fraction, near the bottom ? yes ok i see thanks. ok now if you move the protein and run again the score should change.

xCI recipes[]

how about modifying the xCI scripts to run between 2 different ranges? It could be used to decide whether or not to flip sheets. Did you try the CCI recipe? you can use it to compare the charge affinity for 2 AA's. i really dont understand these programs. i can guess that they are short programs that feed data to other programs. when i run them they only last an instant and are gone. so where does the output go ? or do they need some other kind of input from me? like where do i enter the AA sequences? so the idea is that the script well tell you if 2 different AAs are compatable, if they well fit together, right ? yes. and it well tell you what alternate AAs might fit in a tight space. right. ok now i see - i needed to open the output window!

https://foldit_research.titanpad.com/2 for CCI

but how is the new CCI different from the 1st script CC1? the difference is that you can specify segment ranges to be used for comparison. the original recipe only compares 2 AAs

Isosurface recipes?[]

I would like to do more with isosurfaces and see if we can find recipes that take advantage of using them. the "packing" term may be a sequence-independent term, derived from a simulated annealing algorithm run on the sidechains.

http://www.ncbi.nlm.nih.gov/pubmed/10526365

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2253451/

I wonder if its based on steric factors....or energetics?

Protein optimizer script[]

so like i was saying in foldit's chat, I had a relatively simple idea for a script that would help (at least I know it would help my process) with protein folding. basically, I wanted to create a set of bands between hydrophobics, set the strength (or weighted spacial importance) to 10, wiggle for an iteration, then cyclically improve on the various attributes of each segment. this method would be encapsulated and performed again with a strength of 1 between hydrophobes and hydrophiles (while the original set of bands is deactivated) and yet again between just the hydrophiles at the lowest strength (0.1 I believe). theoretically this should result in the protein iteratively going towards a sphere with the hydrophiles on the outside and the hydrophobes on the inside. Is this similar to a compressor recipe by rav that pulls on phobics and pushes philics? similar, yes, in fact I use that recipe quite often along with some of my own. however, by keeping persistent sets of bands, the spring force can be calculated ( relative to the other groups ) and used to automatically adjust for efficiency. so the increased magnitude of the spring tension (force) can be equally counter-balanced by the bands to create the tension required to produce a gain and balanced across all segments. exactly. so in theory your optimization removes the need for random bands! essentially you gain extra dimensions - extra degrees of freedom - by adjusting the weight of the bands in the spherical mesh. well you can always add as many extra forces as you need - hence the n-dimensional linear algebra component. I guess the strength could be random but controlling it directly would be most efficient I'd think. with the new threading ability, a version of the sphere could be saved for each attribute of each segment (with only # segments of spheres) and then compared to the mesh after adjusting each band in each step of an iterative process. initially an idea would be to reduce the search space. this could be useful on large segment puzzles. basically I had the idea to use embedded vectors represented by recursive functions. tail-recursion is actually very efficient in lua! I'll try to work on some math proofs surrounding the implementation.

AMBER force-field equation –

V(r^n) = sum_of_bonds [ 0.5 * kb * ( len [1] - len [0] ) ^ 2 ] + sum_of_angles [ 0.5 * ka * ( theta [1] - theta [0] ) ^ 2 ] + sum_of_torsions [ 0.5 * vn * ( 1 + cos ( n * omega - lamda ) ) ] + sum_of_intermolecular_forces_j [ sum_of_intermolecular_forces_i [ lennard_Jones_potential (r,i,j) + (( q [i] * q [j] ) / ( 4 * pi * epsilon [0] * r [i][j] )) ]]

where lennard_Jones_potential (r,i,j) = sum ( epsilon [i] * [ ((r_init [i][j]/r [i][j]) ^ 12) - 2 * ((r_init [i][j]/r [i][j]) ^ 6) ]

Let S = { langle Protein Segment 1, Protein Segment 2, ... , Protein Segment n,` forall n` in` setZ, n` =` Total Segments rangle }

Let vec bold P = { left langle langle vec bold x_{seg0}, vec bold x_{seg1}, vec bold x_{seg2}, vec bold y_{seg0}, vec bold y_{seg1}, vec bold y_{seg2}, vec bold z_{seg0}, vec bold z_{seg1}, vec bold z_{seg2}, vec bold t, vec bold a rangle, vec bold P subset S,` forall vec bold x, vec bold y, vec bold z, vec bold t, vec bold a` in` setR^{1}_10 right rangle }

Let vec bold B(vec bold s_1, vec bold s_2, w) = -w*(vec bold s_2-vec bold s_1) ,` forall vec bold s_1,vec bold s_2` in` vec bold P

(Note: this depends on w, the weight or strength of the band, being proportional to the spring constant k of the band) seg1 represents the segments connected to seg0 by bands and seg2 represents the segments connected to all segments in seg1.

Let vec bold a(x,y,z) = langle "_score", "_hci", "_sci", "_cci", "_clashing", "_hiding", "_packing", "_disulfides" rangle

Let seg0 = langle vec bold x_seg0, vec bold y_seg0, vec bold z_seg0 rangle,~ seg1 = langle vec bold x_seg1, vec bold y_seg1, vec bold z_seg1 rangle,~ seg2 = langle vec bold x_seg2, vec bold y_seg2, vec bold z_seg2 rangle,~ `in` sum vec bold P

Let b = sum vec bold B(seg0,seg1,W(i))` and` Let W(i) = w forall vec bold B_i` in` b, i `in` setZ, i` >` 0

Therefore, as b and inf(ldline %DELTA vec bold a_"_score" rdline) %tendto 0,~ vec bold a_"_score" %tendto max(vec bold a_"_score")

...and thats what I've got so far. Should be a solid base for induction. these definitions would make a good platform to start with I think, although keep in mind the methods aren't explicitly stated (yet). I just wanted to double check before continuing that the data I'm manipulating is all exposed through the mini rosetta script you just released. in this statement here: -w*(vec bold s_2-vec bold s_1) do you want a convolution? or a conjugate (to form a normal vector)? neither its just a scalar weight times a vector. namely f = -kx, the spring force equation. hmm.. I can see the angles are going to be challenging. we should derive langle and rangle from segment lengths. haha, no no thats all format. those just mean its literally in brackets. here I finally got a png of it: http://foldit.wikia.com/wiki/Protein_Optimizer_Proposition

okay now where did the modulus "%tendto" come from ? is it formatting ? yeah, thats just the arrow pointing to the right, comes from limits. setZ is the integer set. setR is the reals. basically the variables of Z are just counters, like index would be in a for loop. should: x_{seg0}, x_{seg1}, x_{seg2}, y_{seg0}, y_{seg1}, y_{seg2}, z_{seg0}, z_{seg1}, z_{seg2} instead be by segment ? it certainly could be, yes.

basically let M(x) = 2^(M(x-1)), x >= 1, and M(0) = 1. basically M(1) represents the two states of the bit, and then work upwards with embedded vectors. the log_2 M(x) and 2^(M(x)) goes up and down meta levels. I found a nice taylor series expansion for base n log. can't wait to use it

I could really use a fact checking and revision of this work: http://fold.it/portal/node/989764. New article http://foldit.wikia.com/wiki/Folding_Methods_-_Fold_Theory

Residue Binder[]

I was wondering if it's possible to write a script that will be able to determine whether there's a donor or acceptor at the end of a sidechain or on a segment of the backbone of a protein. yes, it is possible - are you trying to match up 2 segments? not necessarily I was thinking about the possibilities of making a script that could maybe maximize hydrogen bonding by selectively pulling and/or pushing segments and/or sidechains. polarity characteristics tend (I believe) to remain consistent among similar amino acids, so the get_aa foldit API may be enough for us to tell. we won't know what isomer (sidechains position) it takes, but once we know what type AA it is, we can perform a pull/push as necessary and then snap through the sidechains to find optimal position. how is that different from sidechain flipper? his idea is to maximize hydrogen bonding by selectively pulling and/or pushing these targeted segments -- imagine sidechain flipper as a bander. maybe call it Residue Bonder. Residue Binder - https://foldit_research.titanpad.com/5