Proteiinianalyysi 6 Sekundaarirakenteen ennustaminen ads/teaching/spring2006/proteiinianalyysi
Secondary structure Amino acid sequence => secondary structure –Conformational preferences of amino acids –13-17 residue window –Correlations between positions > neural networks Biophysical background … –
Appendix A
DSSP algorithm to define secondary structure Dictionary of Protein Secondary Structure: Pattern Recognition of Hydrogen-Bonded and Geometrical Features W. Kabsch & C. Sander Biopolymers 22, (1983)
Hydrogen bonds N H O C E ~ q1 q2 [ 1/r(ON) + 1/r(CH) – 1/r(CN) – 1/r(OH) Ideal H-bond is co-linear, r(NO)=2.9 A and E=-3.0 kcal/mol Cutoffs in DSSP allow 2.2 A excess distance and ±60º angle -0.20e +0.20e -0.42e +0.42e
Elementary H-bond patterns n-turn(i) =: Hbond(i,i+n), n=3,4,5 Parallel bridge(i,j) =: [ Hbond(i-1,j) AND Hbond(j,i+1) ] OR [ Hbond(j-1,i) AND Hbond(i,j+1) ] Antiparallel bridge(i,j) =: [ Hbond(i,j) AND Hbond(j,i) ] OR [ Hbond(i-1,j+1) AND Hbond(j-1,i+1) ]
-N-C-C--N-C-C--N-C-C--N-C-C- H O H O H O H O -N-C-C--N-C-C--N-C-C--N-C-C--N-C-C- H O H O H O H O H O -N-C-C--N-C-C--N-C-C--N-C-C-—N-C-C-—N-C-C- H O H O H O H O H O H O 3-turn 4-turn 5-turn N-turns
-N-C-C--N-C-C--N-C-C--N-C-C—N-C-C- H O H O H O H O H O -N-C-C--N-C-C--N-C-C--N-C-C—N-C-C- Parallel bridge
-N-C-C--N-C-C--N-C-C--N-C-C- H O H O H O H O O H O H O H O H -C-C-N--C-C-N--C-C-N--C-C-N- Antiparallel bridge Antiparallel beta-sheet is significantly more stable due to the well aligned H-bonds.
Cooperative H-bond patterns 4-helix(i,i+3) =: [4-turn(i-1) AND 4-turn(i)] 3-helix(i,i+2) =: [3-turn(i-1) AND 3-turn(i)] 5-helix(i,i+4) =: [5-turn(i-1) AND 5-turn(i)] Longer helices are defined as overlaps of minimal helices
Beta-ladders and beta-sheets Ladder =: set of one or more consecutive bridges of identical type Sheet =: set of one or more ladders connected by shared residues Bulge-linked ladder =: two ladders or bridges of the same type connected by at most one extra residue on one strand and at most four extra residues on the other strand
3-state secondary structure Helix Strand Loop Quoted consistency of secondary structure state definition in structures between sequence-similar proteins is ~70 % Richer descriptions possible –E.g. phi-psi regions
Amino acid preferences for different secondary structure Alpha helix may be considered the default state for secondary structure. Although the potential energy is not as low as for beta sheet, H-bond formation is intra- strand, so there is an entropic advantage over beta sheet, where H-bonds must form from strand to strand, with strand segments that may be quite distant in the polypeptide sequence. The main criterion for alpha helix preference is that the amino acid side chain should cover and protect the backbone H-bonds in the core of the helix. Most amino acids do this with some key exceptions. –alpha-helix preference: Ala,Leu,Met,Phe,Glu,Gln,His,Lys,Arg
The extended structure leaves the maximum space free for the amino acid side chains: as a result, those amino acids with large bulky side chains prefer to form beta sheet structures: –just plain large:Tyr, Trp, (Phe, Met) –bulky and awkward due to branched beta carbon:Ile, Val, Thr –large S atom on beta carbon:Cys The remaining amino acids have side chains which disrupt secondary structure, and are known as secondary structure breakers: –side chain H is too small to protect backbone H-bond:Gly –side chain linked to alpha N, has no N-H to H-bond; rigid structure due to ring restricts to phi = -60: Pro –H-bonding side chains compete directly with backbone H-bonds: Asp, Asn, Ser Clusters of breakers give rise to regions known as loops or turns which mark the boundaries of regular secondary structure, and serve to link up secondary structure segments.
Secondary structure prediction GOR method Visual, expert assessment Neural networks Nearest neighbour assignment … consensus filters
GOR method State of central residue is influenced by adjacent positions in a window –…A...X……. –…A...Q……. –…A…X…L.. Superseded by more accurate methods
Structure parsing Multiple alignment Conservation => core elements Gaps, Pro, Gly, polar stretch => loops 3.5 periodicity => amphiphilic helix 2 periodicity => amphiphilic strand Row of hydrophobics => buried strand
What are neural networks? Parallel, distributed information processing structures which draw their ultimate inspiration from neurons in the brain Main class = feed-forward network alias multi-layer perceptron Paradigm for tackling pattern classification and regression tasks
Why (not) use neural networks? Efficient at secondary structure prediction “Black boxes” –Can deal with non-linear combination of multiple factors –Rule-based explanation can over-simplify and mislead
Neural networks are made of units that are often assumed to be simple in the sense that their state can be described by a single numbers, their "activation" values. Each unit generates an output signal based on its activation. Units are connected to each other very specifically, each connection having an individual "weight" (again described by a single number). Each unit sends its output value to all other units to which they have an outgoing connection. Through these connections, the output of one unit can influence the activations of other units. The unit receiving the connections calculates its activation by taking a weighted sum of the input signals (i.e. it multiplies each input signal with the weight that corresponds to that connection and adds these products). The output is determined by the activation function based on this activation (e.g. the unit generates output or "fires" if the activation is above a threshold value). Networks learn by changing the weights of the connections.
Feed-forward architecture Typical output 1.0 for all patterns
Output of each node in the network, for a given pattern p Squashing function f(x) is typically a sigmoid or logistic function
A two-layer neural network capable of calculating XOR. The numbers within the neurons represent each neuron's explicit threshold (which can be factored out so that all neurons have the same threshold, usually 1). The numbers that annotate arrows represent the weight of the inputs. This net assumes that if the threshold is not reached, zero (not -1) is output.
XYA- in B- in C- in A- out B- out C- out z
Training a feed-forward net Supervised learning –Training pattern and associated target = training pair Input patterns in training set must have the same number of elements as the net has input nodes Every target must have the same number of elements as the net has output nodes
Ability to generalise The number of training patterns versus the number of network weights –Rule of thumb: need at least 20 times as many patterns as network weights The number of hidden nodes –Too few nodes impedes learning –Too many nodes impedes generalisation The number of training iterations
Number of training iterations
Basic approach Each training pair is of the form –Pattern: **LSADQISTVQASFDK –Target: H Three target classes –DSSP classes: Prediction class: H, G helix E strand B, I, S, T, e, g, h coil Encoding –Alanine: # –Helix: 1 0 0
Back-propagation algorithm Gradient descent w ij = w ij – n E / w ij + m(1) Partial derivative of error E with respect to weights E / w ij = (s i – d i ) s i (1-s i ) s j (2) S i = signal emitted by hidden node d i = desired value of output N = rate of training (typical value 0.03) m = smoothing factor (typical value 0.2) Example: signal sj sent from Hj to Oi = 0.2; desired output = 1 E / w ij = (0.2-1) x 0.2 x 0.8 x 0.2 = so w ij will be increased according to (1)
Typical numbers Training set –Several hundred non-homologous protein chains –Total number of residues = number of training patterns Architecture –Fully-connected 17(21) input nodes 1,808 weights Prediction –winner-takes-all
Performance measures Q3 –Three-state residue prediction Correlation coefficient SOV –Segment overlap Reliability index
Improvements on basic approach Using evolutionary information –Up 6 %-points Balanced training –Equal representation of H, E, L patterns Increase the amount of training data –Up 4 %-points training on 128 / 318 proteins Post-processing and filtering Use an ensemble of networks –Jury of 10 nets: up 2 %-points
PredictProtein server
PSIPRED PSI-Blast multiple alignment analysed by two feed-forward neural networks
Prediction of secondary structure by nearest neighbor analysis Examples of two of the most accurate nearest neighbor prediction programs (1) NNSSP (accuracy to 73.5%) program chosing the PSSP / NNSSP option. The output probabilities Pa and Pb give a normalized score by co0nverting the values of fa, fb and fcoils to a scale of 0-9.NNSSP (2) Predator (accuracy 75%) using the FSSP assignments of secondary structure to the training sequences. Predator does not provide a normalized score. Predator predictions are shown below NNSP prediction on each line. The input sequence was the a subunit of S. typhimurium tryptophan synthase, Swiss-prot ID TRPA_SALTY, accession P00929, which is in the training sequences since the 3D structure is known.Predator
PredSS aaaaaaaaaaaaa bbbbbb aaaaaaaaaaaaaaaaaaaaa AA seq MERYESLFAQLKERKEGAFVPFVTLGDPGIEQSLKIIDTLIEAGADALEL Prob a Prob b Predator ___HHHHHHHHHHHHHH_EEEEEE_______HHHHHHHHHHH________ PredSS aaaaaaaaaa aaaaaaaaaaaaa bbba AA seq GIPFSDPLADGPTIQNATLRAFAAGVTPAQCFEMLALIRQKHPTIPIGLL Prob a Prob b Predator ______________HHHHHHHHH______HHHHHHHHHHH______HHHH PredSS aaaaaaa aaaaaaaaaaa bbbbb aaaaaaa AA seq MYANLVFNKGIDEFYAQCEKVGVDSVLVADVPVEESAPFRQAALRHNVAP Prob a Prob b Predator HHHHH______HHHHHHHHH____EEEEEE________HHHHHHHH___E PredSS bbb aaaaaaaaa bbbb aaaaaaaaaaaaaaaa AA seq IFICPPNADDDLLRQIASYGRGYTYLLSRAGVTGAENRAALPLNHLVAKL Prob a Prob b Predator EEE_______HHHHHHHH_____EEEEE______HHHHH_____HHHHHH PredSS aaa aaaaaaaaa aaaaaaaaaaa aaa AA seq KEYNAAPPLQGFGISAPDQVKAAIDAGAAGAISGSAIVKIIEQHINEPEK Prob a Prob b Predator HHH_______________HHHHHHH___________HHHHHHHHH__HHH 260 PredSS aaaaaaaaaaaaaaaaa AA seq MLAALKVFVQPMKAATRS Prob a Prob b Predator HHHHHHHH__________
Paracelsuksen haaste Paracelsus oli 1500-luvulla vaikuttanut alkemisti protein design -haaste: suunnittele aminohapposekvenssi, jolla on vähintään 50 % identtisiä aminohappoja tunnetun proteiinin kanssa, mutta joka laskostuu toisenlaiseksi rakenteeksi. Ensimmäinen haasteen täyttänyt keinotekoinen sekvenssi, nimeltään Janus (Dalal et al. 1997, Nat. Struct. Biol. 4, ), muuntaa B1-domeenin beta- rakenteesta (bbabb) alfa-helikaaliseksi rakenteeksi (aa). Janus on rakenteeltaan Rop-proteiinin kaltainen. Rop- monomeeri muodostaa kahden vastakkaissuuntaisen heliksin hiusneulan. Luonnossa Rop dimerisoituu ja muodostaa neljän heliksin kimpun.
(a) B1-domeenin rakenne. Januksen sekvenssissä säilytetyt aminohapot on merkitty punaisella. (b) ROP-dimeerin rakenne. Januksen sekvenssissä esiintyvät aminohapot on merkitty sinisellä.
(a) Laske B1-domeenin, Januksen ja Ropin parittaiset sekvenssi-identtisyydet B1-Janus 27/56 B1-Rop 3/56 Janus-Rop 23/56
CEEEEECCCSSCEEEEECCCSCHHHHHHHHHHHHHHTTCCEEEEECCCEEEEEECC MTYKLILNGKTLKGETITEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVTE B1-domeeni || | || | | || || || | |||| || | || || | MTKKAILALNTAKFLRTQAAVLAAKLEKLGAQEANDNAVDLEDTADDLYKTLLVLA Janus || ||| | | | | | || | | | | | | || || | GTKQEKTALNMARFIRSQTLTLLEKLNELDADEQADICESLHDHADELYRSCLARF Rop-monomeeri CCHHHHHHHHHHHHHHHHHHHHHHHHHHTTCHHHHHHHHHHHHHHHHHHHHHHHHH
(b) Merkitse linjaukseen identtisten aminohappojen lisäksi substituutiot, joiden pistemäärä on positiviinen BLOSUM62- matriisissa. Ei yhtään B1:n ja Januksen välillä. Kahdeksan Januksen ja Ropin välillä.
(c) Esiintyykö B1-perheessä tai Rop-perheessä luonnostaan Janukseen valittuja mutaatioita? B1/Janus-mutaatioista mikään ei esiinny B1-perheessä. Janus/Rop-mutaatioista 7 esiintyy muissa Rop-perheen jäsenissä. 5 näistä mutaatioista on yhteisiä B1-sekvenssin kanssa. B1:stä on muutettu ytimen aminohappoja, kun taas Ropin ydin on säilytetty Januksessa.
(d) Miten Januksen beta-tendenssiä on heikennetty ja alfa-tendenssiä vahvistettu?
Sekvenssilinjaus (c) B1-domeenin, Januksen ja ROP-monomeerin sekvenssilinjaus. Identtiset aminohapot on merkitty pystyviivalla CEEEEECCCSSCEEEEECCCSCHHHHHHHHHHHHHHTTCCEEEEECCCEEEEEECC B1-domeenin sekundaarirak. MTYKLILNGKTLKGETITEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVTE B1-domeeni || | || | | || || || | |||| || | || || | MTKKAILALNTAKFLRTQAAVLAAKLEKLGAQEANDNAVDLEDTADDLYKTLLVLA Janus || ||| | | | | | || | | | | | | || || | GTKQEKTALNMARFIRSQTLTLLEKLNELDADEQADICESLHDHADELYRSCLARF Rop (monomeeri) CCHHHHHHHHHHHHHHHHHHHHHHHHHHTTCHHHHHHHHHHHHHHHHHHHHHHHHH Strong beta-formersI, M, V Beta-formersC, F, L, Q, T, W, Y Strong beta-breakerE Beta-breakerH, K, N, P, S Strong alpha-formersA, E, L Alpha-formersF, H, M, Q, V, W Strong alpha-breakersG, P Alpha-breakersY, N Valine and isoleucine side chains
Sekvenssilinjaus (c) B1-domeenin, Januksen ja ROP-monomeerin sekvenssilinjaus. Identtiset aminohapot on merkitty pystyviivalla CEEEEECCCSSCEEEEECCCSCHHHHHHHHHHHHHHTTCCEEEEECCCEEEEEECC B1-domeenin sekundaarirak. MTYKLILNGKTLKGETITEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVTE B1-domeeni || | || | | || || || | |||| || | || || | MTKKAILALNTAKFLRTQAAVLAAKLEKLGAQEANDNAVDLEDTADDLYKTLLVLA Janus || ||| | | | | | || | | | | | | || || | GTKQEKTALNMARFIRSTQLTLLEKLNELDADEQADICESLHDHADELYRSCLARF Rop (monomeeri) CCHHHHHHHHHHHHHHHHHHHHHHHHHHTTCHHHHHHHHHHHHHHHHHHHHHHHHH Strong beta-formersI, M, V Beta-formersC, F, L, Q, T, W, Y Strong beta-breakerE Beta-breakerH, K, N, P, S Strong alpha-formersA, E, L Alpha-formersF, H, M, Q, V, W Strong alpha-breakersG, P Alpha-breakersY, N
Sekvenssilinjaus (c) B1-domeenin, Januksen ja ROP-monomeerin sekvenssilinjaus. Identtiset aminohapot on merkitty pystyviivalla CEEEEECCCSSCEEEEECCCSCHHHHHHHHHHHHHHTTCCEEEEECCCEEEEEECC B1-domeenin sekundaarirak. MTYKLILNGKTLKGETITEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVTE B1-domeeni || | || | | || || || | |||| || | || || | MTKKAILALNTAKFLRTQAAVLAAKLEKLGAQEANDNAVDLEDTADDLYKTLLVLA Janus || ||| | | | | | || | | | | | | || || | GTKQEKTALNMARFIRSQTLTLLEKLNELDADEQADICESLHDHADELYRSCLARF Rop (monomeeri) CCHHHHHHHHHHHHHHHHHHHHHHHHHHTTCHHHHHHHHHHHHHHHHHHHHHHHHH Strong beta-formersI, M, V Beta-formersC, F, L, Q, T, W, Y Strong beta-breakerE Beta-breakerH, K, N, P, S Strong alpha-formersA, E, L Alpha-formersF, H, M, Q, V, W Strong alpha-breakersG, P Alpha-breakersY, N
Sekvenssilinjaus (c) B1-domeenin, Januksen ja ROP-monomeerin sekvenssilinjaus. Identtiset aminohapot on merkitty pystyviivalla CEEEEECCCSSCEEEEECCCSCHHHHHHHHHHHHHHTTCCEEEEECCCEEEEEECC B1-domeenin sekundaarirak. MTYKLILNGKTLKGETITEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVTE B1-domeeni || | || | | || || || | |||| || | || || | MTKKAILALNTAKFLRTQAAVLAAKLEKLGAQEANDNAVDLEDTADDLYKTLLVLA Janus || ||| | | | | | || | | | | | | || || | GTKQEKTALNMARFIRSQTLTLLEKLNELDADEQADICESLHDHADELYRSCLARF Rop (monomeeri) CCHHHHHHHHHHHHHHHHHHHHHHHHHHTTCHHHHHHHHHHHHHHHHHHHHHHHHH Strong beta-formersI, M, V Beta-formersC, F, L, Q, T, W, Y Strong beta-breakerE Beta-breakerH, K, N, P, S Strong alpha-formersA, E, L Alpha-formersF, H, M, Q, V, W Strong alpha-breakersG, P Alpha-breakersY, N
Sekvenssilinjaus (c) B1-domeenin, Januksen ja ROP-monomeerin sekvenssilinjaus. Identtiset aminohapot on merkitty pystyviivalla CEEEEECCCSSCEEEEECCCSCHHHHHHHHHHHHHHTTCCEEEEECCCEEEEEECC B1-domeenin sekundaarirak. MTYKLILNGKTLKGETITEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVTE B1-domeeni || | || | | || || || | |||| || | || || | MTKKAILALNTAKFLRTQAAVLAAKLEKLGAQEANDNAVDLEDTADDLYKTLLVLA Janus || ||| | | | | | || | | | | | | || || | GTKQEKTALNMARFIRSQTLTLLEKLNELDADEQADICESLHDHADELYRSCLARF Rop (monomeeri) CCHHHHHHHHHHHHHHHHHHHHHHHHHHTTCHHHHHHHHHHHHHHHHHHHHHHHHH Strong beta-formersI, M, V Beta-formersC, F, L, Q, T, W, Y Strong beta-breakerE Beta-breakerH, K, N, P, S Strong alpha-formersA, E, L Alpha-formersF, H, M, Q, V, W Strong alpha-breakersG, P Alpha-breakersY, N
Sekvenssilinjaus (c) B1-domeenin, Januksen ja ROP-monomeerin sekvenssilinjaus. Identtiset aminohapot on merkitty pystyviivalla CEEEEECCCSSCEEEEECCCSCHHHHHHHHHHHHHHTTCCEEEEECCCEEEEEECC B1-domeenin sekundaarirak. MTYKLILNGKTLKGETITEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVTE B1-domeeni || | || | | || || || | |||| || | || || | MTKKAILALNTAKFLRTQAAVLAAKLEKLGAQEANDNAVDLEDTADDLYKTLLVLA Janus || ||| | | | | | || | | | | | | || || | GTKQEKTALNMARFIRSQTLTLLEKLNELDADEQADICESLHDHADELYRSCLARF Rop (monomeeri) CCHHHHHHHHHHHHHHHHHHHHHHHHHHTTCHHHHHHHHHHHHHHHHHHHHHHHHH Strong beta-formersI, M, V Beta-formersC, F, L, Q, T, W, Y Strong beta-breakerE Beta-breakerH, K, N, P, S Strong alpha-formersA, E, L Alpha-formersF, H, M, Q, V, W Strong alpha-breakersG, P Alpha-breakersY, N
Sekvenssilinjaus (c) B1-domeenin, Januksen ja ROP-monomeerin sekvenssilinjaus. Identtiset aminohapot on merkitty pystyviivalla CEEEEECCCSSCEEEEECCCSCHHHHHHHHHHHHHHTTCCEEEEECCCEEEEEECC B1-domeenin sekundaarirak. MTYKLILNGKTLKGETITEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVTE B1-domeeni || | || | | || || || | |||| || | || || | MTKKAILALNTAKFLRTQAAVLAAKLEKLGAQEANDNAVDLEDTADDLYKTLLVLA Janus || ||| | | | | | || | | | | | | || || | GTKQEKTALNMARFIRSQTLTLLEKLNELDADEQADICESLHDHADELYRSCLARF Rop (monomeeri) CCHHHHHHHHHHHHHHHHHHHHHHHHHHTTCHHHHHHHHHHHHHHHHHHHHHHHHH Strong beta-formersI, M, V Beta-formersC, F, L, Q, T, W, Y Strong beta-breakerE Beta-breakerH, K, N, P, S Strong alpha-formersA, E, L Alpha-formersF, H, M, Q, V, W Strong alpha-breakersG, P Alpha-breakersY, N N CA CO Proline Glycine side chain
Sekvenssilinjaus (c) B1-domeenin, Januksen ja ROP-monomeerin sekvenssilinjaus. Identtiset aminohapot on merkitty pystyviivalla CEEEEECCCSSCEEEEECCCSCHHHHHHHHHHHHHHTTCCEEEEECCCEEEEEECC B1-domeenin sekundaarirak. MTYKLILNGKTLKGETITEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVTE B1-domeeni || | || | | || || || | |||| || | || || | MTKKAILALNTAKFLRTQAAVLAAKLEKLGAQEANDNAVDLEDTADDLYKTLLVLA Janus || ||| | | | | | || | | | | | | || || | GTKQEKTALNMARFIRSQTLTLLEKLNELDADEQADICESLHDHADELYRSCLARF Rop (monomeeri) CCHHHHHHHHHHHHHHHHHHHHHHHHHHTTCHHHHHHHHHHHHHHHHHHHHHHHHH Strong beta-formersI, M, V Beta-formersC, F, L, Q, T, W, Y Strong beta-breakerE Beta-breakerH, K, N, P, S Strong alpha-formersA, E, L Alpha-formersF, H, M, Q, V, W Strong alpha-breakersG, P Alpha-breakersY, N NH 2 O Asparagine side chain
(d) Miten Januksen beta-tendenssiä on heikennetty ja alfa-tendenssiä vahvistettu? Kuvaan on merkitty strong beta former B=IMV, beta former b=CTY, strong alpha former A=AEL, alpha former a=HQ, beta breaker i=KNPS, strong alpha breaker I=G. Januksen sekvenssissä on suosittu heliksin muodostajia ja beta-rikkojia. CEEEEECCCSSCEEEEECCCSCHHHHHHHHHHHHHHTTCCEEEEECCCEEEEEECC B1-domeenin MTYKLILNGKTLKGETITEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVTE B1-domeeni BbbiABAiIibAiIAb_bAAB_AAbAAi__i_bA__iIB_IA_bb__Ab_b_bBbA || | || | | || || || | |||| || | || || | MTKKAILALNTAKFLRTQAAVLAAKLEKLGAQEANDNAVDLEDTADDLYKTLLVLA Janus BbiiABAAAibAi_A_b_AABAAAiAAiAIA_AAi_iAB_AA_bA__AbibAABAA || ||| | | | | | || | | | | | | || || | GTKQEKTALNMARFIRSQTLTLLEKLNELDADEQADICESLHDHADELYRSCLARF Rop-monomeeri Ib__A_bAA_BA__B___bAbAAA_A_AA_A_A_A_BbA_A___A_AAb__bAA__ CCHHHHHHHHHHHHHHHHHHHHHHHHHHTTCHHHHHHHHHHHHHHHHHHHHHHHHH
Glysiini tuhoaa heliksejä. Januksessa tälle kohdalle halutaan tiukka käännös. (e) Pistemutaation D30G on havaittu lisäävän luonnollisen Ropin termodynaamista stabiliisuutta. Miten muuten tämä mutaatio edesauttaa Janus- proteiinin laskostumista Chou-Fasman-luokittelun perusteella?