Palabra!

Welcome to the labor intensive Isoelectric Point "estimator" / calculator.

Note: If pKa is '0', default pKa values from [2] will be used(**This includes N-terminus side chain and C-terminus side chain.  If you do not specify a pKa for the amino acid on an N-terminal side chain or C-terminal side chain by leaving the slot '0', the default will be used.  So check your sequence and enter the pKa you want to use if not a default value.) .  If  the pKa is '0' for modified Tyrosine, phospho-Serine/Threonine, and/or New Residue 1-8, these slots will be considered neutral.  Please see additional information below.

Enter FASTA sequence for Isoelectric Point estimation:

Glutamic Acid pKa

Aspartic Acid pKa

Lysine pKa

Arginine pKa

Cysteine pKa     (note)

Tyrosine pKa

Histidine pKa

N-terminal amine pKa    Number of residues      (note)

N-terminal side pKa    (note)

C-terminal acid pKa    Number of residues      (note)

C-terminal side pKa    (note)

 

*the following residues and pKa values will be neutral if a pKa value is not entered*

Modified Tyrosine pKa     Number of Tyrosines modified    (note)

Phospho-Serine/Threonine pKa 1     Number of Serine/Threonine phosphorylated    (note)

Phospho-Serine/Threonine pKa 2

New Residue #1 pKa     Number of residues     (note)

New Residue #2 pKa     Number of residues  

New Residue #3 pKa     Number of residues  

New Residue #4 pKa     Number of residues  

New Residue #5 pKa     Number of residues  

New Residue #6 pKa     Number of residues  

New Residue #7 pKa     Number of residues  

New Residue #8 pKa     Number of residues  

 

Notes:

1)  For N-terminal residues the default pKa is an average of N-terminal pKa based on table in reference [2].  The pI may differ significantly for sequences where single amino acid pKa values are very significant.  Compare short peptide sequences with [1] for example and consider the effect of the average value for N-terminal pKa noted here.  Sequence specific N-terminal pKa values in [2] can be input above to match the output of [1].  The value of pKa in reference [2] for N-terminal amine varies by as much as ~1.5 pKa units.

2)  Remaining default pKa values available in reference [2].  Where pKa value is not noted in this reference, the default pKa is for the residue as if it were an internal residue.  

3)  Default pKa values should be closest to pKa values for functional groups on peptides in 9.8M Urea at 25°C.  Reference [3] does contain pKa values for functional groups in 8M Urea and water.  See also additional information link below.

4)  For Tyrosine modifications:  The pKa for modified Tyrosine will only replace internal Tyrosine residues.  If a Tyrosine residue is present on the N-terminal or C-terminal residue, the pKa of a modified(or unmodified) Tyrosine will be noted by N-terminal side pKa or C-terminal side pKa respectively.  **This means if you expect all of the internal tyrosines to be modified, you need to input the number of modified Tyrosines present as [Total #Tyr - 1](if only the N-terminal or C-terminal residue is modified) or [Total #Tyr -2](if both the N-terminal side chain and C-terminal side chain are modified).  This is necessary because while counting residues the program will subtract '1' from the total number of Tyrosine if a Tyrosine is present at one of the ends of the sequence.  This allows the program to calculate charge from N-terminal side pKa separately from charge from Tyrosine pKa.  In the case that only some of the Tyrosines will be modified, the input number of Tyrosine will be subtracted from the total internal Tyrosine.  See the help link just a few more sentences down.  Following this, the second pKa of a phosphorylated Tyrosine can be input with New Residue #1 slot, as well as New Residue #2 and New Residue #3 if desired for specifying internal, Nterm, and Cterm respectively.  If you have 3 phosphoTyrosine total, and expect the secondary pKa of phosphate to be the same for internal, N-terminal, and C-terminal, you can just input the secondary pKa value in one of the New Residue slots, and specify 3 as the number of residues.  For more help go here.  In all cases of phosphorylation, select acidic residue.  Example values of pKa for phosphotyrosine residue include a primary pKa of ~2 and a secondary pKa of ~5-7[14].  The pKa for Serine/Threonine phosphorylations should not be drastically different[15].Tyrosine-sulfate can be represented by the input of appropriate pKa, as is done for phosphate less one pKa value.  For predictions of possible phosphorylation sites, see the following page at the Center for Biological Sequence Analysis(CBS) (note: limited non-academic use)

5)  For Serine/Threonine modifications: default of 0 for # of residues modified makes pKa for Serine and Threonine neutral.  Adding a number for residues modified also requires input for pKa, as a default is not provided for any pKa derived from phosphorylations.  The assumed modification here is a resulting 'negative' residue.

6) Use New Residue #s for new charged groups added to residues by modification.  If residue was previously charged and has a new charge due to modification, remove the modified residue(s) from the sequence before submission, and input the corresponding number of New Residues created for the number of modified residues that were removed.

7)  In the case that N-terminal residue is blocked, selection of 'blocked' will make this pKa not considered in the charge estimation algorithm.  Same thing for the C-terminal acid being blocked.  If there are multiple N-terminal amines and C-terminal amines, please account for blocking of some or none by adding appropriate number of each(example: if 1 N-terminal amine is blocked and there are 8 N-terminal amines, input number should be = 7 ).  Selecting 'blocked' will block all of the designated functional group.

8)  Alkylation of Cysteines on predominantly basic proteins, or lack of alkylation, can affect pI significantly.  Try Hen Egg White Lysozyme for example.  Designating "blocked" for Cysteines will neglect charge contribution from all Cysteine residues, not just an even number of them, as might be expected for retained disulphide structure.  In the case you expect disulphides to be stable, please remove the Cysteine residues manually prior to submitting sequence.

9)  For N-terminal side and C-terminal side groups, the designation of positvely or negatively charged residue is dependent on what amino acid is present in this position in the submitted sequence.  To reflect a modification accurately, substitute an appropriate residue into the sequence and appropriate pKa value as leaving this value at '0' may use a default value that is not desired.

10)  pI estimation based on equations described in reference [4].  This reference also contains background on limitations of this estimation, as well as additional pKa values from other sources.

11)  Reference [5] illustrates the limitations referred to in [4].  Granted the conditions are expected to be drastically different for the immediate use of the given calculator above, with regards to the nature of default pKa values.  The reference here and in note 12 is provided as a quick example of these effects, while the actual change in pKa for these exceptions under the conditions present for notes 2 and 3 are not noted here.

12)  Nearest neighbor effects[16] can be accounted for to a limited extent with proper modification to the input sequence, and use of the New Residue slots above.  The inclusion of additional residue slots for specifying various pKa values for proteins and peptides that have important nearest neighbor effects to consider, such as Protamine(SwissProt accession # P04553) for example, may be added at some point in the future.  Please do not focus Protamine, it is just an example.  

13)  For sample sequence and contact information go here.

14)  For examples of ways to test the program to see that it is working, and examples of usage with modified residues and expected output check here.  See this page for another example of usage(click).

 

 

references:

[1] Gasteiger E., Hoogland C., Gattiker A., Duvaud S., Wilkins M.R., Appel R.D., Bairoch A.;  Protein Identification and Analysis Tools on the ExPASy Server;  (In) John M. Walker (ed): The Proteomics Protocols Handbook, Humana Press (2005) pp. 571-607.  ExPASy ProtParam Tool :  Protein and peptide sequence analysis at the SwissProt website.  http://www.expasy.ch/tools/protparam.html  

[2] Bjellqvist, B., Basse, B., Olsen, E., Celis, J.E., Electrophoresis 1994, 15, 529-539.

[3] Bjellqvist, B., Hughes, G., Pasquali, C., Paquet, N., Ravier, F., Sanchez, JC., Frutiger, S., Hochstrasser, D., Electrophoresis, 1993, 14, 1023-1031.

[4] pI algorithm by Dr. David Tabb at URL at the Yates Research Group website:  http://fields.scripps.edu/DTASelect/20010710-pI-Algorithm.pdf  Tabb DL, McDonald, WH, Yates JR 3rd, DTASelect and Contrast: Tools for Assembling and Comparing Protein Identifications from Shotgun Proteomics. J. Proteome Res. 2002 1:21-26.

[5] Nielsen, JE., Andersen, KV., Honig, B., Hooft, RW., Klebe, G., Vriend, G., Wade, RC., Protein Eng. 1999 Aug;12(8):657-62.

[6] Java & XML for Dummies, by Barry Burd.  Copyright 2002 Wiley Publishing Inc.  ISBN:0764516582  (an apparently rare book in that it describes defining your classpath variable, and ensures you will have a working java program before getting to Chapter 2)

[7]  http://www.java2s.com/ExampleCode/CatalogExampleCode.htm  

[8]  http://www.coreservlets.com/Apache-Tomcat-Tutorial/Tomcat-5.0-and-4.0.html#Java-Home  -not sure how long this will be available.  From "Core Servlets and JavaServer Pages, vol. 1: Core Technologies, Second Edition"  by Marty Hall and Larry Brown.  Check out the book for excellent help with Tomcat : ISBN 0130092290

[9]  Biochemical Calculations 2nd Edition, by Irwin H. Segel.  ISBN: 0471774219 -excellent resource for calculations in biochemistry, pI estimation.. etc.

[10]  Sam's Teach Yourself Java 2 in 21 Days, by Laura Lemay and Rogers Cadenhead.  ISBN: 0672323702

[11]  Skoog, B, and Wichman, A, Calculation of the isoelectric points of polypeptides from the amino acid composition, 1986, Trends Anal. Chem. 5, 82-83.

[12]  Costas S. Patrickios and Edna N. Yamasaki, Polypeptide Amino Acid Composition and Isoelectric Point, Analytical Biochemistry 231, 82-91 (1995).

[13]  Ribeiro, JM and Sillero, A, An algorithm for the computer calculation of the coefficients of a polynomial that allows determination of isoelectric points of proteins and other macromolecules, Comput. Biol. Med. 1990, 20, 235-242.

[14]  M. Wojciechowski, T. Grycuk, J.M. Antosiewicz, and B. Lesyng;  Prediction of Secondary Ionization of the Phosphate Group in Phosphotyrosine Peptides;  Biophysical Journal Volume 84 February 2003 pp 750-756.

[15]  Hoffman R, Reichart I, Wachs WO, Zeppezauer M, Kalbitzer HR;  1H and 31P NMR spectroscopy of phosphorylated model peptides.  International Journal of Peptide and Protein Research 1994 Sep;44(3):193-8. PMID : 7529751

[16]  Rabenstein DL, Hari SP, Kaerner A;  Determination of acid dissociation constants of peptide side-chain fnctional groups by two-dimensional NMR;  Analytical Biochemistry 1997 Nov 1;69(21):4310-6

[17] Another good explanation by Dr. Putnam of why pI estimation remains an 'estimation'  http://www.scripps.edu/~cdputnam/protcharge.html

 

 

 

Thanks to ABRF discussion list members for the feedback!

Thanks to Dr. Rooney, for pointing out possible areas of improvement in estimation of pI.

Disclaimer:  Please take care in using information from this site.  Check information against other resources.