Isotopident Documentation

Isotopident is a tool that helps you to estimate the theoretical isotopic distribution of a peptide or protein, a polynucleotide and a chemical compound from its composition (sequence of amino acids expressed in either 1-letter code, sequence of amino acids expressed in 3-letter code, sequence of nucleotides or its chemical formula). Moreover, Isotopident is capable of computing its monoisotopic mass. Finally, this tool is also able to predict the most likely isotope combination and the exact mass of the given input. Please read the following to make a better use of this tool (permitted input syntax, for example).

Name of the query: This field should be filled with a descriptive name of the protein/peptide/polynucleotide/chemical compound that is being studied. By default, "Unknow" is the assigned name by the program.

Type of composition: Here, it is necessary to indicate the type/form of INPUT sequence. There are four possibilities: Protein One-letter code, Protein Three-letter code, Bases code and Chemical formula.

Significative ciphers: By default, three significant digits will be shown in the output. The user can change it to obtain the results with four, two, or even one significative digit.

Molecular composition: This field lets the user to enter (copy and paste, or upload a file) a set of characters that corresponds to the peptide/protein, polynucleotide or the chemical compound that is under consideration. Only one of those ways should be used (copy-paste or file upload).

For example: (the tree sequences marked with an asterisk * represent the same peptide)

Sequence entered Type of composition
AGVYNEPLCR* Protein One-letter code
AlaGlyValTyrAsnGluProLeuCysArg* Protein Three-letter code
GATACA Bases code
C48H76N14O15S* Chemical Formula

Protoned molecules ([M+H]+ or [M-H]-) and not protoned molecules ([M]) can be specified within the Type of Composition option for the entered peptide/protein.

If a particular isotope is not specified, the program uses the average mass of each entered element.

It is also possible to use parenthesis to group some symbols within a Chemical formula.

For example:


Supported data formats: This tool only lets you to enter data that follows these rules:

  • Protein one-letter and protein three letter codes must agree the Amino Acids IUPAC standard table.
  • Addmited bases: A, C, T, G and U.
  • When querying 'Chemical formula' type, nested parenthesis are not permitted.
  • It is possible to specify a particular defined Isotope, e.g. [13]C that corresponds to the isotope number 13 of Carbon.
  • The input is case sensitive. Therefore, h2o will fail, but H2O will work.
  • Protein names and element names such as 'Valine' or 'Carbon' are not permitted.
  • The uploaded file must be a plan text file (not permitted markedup files.)

More input examples:

    One-letter code
    Three-letter code
  • MetValProIleTyrCysGlyAlaProAspPheIleTyr
  • MetProCysPheGlyGluAspProIleLeuSerValCys
  • MetHisTyrGluThrGlyAsnIleSerPheValHisPro
    Bases code
    Chemical formula
  • [12]C33[2]H45[15]N35[16]O44
  • (Mn4O2)5FeBr2H2SO4
  • ([56]Fe[16]O)6

Output: The output page will display a filled table with a summary of the chemical elements and their corresponding masses (average). For example, if the input is:


the corresponding output would be:

Average mass:

Note: the system will add one water molecule (H-NH-CH(R)-CO-OH) since the N-terminal part needs an H and the C-terminal part an OH each time the entered sequence would be a Protein. Besides, the system will indicate if an entered isotope is not defined.

The elements are shown in the following order: Cn1 Hn2 Nn3 On4 Sn5...

On the other hand, the program plots an interactive 2D graphic showing the theoretical isotopic distribution of the entered sequence. For example, the theoretical isotopic distribution for the previous entered peptide would be:

Isotopic distribution

Zooming in and out is supported. To zoom in, drag the mouse downwards to draw a box. To zoom out, drag the mouse upward. If you zoom in far enough, the plot becomes unreliable. In particular, if the total extent of the plot is more than 232 times extent of the visible area, quantization errors can result in displaying points or lines. Note that 232 is over 4 billion.

A text file with the plotted points can be downloaded. This file has two columns after some plot related data: mass in Daltons and Intensity that correspond to the axes X and Y respectively. For the previous example, this flat file would be:

TitleText: Isotopic distribution
YRange: 0,1
XLabel: m/z
YLabel: Intensity ( x 100)
Marks: points
Lines: off
Impulses: on
DataSet: peak
9006.388 0.029
9007.388 0.147
9008.388 0.378
9009.388 0.666
9010.388 0.902
9011.388 1.000
9012.388 0.943
9013.388 0.778
9014.388 0.572
9015.388 0.380
9016.388 0.231
9017.388 0.130
9018.388 0.068
9019.388 0.033
9020.388 0.015
9021.388 0.007
9022.388 0.003
9023.388 0.001

Besides, the system will show the following information:

Most likely isotope combination:

The system calculates the most likely combination of isotopes fitting the given formula. And, the probability of this combination is calculated. This can be useful if a high precision mass is wanted. If the combination is the dominant contributor to the response at a particular integer mass, then it should be readily measured on an accurate MS.

Exact mass is 9010.402
Probability of combination is 7.595%
The most likely combination is 52.91% of those masses rounding to 9006 amu.

The system only computes the exact mass for the input when the entered sequence (Chemical formula) in the molecular composition field has been written using one or more specific isotope for a given element. On the contrary, the average mass is computed when the input has no specific isotopes. For example, if the entered sequence is:


Then, the system will only find out the exact mass as follows:

Not stable isotope
Exact mass:

Note that in case that the user enters an isotope that is not defined, the system will show a message indicating that this one is not a stable isotope. For instance, in the previous example the [12]O is not a stable isotope.

Finally, if the entered sequence in the molecular composition field is invalid, the system will display a message telling which one(s) is(are) the unrecognized character(s) painting it(them) red. In this case, please check also if you have selected the right Type of composition.

Notes and definitions:

  • An atomic mass (average atomic mass or atomic weight) found on the periodic chart is the weighted average mass of all the isotopes of an element. The weighing is based on the percent distribution of isotopes on earth.

  • Monoisotopic mass is calculated using the mass of the most abundant natural isotope of each constituent element. An average mass is calculated using the "atomic weight" of each constituent element, which is the weighted average of all its natural isotopes. This definition of monoisotopic mass conforms to that given in Standard Definitions of Terms Relating to Mass Spectrometry, Phil. Price, J. Am. Soc. Mass Spectrom. (1991) 2 336-348. Some sources suggest that monoisotopic mass is calculated from the lightest natural isotope of each element, but this is wrong.

  • An isotopic mass is the mass of one atom of a specific isotope in atomic mass units. For example, the isotopic mass of 11H = 1.0078 amu and that of 21H = 2.0150 amu. Note: The atomic mass is the weighted average of naturally occurring isotopic masses. The weighting is done according to the (%) natural abundance on earth.

  • One atomic mass unit (amu) is 1/12 the mass of one atom of 126C. The masses of all other isotopes are derived by comparison to 126C, one atom of which weighs 12 amu.

  • A mass spectrum is a plot of m/z or mass (abscissae) versus the intensity, frequently normalised to 100% for the most intense ion detected (ordinates). This is produced by scanning the analyser to transmit ions (or release them from a trap) for a predefined range of m/z values over a fixed period of time. Thomson is the unit for mass-to-charge ratio.

  • m/z is the ratio of charge to mass of the ion detected. z is often unity but can be a larger integer especially in ESI-MS.

  • You can also see some notes about the MS Identification process.

Acknowledgments: I would like to express gratitude to Dr. Anton Erasmuson for the support related to the used algorithm. Besides, to Prof. Denis Hochstrasser and the Swiss Institute of Bioinformatics for the academic opportunity. In addition, I would like to thank to Elisabeth Gasteiger and Willy Bienvenut for their supervising labor.

^ back to top ^