Which amino acids can be phosphorylated?

Which amino acids can be phosphorylated?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Definitely serine, tyrosine, and threonine. What about aspartate and glutamate? I thought the O- on the carboxylic acid side group could be phosphorylated but am getting mixed responses to this. Any thoughts / any other amino acids I'm missing?

There's a Wikinome page called Phosphorylation of unusual amino acids describing phosphorylation of histidine, aspartate, cysteine, lysine, and arginine. Lysine and arginine are apparently phosphorylated on the nitrogens. I was a little surprised to see histidine and aspartate on the list, as histidine kinases seem to come up often in my area of research.

I'm also aware of at least one non-proteinogenic amino acid, creatine, which is phosphorylated by a dedicated creatine kinase. It functions as a sort of phosphate shuttle for rapid regeneration of cellular ATP from ADP in situations of high metabolic demand.


Phosphorylation on unusual amino acids. (

PubMed Abstracts:

Guimarães-Ferreira L. Role of the phosphocreatine system on energetic homeostasis in skeletal and cardiac muscles. Einstein (Sao Paulo). 2014;12(1):126-131. doi:10.1590/s1679-45082014rb2741


In biochemistry, a ribonucleotide is a nucleotide containing ribose as its pentose component. It is considered a molecular precursor of nucleic acids. Nucleotides are the basic building blocks of DNA and RNA. The monomer itself from ribonucleotides forms the basic building blocks for RNA. However, the reduction of ribonucleotide, by enzyme ribonucleotide reductase (RNR), forms deoxyribonucleotide, which is the essential building block for DNA. [1] There are several differences between DNA deoxyribonucleotides and RNA ribonucleotides. Successive nucleotides are linked together via phosphodiester bonds by 3'-5'.

Ribonucleotides are also utilized in other cellular functions. These special monomers are utilized in both cell regulation and cell signaling as seen in adenosine-monophosphate (AMP). Furthermore, ribonucleotides can be converted to adenosine triphosphate (ATP), the energy currency in organisms. Ribonucleotides can be converted to cyclic adenosine monophosphate (cyclic AMP) to regulate hormones in organisms as well. [1] In living organisms, the most common bases for ribonucleotides are adenine (A), guanine (G), cytosine (C), or uracil (U). The nitrogenous bases are classified into two parent compounds, purine and pyrimidine.

Which amino acids can be phosphorylated? - Biology

Proteins are one of the most abundant organic molecules in living systems and have the most diverse range of functions of all macromolecules. Proteins may be structural, regulatory, contractile, or protective they may serve in transport, storage, or membranes or they may be toxins or enzymes. Each cell in a living system may contain thousands of proteins, each with a unique function. Their structures, like their functions, vary greatly. They are all, however, polymers of amino acids, arranged in a linear sequence.

Figure 1. Amino acids have a central asymmetric carbon to which an amino group, a carboxyl group, a hydrogen atom, and a side chain (R group) are attached.

Amino acids are the monomers that make up proteins. Each amino acid has the same fundamental structure, which consists of a central carbon atom, also known as the alpha (α) carbon, bonded to an amino group (NH2), a carboxyl group (COOH), and to a hydrogen atom. Every amino acid also has another atom or group of atoms bonded to the central atom known as the R group (Figure 1).

The name “amino acid” is derived from the fact that they contain both amino group and carboxyl-acid-group in their basic structure. As mentioned, there are 20 amino acids present in proteins. Nine of these are considered essential amino acids in humans because the human body cannot produce them and they are obtained from the diet.

For each amino acid, the R group (or side chain) is different (Figure 2).

Practice Question

Figure 2. There are 20 common amino acids commonly found in proteins, each with a different R group (variant group) that determines its chemical nature.

Which categories of amino acid would you expect to find on the surface of a soluble protein, and which would you expect to find in the interior? What distribution of amino acids would you expect to find in a protein embedded in a lipid bilayer?

The chemical nature of the side chain determines the nature of the amino acid (that is, whether it is acidic, basic, polar, or nonpolar). For example, the amino acid glycine has a hydrogen atom as the R group. Amino acids such as valine, methionine, and alanine are nonpolar or hydrophobic in nature, while amino acids such as serine, threonine, and cysteine are polar and have hydrophilic side chains. The side chains of lysine and arginine are positively charged, and therefore these amino acids are also known as basic amino acids. Proline has an R group that is linked to the amino group, forming a ring-like structure. Proline is an exception to the standard structure of an amino acid since its amino group is not separate from the side chain (Figure 2).

Amino acids are represented by a single upper case letter or a three-letter abbreviation. For example, valine is known by the letter V or the three-letter symbol val. Just as some fatty acids are essential to a diet, some amino acids are necessary as well. They are known as essential amino acids, and in humans they include isoleucine, leucine, and cysteine. Essential amino acids refer to those necessary for construction of proteins in the body, although not produced by the body which amino acids are essential varies from organism to organism.

Figure 3. Peptide bond formation is a dehydration synthesis reaction. The carboxyl group of one amino acid is linked to the amino group of the incoming amino acid. In the process, a molecule of water is released.

The sequence and the number of amino acids ultimately determine the protein’s shape, size, and function. Each amino acid is attached to another amino acid by a covalent bond, known as a peptide bond, which is formed by a dehydration reaction. The carboxyl group of one amino acid and the amino group of the incoming amino acid combine, releasing a molecule of water. The resulting bond is the peptide bond (Figure 3).

The products formed by such linkages are called peptides. As more amino acids join to this growing chain, the resulting chain is known as a polypeptide. Each polypeptide has a free amino group at one end. This end is called the N terminal, or the amino terminal, and the other end has a free carboxyl group, also known as the C or carboxyl terminal. While the terms polypeptide and protein are sometimes used interchangeably, a polypeptide is technically a polymer of amino acids, whereas the term protein is used for a polypeptide or polypeptides that have combined together, often have bound non-peptide prosthetic groups, have a distinct shape, and have a unique function. After protein synthesis (translation), most proteins are modified. These are known as post-translational modifications. They may undergo cleavage, phosphorylation, or may require the addition of other chemical groups. Only after these modifications is the protein completely functional.

The Evolutionary Significance of Cytochrome c

Cytochrome c is an important component of the electron transport chain, a part of cellular respiration, and it is normally found in the cellular organelle, the mitochondrion. This protein has a heme prosthetic group, and the central ion of the heme gets alternately reduced and oxidized during electron transfer. Because this essential protein’s role in producing cellular energy is crucial, it has changed very little over millions of years. Protein sequencing has shown that there is a considerable amount of cytochrome c amino acid sequence homology among different species in other words, evolutionary kinship can be assessed by measuring the similarities or differences among various species’ DNA or protein sequences.

Scientists have determined that human cytochrome c contains 104 amino acids. For each cytochrome c molecule from different organisms that has been sequenced to date, 37 of these amino acids appear in the same position in all samples of cytochrome c. This indicates that there may have been a common ancestor. On comparing the human and chimpanzee protein sequences, no sequence difference was found. When human and rhesus monkey sequences were compared, the single difference found was in one amino acid. In another comparison, human to yeast sequencing shows a difference in the 44th position.

Nociceptin Opioid

2.2 Identification of prepronociceptin gene associating cAMP-dependently with phosphorylated CREB

To identify novel genes that are regulated by CREB phosphorylated at Ser133 in Sertoli cells, we performed chromatin immunoprecipitation (ChIP) from Sertoli B cells. After cells were stimulated for 10 min with db-cAMP, extracts were prepared and processed for ChIP with the same antibody to phosphorylated CREB. We screened by PCR several genes, whose proximal promoter regions associate with phosphorylated CREB, and investigated murine prepronociceptin gene ( Zaveri, Waleh, & Toll, 2006 ). The proximal promoter of murine prepronociceptin gene has one functional CRE site in a different location from the human promoter ( Zaveri, Green, Polgar, Huynh, & Toll, 2002 Zaveri et al., 2006 ). The DNA fragment from the putative transcription start site to the ATG translation start codon (252 bp) was detected only in cells treated with db-cAMP but not in untreated cells. None could be detected from immunoprecipitates with an unrelated antibody. Nucleotide sequencing of the detected DNA fragment confirmed the presence of a consensus CRE sequence (CGTCA) at 30 bp upstream of the ATG translation start codon in the proximal promoter of murine prepronociceptin gene as reported ( Zaveri et al., 2006 ). These results indicated that phosphorylated CREB associates with the proximal promoter region of prepronociceptin gene in Sertoli B cells ( Fig. 1 ). This gene encodes a precursor protein of prepronociceptin, from which the mature nociceptin peptide consisting of 17 amino acid residues is produced. Nociceptin, also known as orphanin FQ, is a neuropeptide belonging to the opioid peptide family and shares the identical amino acid sequence between mice and other species.

Upgrading protein synthesis for synthetic biology

Genetic code expansion for synthesis of proteins containing noncanonical amino acids is a rapidly growing field in synthetic biology. Creating optimal orthogonal translation systems will require re-engineering central components of the protein synthesis machinery on the basis of a solid mechanistic biochemical understanding of the synthetic process.

The genetic code was thought to be immutable. At the time the genetic code table was defined, its evolution was most readily explained by Crick's frozen accident theory 1 . Central to the theory was the idea that stability of the proteome and the accuracy of protein synthesis were essential for cell viability. These ideas emerged alongside a nascent molecular biology that developed before genome sequencing and proteomics, when much of what we now know of life's microbial and molecular biodiversity was still undiscovered and unknown. Since the elucidation of the genetic code in the 1960s, the most exciting and surprising findings are those related to exceptions to the standard code and the discovery of diversity far greater than was anticipated in the mechanisms of aminoacyl-tRNA formation and protein synthesis.

Section Summary

The breakdown and synthesis of carbohydrates, proteins, and lipids connect with the pathways of glucose catabolism. The simple sugars are galactose, fructose, glycogen, and pentose. These are catabolized during glycolysis. The amino acids from proteins connect with glucose catabolism through pyruvate, acetyl CoA, and components of the citric acid cycle. Cholesterol synthesis starts with acetyl groups, and the components of triglycerides come from glycerol-3-phosphate from glycolysis and acetyl groups produced in the mitochondria from pyruvate.

Materials and Methods

Dataset selection.

We searched the PDB [58] for phosphorylated protein structures determined by X-ray crystallography (with better than 2.5 Å resolution) that are phosphorylated on well-ordered loop structures that were less than 15 residues in length. We exclude structures in which phosphorylation causes a large, global rearrangement of the protein structure, such as a hinge-bending movement or domain rearrangement, as is the case with glycogen phosphorylase [59] and insulin receptor tyrosine kinase [60,61]. In order to test the limitations of our method, we include one test case, ERK2, in which phosphorylation causes a domain rearrangement [3]. We also include a prokaryotic response regulator, FixJ, in which phosphorylation of an Asp induces a significant conformational change in the orientation of a helix [57,62] this case is successfully treated with an extension of our methods described below. The test set is listed in Table 3.

We determined the loop length that we would predict for each phosphorylated structure using visual inspection. In the cases in which crystallographic structures are available for both the unphosphorylated and phosphorylated protein, these structures were superimposed and the residues to be predicted were defined as the portion of the loop that deviated in the superposition. For the reconstruction of phosphorylated structures without knowledge of the unphosphorylated form, the loop residues to be optimized were determined using a combination of visual inspection of secondary structure and crystallographic B-factors.

Molecular mechanics energy function.

All energy calculations use the OPLS-AA force field [37,63,64] and the Surface Generalized Born (SGB) model of solvation [49,50]. The molecular mechanics energy function represents electrostatics by a relatively simple model of fixed atomic partial charges interacting through the Coulomb approximation. The solvent model captures key effects of desolvation with relatively modest computational expense. Despite the simplicity of the energy function (i.e., it neglects polarizability contributions to electrostatics, and implicit solvent models have well-known limitations), it performs well in predicting conformations of phosphorylated loops (see Results).

The force field parameters for the phosphorylated amino acids were generated by an automated atom-typing algorithm provided in the Impact software package. The atomic partial charges for the phosphorylated amino acid side chains were adjusted slightly from the default values by performing quantum chemistry calculations. The partial charges for phosphoserine (pSer) and phosphothreonine (pThr) were taken from previous work by Wong et al. [65], whereas charges for phosphotyrosine (pTyr) and phosphoaspartate (pAsp) were determined in their −2 and −1 charge states by performing quantum mechanical calculations with the software program Jaguar [66]. Methyl-benzyl-phosphate was used to represent the pTyr side chain, and acetyl phosphate was used for pAsp. Geometry optimization of the phosphate ion was carried out at the HF/6–31G** level, incorporating a condensed-phase environment via a self-consistent reaction field (SCRF) algorithm [67,68]. Single point calculations were performed at the LMP2/cc-pvtz(-f) level, also with SCRF treatment of solvation. Electrostatic potential fitting was used to determine the partial charges. The atomic partial charges for all four phosphorylated amino acids are provided in Tables S1–S4.

Loop prediction methodology.

This study uses the method of Jacobson et al. [40] for predicting loop conformations. In brief, the loop prediction methodology uses an ab initio dihedral sampling scheme to enumerate conformations of the loop backbone that are free from steric clashes. Other methods have employed similar dihedral angle sampling schemes, including ICM [69], CONGEN [70], and the work of DePristo, et al [71]. Unlike Monte Carlo and Molecular Dynamics sampling schemes, the algorithm itself has no knowledge of the starting conformation of the loop, and therefore, does not start predictions based on a starting structure of the loop. These closed backbone conformations are clustered, and a single member of each cluster is then selected for side chain addition and optimization, followed by complete energy minimization. The lowest energy structure is selected as the output of the loop prediction algorithm. The algorithm also permits explicit treatment of crystal packing. We have used this capability in this work (in the “loop reconstruction” cases, as described below), but did not identify any clear crystal-packing artifacts relevant to the conformations of the phosphorylated loops.

The Jacobson et al. paper [40] describes a hierarchical refinement procedure in which multiple iterations of the loop prediction algorithm are used to reduce errors caused by insufficient sampling. The parameters in this scheme have been slightly modified in this study due to the inclusion of some rather long loops (up to 15 residues) in our test set. The first stage allows for unrestrained sampling. After this stage is complete, the top ten lowest energy structures are passed to a first refinement stage in which more extensive sampling is performed around these low-energy basins. Specifically, loop conformations in this stage are only retained if all of the Cα atoms in the loop are within 10 Å of the starting loop structure (which is one of the ten lowest-energy structures from the initial stage). The five resulting lowest-energy structures from this stage are subjected to a second round of refinement, in which the maximum Cα deviation is restricted to 5 Å. Finally, the five lowest-energy structures from this stage are subjected to a third and final round of refinement, in which the maximum Cα deviation is restricted to 2.5 Å. In all, this procedure provides a rank-ordered list consisting of about 250 loops and their associated energies in the context of the full protein. The final prediction is the loop conformation with the lowest energy.

Most published tests of loop prediction methods, including the Jacobson et al. paper [40], evaluate their success by the ability to reconstruct loops in a native protein structure. In these tests, all portions of the protein other than the loop in question remain in the native conformation during the simulation. Success of a prediction methodology in such a test is an important prerequisite for more realistic applications. We perform loop reconstruction tests in this work to assess the ability of the molecular mechanics energy function to identify correct conformations of phosphorylated loops, and to ensure that the sampling methods are sufficient to generate near-native conformations. However, predicting conformational changes induced by phosphorylation, i.e., by phosphorylating a protein in silico, is qualitatively more challenging. In the cases we consider, the sites of phosphorylation are located on loops, and most of the conformational change is localized to that loop. However, there is always some degree of conformational rearrangement in the vicinity of the loop, especially in the conformations of side chains contacting it. Similarly, predicting the conformation of a loop in a homology model is more challenging than reconstructing a loop in a native protein structure because the environment surrounding any given loop in a homology model contains errors that can affect the loop prediction accuracy. We address this issue by performing rotamer optimization and minimization of side chains in the immediate vicinity of the loop concurrently with the optimization of the side chains on the loop itself. In the test set presented here, this strategy performs well in predicting local structural changes, despite the fact that there are also some changes in backbone conformation in the surroundings. We speculate that small changes in the conformations of the surrounding side chains can compensate for not explicitly allowing backbone relaxation. We also use this strategy in the control studies of reconstructing phosphorylated loops in which, interestingly, it sometimes improves the accuracy.

Helix prediction methodology.

The helix prediction algorithm used in this paper is based on the work of Li et al [42]. Briefly, the helix backbone is treated as a rigid body and sampled in six degrees of freedom (three translations and three rotations), and the two flanking loops are sampled using the loop prediction algorithm described above. Again, the method broadly samples the possible configurations of the helix and surrounding loops, independent of any starting configuration, and then hierarchically samples more finely around low-energy basins. As with the loop prediction algorithm, side chains on the loop-helix-loop region, and the surroundings if desired, are sampled using a rotamer-based optimization algorithm.

Examples of Nucleic Acids

The most common nucleic acids in nature are DNA and RNA. These molecules form the foundation for the majority of life on Earth, and they store the information necessary to create proteins which in turn complete the functions necessary for cells to survive and reproduce. However, DNA and RNA are not the only nucleic acids. However, artificial nucleic acids have also been created. These molecules function in the same way as natural nucleic acids, but they can serve a similar function. In fact, scientists are using these molecules to build the basis of an “artificial life form”, which could maintain the artificial nucleic acid and extract information from it to build new proteins and survive.

Generally speaking, nucleic acids themselves differ in every organism based on the sequence of nucleotides within the nucleic acid. This sequence is “read” by cellular machinery to connect amino acids in the correct sequence, building complex protein molecules with specific functions.

Additional data files

The following additional data are available with the online version of the paper. Additional data file 1 is a figure showing the accessibilities of phosphorylation sites as calculated by SABLE. Additional data file 2 is a figure showing Protein Data Bank structures of phosphoproteins. Additional data file 3 is a table listing phosphorylation sites located in parts of phosphoproteins that are too flexible for structure determination. Additional data file 4 is a figure that illustrates the conservation of the region surrounding the phosphosite (-20 to +20 amino acids). Additional data file 5 is a table listing the optimal parameters for the SVM prediction. Additional data file 6 is a table listing the prediction accuracies of the SVM approach.

Other functions

Amino acids are precursors of a variety of complex nitrogen-containing molecules. Prominent among these are the nitrogenous base components of nucleotides and the nucleic acids (DNA and RNA). Furthermore, there are complex amino-acid derived cofactors such as heme and chlorophyll. Heme is the iron-containing organic group required for the biological activity of vitally important proteins such as the oxygen-carrying hemoglobin and the electron-transporting cytochrome c. Chlorophyll is a pigment required for photosynthesis.

Several α-amino acids (or their derivatives) act as chemical messengers. For example, γ-aminobutyric acid (gamma-aminobutyric acid, or GABA a derivative of glutamic acid), serotonin and melatonin (derivatives of tryptophan), and histamine (synthesized from histidine) are neurotransmitters. Thyroxine (a tyrosine derivative produced in the thyroid gland of animals) and indole acetic acid (a tryptophan derivative found in plants) are two examples of hormones.

Several standard and nonstandard amino acids often are vital metabolic intermediates. Important examples of this are the amino acids arginine, citrulline, and ornithine, which are all components of the urea cycle. The synthesis of urea is the principal mechanism for the removal of nitrogenous waste.