Algorithmic Arts

20 Amino acids, their single-letter data-base codes (SLC), and their corresponding DNA codons


Amino Acid

SLC

DNA codons

Isoleucine  

I

ATT, ATC, ATA

Leucine  

L

CTT, CTC, CTA, CTG, TTA, TTG
Valine

V

GTT, GTC, GTA, GTG
Phenylalanine  

F

TTT, TTC
Methionine M ATG
Cysteine  C

TGT, TGC

Alanine      

A

GCT, GCC, GCA, GCG

Glycine  

G

GGT, GGC, GGA, GGG
Proline      

P

CCT, CCC, CCA, CCG
Threonine  

T

ACT, ACC, ACA, ACG
Serine        S TCT, TCC, TCA, TCG, AGT, AGC

Tyrosine  

Y TAT, TAC
Tryptophan   W TGG
Glutamine   Q CAA, CAG
Asparagine   N AAT, AAC
Histidine 

H

CAT, CAC
Glutamic acid  

E

GAA, GAG

Aspartic acid 

D

GAT, GAC
Lysine       

K

AAA, AAG
Arginine  

R

CGT, CGC, CGA, CGG, AGA, AGG
Stop codons Stop TAA, TAG, TGA
In this table, the twenty amino acids found in proteins are listed, along with the single-letter code used to represent these amino acids in protein data bases. The DNA codons representing each amino acid are also listed. All 64 possible 3-letter combinations of the DNA coding units T, C, A and G are used either to encode one of these amino acids or as one of the three stop codons that signals the end of a sequence. While DNA can be decoded unambiguously, it is not possible to predict a DNA sequence from its protein sequence. Because most amino acids have multiple codons, a number of possible DNA sequences might represent the same protein sequence.