The Central Dogma of Biology: How DNA Becomes Protein

Q: What is The Genetic Code: Three Letters at a Time?

The language connecting mRNA to protein is called the genetic code, and it operates on a beautifully simple principle: every three consecutive nucleotide bases in mRNA (called a codon) specify one amino acid.

Every cell in your body is running the same basic program. Whether it is a brain cell firing electrical signals or a muscle cell contracting your bicep, the underlying instructions come from the same source: your DNA. But DNA does not do the work itself. It needs messengers and builders. The process by which DNA's instructions become working proteins is called the Central Dogma of molecular biology, and understanding it is the single most important step toward grasping how gene editing, genetic disease, and modern medicine actually work.

Think of it this way. Your DNA is a master recipe book locked inside a vault. You never take the book out of the vault — it is far too valuable to risk damaging. Instead, you copy individual recipes onto index cards and carry those cards to the kitchen, where a chef reads them and assembles the finished dish. In molecular biology, the vault is the cell nucleus, the index cards are messenger RNA (mRNA), the kitchen is the ribosome, and the finished dishes are proteins.

That one-directional flow — DNA to RNA to protein — is the Central Dogma.

The Central Dogma of molecular biology. Genetic information flows from DNA to RNA (transcription) and from RNA to protein (translation). Image: Wikimedia Commons, public domain.

Where the Idea Came From

In 1958, British molecular biologist Francis Crick proposed what he called the "Central Dogma." Crick, who had co-discovered the double-helix structure of DNA with James Watson just five years earlier, wanted to describe the fundamental rule governing how genetic information moves inside cells [1].

Crick's original formulation was precise: once information passes from nucleic acids (DNA or RNA) into protein, it cannot flow back out. In other words, proteins cannot be used as templates to create RNA or DNA. Information flows in one direction. This was a bold claim in 1958, when scientists were still working out the basic mechanics of gene expression. But decades of research have confirmed Crick's core insight, with a few fascinating exceptions we will get to later.

It is worth noting that the word "dogma" can be misleading. In everyday language, a dogma is an unquestionable belief. Crick later admitted he did not fully appreciate the weight of the word when he chose it — he simply meant it as a hypothesis that had strong support [2]. In science, even the most fundamental ideas remain open to revision when new evidence appears.

Step One: DNA Replication — Copying the Master Book

Before we get to the main flow of information from DNA to protein, there is a preliminary step that makes everything else possible: DNA replication. Every time a cell divides, it must first make a complete copy of its entire DNA so that both daughter cells get a full set of instructions.

DNA replication is astonishingly accurate. The enzyme DNA polymerase copies roughly three billion base pairs of human DNA with an error rate of about one mistake per billion bases after proofreading [3]. To put that in perspective, imagine typing out every word in a library of one thousand books and making only one typo across the entire collection.

Replication is essential because it ensures continuity — every new cell inherits the same genetic blueprint. But the real action, in terms of turning genes into something the cell can actually use, happens in the next two steps.

Step Two: Transcription — Copying the Recipe Card

Transcription is the process of copying a specific section of DNA (a gene) into a molecule of messenger RNA (mRNA). This is where the cell selects which recipes it needs right now and makes portable copies of them.

Here is how it works:

The gene is located. Molecular signals called transcription factors bind to a region of DNA near the gene, marking it as ready to be read.
RNA polymerase binds. This large enzyme attaches to the DNA at the gene's starting point (the promoter region) and begins to unwind the double helix.
The mRNA strand is built. RNA polymerase reads one strand of the DNA and assembles a complementary strand of mRNA, one nucleotide at a time. Where DNA has adenine (A), the mRNA gets uracil (U). Where DNA has cytosine (C), the mRNA gets guanine (G), and vice versa. Where DNA has thymine (T), the mRNA gets adenine (A).
The mRNA is processed. In eukaryotic cells (cells with a nucleus, which includes all human cells), the raw mRNA transcript gets edited before it leaves the nucleus. Sections called introns are cut out, and the remaining sections (exons) are spliced together. A protective cap is added to one end and a poly-A tail to the other.
The mRNA exits the nucleus. The finished mRNA molecule travels through pores in the nuclear membrane and enters the cytoplasm, where it will be read by a ribosome.

Transcription: RNA polymerase reads the DNA template strand and builds a complementary mRNA molecule. The mRNA then carries the gene's instructions out of the nucleus. Image: Wikimedia Commons, public domain.

Returning to our cooking analogy: transcription is when you open the master recipe book to the page for chocolate cake, copy the recipe onto an index card, and carry it to the kitchen. The master book stays safely in the vault, and the kitchen gets only the instructions it needs.

This selectivity is crucial. A liver cell and a skin cell contain identical DNA, but they transcribe very different sets of genes. That is why they look and behave differently despite sharing the same genome. Transcription is where cells make choices about which parts of the genetic blueprint to actually use.

Step Three: Translation — Cooking the Dish

Translation is the step where the information encoded in mRNA is finally used to build a protein. This process takes place on ribosomes — molecular machines found in the cytoplasm of the cell.

Translation: a ribosome reads the mRNA sequence three letters at a time (codons) and assembles a corresponding chain of amino acids, which folds into a functional protein. Image: Wikimedia Commons, public domain.

Here is the step-by-step process:

The ribosome attaches to the mRNA. The ribosome clamps onto the mRNA at a start signal (the start codon, AUG).
Transfer RNA (tRNA) delivers amino acids. Each tRNA molecule carries a specific amino acid and has an anticodon — a three-letter sequence that matches a codon on the mRNA. When the anticodon on a tRNA pairs with a codon on the mRNA, the tRNA drops off its amino acid.
The protein chain grows. The ribosome moves along the mRNA, reading one codon at a time. Each new amino acid is linked to the growing chain by a peptide bond.
The ribosome hits a stop codon. When the ribosome reaches one of three stop codons (UAA, UAG, or UGA), translation ends. The finished protein is released.
The protein folds. The chain of amino acids folds into a precise three-dimensional shape, often with help from other proteins called chaperones. This final shape determines what the protein can do.

In the cooking analogy, this is the chef in the kitchen reading the recipe card step by step, combining ingredients (amino acids), and producing the finished dish (a functional protein). The ribosome is the kitchen, the tRNA molecules are the assistants fetching ingredients from the pantry, and the completed protein is your chocolate cake.

The Genetic Code: Three Letters at a Time

The language connecting mRNA to protein is called the genetic code, and it operates on a beautifully simple principle: every three consecutive nucleotide bases in mRNA (called a codon) specify one amino acid.

Since mRNA is built from four possible bases (A, U, G, and C), and codons are three bases long, there are 4 x 4 x 4 = 64 possible codons. But there are only 20 standard amino acids used to build proteins. This means the code has built-in redundancy — most amino acids are specified by more than one codon. For example, the amino acid leucine can be coded by six different codons (UUA, UUG, CUU, CUC, CUA, CUG) [4].

This redundancy is not a flaw. It is a buffer. Some mutations in the DNA — and therefore in the mRNA — will change the third letter of a codon without changing the amino acid it codes for. These are called silent mutations, and they offer a degree of protection against the harmful effects of random errors.

One codon, AUG, serves double duty: it codes for the amino acid methionine and also acts as the start codon, signaling the ribosome to begin translation. Three codons (UAA, UAG, UGA) do not code for any amino acid; instead, they serve as stop codons, signaling the ribosome to release the finished protein.

The genetic code is nearly universal across all life on Earth. The same codons specify the same amino acids in bacteria, plants, fungi, and animals. This universality is powerful evidence that all living organisms share a common ancestor [5].

The standard genetic code. Each three-letter codon on the mRNA corresponds to one of 20 amino acids (or a stop signal). Notice the redundancy: most amino acids are encoded by multiple codons. Image: Wikimedia Commons, CC BY-SA 3.0.

When the Dogma Breaks: Exceptions and Surprises

Crick's Central Dogma has held up remarkably well, but biology loves to find workarounds. Several important exceptions have been discovered since 1958.

Reverse Transcriptase: RNA to DNA

In 1970, Howard Temin and David Baltimore independently discovered an enzyme called reverse transcriptase that does something the Central Dogma said should not happen: it copies RNA back into DNA [6]. This enzyme is used by retroviruses, the most famous of which is HIV. When HIV infects a cell, it uses reverse transcriptase to convert its RNA genome into DNA, which then integrates into the host cell's chromosomes. This discovery earned Temin and Baltimore the Nobel Prize in Physiology or Medicine in 1975.

Reverse transcriptase turned out to be far more widespread than anyone initially expected. It is also found in retrotransposons — "jumping gene" elements that make up a surprisingly large fraction of the human genome.

Prions: Protein That Copies Itself

Prions represent perhaps the most radical exception to the Central Dogma. These misfolded proteins can cause normal versions of the same protein to adopt the misfolded shape, effectively allowing protein to propagate information without any involvement of DNA or RNA [7]. Prion diseases include mad cow disease (BSE), Creutzfeldt-Jakob disease in humans, and scrapie in sheep. The concept of infectious proteins was so controversial when Stanley Prusiner proposed it in 1982 that many scientists dismissed it outright. He received the Nobel Prize in 1997 when the evidence became overwhelming.

Epigenetics: Same DNA, Different Reading

Epigenetics does not violate the Central Dogma in the strict sense, but it adds an important layer of complexity. Chemical modifications to DNA (such as methylation) and to the histone proteins that package DNA can change which genes are transcribed without altering the DNA sequence itself. These modifications can even be inherited across cell divisions and, in some cases, across generations. Epigenetics explains how identical twins with the same DNA can develop different traits and disease risks over their lifetimes.

RNA Editing

After mRNA is transcribed, enzymes can chemically alter individual bases in the mRNA sequence, effectively changing the protein that will be produced without changing the underlying DNA. This process, called RNA editing, adds yet another layer of information flow that Crick could not have anticipated.

Why Mutations Matter: The Case of Sickle Cell Disease

Understanding the Central Dogma makes it immediately clear why even tiny changes in DNA can have devastating consequences. Consider sickle cell disease, one of the most common genetic disorders worldwide.

Sickle cell disease is caused by a single nucleotide change in the gene for beta-globin, a component of hemoglobin. At position six of the beta-globin protein, the codon GAG (which codes for glutamic acid) is changed to GTG (which codes for valine). One amino acid out of 147 is different [8].

That one substitution causes the hemoglobin molecules to stick together when oxygen levels are low, deforming red blood cells into a rigid, sickle-like shape. These misshapen cells clog blood vessels, causing excruciating pain, organ damage, and shortened life expectancy. A single letter change in DNA, flowing through the Central Dogma — DNA to mRNA to protein — produces a disease that affects millions of people worldwide.

This example illustrates a critical point: the Central Dogma is not just an abstract concept. It is the mechanism by which genetic information becomes biological reality. Every inherited disease, every physical trait, every protein your body produces follows this pathway.

Why Gene Editors Care About the Central Dogma

If you are reading this on a site about gene editing, you might be wondering: why does any of this matter for technologies like CRISPR?

The answer is that every gene editing strategy targets a specific step in the Central Dogma.

CRISPR-Cas9 targets DNA. The most widely known gene editing tool works by cutting the DNA itself, at the very top of the information flow. By changing the master recipe book, you permanently alter every mRNA copy that will ever be made from that gene and, consequently, every protein produced. This is why CRISPR-based therapies have the potential to be one-time cures. To learn more about the molecular target of CRISPR, see our guide to What Is DNA? The Blueprint of Life.

RNA editing and RNA interference target mRNA. Some newer therapeutic approaches do not touch the DNA at all. Instead, they modify or destroy the mRNA — the index card — before it reaches the ribosome. This can silence a harmful gene without permanently altering the genome. The trade-off is that mRNA is temporary, so these therapies typically require repeated dosing.

Protein-targeted therapies work at the end of the pipeline. Traditional drugs — from aspirin to monoclonal antibodies — often work by binding to proteins and changing their activity. These therapies do not alter the flow of genetic information at all; they intervene at the final product.

Understanding where in the Central Dogma a therapy intervenes helps you understand its strengths and limitations. DNA-level editing is permanent but carries risks of off-target cuts. RNA-level approaches are reversible but temporary. Protein-level drugs are well-understood but must be taken continuously.

For a deeper dive into genes themselves and how they encode information, read our article on What Is a Gene? A Beginner's Guide.

The structure of a eukaryotic gene, showing how DNA is transcribed into pre-mRNA, processed, and ultimately translated into protein. Gene editing tools can intervene at the DNA level (CRISPR), the RNA level (antisense oligonucleotides, siRNA), or the protein level (traditional drugs). Image: Wikimedia Commons, CC BY-SA 3.0.

The Big Picture

The Central Dogma of molecular biology is one of the most important ideas in all of science. It tells us that genetic information flows from DNA to RNA to protein — and that this one-directional flow is the basis for how every gene in every living organism is expressed.

Here is the full picture in a single paragraph: DNA replication copies the master blueprint so it can be passed to new cells. Transcription copies individual genes from DNA into mRNA. Translation reads the mRNA and assembles amino acids into proteins. The genetic code — 64 codons mapping to 20 amino acids — is the Rosetta Stone that connects the language of nucleic acids to the language of proteins. Exceptions like reverse transcriptase, prions, and epigenetics add complexity but do not overthrow the fundamental rule.

For anyone interested in gene editing, genetic medicine, or simply understanding how life works at the molecular level, the Central Dogma is where it all begins. Master this concept, and every other topic in molecular biology — from CRISPR to cancer genomics to synthetic biology — will make dramatically more sense.

Key Takeaways

The Central Dogma describes the flow of genetic information: DNA is transcribed into RNA, which is translated into protein.
Francis Crick proposed this framework in 1958; it remains the foundation of molecular biology.
Transcription copies a gene from DNA to mRNA inside the nucleus. Translation reads that mRNA on a ribosome to build a protein.
The genetic code uses three-letter codons (64 total) to specify 20 amino acids, with built-in redundancy that buffers against some mutations.
Exceptions include reverse transcriptase (RNA to DNA, as in HIV), prions (protein-to-protein information transfer), and epigenetic modifications.
Gene editing tools target different stages of the Central Dogma: CRISPR edits DNA, RNA therapies target mRNA, and traditional drugs act on proteins.
Even a single DNA change — as in sickle cell disease — can cascade through the entire pathway, altering the mRNA and producing a defective protein with severe health consequences.

Sources

Crick, F. (1958). "On protein synthesis." Symposia of the Society for Experimental Biology, 12, 138-163.
Crick, F. (1970). "Central Dogma of Molecular Biology." Nature, 227, 561-563. doi:10.1038/227561a0
Kunkel, T.A. (2004). "DNA replication fidelity." Journal of Biological Chemistry, 279(17), 16895-16898. doi:10.1074/jbc.R400006200
Alberts, B. et al. (2022). Molecular Biology of the Cell, 7th Edition. W.W. Norton & Company.
Knight, R.D., Freeland, S.J., & Landweber, L.F. (2001). "Rewiring the keyboard: evolvability of the genetic code." Nature Reviews Genetics, 2, 49-58. doi:10.1038/35047500
Baltimore, D. (1970). "RNA-dependent DNA polymerase in virions of RNA tumour viruses." Nature, 226, 1209-1211. doi:10.1038/2261209a0
Prusiner, S.B. (1998). "Prions." Proceedings of the National Academy of Sciences, 95(23), 13363-13383. doi:10.1073/pnas.95.23.13363
Ingram, V.M. (1957). "Gene mutations in human haemoglobin: the chemical difference between normal and sickle cell haemoglobin." Nature, 180, 326-328. doi:10.1038/180326a0