Zum Inhalt springen
Dieser Artikel wurde noch nicht ins Deutsch übersetzt. Die englische Fassung wird unten angezeigt.

Biologie · Genetik

Pfeiler

What is DNA? The molecule, the code, and what it doesn't determine

DNA is a four-letter molecular code that stores the information needed to build and run a cell. Understanding what it does — and the equally important things it does not do — is the foundation of modern biology.

Dr. Mira Brandt

Computational Biologist, EMBL Affiliate

Veröffentlicht

Aktualisiert 5 Min. Lesezeit

DNA — deoxyribonucleic acid — is a four-letter molecular code that stores the information needed to build and run a cell. It is the carrier of heredity and the substrate on which evolution operates. It is also routinely overstated in popular discussion, in ways that the actual biology does not support.

This piece walks through what DNA is, what it does, and the equally important set of things it does not do.

The molecule

DNA is a polymer assembled from four kinds of subunit, the nucleotides, identified by the bases they carry: adenine (A), thymine (T), guanine (G), and cytosine (C). Each nucleotide attaches to the next through a sugar-phosphate backbone, producing a long chain. Two such chains pair together — A with T, G with C — and twist into the famous double helix.

The double-stranded structure is the source of DNA's most useful property: each strand contains the information needed to reconstruct the other. This is why DNA can be copied with high fidelity, and why damage to one strand is in principle correctable from the other. The chemistry of the molecule was the answer to the question of how heredity could work physically.

In a typical human cell, the DNA is organized into 46 chromosomes (23 pairs), totaling about 3 billion base pairs. End-to-end, this is roughly two meters of molecule packed into a nucleus a few microns across. The packing is not arbitrary — it is hierarchical, with DNA wound around histone proteins, organized into loops and domains, and folded into territories within the nucleus. Where in this structure a given gene sits affects whether and when it is read.

The code

DNA is read in groups of three bases — codons — each of which specifies one of twenty amino acids (or a stop signal). The same code is used in essentially every organism, from bacteria to humans. This shared code is one of the strongest single pieces of evidence for the common ancestry of life on Earth. There is no biochemical reason it had to be this code rather than another; the universality reflects historical contingency more than functional necessity.

The flow of information from DNA to function follows a well-characterized path: DNA is transcribed into RNA, RNA is translated into protein, protein performs most of the cellular work. This is the central dogma, articulated by Francis Crick in 1958 and elaborated since. There are well-characterized exceptions — RNA viruses, retroelements, regulatory RNAs that act without being translated — but the dogma's core remains a useful organizing principle.

What the central dogma does not say, and what it is sometimes mistakenly read to imply, is that information flows only from DNA outward. It says nothing about how the expression of DNA is regulated, which involves substantial information flowing the other direction — from environment to chromatin state to gene activity.

What DNA determines

DNA determines the set of proteins a cell can build. It determines (within limits) the structure of those proteins. It determines (within limits) when and where they are likely to be expressed. It determines the developmental constraints within which a cell or organism can vary.

In some cases, a single DNA change can produce a single phenotypic change with high reliability — sickle-cell disease, cystic fibrosis, achondroplasia. These are the diseases that fit the popular image of "the gene for X." They exist; they are real; they are important for the people affected. They are also a small fraction of the cases where DNA matters.

For most traits — height, susceptibility to common diseases, behavioral predispositions — the picture is fundamentally polygenic. Hundreds or thousands of variants, each with a small effect, combine to produce the heritable component of variation. The effects are statistical, not deterministic. They describe populations, not individuals.

What DNA does not determine

There are several things DNA does not determine, and the gap between popular discourse and actual biology runs through this list.

The state of any individual cell at any given time. Two cells with identical DNA can be in radically different states because of differences in chromatin structure, transcription factor concentrations, signaling history, and stochastic noise in molecular abundance. The genotype is one input among many to the cellular state.

The course of development. Development is not the unfolding of a genetic program in a vacuum. It is a continuous interaction between the genome, the cellular environment (which includes maternal contributions in early development), and the physical constraints of the developing organism. The same genome in a different uterine environment produces a measurably different organism.

Most of the variation in most complex traits. For traits with substantial environmental and behavioral inputs — most disease risk, most psychological traits, most social outcomes — DNA explains a fraction of the variance, often a minority. Heritability is not the same as genetic determination; it is a measure of how much of the variation in a trait, in a given population, in a given environment, can be statistically attributed to genetic differences.

The future trajectory of an individual. Knowing a person's complete DNA sequence does not let you predict their life with precision. The deterministic implications of polygenic risk scores are routinely overstated in popular coverage. The same scores that meaningfully shift population-level risk are usually noisy at the individual level.

These limits are not gaps in our knowledge that more sequencing will fill. They are constraints inherent to the way DNA participates in biology. The gap between "DNA matters" and "DNA determines" is large, and most popular discourse collapses it.

The takeaway

DNA is the storage medium for biological information and the substrate on which selection operates. It does enormous work; it is the right starting point for almost every question in modern biology. It is not, and could not be, a complete description of what an organism is or what it will become.

The most defensible position on DNA in contemporary public discourse is something like: DNA matters, the genome is real, heredity is real, and individuals are not reducible to their genome. All four parts are necessary. Drop any one and you produce one of the standard misunderstandings.

Newsletter

Ein sorgfältiger Text pro Woche.

Abonnieren Sie, um neue Langtexte und Analysen zu erhalten, plus gelegentliche Notizen aus der Redaktion. Kein Clickbait, kein Listenkauf, keine Tracking-Pixel.

Häufige Fragen

  • If DNA is the blueprint, why aren't identical twins identical?

    DNA is not a blueprint — it is a recipe whose execution is sensitive to context. Identical twins share genotype but develop in different prenatal positions, accumulate different mutations, experience different environments, and undergo different epigenetic modifications. The phenotypic differences between identical twins are evidence that DNA underdetermines the organism.

  • How much of the human genome is 'junk'?

    The strong claim that most of the genome is functional has not survived scrutiny. The current best estimate is that 10–25% of the human genome is under purifying selection (and therefore likely functional), with the rest dominated by transposable element relics, pseudogenes, and other sequences without detectable function. 'Junk' is an unfortunate label, but the size of the non-functional fraction is real.

  • Are most diseases genetic?

    Most common diseases have a genetic component but are far from purely genetic. Heritability estimates for conditions like diabetes, depression, and most cancers are in the 30–50% range, meaning environment, behavior, and chance account for the majority of risk variation. Single-gene disorders exist and are important for affected individuals, but they are a small fraction of overall disease burden.

Biologie · ZellbiologiePfeiler

What is a cell? The unit of life, defined operationally

A cell is the smallest unit that can sustain itself, copy itself, and respond to its environment. Each of those requirements rules out what "cell" doesn't mean — and explains why borderline cases are interesting.

Dr. Mira Brandt2. Mai 20264 Min. Lesezeit
  • cells
  • fundamentals
  • cell-biology