Genome Sizes

The genome of an organism is the complete set of genes specifying how its phenotype will develop (under a certain set of environmental conditions). In this sense, then, diploid organisms (like ourselves) contain two genomes, one inherited from our mother, the other from our father.

The table below presents a selection of representative genome sizes from the rapidly-growing list of organisms whose genomes have been sequenced.



Table of Genome Sizes (haploid)
Base pairsGenesNotes
φX1745,38611virus of E. coli
Human mitochondrion16,56937
Epstein-Barr virus (EBV)172,28280causes mononucleosis
nucleomorph of Guillardia theta551,264511all that remains of the nuclear genome of a red alga (a eukaryote) engulfed long ago by another eukaryote
Mycoplasma genitalium580,073517two of the smallest true organisms
Mycoplasma pneumoniae816,394679
Rickettsia prowazekii1,111,523834bacterium that causes epidemic typhus
Treponema pallidum1,138,0111,039bacterium that causes syphilis
Pelagibacter ubique1,308,7591,354smallest genome yet found in a free-living organism (marine α-proteobacterium)
Helicobacter pylori1,667,8671,589chief cause of stomach ulcers (not stress and diet)
Methanocaldococcus jannaschii1,664,9701,783These unicellular microbes look like typical bacteria but their genes are so different from those of either bacteria or eukaryotes that they are classified in a third kingdom: Archaea.
Aeropyrum pernix1,669,6951,885
Methanothermobacter thermoautotrophicus1,751,3772,008
Streptococcus pneumoniae2,160,8372,236the pneumococcus
Pandoravirus2,473,8702556A virus (of an amoeba) with a genome larger than that of the bacteria and archaea above and about the same as that of some parasitic eukaryotes [Example].
Listeria monocytogenes2,944,5282,9262,853 of these encode proteins; the rest RNAs
Synechocystis3,573,4704,003a marine cyanobacterium ("blue-green alga")
E. coli K-124,639,2214,3774,290 of these genes encode proteins; the rest RNAs
E. coli O157:H75.44 x 1065,416strain that is pathogenic for humans; has 1,346 genes not found in E. coli K-12
Schizosaccharomyces pombe12,462,6374,929Fission yeast. A eukaryote with fewer genes than the three bacteria below.
Agrobacterium tumefaciens4,674,0625,419Useful vector for making transgenic plants; shares many genes with Sinorhizobium meliloti
Pseudomonas aeruginosa6.3 x 1065,570Increasingly common cause of opportunistic infections in humans.
Sinorhizobium meliloti6,691,6946,204The rhizobial symbiont of alfalfa. Genome consists of one chromosome and 2 large plasmids.
Saccharomyces cerevisiae12,495,6825,770Budding yeast. A eukaryote.
Neurospora crassa38,639,76910,082Plus 498 RNA genes.
Thalassiosira pseudonana34.5 x 10611,242A diatom. Plus 144 chloroplast and 40 mitochondrial genes encoding proteins
Naegleria gruberi41 x 10615,727This free-living unicellular organism lives as both an amoeboid and a flagellated form. 4,133 of its genes are also found in other eukaryotes suggesting that they were present in the common ancestor of all eukaryotes. The great variety of functions encoded by these genes also suggests that the common ancestor of all eukaryotes was itself as complex as many of the present-day unicellular members.
Drosophila melanogaster122,653,977~17,000the "fruit fly"
Caenorhabditis elegans 100,258,17121,733
Humans3.3 x 109~21,000 [Link to more details.]
Tetraodon nigroviridis (a pufferfish)3.42 x 10827,918Although Tetraodon seems to have more protein-encoding genes than we do, it has much less non-coding DNA so its total genome is about a tenth the size of ours.
Mouse2.8 x 109~23,000
Amphibians109–1011?
Arabidopsis thaliana0.135 x 10927,407a flowering plant (angiosperm) with one of the smallest genomes known in the plant kingdom.
Picea abies19.6 x 10928,354the Norway spruce, a conifer (gymnosperm). Even though it has only ~900 more genes than Arabidopsis, it has 145 times as much DNA. Most of this appears to be derived from transposons.
Psilotum nudum2.5 x 1011?Note

Even though Psilotum nudum (sometimes called the "whisk fern") is a far simpler plant than Arabidopsis (it has no true leaves, flowers, or fruit), it has 3000 times as much DNA. No one knows why, but 80% or more of it is repetitive DNA containing no genetic information. This is also the case for some amphibians, which contain 30 times as much DNA as we do but certainly are not 30 times as complex.

The total amount of DNA in the haploid genome is called its C value. The lack of a consistent relationship between the C value and the complexity of an organism (e.g., amphibians vs. mammals) is called the C value paradox.

How many genes does it take to make an organism?

The scientists at The Institute for Genomic Research (now known as the J. Craig Venter Institute) who determined the Mycoplasma genitalium sequence have followed this work by systematically destroying its genes (by mutating them with insertions) to see which ones are essential to life and which are dispensable. Of the 485 protein-encoding genes, they have concluded that only 381 of them are essential to life.

Welcome&Next Search

19 April 2014