The genome of an organism is the complete set of genes specifying how its phenotype will develop (under a certain set of environmental conditions). In this sense, then, diploid organisms (like ourselves) contain two genomes, one inherited from our mother, the other from our father.
The table below presents a selection of representative genome sizes from the rapidly-growing list of organisms whose genomes have been sequenced.
| Base pairs | Genes | Notes | |
|---|---|---|---|
| Phi-X 174 | 5,386 | 10 | virus of E. coli |
| Human mitochondrion | 16,569 | 37 | |
| Epstein-Barr virus (EBV) | 172,282 | 80 | causes mononucleosis |
| Nanoarchaeum equitans | 490,885 | 552 | This parasitic member of the Archaea has the smallest genome of a true organism yet found. |
| nucleomorph of Guillardia theta | 551,264 | 511 | all that remains of the nuclear genome of a red alga (eukaryote) engulfed long ago by another eukaryote |
| Mycoplasma genitalium | 580,073 | 485 | three of the smallest true organisms |
| Ureaplasma urealyticum | 751,719 | 652 | |
| Mycoplasma pneumoniae | 816,394 | 680 | |
| Chlamydia trachomatis | 1,042,519 | 936 | most common sexually-transmitted disease (STD) bacterium in the U.S. |
| Rickettsia prowazekii | 1,111,523 | 834 | bacterium that causes epidemic typhus |
| Treponema pallidum | 1,138,011 | 1,039 | bacterium that causes syphilis |
| Mimivirus | 1,181,404 | 1,262 | A virus (of an amoeba) with a genome larger than the six cellular organisms above |
| Rickettsia conorii | 1,268,755 | 1,374 | causes Mediterranean spotted fever |
| Pelagibacter ubique | 1,308,759 | 1,354 | smallest genome yet found in a free-living organism (marine α-proteobacterium) |
| Borrelia burgdorferi | 1.44 x 106 | 1,738 | bacterium that causes Lyme disease [Note] |
| Aquifex aeolicus | 1,551,335 | 1,749 | bacterium isolated from a hot spring in Yellowstone National Park |
| Campylobacter jejuni | 1,641,481 | 1,708 | frequent cause of food poisoning |
| Helicobacter pylori | 1,667,867 | 1,589 | chief cause of stomach ulcers (not stress and diet) |
| Thermoplasma acidophilum | 1,564,905 | 1,509 | These unicellular microbes look like typical bacteria but their genes are so different from those of either bacteria or eukaryotes that they are classified in a third kingdom: Archaea. |
| Methanococcus jannaschii | 1,664,970 | 1,783 | |
| Aeropyrum pernix | 1,669,695 | 1,885 | |
| Pyrococcus horikoshii | 1,738,505 | 1,994 | |
| Methanobacterium thermoautotrophicum | 1,751,377 | 2,008 | |
| Haemophilus influenzae | 1,830,138 | 1,738 | bacterium that causes middle ear infections |
| Thermotoga maritima | 1,860,725 | 1,879 | marine bacterium |
| Streptococcus pneumoniae | 2,160,837 | 2,236 | the pneumococcus |
| Archaeoglobus fulgidus | 2,178,400 | 2,437 | another member of the Archaea |
| Neisseria meningitidis | 2,184,406 | 2,185 | Group A; causes occasional epidemics of meningitis in less developed countries. |
| Neisseria meningitidis | 2,272,351 | 2,221 | Group B; the most frequent cause of meningitis in the U.S. |
| Encephalitozoon cuniculi | 2,507,519 | 1,997 | (plus 69 RNA genes); a parasitic eukaryote. |
| Propionibacterium acnes | 2,560,265 | 2,333 | causes acne |
| Listeria monocytogenes | 2,944,528 | 2,926 | 2,853 of these encode proteins; the rest RNAs |
| Deinococcus radiodurans | 3,284,156 | 3,187 | on 2 chromosomes and 2 plasmids; bacterium noted for its resistance to radiation damage |
| Synechocystis | 3,573,470 | 4,003 | a marine cyanobacterium ("blue-green alga") |
| Vibrio cholerae | 4,033,460 | 3,890 | in 2 chromosomes; causes cholera |
| Mycobacterium tuberculosis | 4,411,532 | 3,959 | causes tuberculosis |
| Mycobacterium leprae | 3,268,203 | 1,604 | causes leprosy |
| Bacillus subtilis | 4,214,814 | 4,779 | another bacterium |
| E. coli K-12 | 4,639,221 | 4,377 | 4,290 of these genes encode proteins; the rest RNAs |
| E. coli O157:H7 | 5.44 x 106 | 5,416 | strain that is pathogenic for humans; has 1,346 genes not found in E. coli K-12 |
| Agrobacterium tumefaciens | 4,674,062 | 5,419 | Useful vector for making transgenic plants; shares many genes with Sinorhizobium meliloti |
| Salmonella enterica var Typhi | 4,809,037 | 4,395 | + 2 plasmids with 372 active genes; causes typhoid fever |
| Salmonella enterica var Typhimurium | 4,857,432 | 4,450 | + 1 plasmid with 102 active genes |
| Yersinia pestis | 4,826,100 | 4,052 | on 1 chromosome + 3 plasmids; causes plague |
| Schizosaccharomyces pombe | 12,462,637 | 4,929 | Fission yeast. A eukaryote with fewer genes than the five bacteria below. |
| Ralstonia solanacearum | 5,810,922 | 5,129 | soil bacterium pathogenic for many plants; 1681 of its genes on a huge plasmid |
| Pseudomonas aeruginosa | 6.3 x 106 | 5,570 | Increasingly common cause of opportunistic infections in humans. |
| Streptomyces coelicolor | 6,667,507 | 7,842 | An actinomycete whose relatives provide us with many antibiotics |
| Sinorhizobium meliloti | 6,691,694 | 6,204 | The rhizobial symbiont of alfalfa. Genome consists of one chromosome and 2 large plasmids. |
| Saccharomyces cerevisiae | 12,495,682 | 5,770 | Budding yeast. A eukaryote. |
| Cyanidioschyzon merolae | 16,520,305 | 5,331 | A unicellular red alga. |
| Plasmodium falciparum | 22,853,764 | 5,268 | Plus 53 RNA genes. Causes the most dangerous form of malaria. |
| Thalassiosira pseudonana | 34.5 x 106 | 11,242 | A diatom. Plus 144 chloroplast and 40 mitochondrial genes encoding proteins |
| Neurospora crassa | 38,639,769 | 10,082 | Plus 498 RNA genes. |
| Caenorhabditis elegans | 100,258,171 | 19,427 | The first multicellular eukaryote to be sequenced. |
| Arabidopsis thaliana | 115,409,949 | ~28,000 | a flowering plant (angiosperm) See note. |
| Drosophila melanogaster | 122,653,977 | 13,379 | the "fruit fly" |
| Anopheles gambiae | 278,244,063 | 13,683 | Mosquito vector of malaria. |
| Tetraodon nigroviridis (a pufferfish) | 3.42 x 108 | 27,918 | Although Tetraodon seems to have about the same number of genes as we do, it has much less "junk" DNA so its total genome is about a tenth the size of ours. |
| Rice | 3.9 x 108 | 37,544 | |
| Sea urchin | 8.14 x 108 | ~23,300 | |
| Dogs | 2.4 x 109 | 19,300 | |
| Humans | 3.3 x 109 | ~20,500 | [Link to more details.] |
| Amphibians | 109–1011 | ? | |
| Psilotum nudum | 2.5 x 1011 | ? | Note |
Note: The gene total for Borrelia burgdorferi is based on 853 genes on its single chromosome (of 910,724 base pairs) plus 430 genes on 11 of the 17 plasmids it contains.
Arabidopsis thaliana is a plant (in the mustard family) that has the smallest genome known in the plant kingdom and for this reason has become a favorite of plant molecular biologists. The sequences of two of its five chromosomes (#2 and #4) were published in December 1999. The others were reported in December 2000.
Even though Psilotum nudum (sometimes called the "whisk fern") is a far simpler plant than Arabidopsis (it has no true leaves, flowers, or fruit), it has 3000 times as much DNA. No one knows why, but 80% or more of it is repetitive DNA containing no genetic information. This is also the case for some amphibians, which contain 30 times as much DNA as we do but certainly are not 30 times as complex.
The total amount of DNA in the haploid genome is called its C value. The lack of a consistent relationship between the C value and the complexity of an organism (e.g., amphibians vs. mammals) is called the C value paradox.
The scientists at The Institute for Genomic Research (now known as the J. Craig Venter Institute) who determined the Mycoplasma genitalium sequence have followed this work by systematically destroying its genes (by mutating them with insertions) to see which ones are essential to life and which are dispensable. Of the 485 protein-encoding genes, they have concluded that only 381 of them are essential to life.
| Welcome&Next Search |