The Genetic Code

Structure of DNA

The famous double helix was discovered in 1953 by Watson and Crick. It is made up of two backbones of sugar and phosphate, held together by base-pairs. A simple metaphor would be to think of a rope ladder, and to twist that rope ladder creating the familiar backbones and rungs. It is these rungs that are so important - they are chemicals and these chemicals have names, beginning with the letters A (Adenine), C (Cytosine), G (Guanine) and T (Thymine). These letters make up the genetic code.

If you were to type the 6,000,000,000 letters of a single persons DNA in a standard sized font - it would fill 12,000 standard sized paperback books - a whole library!

It is the order of these letters (these chemicals) that encodes the information that are trying to decipher. The letters make up base-pairs and these base-pairs follow a sequence where A is always partnered with T and C is always partnered with G. 99.9% of these are identical in all of us - it is the 0.1% which makes us different - and this is what we are interested in.


It is in the copying of DNA, in the production of sperm and eggs that changes,variants or markers arise. If you were asked to copy six billion letters, think how many mistakes you would make?! The ‘molecular machine’ that copies our DNA is much more reliable, but not infallible, and it is these changes which make us different and show that we descend from different lineages.

These markers, or Single-Nucleotide Polymorphisms (SNPs - pronounced snips) can be found to originate in particular parts of the world, and through a measurement called the molecular clock they can be dated. The molecular clock hypothesis suggests that there is a roughly constant rate at which the ‘molecular machine’ makes these errors - approximately once every generation or two on the Y chromosome.

If the 6,000,000,000 letters of a single persons DNA were notes in a song, played at an allegro tempo - 120 beats per minute - it would take almost a century to play! We extract DNA from your saliva and read your genetic code at a number of variable points in your sequence of letters in order to determine your personal DNA signature.

Gene Chips

Previously was read by Sanger Sequencing - the most popular DNA sequencing method for twenty-five years from its conception in 1977. It was a very slow process and it could take a week to check one marker in a small number of people. Now, however, we have the revolutionary technology to read one million, or indeed, five million markers in one person in one experiment, highly accurately.

Gene chips use synthetic DNA to match each of the two variants (e.g C or T) at the markers of interest attached to beads on a chip. Each of these beads interrogates a marker - a piece of DNA that we know is variable and interesting. The synthetic bits of DNA - which are attached to red and green fluorescent dyes - find the relevant bits of DNA in your genome and bind to them, bringing two strands together and causing the fluorescent dyes to light up.