" An exceptionally comprehensive reference, touching on every relevant aspect of current forensic DNA typing practice.......will serve many practitioners and students of forensic DNA typing as a single source reference......It is hard to think of a topic in forensic DNA typing that is not treated in this book.... "

                                                                          - Journal of Forensic Science 

             DNA Typing

Genetic fingerprinting, DNA testing, DNA typing, and DNA profiling are techniques used to distinguish between individuals of the same species using only samples of their DNA. Its invention by Sir Alec Jeffreys at the University of Leicester in the UK was announced in 1985. Two humans will have the vast majority of their DNA sequence in common. Genetic fingerprinting exploits highly variable repeating sequences or “microsatellites”, or Simple Sequence Repeats (SSRs): polymorphic loci that consist of repeating units of 1-4 base pairs in length.

Unrelated humans will be likely to have different numbers of microsatellites at a given locus. By using PCR to detect the number of repeats at several loci, it is possible to establish a match that is extremely unlikely to have arisen by coincidence, except in the case of identical twins, who will have identical genomes.

Genetic fingerprinting (DNA testing) is used in forensic science to match suspects to samples of blood, hair, saliva or semen. It has also led to a number of exonerations of formerly convicted suspects. It is also used in such applications as identifying human remains, paternity testing, match organ donors, studying populations of wild animals, and establishing the province or composition of foods. It has also been used to generate hypotheses on the pattern of the human diaspora in prehistoric times.

Testing is subject to the legal code of the jurisdiction in which it is performed. Usually the testing is voluntary, but it can be made compulsory by such instruments as a search warrant or court order. Several jurisdictions have also begun to assemble databases containing DNA information of convicts.

The United Kingdom currently has the most extensive DNA database in the world, with well over 2 million records as of 2005: The National DNA Database (NDNAD). The size of this database, and its rate of growth, is giving concern to civil liberties groups in the UK, where police have wide-ranging powers to take samples and retain them even in the event of acquittal.

In the last few years, DNA evidence has started to play a big part in many nations' Criminal Justice systems. It has been used to prove that suspects were involved in crimes and to free people who were wrongly convicted. In the United States, it has been integral to several high-profile criminal cases, including the trial of Orenthal James (O.J.) Simpson and the investigation of the 1996 murder of JonBenet Ramsey.

The key to DNA evidence lies in comparing the DNA from the scene of a crime with a suspect's DNA. To do this, investigators have to do three things:

1)   Collect DNA at the crime scene and from the suspect
2)   Analyze the DNA to create a DNA profile
3)   Compare the profiles to each other
 
 Authorities can extract DNA from almost any tissue, including hair, fingernails, bones, teeth and bodily fluids. Sometimes, investigators have DNA evidence but no suspects. In that case, law enforcement officials can compare crime scene DNA to profiles stored in a database. The most commonly used database in the United States is called CODIS, which stands for Combined DNA Index System. CODIS is maintained by the FBI. By law, authorities in all 50 states must collect DNA samples from convicted sex offenders for inclusion in CODIS. Some states also require all convicted felons to submit DNA.
  
Law enforcement officials have used a variety of methods to examine DNA. The exact steps in preparing and analyzing the DNA can vary based on which method the investigators use. But, in general, the tests examine non-coding portions of DNA strands. Genes, which serve as templates for making proteins in your cells, make up only five percent of a DNA strand. The remainder of your DNA is non-coding and includes lots of repeating base pairs. Different types of tests look for and analyze different base pair repetition patterns.
  
Restriction Fragment Length Polymorphism (RFLP) analysis was one of the first forensic methods used to analyze DNA. It analyzes the length of strands of DNA that include repeating base pairs. These repetitions are known as variable number tandem repeats (VNTRs) because they can repeat themselves anywhere from one to thirty times. RFLP analysis requires investigators to dissolve DNA in an enzyme that breaks the strand at specific points. The number of repeats affects the length of each resulting strand of DNA. Investigators compare samples by comparing the lengths of the strands. RFLP analysis requires a fairly large sample of DNA that hasn't been contaminated with dirt, and is rarely used in updated modern forensic laboratories.

Polymerase Chain Reaction (PCR) analysis is a newer technique that can amplify the DNA in a much smaller sample. It does this by making lots of identical copies of a small amount of DNA. It's often used as a preliminary step in Short Tandem Repeat (STR) analysis, which is the most commonly-used type of forensic analysis today.

STR analysis examines how often base pairs repeat in specific loci (or locations) on a DNA strand. These can be dinucleotide, trinucleotide, tetranucleotide or pentanucleotide repeats -- that is, repetitions of two, three, four or five base pairs. Investigators often look for tetranucleotide or pentanucleotide repeats in samples that have been through PCR amplification since these are the most likely to be accurate. In STR Analysis, examiners have to:

1)   Extract the DNA from the cells in the sample
2)   Quantify the DNA
3)   Amplify the DNA using PCR
4)   Use capillary electrophoresis to extract the amplified DNA

Several of these steps are fairly labor-intensive, but many of them can now be performed by robots and machines. The FBI's CODIS database uses samples that have undergone STR analysis examining 13 loci. The odds of two people having identical 13-loci STR profiles are about one in a billion.

* Note: Most forensic DNA tests use material from the nucleus of a cell. Sometimes, especially in older samples of tissue like hair and teeth, there is no nucleus remaining in the sample. In these cases, investigators often use mitochondrial DNA analysis, which uses DNA from a cell's mitochondria.

In 1985, DNA entered the courtroom for the first time as evidence in a trial, but it wasn't until 1988 that DNA evidence actually sent someone to jail. This is a complex area of forensic science that relies heavily on statistical predictions; in early cases where jurors were hit with reams of evidence heavily laden with mathematical formulas, it was easy for defense attorneys to create doubt in jurors' minds. Since then, a number of advances have allowed criminal investigators to perfect the techniques involved and face down legal challenges to DNA fingerprinting. Improvements include:
 
1)   New testing procedures - RFLP analysis required large amounts of relatively high-quality DNA. Newer procedures require far less DNA and can be completed faster.
2)   Source of DNA - Science has devised ingenious ways of extracting DNA from sources that used to be too difficult or too contaminated to use.
3)   Expanded DNA databases - Several countries, including the United States and Britain, have built elaborate databases with hundreds of thousands of unique individual DNA profiles. However, these databases also raise questions about privacy. DNA holds a lot more information about a person than fingerprints do. For example, a person's DNA includes information about everything from eye color to genetic defects. Some people fear that the widespread use of DNA databases could encourage governments to discriminate against people because of information encoded in their DNA. However, the DNA used for the FBI's CODIS database is not currently thought to correlate to a person's actual traits.
4)   Training - Crime labs have developed formal protocols for handling and processing evidence, reducing the likelihood of contamination of samples. On the courtroom side, prosecutors have become more savvy at presenting genetic evidence, and many states have come up with specific rules governing its admissibility in court cases. See How CSI Works for more details.
5)   Science education - In recent years, a number of debates have erupted around the world over issues like using DNA evidence, cloning animals or selling genetically modified crops. Since that time, classroom study of DNA and its properties has in many places become more in-depth and widespread.
 
 

Using DNA Evidence
Given the high profile DNA evidence had during the O.J. Simpson trial, most people know DNA profiles are used by criminal investigators to:
1)   Prove guilt - Matching DNA profiles can link a suspect to a crime or crime scene.
2)   Exonerate an innocent person - Innocent people have been freed from death row in the United States based on DNA evidence. So far, DNA evidence has been almost as useful in excluding suspects as in fingering and convicting them; about 30 percent of DNA profile comparisons done by the FBI result in excluding someone as a suspect.
 
DNA evidence is also useful beyond the criminal courtroom in:
 
1)   Paternity testing and other cases where authorities need to prove whether or not individuals are related - One of the more infamous paternity cases of late revolved around a 1998 paper in the journal "Nature" that studied whether or not Thomas Jefferson, the third president of the United States, fathered children with one of his slaves.
 
2)   Identification of John or Jane Does - Police investigators often face the unpleasant task of trying to identify a body or skeletal remains. DNA is a fairly resilient molecule, and samples can be easily extracted from hair or bone tissue; once a DNA profile has been created, it can be compared to samples from families of missing persons to see if a match can be made. The military even uses DNA profiles in place of the old-school dog tag. Each new recruit must provide blood and saliva samples, and the stored samples can subsequently be used as a positive ID for soldiers killed in the line of duty. Even without a DNA match to conclusively identify a body, a profile is useful because it can provide important clues about the victim, such as his or her sex and race.
 

3)   Studying the evolution of human populations - Scientists are trying to use samples extracted from skeletons and from living people around the world to show how early human populations might have migrated across the globe and diversified into so many different races.
 

4)   Studying inherited disorders - Scientist also study the DNA fingerprints of families with members who have inherited diseases like Alzheimer's Disease to try and ferret out chromosomal differences between those without the disease and who are have it, in the hopes that these changes might be linked to getting the disease.
 

  ~~~~~~~~~~~~~~~~~~~

                  DNA:

        The Genetic Code

  ~~~~~~~~~~~~~~~~~~~~~~

Overview

Organic Chemistry is the scientific study of the structure, properties, composition, reactions, and preparation by synthesis (or other means) of chemical compounds of carbon and hydrogen. These compounds may contain any number of other elements. including: nitrogen, oxygen, and the halogens (fluorine, chlorine, bromine, iodine). They may also contain the elements phosphorus or sulfur. Because of their unique properties, multi-carbon hydrocarbon compounds exhibit extremely large variety and the range of application of organic compounds is enormous. They form the chemical basis of many products (e.g. paints, plastics, explosives, pharmaceuticals, fossil fuels, petrochemicals ) and of course they form the basis of all life processes.

The original definition of organic chemistry came from the misperception that these compounds were always related to life and vital functions. Those compounds that are related to life processes are dealt with in the branch of organic chemistry which is called biochemistry. Living organisms maintain themselves by continuously processing the fuel of nutrient molecules contained in edible matter, or food. These molecules provide building blocks for new living matter and energy to sustain the vital functions of life. Nutrients include organic compounds such as complex carbohydrates (polysaccharides such as starch, glycogen, cellulose), fats, proteins, and vitamins, as well as metallic elements or minerals such as iron and copper, and water. The human body is composed of chemical compounds such as water, amino acids (or proteins ), fatty acids (or lipids ), nucleic acids (DNA / RNA), and carbohydrates (or sugars).

Carbohydrates are carbon, hydrogen and oxygen (C-H-O) containing compounds also known as simple sugars or monosaccharides. Most carbohydrates have one carbon atom (C) for each water molecule (H20). The formula for a simple sugar such as glucose (C6H12O6) is easy to recognize because there are equal numbers of carbons and oxygens and twice as many hydrogens. The ending –ose indicates that you are dealing with a carbohydrate. Carbohydrates provide both energy and shape to certain cells, and are an essential player in the field of molecular genetics.    

Complex carbohydrates form when simple sugars combine with one another to form polysaccharides (“many sugar units”) such as disaccharides and trisaccharides (e.g. sucrose).  Some common examples of polysaccharides are cellulose (a fibrous sugar used in the construction of cell walls), starch (an easily digestible energy source), and glycogen (for storing energy in muscle cells). Sugar molecules are also an essential part of genetic molecules such as DNA (deoxyribonucleic acid), RNA (ribonucleic acid) and ATP (adenosine triphosphate).

                                       

Proteins (or polypeptides) include enzymes and catalysts that speed the rate of chemical reactions in living things. Proteins such as hemoglobin serve as carriers of other molecules such as oxygen. Others such as collagen provide shape and support and are responsible for muscle movement. Hormones are chemical messengers secreted by endocrine glands to regulate other parts of the body. Antibodies are globular proteins made up by the body in response to the presence of a foreign or harmful molecule called an antigen. Antigens are often proteins also.   

Chemically speaking, proteins are polymers made up of monomers known as amino acids. An amino acid is a short carbon skeleton that contains an amino group (NH2) on one end of the skeleton and a carboxylic acid group (- COOH) at the other end. (The acidic properties derive from the non-bonding or “extra” pair of electrons associated with the Group V nitrogen).

                                                 The general structure of an amino acid molecule, with the amine group on the left and the carboxylic acid group on the right. The R group is dependent on the amino acid.

Structural proteins (e.g. collagen) are important for maintaining the shape of cells and organisms. Regulator proteins, such as enzymes and hormones, help to determine what activities will occur in the organism. Carrier proteins transport molecules from one location in the body to another. 

Genes - The Central Dogma

A gene is a segment of DNA that is able to: 1) replicate itself; 2) mutate (or change) itself; 3) store information about itself; and 4) synthesize new structural and regulatory proteins essential to the operation of the cell or organism. 

This stylistic schematic diagram shows a gene in relation to the double helix structure of DNA and to a chromosome (right). Introns are regions often found in eukaryote genes which are removed in the splicing process: only the exons encode the protein. This diagram labels a region of only 40 or so bases as a gene. In reality many genes are much larger, as are introns and exons.

It is our genetic programming which determines our individual physical traits, such as hair, skin and eye color. In order to understand how genes work, we need to look at a concept developed by the first scientists to really question the function of nucleic acids in the overall scheme of molecular genetics. In an attempt to understand how DNA and RNA relate to inheritance, cell structure and cell activities, the central dogma was developed.

The essence of the concept is that at the center of it all is DNA – the genetic material of all organic cells. One of the primary functions of DNA is to reproduce itself –- a process which we call replication. DNA is also capable of supervising the manufacture of RNA -- a process known as transcription. This process is closely linked to the production of new protein molecules by RNA – a process which we call translation.

Nucleic Acids: DNA & RNA  

Nucleic acids are complex polymers that store and transfer information within a cell. There are two types of nucleic acids: DNA and RNA. DNA serves as genetic material determining which proteins will be manufactured. RNA plays a vital role in the process of protein manufacture. All nucleic acids are composed of fundamental monomers known as nucleotides.

 Each nucleotide monomer is composed of:

1) a sugar molecule (5-member carbon ring skeleton)

2) a phosphate (- PO4) group

3) a base or amino acid (hexagonal 6-member carbon / nitrogen ring skeleton)   

There are five types of amino acid bases:

                       Adenine, Guanine, Cytosine, Uracil and Thymine

The bases A, G, T, C, and U are situated such that they are exposed on the outer edges of the polymer molecule. It is possible to classify nucleic acids into two main groups (DNA and RNA) based on the kinds of sugars and bases used in the nucleotides. DNA contains the sugar deoxyribose and has the bases A, T, G and C. RNA contains the sugar ribose has the bases A, U, G and C.

DNA is actually a double molecule, as it is composed of two strands to form a ladder-like structure thousands of nucleotide bases long. The two strands are attached between their bases according to the base pair rule: A – T and G – C.       

One strand of DNA is called the coding strand because it has a meaningful genetic message written using the nitrogenous bases as letters (e.g the base sequence CATTAGACT). The opposite strand is called the non-coding strand since it makes no sense, but protects the coding strand from chemical and physical damage. Both strands are twisted into a three-dimensional spiral helix as proposed in 1953 by biologists James Watson and Francis Crick.

                            www.dnaftb.org/dnaftb/19/concept/index.html

 

www.accessexcellence.org/RC/VL/GG/structure.html

You can actually “write” a message in the form of a stable DNA molecule by combining the four different bases (A, G, C and T) in different sequences. The four letters in the nucleic acid “alphabet” yield sixty-four possible three-letter words. (Table 26.4).This is the basis of the genetic code for all organisms. If these bases are read in groups of three, they make sense to us (CAT, TAG, ACT). To make sense out of such a code, it is necessary to read in only one direction (as in English). 

The genetic information contained by DNA can be compared to the information in a textbook. Thus, books are composed of words (constructed form individual letters) in particular combinations, organized into chapters. Similarly, DNA is composed of tens of thousands of nucleotides (letters) in specific sequences (words) organized into genes (chapters). Each gene carries the information for producing a protein, just as each chapter carries the information relating to one idea.

The order of nucleotides in a gene is directly related to the order of amino acids in a protein. Just as chapters in a book are identified by beginning and ending statements, different genes along a DNA strand have beginning and ending signals. They tell when to start and when to stop reading a particular gene.

Human body cells contain forty-six strands (books) of helical DNA, each containing thousands of genes (chapters). These strands are called chromosomes when they become super-coiled in preparation for cellular reproduction. Before reproduction, the DNA makes copies of the coding and non-coding strands in order to ensure that all offspring will receive the sufficient number of genes required for their survival. All replicated gene strands must possess the inherent capacity to: 

1)  Produce enzymes for the digestion of nutrients.

2)  Eliminate toxic waste from the cell interior.

3)  Repair and assemble cell parts.

4)  Reproduce healthy offspring.

5)  Respond to unfavorable conditions in the environment.

6)  Coordinate and regulate all of the essential functions of life.

                     If any of these functions is not performed properly, the cell may die.  

The processes of DNA replication should be reviewed in order to gain an understanding of the fundamentals of molecular genetics. The replication process is stunningly accurate, with only one error made for every 2 billion nucleotides. A human cell contains forty-six chromosomes consisting of about 3 billion base pairs. This averages about 1.5 errors per cell. Because this error rate is so small, DNA replication is considered to be essentially (and for all intents and purposes) error-free.

The distribution of DNA involves splitting the cell and distributing a set of genetic blueprints to two new daughter cells. The mother cell ceases to exist when it divides its contents between the two smaller daughter cells. In this way it does not really die -- it merely starts over again.

The process of DNA transcription (scribe = to write) should also be reviewed by the student of molecular genetics. Thus, the second major function of DNA is the production of single-stranded, complementary RNA copies of the double-stranded DNA. Transcription means literally “to transfer information or data form one form to another”. Although many types of RNA are synthesized from the genes, the three most important are messenger RNA (m-RNA), transfer RNA (t-RNA) and ribosomal RNA (r-RNA). Each of these actors has its own critical role on the genetic stage.

Messenger RNA is a mature, straight-chain copy of a gene that describes the exact sequence in which amino acids should be bonded together to form polypeptides, or proteins. Transfer RNA molecules are responsible for picking up particular amino acids and transferring them to the ribosome for protein assembly.

Ribosomal RNA is a highly coiled molecule and is used along with protein molecules in the manufacture of  ribosomes.  The ribosomes serve as the main component in the plasma of the blood where all 3 types of RNA (mRNA, tRNA and rRNA) are then used in concert in the synthesis of new proteins.

 

Translation as Communication

The mRNA molecule is a coded message written in the language of the nucleic acids. The information is used to assemble amino acids into protein by a process called translation. The word “translation” refers to the fact that nucleic acid language is being changed to protein language. In order for the information not to become literally “lost in translation”, a dictionary is necessary. Thus, the protein language has twenty words in the form of twenty common amino acids.

It is worth noting here that more than one codon may used in code for the same amino acid. Such “phonetic redundancy” can prove essential for genetic communication, and thus the survival of the organism. This was the essential contribution to the sciences achieved by Nobel Prize winning Claude Shannon of MIT in his classic 1948 papers on the Mathematics of Communication: The Theory of Information. Shannon’s remarkable theorem shows that codes do exist that preserve order (signal) in the face of disorder (noise), if a certain amount of redundancy is built into the message at the source.    

A sentence of English prose is a series of letters and words in a language obeying certain statistical rules. Some letters and groups of letters have a higher probability of occurring than others. Anyone who has played the game HangMan or watched the Wheel of Fortune spin knows that certain vowels such as the letter “e” are a good first guess. These rules are internally consistent, so that if a person knows the rules, the sequence is not completely unpredictable.

In any type of communications system, a message can be sent from one place to another without having its order thrown into disorder by noise, and be free from error to the extent that one chooses, as long as it is properly coded. The most efficient code uses the least amount of letters or words. In addition, a certain degree of redundancy is often necessary to transmit the message. But the whole purpose of sending a message or code is to continually generate ideas that were not totally predictable.

Some of the most elaborate codes were devised for the U.S. Space Program. The redundancy of these elaborately devised codes is large. For the Voyager II Mission, which in 1981 sent pictures of the rings of Saturn back to earth, the redundancy was 100 percent (one redundant bit sent for every bit that contained information). The rate of error in the Voyager II Code was one in 10,000 bits. This was not that bad for such physical phenomena as galactic temperature and background radiation -- information which itself contains a large amount of randomness, and therefore uncertainty!

 

The Origins & Language of Life

Scientists are now wondering whether nature has designed similar, redundant codes to protect the reliability of living forms. Evolution is clearly an inventive process, and has produced an immense amount of variety. On the other hand, there are limits on the amount of variety permitted by natural selection. Big breakthroughs in evolution, when really promising creatures emerged from primitive forms, may also have been when the message system in the genes optimized the amount of variety and error control.

In the course of evolution, some argue that certain living organisms acquired DNA messages which were coded in this optimal way, giving them a highly successful balance between variety and accuracy, a property also displayed by human languages. These winning creatures were the vertebrates, immensely innovative and versatile forms of life, whose arrival led to the speeding up of evolution.

In summary, if molecules carrying information copy themselves, then those which make the largest number of copies with the fewest mistakes are likely to win the competition for survival. In the first, primitive stirrings of pre-life on the planet earth, it is possible that short sequences of chemical symbols tended to survive because they had a high redundancy. Then this simple information system, with its high level of accuracy, could be expanded into something more complex by making the sequence longer.

A development in 1977 resulted in the deciphering of the entire genetic text of one of the smallest bacterial viruses. A mystery had surrounded this particular virus. It did not seem to contain enough information for making the nine different kinds of proteins which are in fact produced when the virus infects a healthy cell. When the mystery was finally unraveled, it turned out that there indeed was enough information in the viral DNA, but it was stored in an unexpectedly tricky way. The words of the text overlapped, so that more messages could be squeezed into a small space.

The discovery that this tiny virus stores information by means of DNA text so cunningly composed as to tax the ingenuity of a master anagrammatist came as a revelation. It had been thought that in the genetic code, precision and accuracy were too important to risk playing clever word games to increase the complexity of the organism. Surely, evolution would insist on messages so clear that they left room for one, and only one, interpretation.

But apparently the structure of DNA, even at such a primitive level, is more interesting than that. As Sir Frederick Sanger remarked, in one of the classic understatements of molecular biology:

Something rather subtle seems to be at work.”

~~~~~~~~~~~~~~~~~~~~~~~~