SNiPs: 

      Single Nucleotide 

         Polymorphism

Single nucleotide polymorphism or SNP (pronounced snip) is a DNA sequence variation occurring when a single nucleotide - A, T, C, or G - in the genome (or other shared sequence) differs between members of a species (or between paired chromosomes in an individual). E.G. Two sequenced DNA fragments from different individuals, AAGCCTA to AAGCTTA, contain a difference in a single nucleotide. In this case we say that there are two alleles : C and T. Almost all common SNP have only two alleles.

Within a population, SNP's can be assigned a minor allele frequency - the ratio of chromosomes in the population carrying the less common variant to those with the more common variant. It is important to note that there are variations between human populations, so a SNP allele that is common in one geographical or ethnic group may be much rarer in another. In the past, single nucleotide polymorphisms with a minor allele frequency of 1% (or 0.5% etc.) were given the title "SNP", an unwieldy definition. With the advent of modern bioinformatics and a better understanding of evolution this definition is no longer necessary.

Single nucleotide polymorphisms may fall within coding sequences of genes, non-coding regions of genes, or in the intergenic regions between genes. SNP's within a coding sequence will not necessarily change the amino acid sequence of the protein that is produced, due to degeneracy of the genetic code. A SNP in which both forms lead to the same polypeptide sequence is termed synonymous (sometimes called a silent mutation) - if a different polypeptide sequence is produced they are non-synonymous. SNP's that are not in protein coding regions may still have consequences for gene splicing, transcription factor binding, or the sequence of non-coding RNA.

On average, SNPs occur in the human population more than 1 percent of the time. Because only about 3 to 5 percent of a person's DNA sequence codes for the production of proteins, most SNPs are found outside of "coding sequences". SNPs found within a coding sequence are of particular interest to researchers because they are more likely to alter the biological function of a protein. Because of the recent advances in technology, coupled with the unique ability of these genetic variations to facilitate gene identification, there has been a recent flurry of SNP discovery and detection.

As a result of recent advances in SNPs research, diagnostics for many diseases may improve. Finding single nucleotide changes in the human genome seems like a daunting prospect. But over the last 20 years, biomedical researchers have developed a number of techniques that make it possible to do just that. Each technique uses a different method to compare selected regions of a DNA sequence obtained from multiple individuals who share a common trait. In each test, the result shows a physical difference in the DNA samples only when a SNP is detected in one individual and not in the other.

Many common diseases in humans are not caused by a genetic variation within a single gene but are influenced by complex interactions among multiple genes as well as environmental and lifestyle factors. Although both environmental and lifestyle factors add tremendously to the uncertainty of developing a disease, it is currently difficult to measure and evaluate their overall effect on a disease process. Therefore, we refer here mainly to a person's genetic predisposition, or the potential of an individual to develop a disease based on genes and hereditary factors.

Genetic factors may also confer susceptibility or resistance to a disease and determine the severity or progression of disease. Because we do not yet know all of the factors involved in these intricate pathways, researchers have found it difficult to develop screening tests for most diseases and disorders. By studying stretches of DNA that have been found to harbor a SNP associated with a disease trait, researchers may begin to reveal relevant genes associated with a disease. Defining and understanding the role of genetic factors in disease will also allow researchers to better evaluate the role non-genetic factors?such as behavior, diet, lifestyle, and physical activity?have on disease.

Because genetic factors also affect a person's response to drug therapy, DNA polymorphisms such as SNPs will be useful in helping researchers determine and understand why individuals differ in their abilities to absorb or clear certain drugs, as well as to determine why an individual may experience an adverse side effect to a particular drug. Therefore, the recent discovery of SNPs promises to revolutionize not only the process of disease detection but the practice of preventative and curative medicine.
 
It will only be a matter of time before physicians can screen patients for susceptibility to a disease by analyzing their DNA for specific SNP profiles. Each person's genetic material contains a unique SNP pattern that is made up of many different genetic variations. Researchers have found that most SNPs are not responsible for a disease state. Instead, they serve as biological markers for pinpointing a disease on the human genome map, because they are usually located near a gene found to be associated with a certain disease. Occasionally, a SNP may actually cause a disease and, therefore, can be used to search for and isolate the disease-causing gene.

To create a genetic test that will screen for a disease in which the disease-causing gene has already been identified, scientists collect blood samples from a group of individuals affected by the disease and analyze their DNA for SNP patterns. Next, researchers compare these patterns to patterns obtained by analyzing the DNA from a group of individuals unaffected by the disease. This type of comparison, called an "association study", can detect differences between the SNP patterns of the two groups, thereby indicating which pattern is most likely associated with the disease-causing gene. Eventually, SNP profiles that are characteristic of a variety of diseases will be established. Then, it will only be a matter of time before physicians can screen individuals for susceptibility to a disease just by analyzing their DNA samples for specific SNP patterns.

Using SNPs to study the genetics of drug response will help in the creation of "personalized" medicine. As mentioned earlier, SNPs may also be associated with the absorbance and clearance of therapeutic agents. Currently, there is no simple way to determine how a patient will respond to a particular medication. A treatment proven effective in one patient may be ineffective in others. Worse yet, some patients may experience an adverse immunologic reaction to a particular drug. Today, pharmaceutical companies are limited to developing agents to which the "average" patient will respond. As a result, many drugs that might benefit a small number of patients never make it to market.

In the future, the most appropriate drug for an individual could be determined in advance of treatment by analyzing a patient's SNP profile. The ability to target a drug to those individuals most likely to benefit, referred to as "personalized medicine", would allow pharmaceutical companies to bring many more drugs to market and allow doctors to prescribe individualized therapies specific to a patient's needs.
 
Most SNPs are not responsible for a disease state. Instead, they serve as biological markers for pinpointing a disease on the human genome map. Because SNPs occur frequently throughout the genome and tend to be relatively stable genetically, they serve as excellent biological markers. Biological markers are segments of DNA with an identifiable physical location that can be easily tracked and used for constructing a chromosome map that shows the positions of known genes, or other markers, relative to each other. These maps allow researchers to study and pinpoint traits resulting from the interaction of more than one gene. NCBI plays a major role in facilitating the identification and cataloging of SNPs through its creation and maintenance of the public SNP database (dbSNP). This powerful genetic tool may be accessed by the biomedical community worldwide and is intended to stimulate many areas of biological research, including the identification of the genetic components of disease.

Thus, SNPs are small genetic changes, single base nucleotides in DNA (individual A, T, G, or C), that vary among individuals. Human populations are estimated to be 99 percent identical at the level of genetic sequence. Diversity arises from the remaining 1 percent variation, most of which is accounted for by SNPs (although a small percentage is due to deletions or insertions of DNA). There are estimated to be approximately 10 million SNPs in the human genome. They are found, on average, every 100 to 300 base pairs in the 3-billion-base pair genome, although their density varies between regions. SNPs are found in both coding and non-coding regions, and the majority of them (two thirds) are substitutions of thymine (T) for cytosine (C). They are relatively stable evolutionarily, and are therefore useful in population studies. Also, because SNPs are distributed more or less evenly throughout the human genome, they can serve as helpful landmarks in the construction of genetic maps.

Most SNPs are silent -- that is, they exert no discernible effect on gene function or phenotype. They can, however, have important consequences for individual susceptibility to disease and reactions to medical treatment. One of the better known associations of SNPS with disease results from the presence of the E4 allele, which is associated with a higher risk of developing Alzheimer's disease than the E2 allele. SNPs in the genes BRCA1 (breast cancer gene 1) and BRCA2 (breast cancer gene 2) that inactivate these tumor suppressors occur in five percent of all breast cancer cases and also put carriers at risk for developing ovarian cancer. The lifetime breast cancer risk for women who carry such genetic mutations is in the range of 50-80 percent.

In addition to changes in single genes that affect disease risk, it is thought that particular combinations of SNPs located across multiple genes contribute to a predisposition to developing medical conditions. SNPs are also believed to underlie individual variation in response to medical treatments. An understanding of the genetic basis for drug response, usually referred to as pharmacogenomics, would have important clinical implications. By being able to predict how different individuals are likely to react to different drugs, a physician could tailor treatment to a specific patient's genetic profile, thus maximizing therapeutic benefit and minimizing hazardous side effects.

Currently, a vast literature exists reporting possible associations between SNPs and diseases. Some links are supported by multiple reports; other associations are plagued by conflicting reports possibly due to false positives, false negatives or true variations among the populations studied. Those associations with high penetrance -- ones that confer a relatively high disease risk of 50 to 80 percent -- have been most clearly defined. The situation for those with a lower penetrance is considerably murkier.

A recent study published in Nature examined the impact of possible false positives on the literature. In the study, a collaborative team spearheaded by investigators at the Whitehead Institute used a meta-analysis approach to examine 300 published studies covering 25 reported associations. The studies selected were follow-up reports that corroborated an association. If the initial report had been a false positive, by chance only 5 percent, or 15, of the follow-up studies should contain statistically significant data. In contrast, the meta-analysis showed that a much higher fraction, 59, of the replication studies were significant. Therefore, while false positives no doubt occur, many associations in the literature in fact can be replicated.

Investigators in the field are advocating approaches to SNP disease association studies that take into account function and biological plausibility and make use of large sample sizes that will increase the power to detect associations with small magnitude effects.