DNA Sequencing

Genotyping, Sequencing

Published

November 18, 2023

Source

DNA, the acronym for deoxyribonucleic acid, is the genetic code inherited from your mother and facther that exists in almost every cell in your body. The code is made up of about 3 billion bases or letters, consisting of A, T, G, or C (which stand for adenine, thymine, guanine and cytosine), that spell out the personal biological instructions that tell your body how to function over time. You are born with this code, and it doesn’t change. Genotyping is the process of determining which genetic variants an individual possesses, while sequencing is a method used to determine the exact sequence of a certain length of DNA.

When someone has their “DNA sequenced,” it almost always means one of four things ref.

Genotyping vs. Sequencing

To explain the difference between the technologies that read DNA, think of a book. Imagine the string of letters that make up your genetic code are like words on a page, telling a story, chapter by chapter. Genotyping is like reading a few scattered words on a page. Sequencing reads whole sentences, paragraphs and chapters. To sum it up quickly, genotyping gives you small packets of data to compare while sequencing gives you more data, with more meaning and context, today and down the road.

Genotyping looks for information at specific place in the DNA where we know important data will be. Microarrays (or “arrays”, for short) are just one approach to genotyping but it has paved the way for understanding how common variations in our DNA may be associated with health conditions like diabetes and heart disease. However, while identifying this specific point in the DNA – or “word” on the page – is incredibly important, it also signals to reasearchers that something in that “paragraph” could be even more significant or provide context. This is the equivalent to reading the word “knife” on a page, yet not knowing whether the book is a cookbook or a murder mystery. This is what genotyping is good at: finding what we know today, where we know it will be. That’s a great tactic if you know what you are looking for. But what about what we don’t know? And what about context?

Sequencing looks at all the letters, in the order they are spelled out in your DNA. In some cases, it looks only at a gene, a stretch of DNA that has the instructions for a specific protein. In other cases, the sequencing can look at the entire sequence, all 3 billion or so letters. Through significant advances in technology, we’ve drastically decreased the time and labor costs to do sequencing. The ability to look at the entire sequence of genes faster and cheaper through next generation sequencing (NGS) allow us to see beyond the commonly known variations in your DNA, thus enabling scientists to identify more of the unique variations from person to person. In turn, this leads to deeper discovery about the genetic underpinnings of your health. Without sequencing, we simply miss these nuances to the story. We only know, “there is an apple in this book”. This is what sequencing is good at: uncovering the context, meaning, detail, and granularity.

WGS vs WES

Exome

An exome is the sequence of all the exons in a genome, reflecting the protein-coding portion of a genome. In humans, the exome is about 1.5% of the genome.

Proteins are encoded by genes. And genes consist of two major components, exons and introns. Exons contain the nucleotides that directly encode for proteins, whereas introns are stretches of DNA between the exons and do not encode for proteins. The entire collection of all the exons from all the genes in a genome is called an exome. In the case the human genome, the exome only corresponds to about 1.5% of the genome’s roughly 3 billion nucleotides. Genome scientists have developed laboratory methods that allow them to just sequence a genome’s exome; in other words, just the part of the genome that directly encodes for proteins. So, you will often hear genomicists talk about an exome sequence, which is a very small part of the overall genome (or whole-genome) sequence.

When to Use Whole-Genome or Whole-Exome Sequencing

The complete genomic information within a sample or individual is known as the whole genome. Exons are the genome’s protein-coding regions and are collectively known as the exome. Despite the exome’s relatively small proportion of the whole genome (approximately 2%), exomes encode most known disease-related variants.

Use When you need to
WGS Analyze the whole genome, including coding, non-coding, and mitochondrial DNA
Discover novel genomic variants (structural, single nucleotide, insertion-deletion, copy number)
Identify previously unknown variants for future targeted studies
WES Increase throughput capabilities
Optimize cost per sample
Analyze manageable data sets and maximize data storage

Note: Both exons and introns are also present in untranslated regions (UTRs) and non-coding RNAs. Misuse of the term exon is problematic, for example, ‘‘whole-exome sequencing’’ technology targets <25% of the human exome, primarily regions that are protein coding. Not all exons are protein coding: Addressing a common misconception