Rare Disease Day Sale | Up to 75% off + free shipping
main logo
Search
loading...

What is Whole Genome Sequencing? | What is WGS?

Whole genome sequencing is a genetic testing technology that reads 100% of your DNA, covering every gene and all of your chromosomes. Other DNA tests read only a small part of your DNA. Single gene tests (PCR) look at one gene, and genotyping microarrays read scattered spots across the genome. Whole genome sequencing reads all of it.

What is WGS?

WGS is the abbreviation for whole genome sequencing. The two terms mean the same thing and are used interchangeably.

WGS reads every chromosomal coordinate, beginning at the start of chromosome 1 and continuing to its end, then chromosome 2, chromosome 3, and so on. It also fully sequences the sex chromosomes (X, and Y in males) and the mitochondrial chromosome.

This is what separates WGS from the tests used by companies such as MyHeritage, 23andMe, and FamilyTreeDNA. Those companies use genotyping with DNA microarrays, which reads data only at selected spots in the genome. For example, a typical 23andMe test reads roughly 700,000 data points, while whole genome sequencing reads around 3 billion data points, which is 100% of the genome. Put simply, many direct to consumer genetic tests read less than 0.1% of your DNA.

What does 30x WGS mean?

You may have seen WGS written with a number and an x, such as 0.4x or 30x. That number is the depth of sequencing, meaning how many times the genome was read. 30x WGS means the genome was sequenced 30 times, and the data from those reads is combined and analyzed together.

Sequencing a genome only once can leave errors in the data. Reading it many times lets the analysis software compare copies and filter those errors out. Here is a simple way to picture it. Imagine copying a dictionary by hand from the first page to the last. No matter how careful you are, you will make some mistakes. Now imagine copying it 30 times. If you made an error on page 488 in one copy, the other 29 copies almost certainly got that page right, so a computer comparing all 30 can identify and discard the error.

A genome contains about 6 billion letters. Even a sequencing technology that is 99.9998% accurate will still produce an error roughly 0.0001% of the time, which works out to around 6,000 errors across a single read of the genome. Sequencing the genome 30 separate times allows the software to resolve those errors and produce one highly accurate sequence. That accuracy is why 30x is the standard for clinical grade results.

If 30x means reading the genome 30 times, what does 0.4x mean? We cover that in a separate article, What is 0.4x and 30x Whole Genome Sequencing?

How is whole genome sequencing done?

First, a DNA sample is collected. It contains DNA from the chromosomes as well as DNA from the mitochondria. The DNA is broken into smaller fragments, and a sequencing instrument reads the order of the nucleotide bases (A, T, C, and G) in each fragment.

Once the fragments are read, they are reassembled into the correct order using bioinformatics. This step is called assembly. With the assembled genome in hand, researchers, physicians, and citizen scientists can analyze it further, including looking at whether specific genetic variations are associated with health conditions.

There are several sequencing methods. Whole genome sequencing and whole exome sequencing are the two most widely used in healthcare and research. They should not be confused, because whole exome sequencing reads less than 1% of the genome, while whole genome sequencing reads all of it. Sequencing technologies include Illumina dye sequencing, SMRT sequencing, pyrosequencing, and newer approaches such as nanopore sequencing, which passes a strand of DNA through a protein pore to read it directly.

How big is one person's whole genome?

It is large. A single genome contains roughly 6 billion letters, and once sequenced it is stored as data files such as FASTQ, BAM, and VCF. A single person's whole genome file can exceed 300 GB, which is larger than the storage on many computers.

Because of that size, most people store their genome in the cloud rather than on a personal device. Sequencing is built for exactly this. It provides safe, confidential storage of your whole genome so you and your healthcare providers can securely access it and continue to get value from it over time.

Why whole genome sequencing matters

The first human genome was completed in 2003 through the Human Genome Project, an international research effort that cost about 2.7 billion dollars and took 15 years. Today the same sequencing can be done in a matter of weeks at a tiny fraction of that cost, which has made whole genome sequencing accessible to far more people.

This matters because whole genome sequencing is one of the most powerful tools available for understanding genetic contributions to health, including variants linked to cancer, heart disease, and medication response, as well as for tracking infectious disease outbreaks. Public health agencies, including the US Food and Drug Administration, use genomic sequencing to identify pathogens involved in foodborne illness outbreaks.

It is also increasingly used to guide personalized treatment for patients with cancer and certain genetic conditions, and it has become a practical option for consumers and citizen scientists who want a complete view of their own health related genetics.

Where to get whole genome sequencing

Sequencing offers clinical grade 30x whole genome sequencing that you can order directly. You can review the service and current pricing on the Whole Genome Sequencing page.

Store, analyze, and understand your genome with Sequencing

Once you have a whole genome sequence, you can upload it to Sequencing, store it for free, and analyze it with apps in the Partner Marketplace. Because your full genome is available, your reports can draw on 100% of your DNA and all of your over 30,000 genes across areas such as health, nutrition, and fitness. From there you can use your results to outsmart your genes.