GEDmatch DNA Upload Instructions

GEDmatch DNA upload instructions

By ⁠Dr. Brandon Colby MD, a physician-expert in the fields of Genomics and Personalized Preventive Medicine.

If you’re looking for one of the most popular ⁠free DNA upload sites, GEDmatch is it.

GEDmatch, founded by Curtis Rogers and John Olson in 2010, is a website where people can upload their DNA data from several different DNA testing companies. After uploading data to GEDmatch’s DNA database, a user can then collaborate with other GEDmatch users and use GEDmatch’s genetic genealogy features.

GEDmatch does not offer its own DNA testing services but does allow you to upload your DNA test data from testing offered by any of the following test providers:

About the DNA Site GEDMatch

GEDmatch is a free genetic genealogy site that allows users to upload their autosomal DNA test results to the site’s DNA database. Users can then use the site’s tools to identify DNA matches and learn more about family history through the GEDmatch DNA database. The site analyzes X chromosome and autosomal genetic data to deliver its results, which has proven to be highly effective.

GEDmatch is known for its use by law enforcement, but many people use it for personal DNA research. Law enforcement searches are what makes GEDMatch’s privacy worrisome for many people, as the DNA data uploaded to the site has been used in law enforcement investigations.

gedmatch privacy

The privacy concerns came to light in May of 2018 when the DNA database was used by police to identify the suspect in the ⁠Golden State Killer case in California. GEDmatch tried to act fast in regaining the confidence of users by changing their site policy and terms of service. They even added a privacy option for users to opt-in or opt-out of police being able to access their DNA databases.

The opt-in or opt-out option may have helped them regain confidence from users, but it was short-lived. A few months later in November 2019, a law enforcement officer obtained a warrant to access ALL of the GEDmatch DNA data. This includes people who opted out of having their data searchable by law enforcement investigators.

A month later in December, ⁠Verogen purchased GEDmatch. Verogen is a forensic testing company that works with law enforcement agencies.

Despite the focus on law enforcement use, GEDmatch.com continues to be a popular genealogical tool used by genetic genealogists. The reality is the genealogy website offers a wealth of information that can help family members find each other through the family finder tool, common ancestors with DNA ancestry tools, and learn more about ethnicity and family history for building family trees.

The GEDmatch admixture/heritage tool allows users to pick the project they want to compare their DNA to and each project produces different results.

When considering all of the DNA information GEDmatch has available, many people do not find the privacy concerns a deterrent. After all, the only people who should really be concerned about privacy are those involved in cold cases.

Uploading DNA Data to GEDMatch

You can ⁠upload your DNA file to the genealogy database within a few minutes. All you have to do is go to the registration page to register.

gedmatch registration

After registering, you can upload your ⁠DNA data file from the SNP-based testing company you used for testing. You can download your data from your DNA testing company by following the below links to tutorials on how to ⁠download raw DNA data.

Depending on which DNA testing company you used, you will either have a file or folder. GEDmatch will only accept “⁠23andMe format” This is the most popular personal genomic data format.

What GEDmatch Will Not Accept

In addition to VCF, GEDmatch will not accept full exomes or genomes. It also does not recommend uploading imputed DNA data.

GEDmatch does not accept DNA data in any of the following formats:

  • FASTA
  • FASTQ
  • BAM
  • CRAM
  • genome VCF (Variant Call Format)
  • standard VCF

If yoiu have a file in one of these formats, you can ⁠upload the file for free to Sequencing.com. Once uploaded, you can then use tools such as ⁠Genome Explorer and ⁠Next-Gen Disease Screen to analyze your data for more than 15,000 diseases, traits, and medication reactions.

How to Convert a VCF to 23andMe Format

If you want to convert your VCF to 23andMe’s format, start by downloading the codebase from this GitHub repo: ⁠VCF-to-23andMe

Run the data_to_db.py script using your VCF as input. This will generate a genome.db file (>python3 data_to_db.py input.vcf.gz vcf genome.db).

Then you run the db_to_23.py script using the genome.db file as input. This is what produces the 23andMe format (>python3 db_to_23.py genome.db blank_v3.txt 23andMe.txt).

Open the text file to make sure it is in the correct format. It should have # rsid chromosome position genotype at the top of the document.

The file is ready for uploading if it contains the above.

Setting Up a GEDmatch Genetic Profile

When uploading raw DNA data to the genetic database, users must answer questions to complete their profile. These questions help build the kit on the site. This is when users can opt-in or opt-out of granting law enforcement officials access to their raw DNA data for criminal searches.

Keep in mind, even if users opt-out of granting access, a court can grant GEDmatch data access to a law enforcement member. This usually happens when there is a cold case, like in the case of the ⁠Golden State Killer and violent crimes.

In other words, while the information may not be easily accessible, it’s still is if needed for DNA matches to crime scene DNA collection.

It’s also important to know that even if you may not have been involved in a crime, you could lead police to distant relatives simply because your ⁠DNA matches what was collected at the crime scene.

DNA Kit Numbers: How It Works

Every profile and kit has a number. The GEDmatch Kit Numbers are unique identifiers every uploaded DNA kit receives. It is not the same as the number the DNA testing company labels DNA results.

This kit number is what you will use whenever using your DNA data with the GEDmatch site. It is attached to each person’s DNA profile.

Tier 1 GEDmatch Tools for Members

GEDmatch is mostly free, but there are GEDmatch tier 1 tools that require a subscription. This subscription costs $10 per month. If you are interested in genetic genealogy, this cost may be worthwhile.


About The Author

Dr. Brandon Colby MD is a US physician specializing in the personalized prevention of disease through the use of genomic technologies. He’s an expert in genetic testing, genetic analysis, and precision medicine. Dr. Colby is also the Founder of Sequencing.com and the author of Outsmart Your Genes.

Dr. Colby holds an MD from the Mount Sinai School of Medicine, an MBA from Stanford University’s Graduate School of Business, and a degree in Genetics with Honors from the University of Michigan. He is an Affiliate Specialist of the American College of Medical Genetics and Genomics (⁠ACMG), an Associate of the American College of Preventive Medicine (⁠ACPM), and a member of the National Society of Genetic Counselors (⁠NSGC)

© 2024 Sequencing.com