By Dr. Brandon Colby MD, a physician-expert in the fields of Genomics and Personalized Preventive Medicine.
If you’re looking for one of the most popular free DNA upload sites, GEDmatch is it.
GEDmatch, founded by Curtis Rogers and John Olson in 2010, is a website where people can upload their DNA data from several different DNA testing companies. After uploading data to GEDmatch’s DNA database, a user can then collaborate with other GEDmatch users and use GEDmatch’s genetic genealogy features.
GEDmatch does not offer its own DNA testing services but does allow you to upload your DNA test data from testing offered by any of the following test providers:
GEDmatch is a free genetic genealogy site that allows users to upload their autosomal DNA test results to the site’s DNA database. Users can then use the site’s tools to identify DNA matches and learn more about family history through the GEDmatch DNA database. The site analyzes X chromosome and autosomal genetic data to deliver its results, which has proven to be highly effective.
GEDmatch is known for its use by law enforcement, but many people use it for personal DNA research. Law enforcement searches are what makes GEDMatch’s privacy worrisome for many people, as the DNA data uploaded to the site has been used in law enforcement investigations.
The privacy concerns came to light in May of 2018 when the DNA database was used by police to identify the suspect in the Golden State Killer case in California. GEDmatch tried to act fast in regaining the confidence of users by changing their site policy and terms of service. They even added a privacy option for users to opt-in or opt-out of police being able to access their DNA databases.
The opt-in or opt-out option may have helped them regain confidence from users, but it was short-lived. A few months later in November 2019, a law enforcement officer obtained a warrant to access ALL of the GEDmatch DNA data. This includes people who opted out of having their data searchable by law enforcement investigators.
A month later in December, Verogen purchased GEDmatch. Verogen is a forensic testing company that works with law enforcement agencies.
Despite the focus on law enforcement use, GEDmatch.com continues to be a popular genealogical tool used by genetic genealogists. The reality is the genealogy website offers a wealth of information that can help family members find each other through the family finder tool, common ancestors with DNA ancestry tools, and learn more about ethnicity and family history for building family trees.
The GEDmatch admixture/heritage tool allows users to pick the project they want to compare their DNA to and each project produces different results.
When considering all of the DNA information GEDmatch has available, many people do not find the privacy concerns a deterrent. After all, the only people who should really be concerned about privacy are those involved in cold cases.
You can upload your DNA file to the genealogy database within a few minutes. All you have to do is go to the registration page to register.
After registering, you can upload your DNA data file from the SNP-based testing company you used for testing. You can download your data from your DNA testing company by following the below links to tutorials on how to download raw DNA data.
Depending on which DNA testing company you used, you will either have a file or folder. GEDmatch will only accept “23andMe format” This is the most popular personal genomic data format.
In addition to VCF, GEDmatch will not accept full exomes or genomes. It also does not recommend uploading imputed DNA data.
GEDmatch does not accept DNA data in any of the following formats:
- genome VCF (Variant Call Format)
- standard VCF
If you have DNA data in one of the formats above, you can use the Ultimate Compatibility app at Sequencing.com to convert your file into an Ultimate Compatibility File, which is compatible with GEDmatch and most other third-party sites that allow DNA data uploads.
To automatically convert your DNA data file to an Ultimate Compatibility File, start by uploading your DNA data and then using the Ultimate Compatibility app.Free DNA Data Upload
If you want to convert your VCF to 23andMe’s format, start by downloading the codebase from this GitHub repo: VCF-to-23andMe
Run the data_to_db.py script using your VCF as input. This will generate a genome.db file (>python3 data_to_db.py input.vcf.gz vcf genome.db).
Then you run the db_to_23.py script using the genome.db file as input. This is what produces the 23andMe format (>python3 db_to_23.py genome.db blank_v3.txt 23andMe.txt).
Open the text file to make sure it is in the correct format. It should have # rsid chromosome position genotype at the top of the document.
The file is ready for uploading if it contains the above.
When uploading raw DNA data to the genetic database, users must answer questions to complete their profile. These questions help build the kit on the site. This is when users can opt-in or opt-out of granting law enforcement officials access to their raw DNA data for criminal searches.
Keep in mind, even if users opt-out of granting access, a court can grant GEDmatch data access to a law enforcement member. This usually happens when there is a cold case, like in the case of the Golden State Killer and violent crimes.
In other words, while the information may not be easily accessible, it’s still is if needed for DNA matches to crime scene DNA collection.
It’s also important to know that even if you may not have been involved in a crime, you could lead police to distant relatives simply because your DNA matches what was collected at the crime scene.
Every profile and kit has a number. The GEDmatch Kit Numbers are unique identifiers every uploaded DNA kit receives. It is not the same as the number the DNA testing company labels DNA results.
This kit number is what you will use whenever using your DNA data with the GEDmatch site. It is attached to each person’s DNA profile.
GEDmatch is mostly free, but there are GEDmatch tier 1 tools that require a subscription. This subscription costs $10 per month. If you are interested in genetic genealogy, this cost may be worthwhile.
Dr. Brandon Colby MD is a US physician specializing in the personalized prevention of disease through the use of genomic technologies. He’s an expert in genetic testing, genetic analysis, and precision medicine. Dr. Colby is also the Founder of Sequencing.com and the author of Outsmart Your Genes.
Dr. Colby holds an MD from the Mount Sinai School of Medicine, an MBA from Stanford University’s Graduate School of Business, and a degree in Genetics with Honors from the University of Michigan. He is an Affiliate Specialist of the American College of Medical Genetics and Genomics (ACMG), an Associate of the American College of Preventive Medicine (ACPM), and a member of the National Society of Genetic Counselors (NSGC)