Clinical+ VCF | Sequencing.com

Knowledge Center

Clinical+ VCF

Clinical+ VCF Format

Clinical+ VCF format eliminates the ambiguity of standard VCFs while providing comprehensive information all within a single file.

The Clinical+ VCF format was created by Sequencing.com for use in clinical applications. Clinical+ VCFs resolve the ambiguity that exists in standard VCF data while still maintaining a manageable file size.

Standard VCF files include data for a specific chromosomal coordinate if the variant (alt) allele is detected. Ambiguity arises, however, when a chromosomal coordinate is not included in a VCF because this could mean either the reference allele was detected or there was a no call at that coordinate. This ambiguity is not acceptable for clinical applications because clearly differentiating between a reference allele and a no-call can have a significant impact upon the interpretation of the data.

Clinical+ VCF resolves this limitation by identifying if there is a no-call and indicating the cause of the no-call, such as low genotype quality score, low coverage or conflicting prediction. Clinical+ VCFs can therefore have two possible results: variant (alt) allele is detected or no-call. If a chromosomal coordinate is not listed then this means the call was the same as the reference allele.

While similar to gVCF (Genome VCF) format, Clinical+ VCFs are considerably smaller in size. gVCFs include three possible results (variant, reference, no-call) but the 'reference' result is extraneous information that can be safely excluded since the information can be obtained from the reference genome. As the majority of calls in a GVCF are reference, excluding reference calls from a Clinical+ VCF while still identifying no-calls means that a Clinical+ VCF provides the same data in a much smaller file size.

 

Clinical+ VCFs also include comprehensive information that may have clinical relevance. This includes:

 

Clinical+ VCF's can be automatically generated from most genetic data files (such as FASTA, FASTQ, BAM, SEQUENCE.TXT, etc.) using the following apps:

 

There are three versions of Clinical+ VCFs:

Clinical+ WGS VCF

  • Included: Calls, No-Calls and Coordinates not interrogated
    • Coordinates not interrogated included as blocks
  • Excluded: Homozygous Reference calls

 

Clinical+ Exome VCF

  • Included: Calls, No-Calls and Coordinates not interrogated
    • Coordinates not interrogated included as blocks
  • Excluded: Homozygous Reference calls

 

Clinical+ Array VCF

  • Included: Calls (including Homozygous Reference) and No Calls
  • Excluded: Coordinates not interrogated

 

The Altruist Endeavor utilizes the Clinical+ VCF format stored in a unique Cassandra cluster to enable rapid, comprehensive search of Altruist data.