Technical details

The Altruist Database utilizes a highly advanced, dynamically expanding Cassandra (NoSQL) cloud cluster with ⁠GEMINI and ⁠ADAM integrations. Sequencing.com’s Cassandra implementation has been fully customized and optimized for genetic data storage in order to provide free, ultra-fast, extremely robust querying and analysis of massive amounts of rapidly growing Big Data.

To ensure consistency of the data throughout the database, genotypic data is obtained from contributed files, anonymized, automatically converted into Clinical+ VCF files and indexed. Sequencing.com also automatically detects each file’s reference genome and sex and takes this into account when indexing and storing the data.

Specs

  • Genotypic Data

    • Variant Search
      • Input can be chromosomal coordinate or reference SNP ID number (dbSNP rs#)
      • Input has the option of including a specific allele or genotype
        • If allele or genotype is not included then data on each possible allele and genotype at that coordinate will be returned
    • Haplotype Search
      • Input can be two or more chromosomal coordinates or reference SNP ID numbers (dbSNP rs#)
      • Must indicate allele for each coordinate
    • Sequence Search
      • Input can be two or more sequential alleles
      • Input must include starting chromosomal coordinate (the coordinate for the first allele in the sequence)
      • Uses a one-based coordinate system
    • Analysis
      • Analysis can be performed on user-defined subset of the data within the Altruist Database
      • Utilizes ⁠ADAM that has been fully integrated into the Altruist Database
    • Input coordinate can be based on any of the following assemblies (reference genome coordinate systems)
      • hg38 / GRCh38
      • hg19 / GRCh37
      • hg18 / NCBI36
      • hg17 / NCBI35
    • One-based coordinate system
    • Analysis groups (one or more Altruist IDs) can be easily saved and shared with others
    • Results can also be easily saved and shared
  • Phenotypic data

    • Phenotype Search
      • Input can be any phenotype including:
        • Disease, Condition or Trait
        • Medication name or adverse drug reaction
        • Biomarker name
    • Phenotypic data derived from:
      • Genetic interpretation
        • GEMINI (includes ClinVar)
        • Nexus® (Sequencing.com’s proprietary genotype-phenotype database)
      • User provided (questionnaires)

Open source

The code for the database schema and the Altruist API are available at ⁠Github.

  • We invite you to create your own tools and apps.
  • You can host the tools yourself or we’ll be happy to host them for free.
  • You can also choose to make them available to the global community via Sequencing.com’s App Market.

Related

Altruism Rewards

Genomes in Need

© 2023 Sequencing.com