Datasets | Sequencing.com

Knowledge Center

Datasets

Datasets are automatically created when paired-end files are uploaded

When paired end files are uploaded into the same Sequencing.com account, Sequencing.com automatically links the paired-end files as a dataset. You can identify if the paired-end have been linked by checking to see if the files have a green 'Compatibility' light to the right of the filename in your My Files page.

If paired-end files are uploaded into your account, whenever you start an app with one of the paired end files, Sequencing.com will always process the entire dataset together.

For example, if you start the Empower app and select your R1 file, the app will automatically process data from both R1 and R2 (ie from the dataset R1+R2). If, instead, you start the app with the R2 file then the app will still process data from R1+R2.

 

Datasets can also be manually created

Although datasets are automatically generated by Sequencing.com, we also provide the ability for you to manually create your own datasets. To create a dataset, goto your My Files page and then click the +New Dataset button. Follow the instructions and select the two files to use to form the dataset.

You can create datasets using files you've uploaded as well as files that are shared with you.

Once created, you'll see the prefix "DS1" added to the two filenames that are part of your first dataset (DS1). If you create a second dataset then the prefix "DS2" will be added to each of the two filenames that are part of your second dataset (DS2).

  • Note: The "D2#" prefix is only added to manually created datasets. There is no prefix (no filename change) for automatically created datasets.

You can use a dataset with any app. For example, if you create a dataset of two FASTQ files you'll then be able to select that dataset when using the EvE app. Simply select either file as the input file and the app will process the entire dataset (R1+R2) together.

If the population and/or gender have already been assigned to either file then the same information will be assigned to the dataset. If neither of the files have an assigned population or gender, this information can be selected at the time the dataset is created. You may also select N/A for population or gender if unknown.

Datasets can only be created with files that have one of the extensions listed below.* Both files must have the same extension.

 .fastq                     .sequence.txt
 .fastq.gz  .sequence.txt.gz
 .fastq.bz2  .sequence.txt.bz2
 .fq  .seq.txt
 .fq.gz  .seq.txt.gz
 .fq.bz2  .seq.txt.bz2
 .fasta  
 .fasta.gz  
 .fasta.bz2  

 

*Please submit a Support Request to create a dataset with an extension not listed above.

 

Related

Assign demographic information to a file

Assign gender to a file

Sample data (sample genomes)