Reference: Samplesheet schema

Reference: Samplesheet schema#

The documentation below is automatically generated from the schema. The JSON file contains additional technical detail not shown in the table below. See How to set up a samplesheet for a user-friendly step-by-step introduction to the genotype inputs.

Each row in a samplesheet can only have a single genomic data format (i.e. they are mutually exclusive). This reference is helpful if you want to:

  • Use the JSON input format (instead of CSV samplesheets) and validate the structure of your JSON

  • Deeply understand samplesheet data structure

But this schema probably isn’t very helpful for most users, so it’s OK to ignore it!

Target genome schema#

https://raw.githubusercontent.com/pgscatalog/pgsc_calc/dev/assets/schemas/samplesheet.json

Validates the JSON representation of a samplesheet

type

array

items

type

object

properties

  • sampleset

Sampleset name must be provided and cannot contain spaces or reserved characters (‘_’ or ‘.’)

type

string

pattern

^[a-zA-Z0-9]*$

  • path

A list of resolved target genome file paths

type

array

items

type

string

maxItems

3

minItems

1

uniqueItems

True

  • chrom

Specify the chromosome of associated genotyping data (must be in {1-22, X, XY, Y}). If all chromosomes are in the associated file (e.g. your data is not split by chromosome), set to null.

type

null / string

minLength

1

  • format

Target genome data format.

type

string

enum

pfile, bfile, vcf

  • vcf_genotype_field

Specify whether to import genotypes (default: GT), or imputed dosages (DS) from the VCF file.

type

boolean

minItems

1

uniqueItems

True