How to use scoring files in the PGS Catalog#

The easiest way to calculate a polygenic score is to use a scoring file that’s been published in the PGS Catalog!

1. Samplesheet setup#

First, you need to describe the structure of your genomic data in a standardised way. To do this, set up a spreadsheet that looks like:

Example samplesheet for a combined plink2 file set#
sampleset	path_prefix	chrom	format
cineca	/path/to/target_genomes/cineca_synthetic_subset		pfile

Save the file as samplesheet.csv. See How to set up a samplesheet for more details.

2. Pick scores from the PGS Catalog#

Accessions#

Individual scores can be used by using Polygenic Score IDs that start with with the prefix “PGS”. For example, PGS001229. The parameter --accession accepts polygenic score IDs:

--pgs_id PGS001229

Multiple scores can be set by using a comma separated list:

--pgs_id PGS001229,PGS000802

Traits#

If you would like to calculate every polygenic score in the Catalog for a trait, like coronary artery disease, then you can use the --trait_efo parameter:

--trait_efo EFO_0001645

Multiple traits can be set by using a comma separated list.

Publications#

If you would like to calculate every polygenic score associated with a publication in the PGS Catalog, you can use the --pgp_id parameter:

--pgp_id PGP000001

Multiple traits can be set by using a comma separated list.

Note

PGS, trait, and publication IDs can be combined to calculate multiple polygenic scores.

3. Calculate!#

$ nextflow run pgscatalog/pgscalc \
    -profile <docker/singularity/conda> \
    --input samplesheet.csv \
    --pgs_id PGS001229 \
    --trait_efo EFO_0001645 \
    --pgp_id PGP000001

Note

For more details about calculating multiple scores, see How to apply multiple scores in parallel

Contents

About the project

Useful links

How to use scoring files in the PGS Catalog

Contents

How to use scoring files in the PGS Catalog#

1. Samplesheet setup#

2. Pick scores from the PGS Catalog#

Accessions#

Traits#

Publications#

3. Calculate!#