How do I run the pipeline on an M1 Mac?

How do I run the pipeline on an M1 Mac?#

If you have a modern Mac you’ll need to use the docker execution profile to run the pipeline on computers with the arm64 architecture. An extra arm profile can be used to get the pipeline working.

$ nextflow run pgscatalog/pgsc_calc \
    -profile docker,test,arm

The default platform is amd64, so if you’re not running on an ARM computer you don’t need to include this extra profile. If you don’t set arm profile on M1 Macs, you’ll probably get a segmentation fault during variant matching:

WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
<jemalloc>: MADV_DONTNEED does not work (memset will be used instead)
<jemalloc>: (This is the expected behaviour if you are running under QEMU)
qemu: uncaught target signal 11 (Segmentation fault) - core dumped line 2:    69 Segmentation fault      match_variants --min_overlap 0.75 --dataset cineca_synthetic_subset --scorefile scorefiles.txt --target "$PWD/*.vars" --split -n 2 --outdir $PWD -v

The pipeline has been developed and tested using Docker desktop on an M1 Macbook.

pgsc_calc seems slow on a Mac#

Running pgsc_calc on a mac will generally be slower than running on Linux. This is because the workflow is running in a Linux virtual machine on top of macOS. Some things to consider:

  • Enable VirtIO to improve I/O performance

  • Allocate more CPU / RAM to the Docker Desktop virtual machine

  • Run big jobs on Linux