Quick start
nf-core/pixelator helps you go from raw sequencing data (FASTQ) to PXL output files that you can use for downstream analysis. It runs with Nextflow, which can execute the same workflow on a single server, a large HPC cluster, or in the cloud. This quickstart guide is aimed at first time users of nextflow and nf-core/pixelator.
If you are not used to working with servers and/or high performance computing systems, and parts of this guide feel difficult, consider contacting your local IT support.
If you are using a centralized HPC center, many administrators will be comfortable setting up and configuring nf-core pipelines. There are also many cluster-specific configurations available via nf-core/configs. Perhaps your cluster is among them?
If you do not have access to these resources, that’s okay — the guide below will still walk you through the basics you need to get started.
Before you start (checklist)
- Your system is supported
- Linux or macOS (Windows is supported via WSL2)
- You have x86_64 CPUs (ARM systems such as Apple M series are currently not supported)
- You have enough RAM memory
- Plan for at least 512 GB RAM available on one or more machines for memory-heavy steps
- You have enough disk space
- Plan for several times the total size of your FASTQ files (the pipeline creates intermediate files while it runs)
- Nextflow works on your machine
-
nextflow -vruns successfully
-
- You have a container runtime install, any of the below should be ok
- Docker:
docker -vworks - Apptainer:
apptainer --version - Singularity
singularity --version
- Docker:
- You know where results should be written
- You have chosen an output folder for
--outdir(for example./results)
- You have chosen an output folder for
- You have prepared a samplesheet with the paths the files you want to process
Choose your setup
Pick the option that best matches where your data lives and how much computing power you need.
Option 1: Single server / workstation
Best when you want the simplest setup, and your dataset fits comfortably on one machine.
- Pros
- Quick to get started (ideal for first-time users)
- Great for testing and smaller runs
- Cons
- Limited by the CPU, RAM, and disk of a single machine
- Large datasets may run slowly due to processing happening sequentially
Go to Run on a single server.
Option 2: HPC cluster and/or cloud
Best when you need to scale up (more samples, faster turnaround) or your organization runs compute through a scheduler (e.g. slurm or PBS).
- Pros
- Can use many compute nodes in parallel (faster, more scalable)
- Better for large datasets and shared infrastructure
- Often provides fast scratch storage for the pipeline
workDir
- Cons
- Requires extra preparation (usually a
nextflow.configfor executor/queues/storage) - More environment-specific settings (you may need help from an HPC admin)
- Requires extra preparation (usually a
Go to Run on HPC or cloud.
Run on a single server
1) Run the test dataset (recommended first run)
This command downloads a small public test dataset and runs the pipeline end-to-end.
- Docker
- Apptainer
- Singularity
nextflow run nf-core/pixelator \
-profile test,docker \
--outdir "./results"
nextflow run nf-core/pixelator \
-profile test,apptainer \
--outdir "./results"
nextflow run nf-core/pixelator \
-profile test,singularity \
--outdir "./results"
What you should see
- Nextflow prints a banner that includes
nf-core/pixelatorand a pipeline version. - A
results/folder is created in your current directory. - The run finishes without errors.
If the test run fails, fix that first (container runtime, Java/Nextflow, permissions, disk space) before moving on to real data.
2) Run your own data
To run real data, you typically provide a samplesheet (a CSV file that tells the pipeline where your FASTQ files are) and choose an output directory.
- Docker
- Apptainer
- Singularity
nextflow run nf-core/pixelator \
-profile docker \
--input "samplesheet.csv" \
--outdir "./results"
nextflow run nf-core/pixelator \
-profile apptainer \
--input "samplesheet.csv" \
--outdir "./results"
nextflow run nf-core/pixelator \
-profile singularity \
--input "samplesheet.csv" \
--outdir "./results"
Replace these values
samplesheet.csv: the path to your samplesheet file./results: where you want output files to be written
Keep your first real run small (a subset of samples) so you can validate runtime, disk usage, and output before scaling up.
Run on HPC or cloud
On a cluster or cloud environment, the pipeline usually cannot “just run” with defaults, because Nextflow needs to know things like:
- Which scheduler/executor to use (for example Slurm or PBS)
- Which queue/partition to submit jobs to
- Where to put the work directory (often a fast scratch filesystem)
- Which container runtime is allowed (often Apptainer/Singularity)
There are two common approaches.
Option A: Use an existing nf-core institutional profile (recommended)
Many institutes already have a ready-made configuration in nf-core/configs. If your cluster is listed, this is usually the fastest path.
- See nf-core/configs
- More background: nf-core configuration documentation
You would then run with something like:
nextflow run nf-core/pixelator \
-profile <institution>,apptainer \
--input "samplesheet.csv" \
--outdir "./results"
Replace <institution> with the name of your cluster profile from nf-core/configs.
Option B: Create a minimal nextflow.config
If there is no existing profile (or you need custom settings), create a small config file in your run directory. Below is a minimal starting point. You will still need to fill in the correct values for your system.
/*
* Minimal Nextflow config for running nf-core/pixelator on HPC/cloud.
* Replace placeholders (<>), and ask your HPC admin if unsure.
*/
process {
executor = '<slurm|pbs|lsf|...>'
queue = '<partition_or_queue_name>'
// clusterOptions = '--account <project> --qos <qos>' // optional, system-specific
}
// Put temporary work files on fast storage if available
workDir = '<path_to_fast_scratch_workdir>'
// Choose ONE container runtime (set the one you use to true)
docker.enabled = false
apptainer.enabled = false
singularity.enabled = false
// Optional: set a default output folder (you can still override with --outdir)
// params.outdir = './results'
Then run the pipeline using your config file:
nextflow run nf-core/pixelator \
-c nextflow.config \
-profile apptainer \
--input "samplesheet.csv" \
--outdir "./results"
Common gotchas
- Start with the test dataset on your cluster first (it’s faster to debug):
-profile test,apptaineror-profile test,singularity - Use a
workDiron a filesystem that is intended for heavy temporary I/O (often scratch). Writing files to network attached storage can be very slow and degrade the performance of the pipeline. - If your cluster requires an account or project, you may need to add scheduler-specific flags (ask your admin).
- You do not need to merge your fastq files manually prior to running nf-core/pixelator. If you have multiple fastq files per samples simply add one entry per file in the samplesheet with the same sample name. The nf-core/pixelator pipeline will recognize this as data from the same sample and merge them prior to proceeding with the rest of the pipeline.
Where to go next
- Full running guide: Running the Pipeline
- Nextflow profile/config overview: Nextflow Parameters