user-configurable parameters

Analysis parameters

Any {scamp} parameter that can be provided in the project `_defaults` stanza or under a specific dataset's stanza.

TagDescriptionTypeProvider
adt set pathPath to antibody-derived tags reference file.path
barcodeA barcode identifier, for example BC001.string(s)
dataset idA directory-safe name for a dataset, taken from the dataset name if omitted.string
dataset nameHuman readable name for a dataset in an analysis. The YAML key will be used if omitted.string
dataset tagA very short name for the dataset. This will be appended to cell barcodes so should be very short and concise with no spaces or funny characters. An unhelpful default will be provided but should not be trusted.string
descriptionA short textual description of the dataset, mainly as an aide-memoire. A default value of dataset name is used if missing.string
fastq pathsPaths to any directory (non-recursively) containing FastQ files for the project.paths
feature identifiersWhether to use accession or name as the feature index. In the Seurat workflows, the non-selected identifier may be saved into an RNA_alt assay in the object. The default is to use feature names.string
feature typesA map of assay types in the project and the relevant LIMS IDs.map of strings
hto set pathPath to hashtag oligos reference file.path
index pathThe path to an index for the analysis. If omitted, it is assumed that an index is to be created and will be provided by a {scamp} process.path
limsidIdentifier(s) for the sample in the project. This will be used to identify FastQ files for the dataset/sample. No default value can be provided. Some samples provide multiple libraries, so this may be a collection of strings in certain cases.string(s)
probe set pathA 10x-provided file linking probes and gene targets.path
quantification methodThe method used to create the data in quantification path. This is a curated set of methods and depends on the analysis workflows: cell_ranger and cell_ranger_arc for example. This will be provided by {scamp} if a quantification workflow is applied, otherwise it is required.string
quantification pathPath to quantified data that can be read and used by an analysis workflow. Can be provided by a {scamp} workflow.path
vdj index pathPath to VDJ reference index.path
workflowsA collection of (unordered) workflows to apply in an analysis. These are a curated list of workflows available in {scamp} and should be specified as a path. (Spaces will be converted to underscores). Omitting this parameter will prevent workflows from launching but will not cause {scamp} to fail.strings

Project parameters

A reserved stanza that defines the project, rather than specifc data.

TagDescriptionTypeProvider
babs idUnique identifier for the project.string
labThe <last name><first initial> formatted name of the lab.string
lims idUnique identifier for the project.string
scientistThe <first name>.<last name> formatted name of the scientist, which may help find data in the filesystem. Be careful with double-barraled or multiple last names.string
typeType of project as recorded by ASF. This is a curated list of: “10X-3prime”, “10X-multiome” etc. The default value is 10X-3prime.string

Genome parameters

A dictionary of parameters that define a genome. This can be used to define the parameters for a custom genome.

TagDescriptionTypeProvider
assemblyName of the genome assembly, such as “mm10”.string
ensembl releaseNumber of Ensembl release, such as 98.string
fasta fileGenomic sequence in FastA format. Can be provided by the fasta path option. This parameter takes precedence over the fasta path parameter.file
fasta pathA directory with FastA files that can be used to create a genome index. When provided, the files in the directory will be concatenated together into a genome FastA.No default is provided but is probably only needed to build an index.path
gtf fileGTF file of features in the genome. This parameter will be used in preference to the gtf path.file
gtf pathA directory with GTF files that can be used to quantify activity of features. The files in this directory will be concatenated into a single GTF file and the result used in gtf file. No default is provided but is probably only needed to build an index.path
idA directory-safe name of the genome, which will be converted from assembly if missing.string
motifs fileA JASPAR-formatted file of motifs that can be used by Cell Ranger ARC to build an index. No default is provided.file
non-nuclear contigsA collection of chromosomes in the genome that may be treated differently - for example by Cell Ranger ARC to created an index.strings
organismLatin name for the species, such as “mus musculus”.string

Nextflow parameters

Parameters used by the pipeline but are not directly part of {scamp} and specified with the `--` command line option. Default values are defined in `params.config`.

TagDescriptionTypeProvider
only_validate_parametersDo not start the piipeline but check and validate that the parameters in --scamp_file are probably OK to use. The checks are for types against the expected and whether sufficent parameters were provided for each of a dataset’s workflows. Defaults to false.boolean
publish_dirThe root directory (default: results) under which task results will be published.path
publish_modeHow results of tasks are outuput, defaults to copy. Other modes may affect the pipeline so the only alterantive to copy is symlink.string
scamp_fileYAML file (default: scamp_file.yaml) that contains the configuration parameters for the analyses.file
show_parameter_validationShow a summary of the parameters that were checked and validated for each dataset. The default is to not show the summary (--show_parameter_validation false). If any parameter fails validation, the summary of failed parameters is printed and scamp will stop.boolean