0
votes

Input: A Snakefile that uses the SSP software to calculate various quality metrics for sequencing data. The input to SSP is a BAM file.

sample1.sorted.bam

Output: Various files, but the only one I care about is a file named {prefix}.stats.txt.

sample1.stats.txt

Snakefile: ($SCIF_DATA = /scif/data)

configfile: "config.yaml"
workdir: "/scif/data"

# define samples
SAMPLES, = glob_wildcards("raw_data/{sample}.fastq.gz")

rule all:
    input:
        expand("processed_data/qc/{sample}/{sample}.stats.txt", sample=SAMPLES),

rule quality_metrics:
    input:
        "processed_data/{sample}.sorted.bam"
    params:
        prefix="{sample}",
        gt="raw_data/hg38.chrom.sizes"
    output:
        "processed_data/qc/{sample}/{sample}.stats.txt"
    shell:
        "scif run ssp '-i $SCIF_DATA/{input} -o {params.prefix} --gt {params.gt} -p 50 --odir $SCIF_DATA/{params.prefix}'"

When I run ssp '-i sample1.sorted.bam -o sample1 --gt {params.gt} -p 50 --odir sample1 on the terminal, I get the correct output:

{path}/sample1/sample1.stats.txt

However when I run my snakemake workflow, I am getting the following error:

Waiting at most 5 seconds for missing files.
MissingOutputException in line 58 of /scif/data/Snakefile:
Missing files after 5 seconds:
processed_data/qc/THP-1_PU1-cMyc_PU1_sc_S40_R1_001/THP-1_PU1-cMyc_PU1_sc_S40_R1_001.stats.txt
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.
Will exit after finishing currently running jobs.
Shutting down, this might take some time.

Increasing the latency wait time does not help.

Any ideas?

1

1 Answers

1
votes

I think you are missing part of the output path in the prefix

params:
    prefix="processed_data/qc/{sample}"