1
votes

I'm trying to combine these two rules together

rule fastqc:
    input:
        fastq = "{sample}.fastq.gz",
    output:
        zip1 = "{sample}_fastqc.zip",
        html = "{sample}_fastqc.html",
    threads:8
    shell:
        "fastqc -t {threads} {input.fastq}"

rule renamefastqc:
    input:
        zip1 = "{sample}_fastqc.zip",
        html = "{sample}_fastqc.html",
    output:
        zip1 = "{sample}__fastqc.zip",
        html = "{sample}__fastqc.html",
    shell:
        "mv {input.zip} {output.zip} && "
        "mv {input.html} {output.html} "

To look like this.

rule fastqc:
    input:
        fastq = "{sample}.fastq.gz"
    output:
        zip1 = "{sample}__fastqc.zip",
        html = "{sample}__fastqc.html"
    threads:8
    shell:
        "fastqc -t {threads} {input.fastq} && "
        "mv {outfile.zip} {output.zip1} && "
        "mv {outfile.html} {output.html}"

FastQC cannot specify file outputs and will always take a file ending in fastq.gz and create two files ending in _fastqc.zip and _fastqc.html. Normally I just write a rule that takes in those outputs and produces the one with two underscores (renamefastqc rule). But this means everytime I run the pipeline, snakemake sees that the outputs for the fastqc rule are gone and it wants to rebuild them. Therefore I'm trying to combine both rules into one step.

1

1 Answers

1
votes

You could use params to define files that are to be renamed.

rule all:
    input:
        "a123__fastqc.zip",

rule fastqc:
    input:
        fastq = "{sample}.fastq.gz",
    output:
        zip1 = "{sample}__fastqc.zip",
        html = "{sample}__fastqc.html",
    threads:8
    params:
        zip1 = lambda wildcards, output: output.zip1.replace('__', '_'),
        html = lambda wildcards, output: output.html.replace('__', '_')
    shell:
        """
        fastqc -t {threads} {input.fastq}
        mv {params.zip1} {output.zip1} \\
            && mv  {params.html} {output.html}
        """