0
votes

I have a file with prefixes for each input my files for a snakemake rule:

> "PB0206,SRR10694951,PB0216,SRR10694963" 
> "PB0212,PB0202"
> "PB0199,PB0205" 
> "PB0215,PB0219" 
> "PB0197,PB0211"
> "PB0204,SRR10694964,PB0214" 
> "PB0210,SRR10694969,PB0209,SRR10694950"
> "PB0217"

I want to make a rule that takes each row of sample ids as the wildcard inputs for a single job; one job for each row. So the inputs for the rule run on the first row would be like so in the shell:

shell:"""
      -I PB0206.bam -I SRR10694951.bam -I PB0216.bam -I SRR10694963.bam \
      -O SRR10694963_PB0206_SRR10694951_PB0216_SRR10694963.marked.bam
      """

And the name of the output file would the wildcards pooled together. The shell command for the last row as input the rule would look like:

shell:""" 
      -I PB0217.bam \ 
      -O PB0217.marked.bam
      """

Is there a way to make nested wildcards from a list or dictionary from a snakemake rule? Thanks in advance.

1
I just split each row of the csv into separate files preformatted as -I name -I name2... then I used those files as the rule inputs. In the shell is stored the file contents as a variable in the shell and use that variable as the input. - Apis_delorean

1 Answers

1
votes

You may get the values as a single wildcard and split into separate values in the script:

rule do_your_job:
    output: "{values_separated_with_underscore}.marked.bam"
    run:
        values = wildcards.values_separated_with_underscore.split("_")
        shell("-I ".join(values) + " -O {output}")

The problem is that you cannot define the input as the list of separate files then.

Another option is to define several rules, one for each expected number of values:

rule do_your_job1:
    input: lambda wildcards: expand("{value}.bam", value=wildcards)
    output: "{value1}.marked.bam"
    shell:""" 
          -I {value1}.bam \ 
          -O {output}
          """

rule do_your_job2:
    input: lambda wildcards: expand("{value}.bam", value=wildcards)
    output: "{value1}_{value2}.marked.bam"
    shell:""" 
          -I {value1}.bam -I {value2}.bam \ 
          -O {output}
          """