Snakemake: access a list within a dict by using a wildcard

Question

To break it down, I have a dict that looks like this:

dict = {'A': ["sample1","sample2","sample3"], 
        'B': ["sample1","sample2"], 
        'C': ["sample1","sample2","sample3"]}

And I have a rule:

rule example:
      input:
          #some input
      params:
          # some params
      output:
          expand('{{x}}{sample}', sample=dict[wildcards.x])
          # the alternative I tried was
          # expand('{{x}}{sample}', sample=lambda wildcards: dict[wildcards.x])
      log:
          log = '{x}.log'
      run:
        """
        foo
        """

My problem is how can I access the dictonary with the wildcard.x as key such that I get the list of items corresponding to the wildcard as key. The first example just gives me

name 'wildcards' is not defined

while the alternative just gives me

Missing input files for rule all Since snakemake doesn't even runs the example rule.

I need to use expand, since I want the rule to run only once for each x wildcard while creating multiple samples in this one run.

Dmitry Kuzminov Dmitry Kuzminov · Accepted Answer · 2020-12-23T04:28:06

You can use a lambda as a function of a wildcard in the input section only, and cannot use in the output. Actually output doesn't have any wildcards, it defines them.

Let's rethink of your task from another view. How do you decide how many samples the output produces? You are defining the dict: where does this information come from? You have not shown the actual script, but how does it know how many outputs to produce?

Logically you might have three separate rules (or at least two), one knows how to produce two samples, the other how to produce three ones.

As I can see, you are experiencing a Problem XY: you are asking the same question twice, but you are not expressing your actual problem (X), while forcing an incorrect implementation with defining all outputs as a dictionary (Y).

Update: another possible solution to your artificial example would be to use dynamic output:

rule example:
      input:
          #some input
      output:
          dynamic('{x}sample{n}')

That would work in your case because the files match the common pattern "sample{n}".

Snakemake: access a list within a dict by using a wildcard

1 Answers