The issue is {input} versus {wildcards.namedVar} access in the shell directive. See here in the documentation. With that said, I do not see your driver call for the Snakemake setup, which I would also recommend. (I've added it in my answer below). It would be equivalent to the .PHONY and all rule pattern (the messy convention that GNU Make forced us into).
In your shell directive, the variable {filename} is accessible as an attribute of the wildcard object. You need to use python dot notation to access it, like {wildcards.filename}. With that said, the better way would be to directly access the input wildcard object because it actually has built in toString conversion, since it carries only a single list of strings (where as the wildcard object can contain multiple individual wildcard attributes, so the behaviour is not predictable).
You can ignore the ".snk" suffix, I just think it's nice for Snakemake rule files. In code, this is what I mean:
test.snk
rule test:
output:
"{filename}"
wildcard_constraints:
filename = "[abc]"
shell:
"echo x > {wildcards.filename}"
In identical fashion, you can also do this, test.snk:
rule test:
output:
"{filename}"
wildcard_constraints:
filename = "[abc]"
shell:
"echo x > {output}"
Recommended Code Base:
test1.snk:
rule test:
output:
"{filename}"
wildcard_constraints:
filename = "[abc]"
shell:
"echo x > {output}"
Snakefile:
configfile: "config.yaml"
rule all:
input:
expand("{sample}", sample=config["fileName"])
include: "test1.snk"
config.yaml
fileName: ['a','b','c']
$snakemake -n:
rule test:
output: a
jobid: 1
wildcards: filename=a
rule test:
output: c
jobid: 2
wildcards: filename=c
rule test:
output: b
jobid: 3
wildcards: filename=b
localrule all:
input: a, b, c
jobid: 0
Job counts:
count jobs
1 all
3 test
4
Additional info
Also, this setup scales VERY well :) Run it just using the CLI call Snakemake, absent of any arguments. Like:
snakemake
Although this is terrible practice, technically it's also possible if you are more "outcome" oriented, and don't care about reproducibility.
snakemake -n -s "test1.snk" a b c
That will essentially target just rule "test1.snk" and request from it "a", "b", and "c".
rule test:
output: c
jobid: 0
wildcards: filename=c
rule test:
output: b
jobid: 1
wildcards: filename=b
rule test:
output: a
jobid: 2
wildcards: filename=a
Job counts:
count jobs
3 test
3
You can see the dry-run call is actually different, as it is not accessing the "rule all", as a result, there is no 4th job. Overall the processing by Snakemake is usually trivial to the processing performed by shell commands. With out without an "all" rule I would expect very little difference in performance. Yet, with the all rule, it's infinitely clearer what your code is suppose to be doing, and you can easily re-run the exact same command without having to 'grep' your 'history'.
x
. The target files area
,b
andc
, which are all created by calling the pattern rule three times. – Michael Schubert