I recently decided to start with snakemake. I can't find anything that fits my needs neither on stack, nor on the snakemake doc. I feel like I don't understand something and I may need some explanations.
I am trying to make a simple snakemake workflow that take as input, a fastq file and a sequencing-summary file (that contains infos about the reads) and filter the reads within the fast into several file (low.fastq and high.fastq).
My input data and my Snakefile I'm trying to execute are stored like this :
.
├── data
│ ├── sequencing-summary-example.txt
│ └── tiny-example.fastq
├── Snakefile
└── split_fastq
And this is what I've tried so far :
*imports*
rule targets:
input:
"split_fastq/low.fastq",
"split_fastq/high.fastq"
rule split_fastq:
input:
"data/{reads}.fastq",
"data/{seqsum}.txt"
output:
"split_fastq/low.fastq",
"split_fastq/high.fastq"
run:
* do the thing *
I expected to have a directory "split_fastq" filled with a "low" and a "high" fastq. But instead I got the error :
Building DAG of jobs...
WildcardError in line 10 of /work/sbsuser/test/roxane/alignement-ont/Snakefile:
Wildcards in input files cannot be determined from output files:
'reads'
Even though it seems to be a very popular error, I'm not sure if I don't understand how to use wildcards or if there is an other problem. Am I using the "input" and "output" correctly ?
snakemake split_fastq/low-example.fastq
. The other is that you use the globbing approach in the link, which then will find the relevant samples, and decides which output needs to be generated. – Maarten-vd-Sande