I'm creating a pipeline using snakemake to call methylation in nanopore sequencing data. I've run snakenake using the --dryrun option and the dag is constructed successfully. But when I add the option --profile slurm, I get the following error:
(nanopolish) [danielle.perley@talonhead2 nanopolish-CpG-calling]$ snakemake -np --use-conda --profile slurm test_data/20-001-002/20-001-002_fastq_pass.gz
Building DAG of jobs...
Job counts:
count jobs
1 combine_tech_reps
1
InputFunctionException in line 32 of /home/danielle.perley/nanopolish-CpG-calling/Snakefile:
Error:
SyntaxError: invalid syntax (<string>, line 1)
Wildcards:
sample=20-001-002
Traceback:
File "/home/danielle.perley/miniconda3/envs/nanopolish/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 115, in run_jobs
File "/home/danielle.perley/miniconda3/envs/nanopolish/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 120, in run
File "/home/danielle.perley/miniconda3/envs/nanopolish/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 131, in _run
File "/home/danielle.perley/miniconda3/envs/nanopolish/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 151, in printjob
File "/home/danielle.perley/miniconda3/envs/nanopolish/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 137, in printjob
Line 33 is rule combine_tech_reps in my snakefile. (I'm only showing the first part of my snakefile here)
from snakemake.utils import validate
import pandas as pd
import os.path
import glob
configfile: "config.yaml"
samples_df = pd.read_table(config["samples"],sep = '\t')
samples_df = samples_df.set_index("Sample")
samples = list(samples_df.index.unique())
wildcard_constraints:
sample = "|".join(samples)
def get_fast5(wildcards):
f5 = glob.glob(os.path.join(config["raw_data"],wildcards.sample,"2*","fast5_pass"))
return(f5)
localrules: all,build_index
rule all:
input:
expand("results/Methylation/{sample}_frequency.tsv",sample=samples),
expand("results/alignments/{sample}_flagstat.txt",sample=samples),
expand("resources/QC/{sample}_pycoQC.json",sample=samples),
expand("results/QC/{sample}_pycoQC.html",sample=samples),
"report/multiQC.html"
rule combine_tech_reps:
input:
fqs = lambda wildcards: glob.glob(os.path.join(config["raw_data"],"{sample}","2*","{sample}_fastq_pass.gz").format(sample=wildcards.sample))
output:
fq = os.path.join(config["raw_data"],"{sample}","{sample}_fastq_pass.gz")
shell: """
zcat {input} > {output}
"""
I have a slurm profile file in the directory: ~/.config/snakemake/slurm/config.yaml
jobs: 10
cluster: "sbatch -p talon -t {resources.time} --mem={resources.mem} -c {resources.cpus} -o logs_slurm/{rule}_{wildcards} -e logs_slurm/{rule}_{wildcards}"
default-resources: [cpus=1, mem=2000, time=10:00]
use-conda: true
I'd really like to use this pipeline on our HPC, but I'm not sure what's causing this error.