0
votes

I'm creating a pipeline using snakemake to call methylation in nanopore sequencing data. I've run snakenake using the --dryrun option and the dag is constructed successfully. But when I add the option --profile slurm, I get the following error:

(nanopolish) [danielle.perley@talonhead2 nanopolish-CpG-calling]$ snakemake -np --use-conda --profile slurm test_data/20-001-002/20-001-002_fastq_pass.gz

Building DAG of jobs...
Job counts:
    count   jobs
    1   combine_tech_reps
    1
InputFunctionException in line 32 of /home/danielle.perley/nanopolish-CpG-calling/Snakefile:
Error:
  SyntaxError: invalid syntax (<string>, line 1)
Wildcards:
  sample=20-001-002
Traceback:

  File "/home/danielle.perley/miniconda3/envs/nanopolish/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 115, in run_jobs
  File "/home/danielle.perley/miniconda3/envs/nanopolish/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 120, in run
  File "/home/danielle.perley/miniconda3/envs/nanopolish/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 131, in _run
  File "/home/danielle.perley/miniconda3/envs/nanopolish/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 151, in printjob
  File "/home/danielle.perley/miniconda3/envs/nanopolish/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 137, in printjob

Line 33 is rule combine_tech_reps in my snakefile. (I'm only showing the first part of my snakefile here)

from snakemake.utils import validate
import pandas as pd
import os.path
import glob

configfile: "config.yaml"

samples_df = pd.read_table(config["samples"],sep = '\t')
samples_df = samples_df.set_index("Sample")
samples = list(samples_df.index.unique())


wildcard_constraints:
    sample = "|".join(samples)
         
def get_fast5(wildcards):
      
    f5 = glob.glob(os.path.join(config["raw_data"],wildcards.sample,"2*","fast5_pass"))
    return(f5)

localrules: all,build_index

rule all:
    input: 
        expand("results/Methylation/{sample}_frequency.tsv",sample=samples),
        expand("results/alignments/{sample}_flagstat.txt",sample=samples),
        expand("resources/QC/{sample}_pycoQC.json",sample=samples),
        expand("results/QC/{sample}_pycoQC.html",sample=samples),
        "report/multiQC.html"


rule combine_tech_reps:
    input:
        fqs = lambda wildcards: glob.glob(os.path.join(config["raw_data"],"{sample}","2*","{sample}_fastq_pass.gz").format(sample=wildcards.sample))

    output:
        fq = os.path.join(config["raw_data"],"{sample}","{sample}_fastq_pass.gz")

    shell: """
        zcat {input} > {output}
    """

I have a slurm profile file in the directory: ~/.config/snakemake/slurm/config.yaml

jobs: 10
cluster: "sbatch -p talon -t {resources.time} --mem={resources.mem} -c {resources.cpus} -o logs_slurm/{rule}_{wildcards} -e logs_slurm/{rule}_{wildcards}"
default-resources: [cpus=1, mem=2000, time=10:00]
use-conda: true

I'd really like to use this pipeline on our HPC, but I'm not sure what's causing this error.

2

2 Answers

1
votes

I was able to solve my problem with the help of this post:

InputFunctionException: unexpected EOF while parsing

By adding the verbose flag:

snakemake -np --verbose --use-conda --profile slurm test_data/20-001-002/20-001-002_fastq_pass.gz

I could see that snakemake was having issues with the default resources:

10:00
   ^

Changing the default resources line of my config.yaml file:

default-resources: [cpus=1, mem=2000, time=600]

removed the error.

0
votes

I am not sure if default-resources is a valid key in the config.

What happens if you try this as config.yaml:

jobs: 10
cluster: "sbatch -p talon -t {resources.time} --mem={resources.mem} -c {resources.cpus} -o logs_slurm/{rule}_{wildcards} -e logs_slurm/{rule}_{wildcards}"
use-conda: true

__default__:
    time: 10
    cpus: 1
    mem: 2GB