13
votes

Even if the output files of a Snakemake build already exist, Snakemake wants to rerun my entire pipeline only because I have modified one of the first input or intermediary output files.

I figured this out by doing a Snakemake dry run with -n which gave the following report for updated input file:

Reason: Updated input files: input-data.csv

and this message for update intermediary files

reason: Input files updated by another job: intermediary-output.csv

How can I force Snakemake to ignore the file update?

2

2 Answers

10
votes

You can use the option --touch to mark them up to date:

--touch, -t
Touch output files (mark them up to date without really changing them) instead of running their commands. This is used to pretend that the rules were executed, in order to fool future invocations of snakemake. Fails if a file does not yet exist.

Beware that this will touch all your files and thus modify the timestamps to put them back in order.

8
votes

In addition to Eric's answer, see also the ancient flag to ignore timestamps on input files.

Also note that the Unix command touch can be used to modify the timestamp of an existing file and make it appear older than it actually is:

touch --date='2004-12-31 12:00:00' foo.txt 
ls -l foo.txt 
-rw-rw-r-- 1 db291g db291g 0 Dec 31  2004 foo.txt