I wish to write several rules that extract the contents of tar archives to produce a number of files that are then used as input dependencies for other rules. I wish this to work even with parallel builds. I'm not using recursive make.
First up, sorry for the marathon question, but I don't think I can explain it well in a shorter form.
Think of untarring a collection of source files and then compiling them with rules stored outside of the archive to produce various build artefacts that are then, in turn, used further. I am not seeking other arrangements that lead to the omission of this problem. Just take it for granted that I have good reason to do this. :)
I'll demonstrate my issue with a contrived example. Of course, I started with something basic:
TAR := test.tar.bz2
CONTENTS := $(addprefix out/,$(filter-out %/,$(shell tar -tf $(TAR))))
out: $(TAR)
rm -rf out
mkdir out
tar -xvf $< -C out --touch || (rm -rf out; exit 1)
$(CONTENTS): out
sums: $(CONTENTS)
md5sum $^ > $@
.DELETE_ON_ERROR:
.DEFAULT_GOAL := all
.PHONY: all clean
all: sums
clean:
rm -rf out sums
The thinking here is that since $(CONTENTS) are all of the files in the archive, and they all depend on out
, then to run the sums
target we need to end up extracting the archive.
Unfortunately, this doesn't (always) work if you use a parallel invocation after a previous build when only test.tar.bz2
is updated, because make
may decide to check the timestamp of $(CONTENTS) before running the out
rule, which means it thinks that each of the sources is older than sums
, so there is nothing to do:
$ make clean
rm -rf out sums
$ make -j6
rm -rf out
mkdir out
tar -xvf test.tar.bz2 -C out --touch || (rm -rf out; exit 1)
data.txt
file
weird.file.name
dir/
dir/another.c
dir/more
md5sum out/data.txt out/file out/weird.file.name out/dir/another.c out/dir/more > sums
$ touch test.tar.bz2
$ make -j6
rm -rf out
mkdir out
tar -xvf test.tar.bz2 -C out --touch || (rm -rf out; exit 1)
data.txt
file
weird.file.name
dir/
dir/another.c
dir/more
Oops! The sums
rule didn't run!
So, the next attempt was to tell make that the one untar rule actually does make all the $(CONTENTS) directly. This seems better since we're telling make what's really going on, so it knows when to forget any cached timestamps for targets when they are remade through their rule.
First, let's look at what seems to work, and then I'll get to my problem:
TAR := test.tar.bz2
CONTENTS := $(addprefix out/,$(filter-out %/,$(shell tar -tf $(TAR))))
# Here's the change.
$(addprefix %/,$(patsubst out/%,%,$(CONTENTS))): $(TAR)
rm -rf out
mkdir out
tar -xvf $< -C out --touch || (rm -rf out; exit 1)
sums: $(CONTENTS)
md5sum $^ > $@
.DELETE_ON_ERROR:
.DEFAULT_GOAL := all
.PHONY: all clean
all: sums
clean:
rm -rf out sums
In this case, we've effectively got a rule that says:
%/data.txt %/file %/weird.file.name %/dir/another.c %/dir/more: test.tar.bz2
rm -rf out
mkdir out
tar -xvf $< -C out --touch || (rm -rf out; exit 1)
Now you can see one of the reasons I forced the output into an out
directory: to give me a place to use the %
so I could use a pattern rule. I am forced to use a pattern rule even though there isn't a strong pattern here because it is the only way make can be told that one rule creates multiple output files from a single invocation. (Isn't it?)
This works if any of the files are touched (not important for my use case) or if the test.tar.bz2
file is touched, even in parallel builds, because make has the information it needs: running this recipe makes all these files and will change all their timestamps.
For example, after a previous successful build:
$ touch test.tar.bz2
$ make -j6
rm -rf out
mkdir out
tar -xvf test.tar.bz2 -C out --touch || (rm -rf out; exit 1)
data.txt
file
weird.file.name
dir/
dir/another.c
dir/more
md5sum out/data.txt out/file out/weird.file.name out/dir/another.c out/dir/more > sums
So, if I have a working solution, what's my problem?
Well, I have many of these archives to extract, each with their own set of $(CONTENTS). I can manage that, but the trouble comes in writing a nice pattern rule. Since each archive needs its own rule defined, the patterns for each rule must not overlap even if the archives have similar (or identical) content. That means the output paths for the extracted files must be made unique for each archive, as in:
TAR := test.tar.bz2
CONTENTS := $(addprefix out.$(TAR)/,$(filter-out %/,$(shell tar -tf $(TAR))))
$(patsubst out.$(TAR)/%,out.\%/%,$(CONTENTS)): $(TAR)
rm -rf out.$(TAR)
mkdir out.$(TAR)
tar -xvf $< -C out.$(TAR) --touch || (rm -rf out.$(TAR); exit 1)
sums: $(CONTENTS)
md5sum $^ > $@
.DELETE_ON_ERROR:
.DEFAULT_GOAL := all
.PHONY: all clean
all: sums
clean:
rm -rf out.$(TAR) sums
So, this can be made to work with the right target-specific variables, but it now means that the extraction points are all "ugly" in a way that is very specifically tied to how the makefile is constructed:
$ make -j6
rm -rf out.test.tar.bz2
mkdir out.test.tar.bz2
tar -xvf test.tar.bz2 -C out.test.tar.bz2 --touch || (rm -rf out.test.tar.bz2; exit 1)
data.txt
file
weird.file.name
dir/
dir/another.c
dir/more
md5sum out.test.tar.bz2/data.txt out.test.tar.bz2/file out.test.tar.bz2/weird.file.name out.test.tar.bz2/dir/another.c out.test.tar.bz2/dir/more > sums
The next natural step I took was to try to combine static pattern rules with the multiple-targets-via-pattern-rule approach. This would let me keep the patterns very general, but limit their application to a specific set of targets:
TAR := test.tar.bz2
CONTENTS := $(addprefix out/,$(filter-out %/,$(shell tar -tf $(TAR))))
# Same as second attempt, except "$(CONTENTS):" static pattern prefix
$(CONTENTS): $(addprefix %/,$(patsubst out/%,%,$(CONTENTS))): $(TAR)
rm -rf out
mkdir out
tar -xvf $< -C out --touch || (rm -rf out; exit 1)
sums: $(CONTENTS)
md5sum $^ > $@
.DELETE_ON_ERROR:
.DEFAULT_GOAL := all
.PHONY: all clean
all: sums
clean:
rm -rf out sums
Great! Except it doesn't work:
$ make
Makefile:5: *** multiple target patterns. Stop.
$ make --version
GNU Make 4.0
So, is there a way to use multiple target patterns with a static pattern rule? If not, is there another way to achieve what I have in the last working example above, but without the constraint on the output paths to make unique patterns? I basically need to tell make "when you unpack this archive, all of the files in this directory (which I am willing to enumerate if necessary) have new timestamps". A solution where I can force make to restart if and only if it unpacks an archive would also be acceptable, but less ideal.