Yes, Alloy is a falsifier. Alloy's primary novelty when it was introduced 20 years ago was to argue that falsification was often more important than verification, since most designs are not correct, so the role of an analyzer should be to find the errors, not to show that they are not present. For a discussion of this issue, see Section 1.4, Verification vs. Refutation in Software analysis: A roadmap (Jackson and Rinard, 2000); Section 5.1.1, Instance Finding and Undecidability Compromises in Software Abstractions (Jackson 2006).
In Alloy's case though, there's another aspect, which is the argument that scope-complete analysis is actually quite effective from a verification standpoint. This claim is what we called the "small scope hypothesis" -- that most bugs can be found in small scopes (that is analyses that are bounded by a small fixed number of elements in each basic type).
BTW, Alloy was one of the earliest tools to suggest using SAT for bounded verification. See, for example, Boolean Compilation of Relational Specifications (Daniel Jackson, 1998), a tech report that was known to the authors of the first bounded model checking paper, which discusses Alloy's predecessor, Nitpick, in the following terms:
The hypothesis underlying Nitpick is a controversial one. It is that,
in practice, small scopes suffice. In other words, most errors can be
demonstrated by counterexamples within a small scope. This is a purely
empirical hypothesis, since the relevant distribution of errors cannot
be described mathematically: it is determined by the specifications
people write.
Our hope is that successful use of the Nitpick tool will justify the
hypothesis. There is some evidence already for its plausibility. In
our experience with Nitpick to date, we have not gained further
information by increasing the scope beyond 6.
A similar notion of scope is implicit in the context of model checking
of hardware. Although the individual state machines are usually
finite, the design is frequently parameterized by the number of
machines executing in parallel. This metric is analogous to scope; as
the number of machines increases, the state space increases
exponentially, and it is rarely possible to analyze a system involving
more than a handful of machines. Fortunately, however, it seems that
only small configurations are required to find errors. The celebrated
analysis of the Futurebus+ cache protocol [C+95], which perhaps marked
the turning point in model checking’s industrial reputation, was
performed for up to 8 processors and 3 buses. The reported flaws,
however, could be demonstrated with counterexamples involving at most
3 processors and 2 buses.