An entity is a primary unit representing the interface to a design specification implemented in an architecture, a secondary unit. An entity and it's selected architecture together implement a block, which can be instantiated by component, direct entity instantiation, or configuration.
Is it possible to simulate something without an entity? Not with a VHDL simulator. It's possible to have an empty entity:
entity foo is
end entity;
architecture fum of foo is
begin
end architecture;
(This should actually analyze, there's no requirement an architecture have any concurrent statements either.)
A testbench is generally an entity providing no interfaces (or sometimes just a generic interface), and an architecture that contains concurrent statements (and processes are concurrent statements).
So, you need an entity, it doesn't need to provide an interface. An entity declaration also provides a declarative region where you can supply declarations necessary for declarations in the architecture body's declarative region.
It's called a testbench because it has no practically value in hardware. It's only useful for testing one or more design specifications that do have interfaces (and can be synthesized) or for executing a VHDL design specification (your multiple processes in an architecture body) on a simulator.
The length of simulation time a collection of processes can run is bounded by two things. First there is no standard for how many delta simulation cycles can run without advancing simulation time.
There are simulators that can execute an unlimited number of delta cycles. Early VHDL simulators had consecutive delta cycle limits measured in hundreds. It's common to use a delta cycle limit of 5,000 (Modelsim's default).
The other limit is the duration of the Time variable, which can be expanded optionally by some simulators by setting the resolution limit and constraining the minimum size of Time physical unit appears in a design description, in effect scaling Time to run longer at a coarser granularity.
If Time advances in a simulation it will eventually run up against Time'HIGH and stop and implementing a resolution limit is optional for an implementation.
In effect trying to write the equivalent of a daemon in VHDL is guaranteed to be non-portable. VHDL isn't a general purpose parallel programming language, wiki articles stating to the contrary, aside.
It's also possible to write a VHDL design specification that memory leaks. There is no requirement for garbage collection.