Firstly, I'm not entirely sure how the clipping works, but I suppose it "cuts" off the fragments that are not seen by the viewer, although I don't know how this works in practice. However, does it happen before or after the primitive assembly?
The official documentation says this:
The purpose of the primitive assembly step is to convert a vertex stream into a sequence of base primitives. For example, a primitive which is a line list of 12 vertices needs to generate 11 line base primitives.
The full primitive assembly step (including the processing below) will always happen after Vertex Post-Processing. However, some Vertex Processing steps require that a primitive be decomposed into a sequence of base primitives. For example, a Geometry Shader operates on each input base primitive in the primitive sequence. Therefore, a form of primitive assembly must happen before the GS can execute.
This early primitive assembly only performs the conversion to base primitives. It does not perform any of the below processing steps.
Such early processing must happen if a Geometry Shader or Tessellation is active. The early assembly step for Tessellation is simplified, since Patch Primitives are always sequences of patches.
It seems that there are two forms of primitive assembly, which I'm confused about.
First, we see that when the vertex data is first fed into the vertex shader for rendering, it has to interepret the stream of vertices as some triangle or line etc. This is called "rendering" I suppose.
But on the other hand, the primitive assembly as quoted above also does something so similar. What is the difference between the two processes?
The article on primitives says this:
The term Primitive in OpenGL is used to refer to two similar but separate concepts. The first is the interpretive scheme used by OpenGL to determine what a stream of vertices represents when being rendered. Such sequences of vertices can be arbitrarily long.
The other meaning of "Primitive" is as the result of the interpretation of a vertex stream, as part of Primitive Assembly. Therefore, processing a vertex stream by one of these primitive interpretations results in an ordered sequence of primitives. The individual primitives are sometimes called "base primitives".
If we follow the quote above, it seems that there is no difference between the two apparently separate concepts. The "interpretation step" can view, say, a sequence of 10 vertices as 8 dependent triangles. But so can the primitive assembly steps, which views the "dependent triangles" as base primitives. What is concretely different between the two?