I'm not quite sure I'm clear on the queries you are referring to, but let's say the situation is something like this:
Doc A: Name = "Carlos Fernando Luís Maria Víctor Miguel Rafael Gabriel Gonzaga Xavier Francisco de Assis José Simão de Bragança, Sabóia Bourbon e Saxe-Coburgo-Gotha"
Doc B: Name = "Tomás António Gonzaga"
If you search for "gonzaga", Doc B will be given the higher score, since, while there is one match in each name, Doc B has a much shorter name, with only three terms, and shorter fields are weighed more heavily. This is the LengthNorm refered to in the TFIDFSimilarity documentation.
There are other factors though. If we just chuck each name into the queryparser, and see what comes up, something like:
Query queryA = queryparser.parse(docA.name);
Query queryB = queryparser.parse(docB.name);
Then the queries generated are much different:
name:carlos name:fernando name:luis name:maria name:victor name:miguel name:rafael name:gabriel name:gonzaga name:xavier name:francisco name:de name:assis name:jose name:simao name:de name:braganca name:baboia name:bourbon name:e name:saxe name:coburgo name:gotha
vs
name:tomas name:antonio name:gonzaga
there are a wealth of reasons why these would generate different scores. The lengthNorm discussed above, the coord factor, which boosts results which match more query terms would very likely come into play, tf, which weighs documents with more matches for a term more heavily, idf, which prefers terms that appear less frequently over the entire index, etc. etc.
Scores are only relevant to the result set of a query run. A change to the query, or to the state of the index can lead to different scores, and they are not intended to be comparable. You can use IndexSearcher.explain, to understand how a score was calculated.