You should look at your Solr logs, to see if there's anything about "duplicate" documents, or just go look in the solrconfig.xml file for the core into which you are pushing the documents. There is likely a "dedupe" call is being made on the update handler, the fields used may be causing duplicate documents (based on a few fields) to be dropped. You'll see something like this
<requestHandler name="/dataimport"
class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="update.chain">dedupe</str> <<-- change dedupe to uuid
<str name="config">dih-config.xml</str> or comment the line
</lst>
</requestHandler>
and later in the file the definition of the dedupe update.chain,
<updateRequestProcessorChain name="dedupe">
<processor class="solr.processor.SignatureUpdateProcessorFactory">
<bool name="enabled">true</bool>
<str name="signatureField">id</str>
<bool name="overwriteDupes">true</bool>
-->> <str name="fields">url,date,rawline</str> <<--
<str name="signatureClass">solr.processor.Lookup3Signature</str>
</processor>
<processor class="solr.LogUpdateProcessorFactory" />
<processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
The "fields" element is what will select which input data is used to determine the uniqueness of the record. Of course, if you know there's no duplication in your input data, this is not the issue. But the above configuration will throw out any records which are duplicate on the fields shown.
You may not be using the dataimport requestHandler, but rather the "update" requestHandler. I'm not sure which one Nutch uses. Either, way, you can simply comment out the update.chain, change it to a different processorChain such as "uuid", or add more fields to the "fields" declaration.