I am trying to understand Cassandra by playing with a public dataset.
I had inserted 1.5M rows from CSV to a table on my local instance of Cassandra, WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }
The table was created with one field as a partition key, and one more as primary key
I had a confirmation that 1.5M rows were processed. COPY Completed
But when I run SELECT or SELECT COUNT(*) on the table, I always get a max of 182 rows.  Secondly, the number of records returned with clustered columns seem to higher than single columns which is not making sense to me. What am I missing from Cassandra's architecture and querying point of view.
Lastly I have also tried reading the same Cassandra table from pyspark shell, and it seems to be reading 182 rows too.