One thread will read only from one partition. To read from multiple partitions you need to spawn multiple threads and each thread will read from single partition. You must run this in different thread, otherwise you loose the benefits of having partitions and your performance will take a hit.
For starter you can run all consumers on one machine. But eventually you will have to start using different machines for consuming. At that time you need to ensure that one partition is processed only once. Concretely, problem you need to solve is that 2 threads (from different) are trying to read from same partition. At all times, you must allow only one to process it.
Additionally, you need to manage offsets. You need to flush them in zookeeper at regular interval.
I'll suggest you to use High Level Consumer. It is much easier to use than Simple Consumer. It provides with co-ordination among different threads accessing same partition and manages offsets of its own.