Choosing between len() and count() depends on the situation and it's worth to deeply understand how they work to use them correctly.
Let me provide you with few scenarios:
- (most crucial) When you only want to know the number of elements and you do not plan to process them in any way it's crucial to use
count():
DO: queryset.count() - this will perform single SELECT COUNT(*) some_table query, all computation is carried on RDBMS side, Python just needs to retrieve the result number with fixed cost of O(1)
DON'T: len(queryset) - this will perform SELECT * FROM some_table query, fetching whole table O(N) and requiring additional O(N) memory for storing it. This is the worst that can be done
- When you intend to fetch the queryset anyway it's slightly better to use
len() which won't cause an extra database query as count() would
len() (one db query)
len(queryset) # SELECT * fetching all the data - NO extra cost - data would be fetched anyway in the for loop
for obj in queryset: # data is already fetched by len() - using cache
pass
count() (two db queries!):
queryset.count() # First db query SELECT COUNT(*)
for obj in queryset: # Second db query (fetching data) SELECT *
pass
Reverted 2nd case (when queryset has already been fetched):
for obj in queryset: # iteration fetches the data
len(queryset) # using already cached data - O(1) no extra cost
queryset.count() # using cache - O(1) no extra db query
len(queryset) # the same O(1)
queryset.count() # the same: no query, O(1)
Everything will be clear once you take a glance "under the hood":
class QuerySet(object):
def __init__(self, model=None, query=None, using=None, hints=None):
# (...)
self._result_cache = None
def __len__(self):
self._fetch_all()
return len(self._result_cache)
def _fetch_all(self):
if self._result_cache is None:
self._result_cache = list(self.iterator())
if self._prefetch_related_lookups and not self._prefetch_done:
self._prefetch_related_objects()
def count(self):
if self._result_cache is not None:
return len(self._result_cache)
return self.query.get_count(using=self.db)
Good references in Django docs: