10
votes

I need to fill in a template with a summary of user activity in a simple messaging system. For each message sender, I want the number of messages sent and the number of distinct recipients.

Here's a simplified version of the model:

class Message(models.Model):
    sender = models.ForeignKey(User, related_name='messages_from')
    recipient = models.ForeignKey(User, related_name='messages_to')
    timestamp = models.DateTimeField(auto_now_add=True)

Here's how I'd do it in SQL:

SELECT sender_id, COUNT(id), COUNT(DISTINCT recipient_id)
    FROM myapp_messages
    GROUP BY sender_id;

I've been reading through the documentation on aggregation in ORM queries, and although annotate() can handle the first COUNT column, I don't see a way to get the COUNT(DISTINCT) result (even extra(select={}) hasn't been working, although it seems like it should). Can this be translated into a Django ORM query or should I just stick with raw SQL?

2
The .distinct() filter might come in handy. - Bartek Banachewicz
If we ignore the COUNT(DISTINCT recipient_id) it is pretty straightforward using annotate and Count function. Here, django's distinct() does not help. I don't think it is possible to integrate these two things (annotate and distinct) together. You have to write two queries I guess. - jurgenreza

2 Answers

8
votes

You can indeed use distinct and count together, as seen on this answer: https://stackoverflow.com/a/13145407/237091

In your case:

SELECT sender_id, COUNT(id), COUNT(DISTINCT recipient_id)
FROM myapp_messages
GROUP BY sender_id;

would become:

Message.objects.values('sender').annotate(
    message_count=Count('sender'),
    recipient_count=Count('recipient', distinct=True))
4
votes
from django.db.models import Count

messages = Message.objects.values('sender').annotate(message_count=Count('sender'))

for m in messages:
    m['recipient_count'] = len(Message.objects.filter(sender=m['sender']).\
                              values_list('recipient', flat=True).distinct())