4
votes

On my blog, I display in the right nav the 10 most popular articles in terms of page hits. Here's how I get that:

SELECT *
FROM entries
WHERE is_published = 1
ORDER BY hits DESC, created DESC
LIMIT 10

What I would like to do is show the top 10 in terms of page hits per day. I'm using MySQL. Is there a way I can do this in the database?

BTW, The created field is a datetime.

UPDATE: I think I haven't made myself clear. What I want is for the blog post with 10,000 hits that was posted 1,000 days ago to have the same popularity as the blog post with 10 hits that was posted 1 day ago. In pseudo-code:

ORDER BY hits / days since posting

...where hits is just an int that is incremented each time the blog post is viewed.

OK, here's what I'm going to use:

SELECT *, AVG(
    hits / DATEDIFF(NOW(), created)
) AS avg_hits
FROM entries
WHERE is_published = 1
GROUP BY id
ORDER BY avg_hits DESC, hits DESC, created DESC 
LIMIT 10

Thanks, Stephen! (I love this site...)

2

2 Answers

6
votes

I'm not entirely sure you can by using the table structure you suggest in your query. The only way I can think of is to get the top 10 by way of highest average hits per day. By doing that, your query becomes:

SELECT *, AVG(hits / DATEDIFF(NOW(), created)) as avg_hits
FROM entries
WHERE is_published = 1
GROUP BY id
ORDER BY avg_hits DESC
LIMIT 10

This query assumes your created field is of a DATETIME (or similar) data type.

1
votes

I guess you could have a hits_day_count column, which is incremented on each view, and a hits_day_current.

On each page-view, you check if the hits_day_current column is today. If not, reset the hit count.. Then you increment the hits_day_count column, and set hits_day_current to the current datetime.

Pseudo-code:

if article_data['hits_day_current'] == datetime.now():
    article_data['hits_day_count'] ++
else:
    article_data['hits_day'] = 0

article_data['hits_day_current'] = datetime.now()

The obvious problem with this is simple - timezones. The totals get reset at 00:00 wherever the server is located, which may not be useful.

A better solution would be a rolling-24-hour total.. Not quite sure how to do this neatly. The easiest (although not so elegant) way would be to parse your web-server logs periodically. Get the last 24 hours of logs, count the number of requests to each article, and put those numbers in the database.