0
votes

I'm working on a discussion board, which is listing all topics according on it's hotness/rank (like reddit). So i took reddits algorithm and started trying. i used this example: http://blog.sodhanalibrary.com/2014/04/reddit-ranking-algorithm-implementation.html

function score($ups,$downs){
    return $ups - $downs;
}

function epoch_seconds($timestamp){
    $epoch = new DateTime("1970-01-01 00:00:00");
    $unix = new DateTime($timestamp);
    $td = $epoch->diff($unix);

    $days = $td->format('%a');
    $hours = $td->format('%h');
    $minutes = $td->format('%i');
    $seconds = $td->format('%s');
    $age = ($days * 86400) + ($hours * 3600) + ($minutes * 60) + $seconds;

    return $age;
}

function calculateRank($ups,$downs,$date){
    $s = score($ups,$downs);
    $order = log10(max(abs($s), 1), 10); 

    if($s > 0) {
        $sign = 1;
    } elseif($s < 0) {
        $sign = -1;
    } else {
        $sign = 0;
    }

    $seconds = epoch_seconds($date) - 1134028003;

    return round($order + (($sign * $seconds)/45000), 7);
}

Example:

echo calculateRank(1,0,"2015-02-14 12:00:00"); // = 6441.9377111

What i do not understand, is the fact, that if the score (the difference between upvotes and downvotes) is 0, then the rank is 0. This would mean, that a completely new article with +1/-1 would be ranked in nirvana.

echo calculateRank(1,1,"2015-02-14 12:00:00"); // = 0

Also, if the score is negative the rank is negative. Which means that a completely new article with a +1/-2 would be ranked even further away then nirvana.

echo calculateRank(1,2,"2015-02-14 12:00:00"); // = -6441.9377111

The Select Query would look something like this:

SELECT * FROM articles ORDER BY rank DESC 

According to the results i showed you, that would mean that an 10 year old article with a positive score (eg: 1 upvote / 0 downvotes), would be ranked higher, then EVERY article with a score of 0 or a negative score, no matter of the date. That cannot be right and confuses me.

What i'm looking for is something similar. I already got rid of zero-ranks by do not allowing a score of 0. However, negative scores (eg: 0 upvotes / 2 downvotes) should lower the score instead of inverting it.

Any help is highly appreciated! Thanks.

1
Have read read articles such as this one trying to explain the ranking method? If so, I believe your question is rather fit for CodeReview (though the code looks good, though DateTime has getTimestamp() already implemented) or Mathematics (to learn that only because numbers are high, they aren't near nirvana) - kero
The thing is, even an article which is 10 years old, would be ranked higher then every other article that has either a score of 0 or a negative score. That confuses me cause that cannot be right. - user3614800
@kingkero - This question is not asking for a code review, and would be closed as off-topic there. This question is asking "how does this ranking algorithm deal with new posts?", not "Does my code look good and follow best practices?". - rolfl
@user3614800 You need to evaluate if this fits your needs. Reddit has lots of threads coming up, so they value recent posts higher. And judging from SO experience, if the score is negative that means it is usually not worth a read - kero

1 Answers

1
votes

I adjusted the algorithm to my needs. I came up with the following:

if($score >= 0) {
    $sign = 1;
} elseif($score < 0) {
    $sign = -1;
}
return round( ($sign * $order) + ($seconds / 45000) , 7);

This way articles with a negative score will just lower the rank instead to invert it. (the punishment for a score of -1 for example shouldn't be: "go to nirvana!")