6
votes

I'm implementing a tag system similar to StackOverflow tag system but I just wonder How-to get related tags and define the relationships weights between tags like the list of "Related Tags" in any tag page like this https://stackguides.com/questions/tagged/php they define the relationship weight by the co-occurrence between 2 or more tags

How I can do this in PHP/MySQl to define the most related tags for tag "X" and keep all weights up to date as users add more and more posts/questions ?

3

3 Answers

2
votes

You probably want to look into statistics for this:

  1. given a tag X
  2. check all other tags Y
  3. count how often Y and X show up at the same time
  4. divide by how often Y shows up
  5. ???
  6. Profit!!!

As for more information on step 5: This information only changes very slowly, so you can really cache this stuff and only recreate it when you have time.

What you want in the end is a relation

conditional_probability(X, Y, P)

Which tells you how probable (P) tag Y is, given X. P was calculated in step 4.

1
votes

I used this blog entry for calculating relative tag size within a cloud. You can use this algorithm on the entire could or a particular found set.

Instead of storing the denormalized weights for all tags in the database, I cache them in my (Ruby) process, and rebuild them when tags are added/removed or when the process restarts.

As for how to store them, you generally want:

  1. A tags table associating unique tag names with row IDs, and
  2. A tags_items table providing you with your n-to-n mapping between tags and items.

Once you have that, and once you have a found set of items on a results page, it's a simple join and unique to find out the set of 'related' tags.

0
votes

1 Each post id can be tagged with one or more tags (PHP + other tags)

2 Going back the same way each tag has associated post id

3 Foreach post id get all tags other than PHP

4 Show only those which has count more than a prticular Number (say 4000)

Think about it this question has been tagged "Mysql" "Database-design" "Tags" and "Tagging" Do you see how you have related PHP with other tags.