4
votes

The Idea is to have a tagging system between Users and Content(images, videos, posts) Kind of like the tagging system here on SO with questions.

I like the achievements system on SO, meaning that after a certain amount of points a user can start making his/her own tags. Same Idea for my system

My current table design looks like

Tag           UserTag       User
---           -------       ----
tag_id        user_id       user_id
tag_name      tag_id        username
usage_count                 .... 

It brings me to this question.

Q How can you have a tagging system for content in different languages.

  • Yet at the same time be able to search for the same content with tags in different languages.
  • Have auto-complete with different languages for the same tag

When i use autocomplete I search for tag names like the characters the user is typing.

E.g. I have a tag named "nightclub" in English

yet in French if they were tagging that the translation would be "discothèque"


Or is there no way of doing this, and just let people make tags in different languages.

2

2 Answers

2
votes

Yes you can. But be aware that some words in one language may have several translations in others.

You may have a languages table, a tags table with only a tag_id, and a many to many table with language_id, tag_id, tag_name.

Like I said previously, you might run into problems when people want to make refinements that their own language allows, but other languages can't. To stay in the french example, talking about bread, you may have 'baguette', 'flûte', 'recuit', 'demi-recuit', etc. tags, whereas the english would merely have a 'bread' tag. The mapping between the tags in then significantly complicated. but that's a general translation problem, not only in programming realm.


Regarding your comment : a compromise would be to add a "tag_related_to_tag" table, allowing to make couplings between tags. Users could tell which tag is related to which other in a different language. This would allow the maximum flexibility with the minimum of complexity, but would need some administration (otherwise you might have evil users making very unexpected relationships between tags, breaking the usefulness of the system).

That's something I actually was thinking to implement for a website which has a very narrow field (stoic philosophy) and target public. If the field is too broad, it might be very ineffective.

1
votes

Interesting question! Just some thoughts (not intended a a complete solution, it's more a set of questions):

A straightforward approach is having an internal tag ID and for each language a localized name.

If no localized name was created yet, you may need to fall back to the tag name in a 'primary' language - usually english - or the language the tag was created in.

Translation needs to be done by a user ho knows both languages, automatic translations are IMO to imprecise. So probably a user right (bound to rewards?) to rename tags.

Are all languages equal, or are tags only created in a "primary language" understood by most users, and translations added separately? (The latter looks less fair, but would probably make some things easier)

You need an ability to merge tags - e.g. when users independently created "discothèque" and "nightclub".

Do I see only the tags that are available in my language, or can I see tags available in other languages that don't have translations to my language? Can I search for tags in other languages?

Is the tag name included in a query string? Will my german query link work when I send it to a friend in the US?

How to resolve disputes regarding the tag meaning? Example: The closest translation to german is "Nachtclub" for night club, and "Diskothek" for "discothèque". But in german, a "Nachtclub" is quite different from a "Diskothek" (though there is some overlap).