1
votes

Imagine I have a table with posts, and I want to make a full-text search through that posts. And posts can be in different languages and count of posts is too big. So I can add text column language and we'll have:

Posts:
post_id bigserial PK
post text
post_ts tsvector
language text

(additionally I'll add GIN index)

And when I search for English query text I'll add "where language = 'english'":

Select post from posts where to_tsquery('english', 'some query string') @@ post_ts and language = 'english';

so postgres process will find only rows with english posts and skips other. Is there will be additional time to skip others rows? Can adding column language to Primary Key helps with this? Or it will be better for performance to make different tables for different languages?(Posts_en, Posts_fr, Posts_it)

1
hard to say if it worth of separating to different table, but if you your query will have where language = 'english' and and language will be indexed (and using index will be benefitial for optimyzes) - yes index should be used - Vao Tsun

1 Answers

0
votes

Last year I wrote an article about Multilingual Data Modeling Patterns and present all existing data models to support Multi Language in Information Systems and I evaluated the advantages and disadvantages of them in different context.

I'm sure that my article is useful for your problem. have fun.
http://www.ijcaonline.org/archives/volume143/number10/irani-2016-ijca-910364.pdf