0
votes

I have a list of titles and descriptions in a table which are indexed in a tsvector column. How can I implement Google Search like full text search functionality in Postgres for these fields. I tried various functions offered by standard Postgres like

to_tsquery('apple | orange') -- apple | orange

This function returns rows as long as it has one of these terms so it doesn't produce highly relevant results at top which should have both of the terms.

plainto_tsquery('apple orange') -- apple & orange

This function requires all of the terms in the query. But I want results including both apple and orange first but can still have results including even one of these terms just later in the results.

phraseto_tsquery('apple orange') -- apple <> orange

This function only matches orange followed by apple but not vice versa. But for me orange <> apple is also still relevant.

I also tried websearch_to_tsquery() but it behaves very similar to above functions.

How can I ask Postgres to list highly relevant rows first which contains most of the terms in the search query no matter the order of the terms and then followed by rows with less number of terms?

1

1 Answers

0
votes
to_tsquery('apple | orange')   --  apple | orange

This function returns rows as long as it has one of these terms so it doesn't produce highly relevant results at top which should have both of the terms.

Unless you tell it how to order the rows, rows of a single query are returned in arbitrary order. There is no "top" without an ORDER BY, there is just something which happens to be seen first.

How can I ask Postgres to list highly relevant rows first which contains most of the terms in the search query no matter the order of the terms and then followed by rows with less number of terms?

Use the | operator, then rank those rows using ts_rank, ts_rank_cd, or a custom ranking function you write yourself. For performance, you might want to use the & operator first, then revert to | if you don't get enough rows.

The built in ranking functions don't care about order, but also don't care about proximity. So they might not do what you want. But writing your own won't be particularly easy, so I'd at least try them out first.

It would be nice if the introduction of websearch_to_tsquery or phraseto_tsquery had also introduced some corresponding ranking functions. But since they invented only ordered proximity, not proximity without order, it is unlikely they would do you want if they did exist.