7
votes

I am currently trying to develop a basic fulltext search for my website, and I noticed that certain words like "regarding" are listed as stopwords for MySQL fulltext searches. This doesn't bother me too much right now since people searching for a given news item wouldn't necessarily search using the word "regarding" (but I certainly can't speak for everyone!). However, I was hoping someone here could enlighten me about the rationale for having a stopwords list. Thanks!

For Clarification: I'm using MyIsam for my fulltext table. The stopwords are words that MySQL won't index (for any fulltext index). As noted in a comment to this question, there is a full list of stopwords without any kind of explanation. I'd just like to know if there was a rationale behind the words "they" chose.

1
Do you want to use mySQL for your searching? Would you not be happier implementing something else? - Layke
@Laykes I might be happier using a different framework. I'm developing very conservatively right now since I'm not in control of the server for which I'm developing. I also don't need a terribly advanced search for my site. Either way, I'm still curious about the stopword list. - just_wes
weird, I never knew about stopwords - here is a full list but without explanation: dev.mysql.com/doc/refman/5.1/en/fulltext-stopwords.html - Otto Allmendinger

1 Answers

8
votes

The stop words are just common words in the English language. In most cases, your search results will be more relevant -- and your indices will be smaller and faster -- if you don't index these words.

You can edit the stop word list using the ft_stopword_file variable (or set it to '' to index all words as long or longer than ft_min_word_len) if that suits your needs better. You can also change the minimum indexed word length using the ft_min_word_len variable, which exists for the same reason.