The most common approach is to either lowercase or uppercase the search string and the data. But there are two problems with that.
- It works in English, but not in all languages. (Maybe not even in
most languages.) Not every lowercase letter has a corresponding
uppercase letter; not every uppercase letter has a corresponding
lowercase letter.
- Using functions like lower() and upper() will give you a sequential
scan. It can't use indexes. On my test system, using lower() takes
about 2000 times longer than a query that can use an index. (Test data has a little over 100k rows.)
There are at least three less frequently used solutions that might be more effective.
- Use the citext module, which mostly mimics the behavior of a case-insensitive data type. Having loaded that module, you can create a case-insensitive index by
CREATE INDEX ON groups (name::citext);
. (But see below.)
- Use a case-insensitive collation. This is set when you initialize a
database. Using a case-insensitive collation means you can accept
just about any format from client code, and you'll still return
useful results. (It also means you can't do case-sensitive queries. Duh.)
- Create a functional index. Create a lowercase index by using
CREATE
INDEX ON groups (LOWER(name));
. Having done that, you can take advantage
of the index with queries like SELECT id FROM groups WHERE LOWER(name) = LOWER('ADMINISTRATOR');
, or SELECT id FROM groups WHERE LOWER(name) = 'administrator';
You have to remember to use LOWER(), though.
The citext module doesn't provide a true case-insensitive data type. Instead, it behaves as if each string were lowercased. That is, it behaves as if you had called lower()
on each string, as in number 3 above. The advantage is that programmers don't have to remember to lowercase strings. But you need to read the sections "String Comparison Behavior" and "Limitations" in the docs before you decide to use citext.