Due to many questions about the PostgreSQL collation that were asked after my last short article about that, I decided to write some more about the current collation situation in PostgreSQL.
What The Collation Is
The collation setting is responsible for correct working functions such as
- converting letter to upper
- converting letter to lower
- sorting texts
- ability to use indexes in `like` clauses
If we have text written in chineese or polish and we’ve got the japaneese collation, we can be sure that sorting texts won’t look like we want it.
Types of Language Settings
Generally there are two types of languge settings in PostgreSQL. One is can be set for each connection and the other only during the initdb and can’t be changed .
How to Set Collation For The Cluster
There is only one way for setting the cluster collation – during initdb. Initdb is a program for creating initial files and directory structure for the database cluster. Once the collation is set, you can’t change it. For setting the collation, you can initialize the cluster with the command:
initdb --locale=fr_CA
Of course you can use the initdb without any option, then the language settings will be taken from the environment variables.
How to Set Language Settings for Client
For client connection there can be set language settings such as:
- LC_MESSAGES – language of messages
- LC_MONETARY – formatting of currency amounts
- LC_NUMERIC – formatting of numbers
- LC_TIME – formatting of dates and times
Setting the above variables won’t change anything for the current database or the whole cluster. You can’t change or improve that way how texts are sorted.
The above options can be changed for the whole database using:
ALTER DATABASE mydb SET LC_NUMERIC TO ‘my value’;
How to Check The Variables
For checking all the above variables you can connect to you database and write a query such as:
show LC_COLLATE;
Possible output is:
# show LC_COLLATE;
lc_collate
————-
en_US.UTF-8
(1 row)
Other options can be checked the same way.
How To Reinitdb
That’s not so easy. There is no way to, so called, reinitdb. You can initdb and nothing more. To change those two language settings that can’t be changed normally you can do:
- Just make dump of all databases
- Stop the server.
- Delete the whole PostgreSQL data directory.
- Run initdb once again but with other language settings.
- Load the dump that has been made in the first point.
pg_dumpall > file.sql
What next?
Fortunately in the next version 8.4 there will be some progress in playing with the languge settings. I wrote that before in PostgreSQL Collation.
Related posts:
- PostgreSQL Collation – part 1 PostgreSQL is (IMHO) much better than MySQL but unfortunately it ...
- PostgreSQL Collation – 8.4 In the last PostgreSQL version (8.4) there is a small...
- PostgreSQL FOUND Problem FOUND is a global variable that exists in the plpgsql...
- Trying to Use Database in Python Why the f**** there is nothing like DBI in Python?...













Thanks man, i was having problem changing to pt_PT colattion, this post helped me to Reinitdb.
Cheers