3.6.4 Using a Collation

Table of Contents Previous Next



A newly defined ICU collation can be used anywhere the COLLATION "collation_name" clause can be used in a SQL command such as in the column specifications of the CREATE TABLE command or appended to an expression in the ORDER BY clause of a SELECT command.
Collation icu_collate_lowercase forces the lowercase form of a letter to sort before its uppercase counterpart (CL).
Collation icu_collate_uppercase forces the uppercase form of a letter to sort before its lowercase counterpart (CU).
Collation icu_collate_ignore_punct causes variable characters (white space and punctuation marks) to be ignored during sorting (AS).
Collation icu_collate_ignore_white_sp causes white space and other non-visible variable characters to be ignored during sorting, but visible variable characters (punctuation marks) are not ignored (AS, T0020).
Note: When creating collations, ICU may generate notice and warning messages when attributes are given to modify the LROOT collation.
The following psql command lists the collations.
The following query sorts on column c2 using the default collation. Note that variable characters (white space and punctuation marks) with id column values of 9, 10, and 11 are ignored and sort with the letter B.
The following query sorts on column c2 using collation icu_collate_lowercase, which forces the lowercase form of a letter to sort before the uppercase form of the same base letter. Also note that the AN attribute forces variable characters to be included in the sort order at the same level when comparing base characters so rows with id values of 9, 10, and 11 appear at the beginning of the sort list before all letters and numbers.
The following query sorts on column c2 using collation icu_collate_uppercase, which forces the uppercase form of a letter to sort before the lowercase form of the same base letter.
The following query sorts on column c2 using collation icu_collate_ignore_punct, which causes variable characters to be ignored so rows with id values of 9, 10, and 11 sort with the letter B as that is the character immediately following the ignored variable character.
The following query sorts on column c2 using collation icu_collate_ignore_white_sp. The AS and T0020 attributes of the collation cause variable characters with code points less than or equal to hexadecimal 0020 to be ignored while variable characters with code points greater than hexadecimal 0020 are included in the sort.
The row with id value of 11, which starts with a space character (hexadecimal 0020) sorts with the letter B. The rows with id values of 9 and 10, which start with visible punctuation marks greater than hexadecimal 0020, appear at the beginning of the sort list as these particular variable characters are included in the sort order at the same level when comparing base characters.


Table of Contents Previous Next