3.6.1 Basic Unicode Collation Algorithm Concepts

Table of Contents Previous Next


3 Database Administration : 3.6 Unicode Collation Algorithm : 3.6.1 Basic Unicode Collation Algorithm Concepts

The official information for the Unicode Collation Algorithm is specified in Unicode Technical Report #10, which can be found on The Unicode Consortium website at:
Level 1 – Primary Level for Base Characters. The order of basic characters such as letters and digits determines the difference such as A < B.
Level 2 – Secondary Level for Accents. If there are no primary level differences, then the presence or absence of accents and other such characters determine the order such as a < á.
Level 3 – Tertiary Level for Case. If there are no primary level or secondary level differences, then a difference in case determines the order such as a < A.
Level 4 – Quaternary Level for Punctuation. If there are no primary, secondary, or tertiary level differences, then the presence or absence of white space characters, control characters, and punctuation determine the order such as -A < A.
Level 5 – Identical Level for Tie-Breaking. If there are no primary, secondary, tertiary, or quaternary level differences, then some other difference such as the code point values determines the order.

3 Database Administration : 3.6 Unicode Collation Algorithm : 3.6.1 Basic Unicode Collation Algorithm Concepts

Table of Contents Previous Next