When we consider that Ethiopia and Eritrea together have upwards of 80 languages it should come as no surprise to learn that no single language uses all of the characters in the Unicode definition of Ethiopic. Ethiopic in Unicode is indeed a collection of the character sets of these languages integrated under a set of linguistic rules. As is the case in Europe and North America a person generally learns the letters they need for their particular language and might go their lifetime oblivious to additional letters, or even slightly different use of common letters, in neighboring languages.
Fortunately the variations in convention are not too unwieldy to manage in software and there is a good degree of tolerance in the area of sorting. Firstly, there is no real concern for the order of punctuation. Numerals should of course be sequential as per their value but there is no concern that their order be higher or lower than that of the syllographs or punctuation.
The syllabic series that go unused by a given language are generally just dropped from its syllabary. Accordingly, users are not concerned for the sorting order of jettisoned letters, as they were unlikely to have been used in their literature anyway. The traditional order is then suitable. For example the ቨ ("ve") family is not a phoneme in most Ethiopian languages but is important for transcribing foreign words such as "university" (ዩኒቨርስቲ). "ቨ" may then acceptably sort after "በ", from which the glyph is derived, though some conventions will place it after "ፐ" the last series of the syllabary.
Under Amharic practices the phonetically redundant (homophonic) characters are not dropped from the syllabary as they do retain their relevance for the canonical spelling of words. It is however a common convention in Amharic dictionaries to "fold" the series onto one another for brevity.
The convention of placing relatively newer characters at the end of the syllabary (including the zemede extensions) is required to maintain the Gematria encoding of the script and is also common in primary education but is generally not a concern when applied to electronic collation. The Unicode approach of integrating the extension members into the syllabary proper will meet with little objection. The approach also has precedence set by Kidane Wold Kifle who applied the same technique in 1955 in an updated form of the "Abegede" syllabary12.
The Abegede syllabary does indeed demand our attention and consideration when addressing collation issues. Proponents of the sequencing argue that it is the original order of the Ge'ez script as would be required by the Semitic alphabetic template. Many references on ancient writing and archaeology will use the Abegede order for sake of comparison when displaying the Mino-Sabaean script (the likely progenitor of Ge'ez), "Thamudic" style Ge'ez and later Ge'ez along side other Semitic scripts. The more thorough references will add the caveat to the comparison that the South Arabian convention (albeit sans the vowel demarcation) was indeed "Halehame".
The Ethiopian Orthodox Church did for a time embrace the "Abegede" order and until the present day materials are still published where the sequence is used for indexing12. The "Abegede" collation should remain an option in a Ge'ez language locale when dealing with this class of literature where it is expected. For every other case the "Halehame" collation convention is appropriate. Kidane Wold Kifle introduced a modernization of the Abegede order for the Amharic syllabary11. The extent to which it was adapted and impacted literature that followed is unknown to the author.
The Abugida syllabary is more familiar to the average person than is the Abegede syllabary that it is derived from. The two are often thought to be the same syllabary. This is not the case however as the Abugida syllabary is a simple rotation of the Abegede order intended to challenge the student who is has learned the syllabary thru rote memorization. The "Abugida" in the first column curiously follows the Gematria order. Each consecutive rightward column then shifts a single row upward as per its distance from the first column (shift = column number - 1). Elements shifted off the top then wrap around to the bottom.
A modern ordering of the syllabary is aimed at introducing the syllabary to kindergarteners following the glyph similarities of the elements. The kindergarten order are shown here for comparison value. We need only be aware of conventions as not in need of being addressed as a collation system.
Unicode | Amharic Dictionary | Ge'ez Heleheme | Ge'ez Abegede | KWK Abegede | Glyph Based |
---|---|---|---|---|---|
ሀ | ሀ/ሐ/ኀ/ኸ | ሀ | አ | አ | በ |
ለ | ለ | ለ | በ | በ | ሰ |
ሐ | መ | ሐ | ገ | ገ | ሸ |
መ | ሠ/ሰ | መ | ደ | ደ | ከ |
ሠ | ረ | ሠ | ሀ | ጀ | ኸ |
ረ | ሸ | ረ | ወ | ሀ | ለ |
ሰ | ቀ | ሰ | ዘ | ወ | አ |
ሸ | በ | ቀ | ሐ | ዘ | ዘ |
ቀ | ተ | በ | ጠ | ዠ | ዠ |
ቐ | ቸ | ተ | የ | ሐ | ሀ |
በ | ነ | ኀ | ከ | ኀ | ሠ |
ቨ | ኘ | ነ | ለ | ጠ | መ |
ተ | አ/ዐ | አ | መ | ጨ | ገ |
ቸ | ከ | ከ | ነ | የ | ፐ |
ኀ | ወ | ወ | ሠ | ከ | ተ |
ነ | ዘ | ዐ | ዐ | ኸ | ቸ |
ኘ | ዠ | ዘ | ፈ | ለ | ቀ |
አ | የ | የ | ጸ | መ | የ |
ከ | ደ | ደ | ቀ | ነ | ደ |
ኸ | ጀ | ገ | ረ | ኘ | ጀ |
ወ | ገ | ጠ | ሰ | ሠ | ጸ |
ዐ | ጠ | ጰ | ተ | ዐ | ጰ |
ዘ | ጨ | ጸ | ኀ | ፈ | ነ |
ዠ | ጰ | ፀ | ፀ | ጸ | ኘ |
የ | ጸ | ፈ | ጰ | ፀ | ኀ |
ደ | ፀ | ፐ | ፐ | ቀ | ዐ |
ዸ | ፈ | ረ | ፀ | ||
ጀ | ፐ | ሰ | ወ | ||
ገ | ሸ | ጠ | |||
ጘ | ተ | ጨ | |||
ጠ | ቸ | ሐ | |||
ጨ | ጰ | ረ | |||
ጰ | ፐ | ፈ | |||
ጸ | |||||
ፀ | |||||
ፈ | |||||
ፐ |
It must be noted that for the six series (ቀ,ቐ,ኀ,ከ,ኸ and ገ) having the full complement of 12 forms, the Unicode ordering attempts a linguistic style of sequencing. Indeed the zemede-kaib and zemede-sadis can be shown to be separated by very little distance under linguistic metrics. In the present day this spoken difference is not stressed and we are likely seeing a phonemic decay of the zemede-kaib into a zemede-sadis form. While the Unicode sequencing can be found in Eritrean and Ethiopian reference, it is fairly common knowledge that the traditional sequence recognizes the zemede-kaib position as correct for the syllographs in questions. Our next table demonstrates the difference using the ከ syllable:
ግዕዝ Ge'ez |
ካዕብ Kaib |
ሣልስ Salis |
ራዕብ Rabi |
ኃምስ Hamis |
ሳድስ Sadis |
ሳብዕ Sabi |
|
---|---|---|---|---|---|---|---|
Traditional | ኰ | ኵ | ኲ | ኳ | ኴ | ||
Unicode | ኰ | ኲ | ኳ | ኴ | ኵ |
The final three syllographic elements in Unicode, ፘ, ፙ and ፚ, are in an order slightly different from that of how their base components appear in the syllabary. These last three elements could potentially sort as-is, or in order with the base syllabary as ፙ, ፘ and ፚ, or as the last member of their syllabic families following ሟ, ሯ and ፏ accordingly. The later option would likely be found to be the most intuitive of the three. Alas, since the 3 ligatures are virtually unknown archaic relics of the syllabary's history, the matter of their sort order could be no more than academic.
Collation is otherwise straightforward and no practices are known for Ethiopic script whereby two or more characters together sort in a higher or lower order than separately.