Adding collation to utf8mb4 charset
Hello experts,
i wonder how to create a custom collation for utf8mb4-charsets:
If you want to add a custom collation in mysql/mariaDB, for utf-8 charsets you can modify .../charsets/Index.xml and extend the charset with the LDML-Syntax:
<charset name="utf8"> ... <collation name="utf8_myown_ci" id="1234"> <rules> <reset>\u0000</reset> <i>\u0020</i> ... </rules> </collation> ... </charset>
But there is not charset-tag with name "utf8mb4". So I created one with name="utf8mb4" and added the base collation tags and my own collation.
<charset name="utf8mb4"> <family>Unicode</family> <description>UTF-8 MB4 Unicode</description> <collation name="utf8mb4_general_ci" id="45"> <flag>primary</flag> <flag>compiled</flag> </collation> <collation name="utf8mb4_bin" id="46"> <flag>binary</flag> <flag>compiled</flag> </collation> <collation name="utf8mb4_myown_ci" id="213"> </collation> </charset>
In phpmyadmin i could choose the newly created collation. But i couldn't inserts four byte characters; i get the error
"#1366 - Incorrect string value: '\xF0\x9F\x8D\xB5\xF0\x9F...' for field ..."
(with a built-in mb4-collation like utf8mb4_general_ci it works).
To be more precise: I have one column (a) with the bulit-in collation utf8mb4_general_ci and one column (b) with my own collation utf8mb4_myown_ci(defined in Index.xml). I insert the same data in both columns and in column a there is no error and in column b i'll get the error as described above.
It seems to be no problem to have the collation-tag empty, because i created an empty utf8_myown_ci inside charset="utf-8" and this works.
In the column with utf8mb4_myown_ci i can also insert 3 Byte Chars, so it seems it is interpreted as an utf8 collation.
I tried google multiple times and didn't find anything here, but i couldn't find any hints, how to add collations to charsets, which aren't present in Index.xml.
Any Ideas how to do it? Thank you for any hints!
Answer Answered by KnutEdelbert in this comment.
Turns out, i used an occupied collation-ID. If i use e.g. 501 instead of 213, it works.