@jbp:
We have two cases here:
1. pfsense lets people use characters from their own languages (until now)
2. pfsense doesn't let people use characters from their own languages
-> pfsense should reject characters it doesn't support.
They are supported in 2.0 where it's feasible to do so, they were not supported in 1.2.3, they just happened to work without exploding the config in certain specific spots. Other spots would explode the config there. Congratulations, you stepped on a land mine and it didn't go off.
The way the characters were stored in the config in 1.2.3 was invalid XML, it doesn't meet the spec, which is why the config is now rejected on 2.0. If you run your 1.2.3 config through a standard xmllint tool it will show you that they are invalid XML.
On 2.0 in description fields and some others that can take such characters, we CDATA escape the values so that they are properly handled in XML.