Gergely Buday wrote:
There was a long discussion on the mlton list in November 2005, seemingly leading to no conclusion:
See the "[Sml-basis-discuss] Unicode and WideChar support" and the "Unicode / WideChar" threads.
Yes, I remember that thread also. It was a bit disheartening.
For the moment, I am "handling" UTF-8 by using the %full option on my ml-lex lexer and treating characters in the range \128-\255 as valid constituents of "symbols". This is pretty weak, but it makes things look a deal nicer with minimal effort. If my users don't set out to break it, it should behave reasonably too.
Michael.