[polyml] Windows Unicode

7 Sep 2015


      This was something that came out of a discussion with Makarius about 
using the native Windows version with Isabelle but it's probably of 
general interest to anyone who uses or might use the native Windows version.
The current master version uses the ANSI interface to Windows API calls 
rather than Unicode.  I've been experimenting with a version that uses 
the Unicode, or more correctly UTF-16, interface.  What this means is 
that the conversion between ML strings and, for example file-names, is 
handled by Poly/ML itself rather than defaulting to the current 
code-page.  It only affects non-ASCII characters.
The experimental code is in the Windows-Unicode branch and requires
./configure CPPFLAGS="-DUNICODE -D_UNICODE"
to build the Unicode version.  The resulting poly takes a --codepage 
option to set the code-page to be used for conversion. Probably "utf8" 
is the most useful argument to give here.
I've been wondering whether to make the Unicode version the default 
rather than ANSI.  There's also the question of how best to specify the 
codepage.  For backwards compatibility I think it should default to the 
system code-page but UTF-8 is likely to be very popular.  Perhaps there 
should be some programmatic way (PolyML.setWindowCodePage ???) to set it 
as well as/instead of the command line argument.
Setting the code-page affects file-names, both those used for reading 
and writing files but also the names returned by OS.FileSys.readDir.  It 
also affects command-line arguments and environment variables.  When 
using the Windows GUI in "poly.exe" it affects the way characters are 
displayed when text is written to TextIO.stdOut.
David

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

[polyml] Windows Unicode