Does anyone know what should happen to threads (created with Thread.Thread.fork) when PolyML.SaveState.loadState is performed? It appears that they are no longer reported as active but, despite this, carry on running. Is it up to the user to kill these before doing a loadState?
Thanks, Phil
The information contained in this E-Mail and any subsequent correspondence is private and is intended solely for the intended recipient(s). The information in this communication may be confidential and/or legally privileged. Nothing in this e-mail is intended to conclude a contract on behalf of QinetiQ or make QinetiQ subject to any other legally binding commitments, unless the e-mail contains an express statement to the contrary or incorporates a formal Purchase Order.
For those other than the recipient any disclosure, copying, distribution, or any action taken or omitted to be taken in reliance on such information is prohibited and may be unlawful.
Emails and other electronic communication with QinetiQ may be monitored and recorded for business purposes including security, audit and archival purposes. Any response to this email indicates consent to this.
Telephone calls to QinetiQ may be monitored or recorded for quality control, security and other business purposes.
QinetiQ Limited Registered in England & Wales: Company Number:3796233 Registered office: 85 Buckingham Gate, London SW1E 6PD, United Kingdom Trading address: Cody Technology Park, Cody Building, Ively Road, Farnborough, Hampshire, GU14 0LX, United Kingdom http://www.qinetiq.com/home/notices/legal.html
Philip Clayton wrote:
Does anyone know what should happen to threads (created with Thread.Thread.fork) when PolyML.SaveState.loadState is performed? It appears that they are no longer reported as active but, despite this, carry on running. Is it up to the user to kill these before doing a loadState?
Phil, Threads continue running after loadState. However there is an issue with thread identifiers and I think this is what is causing the confusion.
Saved states are primarily intended to allow the state to be restored in a different session. Certain values, such as thread identifiers and stream identifiers, only have meaning within a particular session and aren't persistent. When they are saved and loaded, even within the same session, the link with the original object is broken. It is possible to "carry over" such a value within a session by passing it on the stack. Consider: val y = Thread.Mutex.mutex(); Thread.Mutex.lock y; val v = Thread.Thread.fork(fn () => Thread.Mutex.lock y, []); (* v now refers to a blocked but running thread. *) PolyML.SaveState.saveState "save";
val w = (PolyML.SaveState.loadState "save"; v);
After reloading Poly/ML reports:
Thread.Thread.isActive v;
val it = false : bool
Thread.Thread.isActive w;
val it = true : bool
There is a similar situation with streams. The reason it works this way is that there is no guarantee that the saved state loaded by loadState actually has been loaded in the same session it was saved in. To solve this would require (expensive) globally unique identifiers.
David
David,
Thanks. This makes sense and explains why we were seeing an inactive thread. It's useful to know that values can be passed from one sessions to the next via the stack - we hadn't fully appreciated that feature.
This behaviour with threads works well for what we are trying to do but there is still something we can't explain or see how to work around: after a loadState, when the running thread retrieves a referenced value, it no longer appears to pick up the latest value. I've provided an example that demonstrates this below. Is there a way for the thread to pick up assignments to the ref variable 'c' after the loadState?
Thanks, Phil
val c = ref 0;
open Thread;
fun run () : Thread.thread = let val cv = ConditionVar.conditionVar (); val m = Mutex.mutex ();
fun aux () = let open Time; val _ = ConditionVar.waitUntil (cv, m, now () + fromSeconds 1); in print (Int.toString (!c)); aux () end; in Thread.fork (aux, []) end;
val t = run (); (* prints 0 on standard output every second *)
c := 2; (* now prints 2 *)
PolyML.SaveState.saveState "test1.polydb"; (* save state with c = ref 2 *)
c := 3; (* now prints 3 *)
val t = (PolyML.SaveState.loadState "test1.polydb"; t); (* reload state *)
Thread.isActive t; (* thread still active *)
(* still printing 3, even though saved when c = ref 2 *)
c := 1; (* no effect, still prints 3 *)
David Matthews wrote:
Philip Clayton wrote:
Does anyone know what should happen to threads (created with Thread.Thread.fork) when PolyML.SaveState.loadState is performed? It appears that they are no longer reported as active but, despite this, carry on running. Is it up to the user to kill these before doing a loadState?
Phil, Threads continue running after loadState. However there is an issue with thread identifiers and I think this is what is causing the confusion.
Saved states are primarily intended to allow the state to be restored in a different session. Certain values, such as thread identifiers and stream identifiers, only have meaning within a particular session and aren't persistent. When they are saved and loaded, even within the same session, the link with the original object is broken. It is possible to "carry over" such a value within a session by passing it on the stack. Consider: val y = Thread.Mutex.mutex(); Thread.Mutex.lock y; val v = Thread.Thread.fork(fn () => Thread.Mutex.lock y, []); (* v now refers to a blocked but running thread. *) PolyML.SaveState.saveState "save";
val w = (PolyML.SaveState.loadState "save"; v);
After reloading Poly/ML reports:
Thread.Thread.isActive v;
val it = false : bool
Thread.Thread.isActive w;
val it = true : bool
There is a similar situation with streams. The reason it works this way is that there is no guarantee that the saved state loaded by loadState actually has been loaded in the same session it was saved in. To solve this would require (expensive) globally unique identifiers.
David
The information contained in this E-Mail and any subsequent correspondence is private and is intended solely for the intended recipient(s). The information in this communication may be confidential and/or legally privileged. Nothing in this e-mail is intended to conclude a contract on behalf of QinetiQ or make QinetiQ subject to any other legally binding commitments, unless the e-mail contains an express statement to the contrary or incorporates a formal Purchase Order.
For those other than the recipient any disclosure, copying, distribution, or any action taken or omitted to be taken in reliance on such information is prohibited and may be unlawful.
Emails and other electronic communication with QinetiQ may be monitored and recorded for business purposes including security, audit and archival purposes. Any response to this email indicates consent to this.
Telephone calls to QinetiQ may be monitored or recorded for quality control, security and other business purposes.
QinetiQ Limited Registered in England & Wales: Company Number:3796233 Registered office: 85 Buckingham Gate, London SW1E 6PD, United Kingdom Trading address: Cody Technology Park, Cody Building, Ively Road, Farnborough, Hampshire, GU14 0LX, United Kingdom http://www.qinetiq.com/home/notices/legal.html
Phil, The idea of loadState is to be able to load information saved in a previous session and it doesn't always work as expected when you reload the state into the session that saved it. That isn't something I imagined someone wanting to do.
loadState restores the values of references in the executable to the values they had when the state was saved and this typically involves loading additional mutable and immutable data. Any existing saved state in the session is discarded but that raises the question of what happens if a thread, in particular the one actually calling loadState, is using some of the data in the old saved state. To allow this to work the contents of the old saved state is moved into the local heap before the new saved state is loaded. Most or all of it will then get garbage collected away. The effect of this is that if you happen to load the same state that has previously been loaded or saved then any objects will be duplicated. There's no check that a state is actually the same as one that was previously in the session, either loaded or saved, and in any case I'm not sure there should be.
David
Philip Clayton wrote:
David,
Thanks. This makes sense and explains why we were seeing an inactive thread. It's useful to know that values can be passed from one sessions to the next via the stack - we hadn't fully appreciated that feature.
This behaviour with threads works well for what we are trying to do but there is still something we can't explain or see how to work around: after a loadState, when the running thread retrieves a referenced value, it no longer appears to pick up the latest value. I've provided an example that demonstrates this below. Is there a way for the thread to pick up assignments to the ref variable 'c' after the loadState?
Thanks, Phil
David,
That behaviour sounds reasonable to me. I now realize that when I was simplifying the example we have, I went a little too far and eliminated the hierarchical aspect of the state, making the problem I ended up describing subtly different.
My understanding of loading a state is that new objects are created whether or not the same object is already loaded. That seems the right thing to do, as you say. When loading a child state however, I would expect only objects in the child state to be created and any objects not in the child to remain as they are. This does not appear to fit with our observations in relation to threads though. Loading a child state appears to prevent a running thread from picking up the latest mutable data, even though all objects used in the running thread are not in the child state. I can't recreate the issue without threads. I've updated the previous example to show this.
Sorry if I'm missing something obvious here.
Thanks, Phil
val c = ref 0;
open Thread;
fun run () : Thread.thread = let val cv = ConditionVar.conditionVar (); val m = Mutex.mutex ();
fun aux () = let open Time; val _ = ConditionVar.waitUntil (cv, m, now () + fromSeconds 1); in print (Int.toString (!c)); aux () end; in Thread.fork (aux, []) end;
PolyML.SaveState.saveState "test1.polydb"; (* save state with c = ref 0 *)
(* save a child of test1.polydb that contains nothing extra *)
PolyML.SaveState.saveChild ("test2.polydb", 1);
val t = run (); (* prints 0 on standard output every second *)
c := 2; (* now prints 2 *) c := 3; (* now prints 3 *)
(* load the child test2.polydb *)
val t = (PolyML.SaveState.loadState "test2.polydb"; t);
(* still printing 3, even though saved when c = ref 0 *)
Thread.isActive t; (* thread still active *)
c := 1; (* no effect, still prints 3... *)
!c; (* ...even though !c = 1 *)
(* kill thread and start again with same function run: now picks up current c *)
Thread.kill t;
val t = run ();
(* uses current c: now prints 1 - ok *)
c := 2; (* now prints 2 - ok *)
(* this seems strange because run is not in the child state so we would *) (* expect it to be the same object. *)
David Matthews wrote:
Phil, The idea of loadState is to be able to load information saved in a previous session and it doesn't always work as expected when you reload the state into the session that saved it. That isn't something I imagined someone wanting to do.
loadState restores the values of references in the executable to the values they had when the state was saved and this typically involves loading additional mutable and immutable data. Any existing saved state in the session is discarded but that raises the question of what happens if a thread, in particular the one actually calling loadState, is using some of the data in the old saved state. To allow this to work the contents of the old saved state is moved into the local heap before the new saved state is loaded. Most or all of it will then get garbage collected away. The effect of this is that if you happen to load the same state that has previously been loaded or saved then any objects will be duplicated. There's no check that a state is actually the same as one that was previously in the session, either loaded or saved, and in any case I'm not sure there should be.
David
Philip Clayton wrote:
David,
Thanks. This makes sense and explains why we were seeing an inactive thread. It's useful to know that values can be passed from one sessions to the next via the stack - we hadn't fully appreciated that feature.
This behaviour with threads works well for what we are trying to do but there is still something we can't explain or see how to work around: after a loadState, when the running thread retrieves a referenced value, it no longer appears to pick up the latest value. I've provided an example that demonstrates this below. Is there a way for the thread to pick up assignments to the ref variable 'c' after the loadState?
Thanks, Phil
The information contained in this E-Mail and any subsequent correspondence is private and is intended solely for the intended recipient(s). The information in this communication may be confidential and/or legally privileged. Nothing in this e-mail is intended to conclude a contract on behalf of QinetiQ or make QinetiQ subject to any other legally binding commitments, unless the e-mail contains an express statement to the contrary or incorporates a formal Purchase Order.
For those other than the recipient any disclosure, copying, distribution, or any action taken or omitted to be taken in reliance on such information is prohibited and may be unlawful.
Emails and other electronic communication with QinetiQ may be monitored and recorded for business purposes including security, audit and archival purposes. Any response to this email indicates consent to this.
Telephone calls to QinetiQ may be monitored or recorded for quality control, security and other business purposes.
QinetiQ Limited Registered in England & Wales: Company Number:3796233 Registered office: 85 Buckingham Gate, London SW1E 6PD, United Kingdom Trading address: Cody Technology Park, Cody Building, Ively Road, Farnborough, Hampshire, GU14 0LX, United Kingdom http://www.qinetiq.com/home/notices/legal.html
Phil,
Philip Clayton wrote:
My understanding of loading a state is that new objects are created whether or not the same object is already loaded. That seems the right thing to do, as you say. When loading a child state however, I would expect only objects in the child state to be created and any objects not in the child to remain as they are. This does not appear to fit with our observations in relation to threads though. Loading a child state appears to prevent a running thread from picking up the latest mutable data, even though all objects used in the running thread are not in the child state. I can't recreate the issue without threads. I've updated the previous example to show this.
I've had a look through your example and I'm not certain exactly what's going on but I think the answer is that loading a child state always (re)loads the parents, at least as the code is currently implemented. This causes a new version of your ref "c" to be created and breaks the link between the ref that the thread is using and the ref that is found by looking up "c".
I guess it would be possible to change this so that saved states that were already loaded were not reloaded. I can't think of any reasons why this wouldn't work. It would require changes to the code. However, I wonder if it's what users would always want. Aren't there cases where a user might want loadState on a child to restore the whole state?
David
On Friday 28 Nov 2008 11:06 am, David Matthews wrote:
Phil,
Philip Clayton wrote:
My understanding of loading a state is that new objects are created whether or not the same object is already loaded. That seems the right thing to do, as you say. When loading a child state however, I would expect only objects in the child state to be created and any objects not in the child to remain as they are. This does not appear to fit with our observations in relation to threads though. Loading a child state appears to prevent a running thread from picking up the latest mutable data, even though all objects used in the running thread are not in the child state. I can't recreate the issue without threads. I've updated the previous example to show this.
I've had a look through your example and I'm not certain exactly what's going on but I think the answer is that loading a child state always (re)loads the parents, at least as the code is currently implemented. This causes a new version of your ref "c" to be created and breaks the link between the ref that the thread is using and the ref that is found by looking up "c".
This may be obvious, but you don't need threads to exhibit this issue. E.g.,
val c1 = ref 1;
PolyML.SaveState.saveState "test1.polydb"; (* save state with c1 = ref 1 *)
val c2 = c1; (c1 := 2; c2); (* val it = ref 2 *)
PolyML.SaveState.saveChild ("test2.polydb", 1); (* save state with c2 = c1 & c1 = ref 1 *)
val c3 = c1; (c1 := 3; c3); (* val it = ref 2 *)
(* load the child test2.polydb *) val c4 = (PolyML.SaveState.loadState "test2.polydb"; c3);
c1; (* val it = ref 2 *) c4; (* val it = ref 3 *) c3; (* Error-Value or constructor (c3) has not been declared *)
(c4 := 9; (c1, c4));(* val it = (ref 2, ref 9) *) (c1 := 5; (c1, c4));(* val it = (ref 5, ref 9) *)
I guess it would be possible to change this so that saved states that were already loaded were not reloaded. I can't think of any reasons why this wouldn't work. It would require changes to the code. However, I wonder if it's what users would always want. Aren't there cases where a user might want loadState on a child to restore the whole state?
I would expect it to restore only that part of the state that is referred to by the child and that does not exist, if that makes sense. I would not have expected it to undeclare c3 above and would have expected it to leave c1, c2, c3 and c4 all referencing the same cell.
What would be very useful would be if you could save immediate children of some state (say each containing some data and some code to operate on that data) and load them in independently with any pointers into the common ancestry being resolved in the obvious way.
In ProofPower, for example, we currently have to save and reload the entire state of the theory hierarchy. It would be nice if theories and associated proof procedures could be saved separately and only loaded when needed. HOL IV has some kind of home-rolled way of achieving something that looks like this under Moscow ML (which effectively doesn't allow data to be saved at all, but does allow modules of code to be saved separately).
Regards,
Rob.
Rob, The current saveState/loadState saves and restores the whole "state". It works at the level of the memory and has no knowledge of what the values represent. When you load "test2.polydb" this simply restores all global references to the values they had when you saved the state. In particular the global name space is restored so any declarations made after the save will be lost. Using a hierarchy doesn't change this behaviour; using child saved states merely reduces the amount of data that may need to be written out.
To change this behaviour the process of saving and restoring would need to be more closely integrated with ML itself. I think it would be possible to adapt the existing scheme so that instead of treating the global mutable values as roots the "save" operation would take a list of pointers and write out the data structure reachable from this, apart from any data in the parents. The corresponding "load" function would be passed these pointers once the data structures had been brought into memory. Using these operations as primitives it ought to be possible to write a "saveStructure" function that would look up an ML structure in the global state and write it out. E.g. structure S = struct val x = 1 end PolyML.SaveState.saveStructure("S", "saveS.polydb", 1 (*Child level*));
The subsequently PolyML.SaveState.loadStructure "saveS.polydb"; S.x; Here "loadStructure" adds the structure declared within the saved state to the global name space. Unlike the existing "loadState" function it would know about the global name space.
There could be similar functions to save functors and signatures and, if necessary, types, values and fixity.
Since these saved modules would be relative to an existing executable and, possibly saved states, references to global types would match when the module was read in and there would be no need to package up type information explicitly. If a module referred to a type in a different module, though, the types would be different when the modules were read in.
I haven't thought this through in detail or made any experiments to check that it would all work but it seems feasible.
David
Rob Arthan wrote:
This may be obvious, but you don't need threads to exhibit this issue. E.g.,
val c1 = ref 1;
PolyML.SaveState.saveState "test1.polydb"; (* save state with c1 = ref 1 *)
val c2 = c1; (c1 := 2; c2); (* val it = ref 2 *)
PolyML.SaveState.saveChild ("test2.polydb", 1); (* save state with c2 = c1 & c1 = ref 1 *)
val c3 = c1; (c1 := 3; c3); (* val it = ref 2 *)
(* load the child test2.polydb *) val c4 = (PolyML.SaveState.loadState "test2.polydb"; c3);
c1; (* val it = ref 2 *) c4; (* val it = ref 3 *) c3; (* Error-Value or constructor (c3) has not been declared *)
(c4 := 9; (c1, c4));(* val it = (ref 2, ref 9) *) (c1 := 5; (c1, c4));(* val it = (ref 5, ref 9) *)
I would expect it to restore only that part of the state that is referred to by the child and that does not exist, if that makes sense. I would not have expected it to undeclare c3 above and would have expected it to leave c1, c2, c3 and c4 all referencing the same cell.
What would be very useful would be if you could save immediate children of some state (say each containing some data and some code to operate on that data) and load them in independently with any pointers into the common ancestry being resolved in the obvious way.
In ProofPower, for example, we currently have to save and reload the entire state of the theory hierarchy. It would be nice if theories and associated proof procedures could be saved separately and only loaded when needed. HOL IV has some kind of home-rolled way of achieving something that looks like this under Moscow ML (which effectively doesn't allow data to be saved at all, but does allow modules of code to be saved separately).