[scheme-reports-wg2] Generative records, eval, and sample implementations

Discussion:

[scheme-reports-wg2] Generative records, eval, and sample implementations

Marc Nieper-Wißkirchen

2016-07-19 13:37:11 UTC

The following example

((eval 'port? (environment '(scheme base)))
(current-input-port))

evaluates to #t because the standard does not list any value types that may
be incompatible between the program's environment and the eval's
environment.

On the other hand,

((eval 'hash-table? (environment '(srfi 125)))
(make-hash-table (make-eq-comparator)))

may not need to return #t when the sample implementation of SRFI 125 is
used. (I could have replaced the sample implementation SRFI 125 with any
other sample implementation that uses record types to generate new disjoint
types.) The reason is that nothing in the report forbids that environment
loads the library (srfi 125) a second time, creating a fresh record type
that is incompatible with the record type of hash tables in the program's
environment.

This makes the types introduced by the SRFIs, which lead, in particular, to
the Red Edition, second-class types when compared with the basic types in
the report.

I see two obviously solutions for R7RS-large here:

1) During evaluation of a program, every library is only loaded at most
once.

2) The sample implementations should use a type of nongenerative record
(which should then be defined for R7RS-large as soon as possible).

The disadvantage of 1) is that it puts huge constraints on implementations.
A compiler will have to put the libraries initially imported by the program
on the same footing as the libraries later referenced by evaluating an
application of the environment procedure. The huge advantage of 1) is that
it won't come with any surprises for a user of the system.

The disadvantage of 2) is that it only partially solves the problem. Some
libraries like SRFI 128 maintain inner state (think of the registered
comparators). A user may (rightfully?) expect that this still works for
evaluated code.

--

Marc

--
You received this message because you are subscribed to the Google Groups "scheme-reports-wg2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scheme-reports-wg2+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Alaric Snell-Pym

2016-07-19 14:14:16 UTC

Post by Marc Nieper-WiÃkirchen
The disadvantage of 2) is that it only partially solves the problem. Some
libraries like SRFI 128 maintain inner state (think of the registered
comparators). A user may (rightfully?) expect that this still works for
evaluated code.

Relatedly, I have always worried about the implementation of generic
functions. As far as I can tell, these tend to be implemented by having
the operation of loading a module that defines some methods on a GF
mutate a table stored inside the GF (effectively). This is alarming, as
the loading of some distant module (perhaps triggered by an eval) can
then changed the behaviour of existing code; if the existing code is
currently calling a * GF with arguments of type integer, and currently
getting an implementation defined on numbers in general - then the
loading of a module, somewhere distant in the runtime, that defines a
more specialised integer implementation can affect that.

Aside from detailed mechanics of the interactions of eval and modules,
what sort of semantics for that sort of thing are desirable? It's
tricky, as GFs are a mainstream and useful thing that just so happen to
violate lexical AND dynamic scope. What *is* the scope of a generic
function, within which method implementations and calls can be brought
together? In that case, one might want to have "generative GFs", where
two different parts of the program get their own definition of the GF to
which methods are attached, and what methods you get depend on which
declaration of the GF you call it through... But you wouldn't generally
want that to be the case for every library that loads the same module
defining the GF. Some larger declaration of scope seems necessary, and
I've been meaning for a while to go and see if this has been addressed
in any existing systems, because I've not bumped into mention of it yet!

Post by Marc Nieper-WiÃkirchen
Marc

ABS

--
Alaric Snell-Pym
http://www.snell-pym.org.uk/alaric/
--
You received this message because you are subscribed to the Google Groups "scheme-reports-wg2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scheme-reports-wg2+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Marc Nieper-Wißkirchen

2016-07-19 14:48:58 UTC

Post by Alaric Snell-Pym

Post by Marc Nieper-WiÃkirchen
The disadvantage of 2) is that it only partially solves the problem. Some
libraries like SRFI 128 maintain inner state (think of the registered
comparators). A user may (rightfully?) expect that this still works for
evaluated code.

Relatedly, I have always worried about the implementation of generic
functions. As far as I can tell, these tend to be implemented by having
the operation of loading a module that defines some methods on a GF
mutate a table stored inside the GF (effectively). This is alarming, as
the loading of some distant module (perhaps triggered by an eval) can
then changed the behaviour of existing code; if the existing code is
currently calling a * GF with arguments of type integer, and currently
getting an implementation defined on numbers in general - then the
loading of a module, somewhere distant in the runtime, that defines a
more specialised integer implementation can affect that.
Aside from detailed mechanics of the interactions of eval and modules,
what sort of semantics for that sort of thing are desirable? It's
tricky, as GFs are a mainstream and useful thing that just so happen to
violate lexical AND dynamic scope. What *is* the scope of a generic
function, within which method implementations and calls can be brought
together? In that case, one might want to have "generative GFs", where
two different parts of the program get their own definition of the GF to
which methods are attached, and what methods you get depend on which
declaration of the GF you call it through... But you wouldn't generally
want that to be the case for every library that loads the same module
defining the GF. Some larger declaration of scope seems necessary, and
I've been meaning for a while to go and see if this has been addressed
in any existing systems, because I've not bumped into mention of it yet!

Couldn't this be solved under 1) using parameter objects?

A module (GF) defines a parameter holding the table of methods. When
ever a method is added or looked up, the parameter is used to locate the
table of methods. Around a call of eval, parameterize is used to let the
parameter now point to a copy of the table of methods (using functional
data structures, this can be quite efficient).

Another idea would be to define a new type of location, say an "instance
variable". Each instance variable is like a mutable parameter object
that, however, is set to its original initial value in the environment
returned by the procedure named environment.

--
You received this message because you are subscribed to the Google Groups "scheme-reports-wg2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scheme-reports-wg2+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Marc Nieper-Wißkirchen

2016-07-19 14:52:58 UTC

Am 19.07.2016 um 16:14 schrieb Alaric Snell-Pym:

On 19/07/16 14:37, Marc Nieper-WiÃkirchen wrote:

The disadvantage of 2) is that it only partially solves the problem. Some
libraries like SRFI 128 maintain inner state (think of the registered
comparators). A user may (rightfully?) expect that this still works for
evaluated code.

Relatedly, I have always worried about the implementation of generic
functions. As far as I can tell, these tend to be implemented by having
the operation of loading a module that defines some methods on a GF
mutate a table stored inside the GF (effectively). This is alarming, as
the loading of some distant module (perhaps triggered by an eval) can
then changed the behaviour of existing code; if the existing code is
currently calling a * GF with arguments of type integer, and currently
getting an implementation defined on numbers in general - then the
loading of a module, somewhere distant in the runtime, that defines a
more specialised integer implementation can affect that.

Aside from detailed mechanics of the interactions of eval and modules,
what sort of semantics for that sort of thing are desirable? It's
tricky, as GFs are a mainstream and useful thing that just so happen to
violate lexical AND dynamic scope. What *is* the scope of a generic
function, within which method implementations and calls can be brought
together? In that case, one might want to have "generative GFs", where
two different parts of the program get their own definition of the GF to
which methods are attached, and what methods you get depend on which
declaration of the GF you call it through... But you wouldn't generally
want that to be the case for every library that loads the same module
defining the GF. Some larger declaration of scope seems necessary, and
I've been meaning for a while to go and see if this has been addressed
in any existing systems, because I've not bumped into mention of it yet!

Couldn't this be solved under 1) using parameter objects?

A module (GF) defines a parameter holding the table of methods. When ever a
method is added or looked up, the parameter is used to locate the table of
methods. Around a call of eval, parameterize is used to let the parameter
now point to a copy of the table of methods (using functional data
structures, this can be quite efficient).

Another idea would be to define a new type of location, say "instance
variables".

--
You received this message because you are subscribed to the Google Groups "scheme-reports-wg2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scheme-reports-wg2+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

John Cowan

2016-07-20 20:44:55 UTC

Post by Alaric Snell-Pym
Aside from detailed mechanics of the interactions of eval and modules,
what sort of semantics for that sort of thing are desirable? It's
tricky, as GFs are a mainstream and useful thing that just so happen to
violate lexical AND dynamic scope. What *is* the scope of a generic
function, within which method implementations and calls can be brought
together? In that case, one might want to have "generative GFs", where
two different parts of the program get their own definition of the GF to
which methods are attached, and what methods you get depend on which
declaration of the GF you call it through... But you wouldn't generally
want that to be the case for every library that loads the same module
defining the GF.

The Chibi approach, which I am going to propose for the Yellow Edition,
is that generic functions are simply mutable procedures. There is a
simple procedural API with make-generic to construct an empty generic
function that always throws when you call it, and add-method! to install
a new method. The latter accepts an array of predicates corresponding
to the argument types, and a lambda to be called if all the types match.
This way you are not dependent on any particular type hierarchy. It's up
to the programmer to install the methods in the correct order.

Trivial macros define-generic and define-method are layered over these:
define-method must be a low-level macro, as non-hygienic call-next-method
can be used to invoke the next available method.

This is crude in some sense, but it does resolve the hygiene and sharing
difficulties: the GF object can be held in a global variable, a local
variable, a parameter, or what have you, and its visibility is controlled
just like any other variable.
--
John Cowan http://www.ccil.org/~cowan ***@ccil.org
I am expressing my opinion. When my honorable and gallant friend is
called, he will express his opinion. This is the process which we
call Debate. --Winston Churchill

--
You received this message because you are subscribed to the Google Groups "scheme-reports-wg2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scheme-reports-wg2+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Taylan Ulrich Bayırlı/Kammer

2016-07-19 14:48:26 UTC

Post by Marc Nieper-WiÃkirchen
1) During evaluation of a program, every library is only loaded at
most once.

I think this is a Good Thing to specify in general. If programmers can
rely on the top-level of every library only being evaluated once during
the run-time of a Scheme program, they will have an easier time
reasoning about the behavior of a program. Moreover, a programmer may
forget about the fact that a top-level may be evaluated multiple times
(since it's arguably counter-intuitive) and create hard to find bugs.

The issue you rise is one concrete example where the possibility of the
top-level of a library being evaluated multiple times leads to a
problem, and I expect more such issues will pop up in the future if the
problem isn't solved from its root.

Taylan

--
You received this message because you are subscribed to the Google Groups "scheme-reports-wg2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scheme-reports-wg2+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Per Bothner

2016-07-19 16:09:49 UTC

Post by Marc Nieper-WiÃkirchen
1) During evaluation of a program, every library is only loaded at
most once.

I think this is the only thing that makes sense.

R7RS states that when module M1 imports M2, evaluation of M2 happens
before evaluation of M1 - regardless of the where the import declaration
is in M1, or how many import declarations for M2 there are in M1.
I.e. evaluation of M2 happens once, before evaluation of M1.

R7RS does state "If a library is imported by more than one program or library,
it may possibly be loaded additional times." However, I don't think hat makes
sense - it would make for a weird mental model, considering the previous.
And it would violate transitivity: If M1 imports M2 and M3, and both M2 and M3
both import M4, then in a sense M1 imports M4 twice. If evaluation
happened twice when M4 is imported twice indirectly, but only happens once
when M4 is imported directly, I think it would be really weird.

--
--Per Bothner
***@bothner.com http://per.bothner.com/
--
You received this message because you are subscribed to the Google Groups "scheme-reports-wg2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scheme-reports-wg2+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Marc Nieper-Wißkirchen

2016-07-19 18:29:13 UTC

In order to make it easy for implementations to support the most sensible
model, 1), it might have been a good idea to make the `environment'
procedure a special form that only accepts literals as arguments. If this
was the case, a compiler could load all needed libraries ahead and exactly
once.

--

Marc

Post by Per Bothner

Post by Marc Nieper-WiÃkirchen
1) During evaluation of a program, every library is only loaded at
most once.

I think this is the only thing that makes sense.
R7RS states that when module M1 imports M2, evaluation of M2 happens
before evaluation of M1 - regardless of the where the import declaration
is in M1, or how many import declarations for M2 there are in M1.
I.e. evaluation of M2 happens once, before evaluation of M1.
R7RS does state "If a library is imported by more than one program or library,
it may possibly be loaded additional times." However, I don't think hat makes
sense - it would make for a weird mental model, considering the previous.
And it would violate transitivity: If M1 imports M2 and M3, and both M2 and M3
both import M4, then in a sense M1 imports M4 twice. If evaluation
happened twice when M4 is imported twice indirectly, but only happens once
when M4 is imported directly, I think it would be really weird.
--
--Per Bothner

--
You received this message because you are subscribed to the Google Groups "scheme-reports-wg2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scheme-reports-wg2+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Alaric Snell-Pym

2016-07-19 19:56:41 UTC

Post by Marc Nieper-WiÃkirchen
In order to make it easy for implementations to support the most sensible
model, 1), it might have been a good idea to make the `environment'
procedure a special form that only accepts literals as arguments. If this
was the case, a compiler could load all needed libraries ahead and exactly
once.

Perhaps it can notice when 'environment' is called with a literal as an
argument, and do that, in the common case: and then only have to do
anything complicated if it's called with non-literal arguments.

ABS

--
Alaric Snell-Pym
http://www.snell-pym.org.uk/alaric/
--
You received this message because you are subscribed to the Google Groups "scheme-reports-wg2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scheme-reports-wg2+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Marc Nieper-Wißkirchen

2016-07-22 13:17:18 UTC

Post by Marc Nieper-WiÃkirchen

Post by Marc Nieper-WiÃkirchen
In order to make it easy for implementations to support the most

sensible

Post by Marc Nieper-WiÃkirchen
model, 1), it might have been a good idea to make the `environment'
procedure a special form that only accepts literals as arguments. If

this

Post by Marc Nieper-WiÃkirchen
was the case, a compiler could load all needed libraries ahead and

exactly

Post by Marc Nieper-WiÃkirchen
once.

Perhaps it can notice when 'environment' is called with a literal as an
argument, and do that, in the common case: and then only have to do
anything complicated if it's called with non-literal arguments.
ABS

I have just realised that restricting library names to literals wouldn't be
very restrictive. Using `evalâ, one could still create mutable environments.

--
You received this message because you are subscribed to the Google Groups "scheme-reports-wg2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scheme-reports-wg2+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

John Cowan

2016-07-20 00:50:28 UTC

Post by Marc Nieper-WiÃkirchen
1) During evaluation of a program, every library is only loaded at most
once.

I think this would be a good thing, but I don't want to jump to conclusions.

Post by Marc Nieper-WiÃkirchen
2) The sample implementations should use a type of nongenerative record
(which should then be defined for R7RS-large as soon as possible).

Alternatively:

3) We could simply declare by fiat that the R7RS-large standard types
are unique and disjoint to one another and to all user-created types,
without saying how this is done. It's possible, for example, that an
implementation uses a variant of define-record-type to define ports.
--
John Cowan http://www.ccil.org/~cowan ***@ccil.org
Let's face it: software is crap. Feature-laden and bloated, written under
tremendous time-pressure, often by incapable coders, using dangerous
languages and inadequate tools, trying to connect to heaps of broken or
obsolete protocols, implemented equally insufficiently, running on
unpredictable hardware -- we are all more than used to brokenness.
--Felix Winkelmann

--
You received this message because you are subscribed to the Google Groups "scheme-reports-wg2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scheme-reports-wg2+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Per Bothner

2016-07-20 01:10:36 UTC

Post by Marc Nieper-WiÃkirchen
1) During evaluation of a program, every library is only loaded at most
once.

On further thought, I feel even stronger that no other option is usable.
Consider the case the the body of a library initializes some data structure.
It could be disastrous if that happened multiple times.

Also consider diamond import: M1 imports M2 and M3, which both import M4.
Suppose M4 exports a variable v initialize using some expression.
Then we risk two versions of v that are not eq? to each other, which
could easily lead to weird behavior.

--
--Per Bothner
***@bothner.com http://per.bothner.com/
--
You received this message because you are subscribed to the Google Groups "scheme-reports-wg2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scheme-reports-wg2+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Marc Nieper-Wißkirchen

2016-07-20 06:05:08 UTC

Post by Per Bothner

Post by Marc Nieper-WiÃkirchen
1) During evaluation of a program, every library is only loaded at most
once.

On further thought, I feel even stronger that no other option is usable.
Consider the case the the body of a library initializes some data structure.
It could be disastrous if that happened multiple times.
Also consider diamond import: M1 imports M2 and M3, which both import M4.
Suppose M4 exports a variable v initialize using some expression.
Then we risk two versions of v that are not eq? to each other, which
could easily lead to weird behavior.

Parameter objects are one of these use cases that won't work well if a
library is loaded multiple times.

--
You received this message because you are subscribed to the Google Groups "scheme-reports-wg2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scheme-reports-wg2+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Marc Nieper-Wißkirchen

2016-07-20 06:35:44 UTC

Post by John Cowan

Post by Marc Nieper-WiÃkirchen
1) During evaluation of a program, every library is only loaded at most
once.

I think this would be a good thing, but I don't want to jump to conclusions.

Post by Marc Nieper-WiÃkirchen
2) The sample implementations should use a type of nongenerative record
(which should then be defined for R7RS-large as soon as possible).

3) We could simply declare by fiat that the R7RS-large standard types
are unique and disjoint to one another and to all user-created types,
without saying how this is done. It's possible, for example, that an
implementation uses a variant of define-record-type to define ports.

My point is that the Red Edition was advertised as one which can be
readily implemented on any R7RS-small system by using the sample
implementations. I don't see how 3) can help here because the sample
implementations have no way to enforce it as long as loading of
libraries can happen in unpredictable ways.

If M1 imports M2 and M3 and M2 and M3 both import (srfi 125) (statically
or by using eval) and if multiple loading was allowed, it could happen
that M1 cannot pass objects between imported procedures of M2 and M3
because both of them have received implementation of hash tables from
different loadings (and thus incompatible with the current sample
implementation).

And even if 3) was enforced, it still wouldn't be possible for a user of
for further SRFIs to extend R7RS-large with types that are at least as
first class as the standard types. Or, think of the example of the radix
parameter in R7RS-small. Assume that it is defined in M3, which is
imported by M1 and M2. M1 parameterizes the radix and calls a procedure
in M2. In may happen that M2 does not see the change of the radix.

I'd like to see these fundamental being resolved as early as possible in
the standardization process of R7RS-large because they influence
everything else and they decide how usable R7RS-large will be. (On the
other hand, libraries like SRFI 135 (just to pick an example) are not
that crucial because it has be shown that they can even be implemented
on an R7RS-small system, so even if we forgot somehow to include such a
library in R7RS-large, users could easily amend the language in that
corner.)

--

Marc

P.S.: Another fundamental thing that should be specified soon, should be
extensible feature detection so that the rest of the language can rely
on it. At the moment, there is no standard-way for libraries like (srfi
125) to statically share information whether they, say, support weak
hash tables. What I have in mind is something like the following:

- Add a <feature declaration> option of the form (feature <identifier>)
to library definitions. As soon as the library is loaded, add (<library
name part>+ (<identifier>)) to the list returned by (features).
- Besides of <library name>, also allow (<library name part>+
(<identifier>)) as a cond-expand clause. This matches if and only if
<identifier> is declared as a feature in (<library name part>+).

Example:

(define-library (srfi 125)
(cond-expand
((srfi 124)
(import (srfi 124))
(feature weak-hash-tables)))
...)

Or, even more useful:

(define-library (srfi 125)
(cond-expand
((srfi 124 (true-ephemerons))
(import (srfi 124))
(feature weak-hash-tables)))
...)

--
You received this message because you are subscribed to the Google Groups "scheme-reports-wg2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scheme-reports-wg2+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Alex Shinn

2016-07-21 08:09:24 UTC

On Tue, Jul 19, 2016 at 10:37 PM, Marc Nieper-Wißkirchen

Post by Marc Nieper-WiÃkirchen
1) During evaluation of a program, every library is only loaded at most
once.

You'd have to word that carefully to allow for sandboxed
environments, but even then I think this is too restrictive:
we want to support R6RS' tower of phases. Also the
problems you're trying to avoid will still be present with
batch compilation of libraries.

--
Alex
--
You received this message because you are subscribed to the Google Groups "scheme-reports-wg2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scheme-reports-wg2+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Marc Nieper-Wißkirchen

2016-07-21 08:22:33 UTC

On Tue, Jul 19, 2016 at 10:37 PM, Marc Nieper-WiÃkirchen

Post by Marc Nieper-WiÃkirchen
1) During evaluation of a program, every library is only loaded at most
once.

You'd have to word that carefully to allow for sandboxed
environments,

Can you explain this a little bit more?

we want to support R6RS' tower of phases.

As soon as phases are introduced (I am still not sure whether R6RS'
approach was that expedient, however), just read "during evaluation" as "in
phase 0". (And imply that during each phase, a library is loaded at most
once.)

It should also be emphasized that "eval" does not introduce a phase "-1"
(otherwise we run into the aforementioned problems).

Also the
problems you're trying to avoid will still be present with
batch compilation of libraries.

In what sense?

--
You received this message because you are subscribed to the Google Groups "scheme-reports-wg2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scheme-reports-wg2+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

15 Replies
1 View
Permalink to this page
Disable enhanced parsing

Thread Navigation

Marc Nieper-Wißkirchen 2016-07-19 13:37:11 UTC

Alaric Snell-Pym 2016-07-19 14:14:16 UTC

Marc Nieper-Wißkirchen 2016-07-19 14:48:58 UTC

Marc Nieper-Wißkirchen 2016-07-19 14:52:58 UTC

John Cowan 2016-07-20 20:44:55 UTC

Taylan Ulrich Bayırlı/Kammer 2016-07-19 14:48:26 UTC

Per Bothner 2016-07-19 16:09:49 UTC

Marc Nieper-Wißkirchen 2016-07-19 18:29:13 UTC

Alaric Snell-Pym 2016-07-19 19:56:41 UTC

Marc Nieper-Wißkirchen 2016-07-22 13:17:18 UTC

John Cowan 2016-07-20 00:50:28 UTC

Per Bothner 2016-07-20 01:10:36 UTC

Marc Nieper-Wißkirchen 2016-07-20 06:05:08 UTC

Marc Nieper-Wißkirchen 2016-07-20 06:35:44 UTC

Alex Shinn 2016-07-21 08:09:24 UTC

Marc Nieper-Wißkirchen 2016-07-21 08:22:33 UTC

about - legalese

Loading...