[m-rev.] for tryout: type_spec_constrained_preds

Zoltan Somogyi zoltan.somogyi at runbox.com
Tue Jan 30 21:18:41 AEDT 2024


For tryout now and later review by Julien. A tryout on typeclass-heavy
code would either confirm or disprove that

- this design does what it is supposed to do, and that
- what this design is supposed to do is what its intended users
  actually need.

(For term_io.m, a single type_spec_constrained_preds pragma seems to be
sufficient to replace all of its 52 type_spec pragmas, *and* several more
that *should* be there but aren't.)

Unfortunately, with MLDS grades, one cannot tell from generated
.c/.java/.cs files what type specialized procedures they contain,
because the naming scheme we use for type-specialized procedures
requires name mangling, and mlds_to_X_func.m don't write out
the original, non-mangled procedure names even with --auto-comments.
However, the LLDS code generate does write out the original names
with --auto-comments, and a grep for TypeSpecOf in the .c file
will report them.

I am attaching the log and the diff, but a review would be worthwhile
only after we discuss and settle the three design questions below,
and I update the diff to implement the chosen solutions. I would also
add the required documentation then, as there is no point in
writing it before. However, I would like feedback now on the proposed
documentation of a new option in compiler/options.m.

-------------------------------------------------------------

The first is about syntax. At the moment, this diff supports
pragmas like this:

    :- pragma type_spec_constrained_preds(
        [stream.line_oriented(Stream, State),
            stream.unboxed_reader(Stream, char, State, Error),
            stream.putback(Stream, char, State, Error)],
        apply_to_superclasses,
        [subst([Stream = io.text_input_stream,
            State = io.state, Error = io.error])]).

The third argument here uses the syntax we agreed about two or three
months ago, but I wonder whether we should replace the equal signs
with an operator that is more like an arrow, such as -> or =>.
Unlike the existing syntax, this would make it clear that the
two sides of the operator are NOT equivalent; they have different
purposes, and therefore different restrictions.

Some terminology for the discussion below: I call

    subst([Stream = io.text_input_stream,
        State = io.state, Error = io.error])

a substitution, and I call its component

    Stream = io.text_input_stream

a type var substitution.

-------------------------------------------------------------

The second question has to do with possible restrictions on what
variables may appear on the left hand sides of those equal-signs/arrows.
It is a given that each left hand side must be a variable.
In the example above, all of these variables also occur in the
list of constraints in the first argument. This means that
when we apply that pragma to a predicate which uses those constraints
(or a subset of them), we replace the variable names on the LHSs
of the third arg of the type_spec_constrained_preds pragma
with the names of the variables in the predicate declaration
that occur in corresponding positions of those constraints
when we generate the type_spec pragma. For example, given

    :- pred p(....) is det
        <= (stream.line_oriented(Strm, St),
            stream.unboxed_reader(Strm, char, St, E),
            stream.putback(Strm, char, St, E)).

the type type_spec pragma we would generate for this predicate
would map Strm to io.text_input_stream, St to io.state, and
E to io.error.

If some of the variables on the LHSs of the third arg did NOT
occur in the first arg, as with this pragma and pred p,

:- pragma type_spec_constrained_preds(
    [stream.line_oriented(Stream, State),
        stream.unboxed_reader(Stream, char, State, Error),
        stream.putback(Stream, char, State, Error)],
    apply_to_superclasses,
    [subst([Stream = io.text_input_stream, Unit = char,
        State = io.state, Error = io.error])]).

we would not know what variable in the tvarset of this predicate
declaration Unit corresponds to. And indeed, the code in the attached
diff would generate a type_spec pragma that maps an unnamed variable
to char because of this, which makes that generated type_spec pragma
invalid.

I can see four possible approaches to solving this problem.

Approach one would simply disallow the occurrence of such variables
on the LHSs of the third arg, and generate an error message if they
nevertheless occur.

Approach two would allow such variables, but ignore them.

Approach three would allow such variables, but ignore them *unless*
the predicate declaration contains a type var with that name,
in which case it would use that name in the generated type_spec pragma.

Approach four would disallow such variables in the LHSs of the
third argument of the type_spec_constrained_preds pragma,
but would allow such variables to specified in a new fourth
argument of that pragma. This fourth arg would be a single
substitution, and the rule would be:

    if a type var occurs in the first arg, then
        it may occur on the LHS of a type var substitution
        in the third arg, but not in the fourth
    if a type var does not occur in the first arg, then
        it may occur on the LHS of a type var substitution
        in the fourth arg, but not in the third

But apart from this syntactic difference, the semantics would match
approach three: we would pay attention to a given type var substitution
in the fourth arg only in predicates whose declarations have
a type variable with the name on its LHS.

I would prefer approach four, though I could also live with approach one.
I view approach two as violating the law of least astonishment. As for
approach three, it could work, but any documentation of its semantics
would be signicantly more complicated than a documentation of the
semantics of approach four, and its use in practice would require
more attention as well. With approach four, the compiler can help
diagnose e.g. unintentional spelling mismatches between the first arg
and the later args; with approach three, it cannot.

-------------------------------------------------------------

The third question has to do with possible duplicates.
At the moment, the compiler does not try to diagnose duplicate
type_spec pragmas, and if any occur, it has to be a user-written
pragma duplicating another user-written pragma, so any problem
is clearly the user's fault. After this diff, there can also be
compiler generated pragmas duplicating user pragmas (or vice versa),
and compiler generated pragmas duplicating other compiler generated
pragmas. The latter can occur e.g. when two type_spec_constrained_preds
pragmas specify distinct but overlapping sets of constraints in their
first args, and they are both applied to a predicate whose class context
consists of just that overlap.

I believe that

- we should report duplicates when both type_spec pragmas are user-provided,
- but not if one or both pragmas are compiler generated.

This raises the question of how exactly we can test whether one type_spec
pragma duplicates another. Such pragmas consist of two arguments,
a predicate or procedure specification and a substitution, and two pragmas
clearly differ if they differ in one or both arguments.

I believe the rules should be the following.

- The first args of two type_spec pragmas differ if and only if
  they specify distinct sets of procedures. If they both

  - specify the same predicate,
  - specify the same procedure,
  - one specifies a predicate and the other specifies a procedure
    in that procedure,

  then the procedures they specify overlap, and this makes the two pragmas
  duplicates (if the second args also match).

- The second args of two type_spec pragmas differ if and only if they differ
  after

  - sorting the type var substitutions of which they consist
    on the name of the LHS variable, and

  - replacing all type variables that occur in the types on the RHSs
    of those type var substitutions with distinct anonymous type vars.
    (In fact, we should consider whether we should allow ANY non-anonymous
    type variables to occur in the RHS types.)

Zoltan.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Log.tscp
Type: application/octet-stream
Size: 5464 bytes
Desc: not available
URL: <http://lists.mercurylang.org/archives/reviews/attachments/20240130/7e3a3366/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: DIFF.tscp
Type: application/octet-stream
Size: 191709 bytes
Desc: not available
URL: <http://lists.mercurylang.org/archives/reviews/attachments/20240130/7e3a3366/attachment-0003.obj>


More information about the reviews mailing list