[m-rev.] for review: Overhaul of the Syntax chapter of the reference manual

Mark Brown mark at mercurylang.org
Sat Sep 3 17:20:22 AEST 2022


On Fri, Sep 2, 2022 at 5:32 PM Peter Wang <novalazy at gmail.com> wrote:
>
> On Thu, 01 Sep 2022 16:40:15 +1000 Mark Brown <mark at mercurylang.org> wrote:
> > Hi Julien.
> >
> > On Thu, Sep 1, 2022 at 12:36 AM Julien Fischer <jfischer at opturion.com> wrote:
> > >
> > >
> > > Hi Mark,
> > >
> > > On Wed, 31 Aug 2022, Mark Brown wrote:
> > >
> ...
> > > >     Changes to document structure
> > > >     -----------------------------
> > > >
> > > >     Sections and subsections of the Syntax chapter have been merged, to provide
> > > >     a more reasonable amount of information at each node.
> > >
> > >
> > > >     The Syntax chapter
> > > >     as a whole has been split into two chapters, Syntax and Semantics (the
> > > >     existing Semantics chapter is renamed Formal Semantics). The new structure
> > > >     is as follows:
> > >
> > > I think "Semantics" is a problematic name: there's a load of syntax
> > > stuff in there as well. Rather than "Semantics" I suggest simply calling
> > > it "Goals and Expressions".
> >
> > Well, there's some pretty significant things other than goals and
> > expressions, and most of the remaining chapters provide syntax along
> > with the semantics. If you want a more accurate name then it could be
> > "Clause semantics" for example. If you just want to suggest that it's
> > not _all_ of the language semantics it could be called "Basic
> > semantics" or something like that.
> >
> > >
> > > Thinking about this a bit more: there's no particular reason that we need to
> > > have a chapter named "Syntax" that introduces all the syntax up front. Here,
> > > for the sake of discussion, is an alternative structure for the beginning of
> > > the reference manual:
> > >
> > >       1. Introduction
> > >       2. Lexical structure
> > >          2.1 Character set
> > >          2.2 Whitespace
> > >          2.3 Comments
> > >          2.4 Tokens
> > >          2.5 Operators
> >
> > I think the difference here mainly depends on whether we want to use
> > the "article" style or the "structured style". The structure you've
> > given here will definitely be familiar to just about any programmer,
> > so I'll ask a few non-Prolog programmers who have learned Mercury what
> > they think (aside from waiting for others on this list). I'll be happy
> > to change to this structure if need be.

The first comment that came back agrees with Julien on wanting links
in a table of contents for easy reference (so I'll assume we're going
with that style), but the example was variables rather than white
space. Which raises a good point: one of the confusing things for
other programmers is that variables start with uppercase while data
constructors start with lowercase. So maybe the Tokens section also
needs to be split?

I'm fine with the name Lexical structure, but I still don't think it's
the right grouping as it covers all of the syntax except terms and
items (and the latter is to be greatly simplified). I think the
distinction between syntax and semantics is more significant to the
reader than the difference between lexical syntax and term syntax.

(As a case in point, one of the devs who recently learned Mercury
reported that it was hard to figure out how to add two ints together,
in part because they didn't realize operator syntax could be used in
function declarations and they didn't see the declaration for '+'. So
the important grouping, I think, is the one that lets us say "this
stuff is common to clauses, declarations, goals, etc, but doesn't have
a meaning of its own - that depends on the context".)

So here's another structure, also for the sake of discussion:

2. Syntax
  2.1 Character set
  2.2 Whitespace
  2.3 Comments
  2.4 Line number directives
  2.5 Variables
  2.6 Names
  2.7 Literals
  2.8 Punctuation
  2.9 Operators
  2.10 Terms
  2.11 Items

The literals section could also be split.

> >
> > >       3. Terms
> >
> > This is better placed under something entitled "Syntax", in my view,
> > otherwise it just sounds like a glossary.
> >
> > One of the reasons why I wanted to make a clear distinction between
> > syntax and semantics, at least early on, is so I could explain why you
> > can use operators all over the place (including where you don't want
> > to, like character literals), but you can't e.g. add two integers
> > together anywhere you please because you need to import a definition.
> > Also, so I could explain why you sometimes need parentheses in weird
> > places, like declarations.
> >
> > >       4. Goals and Expressions
> > >          4.1 Goals
> > >          4.2 DCG-goals
> > >          4.3 Expressions
> > >          4.4 Variable scoping
> > >          4.5 Implicit quantification
> > >          4.6 Elimination of double negation
> > >       5. State variables
> > >       6. Items
> >
> > >       7. Types
> > >       ...
> > >
> > > >     Syntax
> > > >
> > > >     - The opening blurb covers the material from "Syntax overview" and
> > > >     "Character set". The definitions are put first, and comparisons with
> > > >     Prolog are put after (in general we try to make such comparisons only
> > > >     as secondary information).
> > > >
> > > >     - "Lexical syntax" covers the material from "Whitespace" and "Tokens".
> > > >     We also provide comment syntax. Line number directives are no longer
> > > >     considered tokens, since we are supposed to ignore them when parsing
> > > >     terms anyway.
> > >
> > > I would title this subsection "Lexical structure".  That is a fairly common name
> > > for it in other language specifications (e.g. Java, C#, Haskell, Swift etc).
> >
> > Sure.
> >
>
> I like Julien's suggestion for "Lexical structure".
>
> After that, I like your (Mark's) "Terms" section, describing how
> terms work.
>
> Then, I like how there is a grammar presenting the structure of a
> Mercury module. However, I think the description of items is too
> verbose, and throws a lot of stuff at the reader upfront.
> There are a lot of forward references which are distracting.
>
> I'm thinking, instead, we should present a grammar that goes down to the
> level of terms, without describing yet what each piece of syntax is for.
> People who read that kind of thing can get an overview of how a Mercury
> module is structured, other people can skip over it.

That sounds better than what I have. So I'll just keep the very top
bit of grammar, say that declarations are covered in relevant chapters
(without giving forward references), and quickly show the four clause
types.

One question about terminology, though. Sometimes I want to use 'head"
to mean the part of a clause to the left of the ':-' (for example when
saying that the body implies the head, and sometimes I want to exclude
the '= Res' for function clauses (for example, when saying that the
principal functor determines which function is being defined). For
which of these can I use the term "head", and what should I call the
other one?

Fwiw, I think it's more useful to use "head" in the first sense.

>
> In the sections that follow, I think it would be nice to present each
> feature, syntax then a discussion of semantics or whatever is relevant,
> e.g.
>
>     Conjunctions
>     ============
>
>     A conjunction goal is written:
>
>         Goal1, Goal2
>
>     The declarative semantics of a conjunction is ...
>
>     Operationally, a conjunction is executed first by executing Goal1,
>     and if that succeeds, executing Goal2.

I considered adding operational semantics for all the goals, but
realized I would have to explain mode re-ordering, and then realized
that was probably why the previous author didn't add them :-D

>
> I realise that's far beyond what you were intending to change.

Not a problem.

Cheers,
Mark

>
> Peter


More information about the reviews mailing list