Next: , Previous: parsing_utils, Up: Top


48 pprint

     %--------------------------------------------------%
     % vim:ts=4 sw=4 expandtab tw=0 wm=0 ft=mercury
     %--------------------------------------------------%
     % Copyright (C) 2000-2007, 2010-2011 The University of Melbourne
     % This file may only be copied under the terms of the GNU Library General
     % Public License - see the file COPYING.LIB in the Mercury distribution.
     %--------------------------------------------------%
     %
     % File: pprint.m
     % Main author: rafe
     % Stability: medium
     %
     % NOTE: this module has now been superceded by pretty_printer.m which is
     % more economical, produces better output, has better control over
     % the amount of output produced, and supports user-specifiable formatting
     % for arbitrary types.
     %
     % ABOUT
     % -----
     %
     % This started off as pretty much a direct transliteration of Philip Wadler's
     % Haskell pretty printer described in "A Prettier Printer", available at
     % http://cm.bell-labs.com/cm/cs/who/wadler/topics/recent.html
     %
     % Several changes have been made to the algorithm to preserve linear running
     % time under a strict language and to ensure scalability to extremely large
     % terms without thrashing the VM system.
     %
     % Wadler's approach has three main advantages:
     % 1. the layout algebra is small and quite intuitive (more so than Hughes');
     % 2. the pretty printer is optimal in the sense that it will never generate
     %    output that over-runs the specified width unless that is unavoidable; and
     % 3. the pretty printer is bounded in that it never needs to look more than
     %    k characters ahead to make a formatting decision.
     %
     % I have made the following changes:
     %
     % (a) rather than having group/1 as a non-primitive function (for
     % allowing line-breaks to be converted into spaces at the pretty
     % printer's discretion) over docs, I have extended the doc type to
     % include a `GROUP' constructor and made the appropriate algorithmic
     % changes.  Because `UNION' only arises as a consequence of processing
     % a 'GROUP' it turns out to be simpler to do away with `UNION'
     % altogether and convert clauses that process `UNION' terms to
     % processing `GROUP's.
     %
     % (b) Flattened `line' breaks become empty strings rather than spaces.
     %
     % (c) The third change is the introduction of the `LABEL' constructor,
     % which acts much like `NEST', except that indentation is defined
     % using a string rather than a number of spaces.  This is useful for,
     % e.g., multi-line compiler errors and warnings that should be
     % prefixed with the offending source file and line number.
     %
     % (d) The formatting decision procedure has been altered to preserve
     % linear runtime behaviour in a strict language.
     %
     % (e) Naively marking up a term as a doc has the drawback that the
     % resulting doc is significantly larger than the original term.
     % Worse, any sharing structure in the original term leads to
     % duplicated sub-docs, which can cause an exponential blow-up in the
     % size of the doc w.r.t. the source term.  To get around this problem
     % I have introduced the 'DOC' constructor which causes on-demand
     % conversion of arguments.
     %
     % [This is not true laziness in the sense that the 'DOC', once
     % evaluated, will be overwritten with its value.  This approach would
     % lead to garbage retention and not solve the page thrashing behaviour
     % otherwise experienced when converting extremely large terms.
     % Instead, each 'DOC' is reevaluated each time it is examined.  This
     % trades off computation time for space.]
     %
     % I have added several obvious general purpose formatting functions.
     %
     %
     % USAGE
     % -----
     %
     % There are two stages in pretty printing an object of some type T:
     % 1. convert the object to a pprint.doc using the constructor functions
     %    described below or by simply calling pprint.to_doc/[1,2];
     % 2. call pprint.write/[4,5] or pprint.to_string/2 passing the display width
     %    and the doc.
     %
     %
     % EXAMPLES
     % --------
     %
     % The doc/1 type class has types string, char, int, float and doc as instances.
     % Hence these types can all be converted to docs by applying doc/1.
     % This happens automatically to the arguments of ++/2. Users may find it
     % convenient to add other types as instances of the doc/1 type class.
     %
     % Below are some docs followed by the ways they might be displayed by the
     % pretty printer given various line widths.
     %
     % 1. "Hello " ++ line ++ "world"
     %
     %   Hello
     %   world
     %
     % 2. group("Hello " ++ line ++ "world")
     %
     %   Hello world
     %
     %   Hello
     %   world
     %
     % 3. group("Hello " ++ nest(3, line ++ "world"))
     %
     %   Hello world
     %
     %   Hello
     %      world
     %
     % 4. group("Goodbye " ++ nest(3, line ++ "cruel " ++ line ++ "world")
     %
     %   Goodbye cruel world
     %
     %   Goodbye
     %      cruel
     %      world
     %
     % 5. group("Goodbye " ++ nest(3, line ++ group("cruel " ++ line ++ "world")))
     %
     %   Goodbye cruel world
     %
     %   Goodbye
     %      cruel world
     %
     %   Goodbye
     %      cruel
     %      world
     %
     % 6. label("Look! ", line ++
     %                    group("Goodbye " ++
     %                          nest(3, line ++ group("cruel " ++ line ++ "world"))))
     %
     %   Look! Goodbye cruel world
     %
     %   Look! Goodbye
     %   Look!    cruel world
     %
     %   Look! Goodbye
     %   Look!    cruel
     %   Look!    world
     %
     %--------------------------------------------------%
     %--------------------------------------------------%
     
     :- module pprint.
     :- interface.
     
     :- import_module char.
     :- import_module io.
     :- import_module list.
     :- import_module stream.
     :- import_module string.
     :- import_module univ.
     
     %--------------------------------------------------%
     
         % Clients must translate data structures into docs for
         % the pretty printer to display.
         %
     :- type doc.
     
         % This typeclass can be used to simplify the construction of docs.
         %
     :- typeclass doc(T) where [
         % Convert a T to a doc, placing a limit on how much of the term
         % will be fully converted as follows:
         %
         % doc(_, f         ) = f
         % doc(N, f(A, B, C)) = f/3 if N =< 0
         % doc(N, f(A, B, C)) = some representation of the term whereby
         %   A is converted as doc(N - 1, A),
         %   B is converted as doc(N - 2, B), and
         %   C is converted as doc(N - 3, C)
         %   - if there are more than N arguments, the N+1th and subsequent
         %     arguments should be replaced with a single ellipsis.
         %
         func doc(int, T) = doc
     ].
     
     :- instance doc(doc).
     :- instance doc(string).
     :- instance doc(int).
     :- instance doc(float).
     :- instance doc(char).
     
         % Fully convert an instance of doc/1.
         %
     :- func doc(T) = doc <= (doc(T)).
     
         % An alternative to the <>/2 concatenation operator that works
         % on members of the doc/1 typeclass.
         %
     :- func T1 ++ T2 = doc <= (doc(T1), doc(T2)).
     
         % The empty document corresponding to the null string.
         %
     :- func nil                 = doc.
     
         % The document consisting of a single string.
         %
         % NOTE: since string is now an instance of the doc/1
         % type class, it is simpler to just apply the doc/1
         % method.
         %
     :- func text(string)        = doc.
     
         % The composition of two docs with no intervening space.
         %
         % NOTE: with the addition of the doc/1 type class, it is
         % simpler to construct compound docs using ++/2.
         %
     :- func doc `<>` doc        = doc.
     
         % The newline document. In a group doc (see below) the pretty printer
         % may choose to instead `flatten' all line docs into nil docs in order
         % to fit a doc on a single line.
         %
     :- func line                = doc.
     
         % Any `line' docs in the body that are not flattened out by the
         % pretty printer are followed by the given number of spaces
         % (nested `nest's add up).
         %
     :- func nest(int, T)        = doc <= (doc(T)).
     
         % Identical to a nest doc except that indentation is extended with
         % a string label rather than some number of spaces.
         %
     :- func label(string, T)    = doc <= (doc(T)).
     
         % A group doc gives the pretty printer a choice: if the doc can be printed
         % without line wrapping then it does so (all line, label, nest and group
         % directives within the group are ignored); otherwise the pretty printer
         % treats the group body literally, although nested group docs remain as
         % choice points.
         %
     :- func group(T)            = doc <= (doc(T)).
     
         % This function can be used to convert strings, chars, ints and floats
         % to their text doc equivalents.
         %
         % NOTE: since these types are now instances of the doc/1 type class,
         % it is simpler to just apply the doc/1 method to these types.
         %
     :- func poly(string.poly_type) = doc.
     
         % Shorthand for doc ++ line ++ doc.
         %
     :- func doc `</>` doc       = doc.
     
         % Various bracketing functions.
         %
         %   bracketed(L, R, Doc) = L ++ Doc ++ R
         %       parentheses(Doc) = bracketed("(", ")", Doc)
         %          brackets(Doc) = bracketed("[", "]", Doc)
         %            braces(Doc) = bracketed("{", "}", Doc)
         %
     :- func bracketed(T1, T2, T3)  = doc <= (doc(T1), doc(T2), doc(T3)).
     :- func parentheses(T)         = doc <= (doc(T)).
     :- func brackets(T)            = doc <= (doc(T)).
     :- func braces(T)              = doc <= (doc(T)).
     
         % packed(Sep, [X1, X2, .., Xn]) = G1 `<>` G2 `<>` .. `<>` Gn where
         % Gi = group(line `<>` Xi `<>` Sep), except for Gn where
         % Gn = group(line `<>` Xn).
         %
         % For the singleton list case, packed(Sep, [X]) = group(line `<>` X).
         %
         % The resulting doc tries to pack as many items on a line as possible.
         %
     :- func packed(T1, list(T2)) = doc <= (doc(T1), doc(T2)).
     
         % A variant of the above whereby only the first N elements of the list
         % are formatted and the rest are replaced by a single ellipsis.
         %
     :- func packed(int, T1, list(T2)) = doc <= (doc(T1), doc(T2)).
     
         % packed_cs(Xs) = packed(comma_space, Xs).
         %
         % For example, to pretty print a Mercury list of docs one might use
         %
         %   brackets(nest(2, packed_cs(Xs)))
         %
     :- func packed_cs(list(T)) = doc <= (doc(T)).
     
         % A variant of the above whereby only the first N elements of the list
         % are formatted and the rest are replaced by a single ellipsis.
         %
     :- func packed_cs(int, list(T)) = doc <= (doc(T)).
     
         % This is like a depth-limited version of packed_cs/1 that first calls
         % to_doc/2 on each member of the argument list.
         %
     :- func packed_cs_to_depth(int, list(T)) = doc.
     
         % This is like a version of packed_cs_to_depth/1 that first calls
         % univ_value/1 for each member of the argument list.
         %
     :- func packed_cs_univ_args(int, list(univ)) = doc.
     
         % separated(PP, Sep, [X1,...,Xn]) =
         %   PP(X1) `<>` Sep `<>` ... Sep `<>` PP(Xn)
         %
     :- func separated(func(T1) = doc, T2, list(T1)) = doc <= (doc(T2)).
     
         % Handy punctuation docs and versions with following
         % spaces and/or line breaks.
         %
     :- func comma               = doc.
     :- func semic               = doc.      % Semicolon.
     :- func colon               = doc.
     :- func space               = doc.
     :- func comma_space         = doc.
     :- func semic_space         = doc.
     :- func colon_space         = doc.
     :- func comma_line          = doc.
     :- func semic_line          = doc.
     :- func colon_line          = doc.
     :- func space_line          = doc.
     :- func comma_space_line    = doc.
     :- func semic_space_line    = doc.
     :- func colon_space_line    = doc.
     :- func ellipsis            = doc.      % "...".
     
         % Performs word wrapping at the end of line, taking whitespace sequences
         % as delimiters separating words.
         %
     :- func word_wrapped(string) = doc.
     
         % Convert arbitrary terms to docs. This requires std_util.functor/3 to work
         % on all components of the object being converted. The second version
         % places a maximum depth on terms which are otherwise truncated in the
         % manner described in the documentation for the doc/2 method of the doc/1
         % type class.
         %
         % This may throw an exception or cause a runtime abort if the term
         % in question has user-defined equality.
         %
     :- func to_doc(T)           = doc.
     :- func to_doc(int, T)      = doc.
     
         % Convert docs to pretty printed strings. The int argument specifies
         % a line width in characters.
         %
     :- func to_string(int, doc) = string.
     
         % Write docs out in pretty printed format. The int argument specifies
         % a page width in characters.
         %
     :- pred write(int::in, T::in, io::di, io::uo) is det <= doc(T).
     
         % Write docs to the specified string writer stream in pretty printed
         % format. The int argument specifies a page width in characters.
         %
     :- pred write(Stream::in, int::in, T::in, State::di, State::uo) is det
         <= ( doc(T), stream.writer(Stream, string, State) ).
     
     %--------------------------------------------------%
     %--------------------------------------------------%