[m-rev.] for post-commit review: string builder instance for parse_tree_out_info.m

Peter Wang novalazy at gmail.com
Sat Jul 8 11:33:46 AEST 2023


On Sat, 08 Jul 2023 09:07:04 +1000 "Zoltan Somogyi" <zoltan.somogyi at runbox.com> wrote:
> Allow use of string builders in parse_tree_out_*.m ...
> 
> compiler/parse_tree_out_info.m:
>     ... by making string builders instances of the "output" typeclass,
>     whose operations are the base level of the code of parse_tree_out*.m.
> 
>     Strings were already instances of "output", but the use of this instance
>     was problematic for large structures such as entire files. This is because
>     if the final string was constructed by adding together N atomic strings,
>     then the construction of the final string involves copying each of those
>     atomic strings about N/2 times on average.
> 
>     With the string builder instance, each string is copied just twice:
>     when it is added to the string builder state, and when that state is
>     converted to the final string.
> 
> library/string.builder.m:
>     Add three new predicates, append_char, append_string and append_strings.
>     The first two are copied from existing instance methods; making them
>     separate predicates allows them to be invoked using a first order call,
>     instead of a method call. The third is just for convenience. All three
>     are used by the new code in parse_tree_out_info.m.
> 
> NEWS.md:
>     Announce the new predicates in string.builder.m.

> diff --git a/NEWS.md b/NEWS.md
> index 58f7e0dbc..3b5a8816c 100644
> --- a/NEWS.md
> +++ b/NEWS.md
> @@ -765,6 +765,14 @@ Changes to the Mercury standard library
>    to never break a row into more than one line. We have also documented
>    this behavior.
>  
> +### Changes to the `string` module

I changed this to say `string.builder`.

> diff --git a/library/string.builder.m b/library/string.builder.m
> index 19dae55f9..66bfc0fae 100644
> --- a/library/string.builder.m
> +++ b/library/string.builder.m
...
> @@ -58,13 +67,41 @@
>  
>  :- import_module list.
>  
> +%---------------------------------------------------------------------------%
> +
>  :- type state
>      --->    state(list(string)).
>              % A reversed list of the strings that, when unreversed
>              % and appended together, yields the string we have built.
>  
> +%---------------------------------------------------------------------------%
> +
>  init = state([]).
>  
> +:- pragma inline(pred(append_char/3)).      % inline in instance method
> +append_char(Char, !State) :-
> +    !.State = state(RevStrings0),
> +    RevStrings = [string.char_to_string(Char) | RevStrings0],
> +    !:State = state(RevStrings).
> +
> +:- pragma inline(pred(append_string/3)).    % inline in instance method
> +append_string(Str, !State) :-
> +    !.State = state(RevStrs0),
> +    copy(Str, UniqueStr),
> +    RevStrs = [UniqueStr | RevStrs0],
> +    !:State = state(RevStrs).

The strings themselves don't actually need to be unique.
I think it would be worth avoiding the copy using unsafe_promise_unique.

Peter


More information about the reviews mailing list