How to upgrade to a new version of Boehm Garbage Collector in Mercury.

WARNING: This process is difficult and should not be undertaken lightly. Before attempting to upgrade Boehm GC, you should definitely discuss it on the reviews mailing list first.

The setup

This is the first attempt to update Boehm since Mercury switched from CVS to Git. Therefore, I have taken the opportunity to setup this process in a more git-ish way (Boehm GC also uses git). I set this up for version 7.4.2 of the collector and libatomic_ops.

Over time we have made some changes to the collector, some of which have not been pushed upstream. The changes that have not been pushed upstream must be managed by us. I've forked the bdwgc and libatomic_opts repositories. Our forks are currently located here:

Webgitbranch
BDW GC https://github.com/Mercury-Language/bdwgc https://github.com/Mercury-Language/bdwgc.git release-7_4-mercury
libatomic_ops https://github.com/Mercury-Language/libatomic_ops https://github.com/Mercury-Language/libatomic_ops.git release-7_4-mercury

On a clean checkout of the Mercury repository, I created a branch off of the master branch.

$ git branch upgrade_boehm master
$ git checkout upgrade_boehm

Then, on this branch I deleted the existing boehm_gc directory from the repository.

$ rm -rf boehm_gc
$ git commit -a

Next we add the bdwgc and libatomic_ops repositories as git submodules. This basically creates a reference from the Mercury repository to these other repositories without importing their history into the Mercury repository.

The references to submodules are relative. So if the remote named origin has the url https://www.github.com/Mercury-Language/mercury.git then we can have git look for the bdwgc repository at the relative path ../bdwgc.git or https://www.github.com/Mercury-Language/bdwgc.git.

$ git submodule add -b release-7_4-mercury ../bdwgc.git boehm_gc
$ git submodule add -b release-7_4-mercury ../libatomic_ops.git libatomic_ops

I've written a script named prepare.sh and committed it, it can be used to initialize and checkout the submodules.

Mercury's customisations to the Boehm GC

I've created a branch named mercury7_2 based on the gc7_2 tag in the bdwgc repository. This branch contains the Mercury customisations to boehm_gc as a series of patches. Then to upgrade to 7.4.2 I created a new branch release-7_4-mercury (from mercury7_2), switched to it, and rebased it onto the point in the boehm_gc tree that represents the BDWGC 7.4.2 release, that is the tag gc7_4_2:

$ git branch release-7_4-mercury mercury7_2
$ git checkout release-7_4-mercury
$ git rebase --onto gc7_4_2 gc7_2

I needed to solve several merge conflicts to complete the rebase.

Pulling changes from upstream

The final step is how to update Mercury's copy of the collector when there are changes upstream. At the time of writing there are some important patches on the collector's release-7_4 branch (The TSX bug).

$ git remote -v
github-ivan https://github.com/ivmai/bdwgc.git (fetch)
github-ivan https://github.com/ivmai/bdwgc.git (push)
github-mercury https://github.com/Mercury-Language/bdwgc.git (fetch)
github-mercury git@github.com:Mercury-Language/bdwgc.git (push)
origin  ../bdwgc.git (fetch)
origin  ../bdwgc.git (push)

Starting with libatomic_ops I update the release-7_4 branch to the latest changes and then rebase Mercury's customisations (in the release-7_4-mercury branch) on top of those.

$ git checkout release-7_4
$ git pull github-ivan release-7_4
$ git push github-mercury release-7_4          # Optional
$ git checkout release-7_4-mercury
$ git pull github-mercury release-7_4-mercury
$ git rebase release-7_4

There are only two Mercury-specific change so this went smoothly. However, the second change commits some autogenerated files, this makes it easier to build libatomic_ops inside the Mercury tree. We need to regenerate those files.

$ ./autogen.sh
$ git commit --amend -a

Double check that you got everything:

$ git status --ignored

Now, publish this change with --force because we are changing history.

$ git tag release-7_4-mercury-20160915
$ git push --force github-mercury release-7_4-mercury
$ git push github-mercury --tags

Now do the same for boehm_gc, however I've documented two examples depending on what you're trying to achieve.

  1. Upgrade to a newer patch within the 7.4 release.

    $ git checkout release-7_4
    $ git pull github-ivan release-7_4
    $ git push github-mercury release-7_4          # Optional
    $ git checkout release-7_4-mercury
    $ git pull github-mercury release-7_4-mercury
    $ git rebase release-7_4
    

    This rebase hard two merge conflicts however they were simple.

  2. Upgrade from the 7.4 release to the 7.6 release

    $ git checkout -b release-7_6 github-ivan/release-7.6
    $ git push github-mercury release-7_6          # Optional
    $ git checkout -b release-7_6-mercury release-7_4-mercury
    $ git rebase release-7_6
    

    There were a number of conflicts and some patches have since been merged upstream so I was able to drop them entirely.

Now I need to update the version of libatomic_ops we include with as a submodule in the boehm repository. If the branch name in libatomic_ops was changing I would need to check the .gitmodules file, but in this example it isn't.

$ git submodule update --remote --checkout
$ git add libatomic_ops
$ git commit

Depending on which step we chose above, tag and push these changes.

  1. $ git tag release-7_4-mercury-YYYYMMDD
    $ git push --force github-mercury release-7_4-mercury
    $ git push github-mercury --tags
    
  2. $ git tag release-7_6-mercury-20160916
    $ git push github-mercury release-7_6-mercury
    $ git push github-mercury --tags
    

Back in the mercury repository I needed to point the boehm_gc submodule to a different branch, Depending on which option we chose above we may need to update the branch name that the Mercury repository refers to.

  1. The branch name has not changed

  2. The branch name was release-7_4_mercury and is now release-7_6-mercury. The branch is adjusted by editing .gitmodules

    $ vim .gitmodules
    

In any case you will need to tell git that the branches have been updated, and these submodules should now refer to different git IDs.

$ git submodule update --remote --checkout
$ git add .gitmodules boehm_gc
$ git commit

Once done, bootcheck the compiler in at least the asm_fast.gc and hlc.gc grades. Then use the new system to do an install (with non-empty LIBGRADES) and test that the installed version can compile some test programs. This is because the update may have added some new files which may not be copied into the install directories. Some build scripts may also need to be updated (in particular tools/bootcheck and scripts/prepare_install_dir.in).

Finally update .README.in (in the root directory) and bindist/bindist.README to reflect the current version of the collector being used. Then commit these changes and have the changes reviewed before pushing them into the public repository.

Modifying Mercury's customisations to the Boehm GC

It is sometimes necessary to modify the files in the boehm_gc directory that we have customised while not updating the collector. For example, the set of options accepted by the mgnuc script may change and boehm_gc/Mmakefile or boehm_gc/Makefile.direct may need to be updated accordingly.

The procedure for doing this is:
  1. Clone the Mercury bdwgc repository.
  2. Checkout the release-M_N-mercury branch of the collector, where M and N are respectively the major and minor release numbers of version of the collector that Mercury is currently using.
  3. Make your changes and commit them.
  4. Add an annotated tag (git tag -a) named release-M_N-mercury-YYYYMMDD where YYYYMMDD is the date.
  5. Push the changes and the tag.
  6. Checkout the above tag in the mercury/boehm_gc submodule.
  7. The output of the git status command will say that boehm_gc has been modified. You can git add and commit that change.
  8. Remind other developers to run git submodule update --recursive in their workspaces.