Redline changes

Ensure a clear markup between version for complex and long documents to help reduce review fatigue and improve quality.

Challenge

Providing the functionality for reviewers and readers to clearly identify changes between specification documents proved to be a complex task.

Result

A scoped "release-notes" appendix that showed:

  • A list of all relevant changes
  • The change markup in context (git show output for a specific commit)
  • Why the change was made (reference to our engineering change management system and GitLab merge request)

release-notes

Background

The task itself, from a legacy perspective, was quite straight forward: a single author tracked their changes in a MS word document. Readers could then see the markup changes in red (or red strike through).

The difficulty was integrating this into a workflow that included dozens of authors simultaneously working on the same document, and drip feeding edits over a few months. Our GitLab workflow helped scope our changes into relevant chunks. Whereby an author would make their change, apply a descriptive commit message according to our convention and submit a merge request for review. This works internally, but our solution needed to be available for external readers.

We identified three key issues:

  1. How can we distinguish between general editing / improvements and relevant technical updates:

    Git itself isn't smart enough to understand the context of the changes, this must be done manually. For this, we have two types of commit messages, one that identified general doc changes such as formatting or grammar fixes, and one that identified pure technical changes. We did this by enforcing users provide a reference to our engineering change management system (-refs)

    (DOC): rewrote section to improve readability
    (DOC-refs #12345): added USB compliance testing
    

    This ensures the redlines omit changes like spelling or layout improvements. This allows our technical writers to work in parallel to engineers on updating documents.

  2. How to apply to both HTML and PDF:

    Our document generator also isn't smart enough to understand the changes, as it can't attach any Git context to text updates - it only processes our raw text files.

    For this, we produce a separate Release Notes annex that we generate via Python. These release notes use the Git function show and some filtering to produce a user-friendly output. In HTML, readers see this directly in the "Release notes" section. In PDF, readers see this as an annex (HTML converted to PDF and added post-build during our release pipeline).

  3. How to do this automatically for each update:

    We use a diff2html.py function during our pipeline to generate a "release-notes.html" file. The main steps are:

    1. Generate a list of all commits between the latest release commit, and the relevant git HEAD commit. Filter these to technical update commits only:

      commit_list = subprocess.run(
          [
            "git", "log", "--grep=-refs","--format=%h %s",
            f"{latest_approved_hash}..{next_release_hash}", 
            "--", "."
          ],
          capture_output=True, text=True
        ).stdout.strip()
      
    2. Show the change markup for each commit and each file changed:

      diff_result = subprocess.run(
          [
            "git", "show", "-p", "--ignore-space-change",
            "--format=", commit_hash, "--", file
          ],
          capture_output=True, text=True
        )
      

    The rest was just generating the output into an HTML document and styling it.