Confusion over "body" #70

Closed
opened 2026-02-17 11:42:02 -06:00 by GiteaMirror · 18 comments
Owner

Originally created by @epage on GitHub (Jul 26, 2019).

The way the spec is written, it sounds like there can only be one paragraph in the body but I'm assuming that isn't the case because there are a lot of times when a commits need multiple paragraphs. This confusion is exacerbated by the discussion in #98 which makes it sound like the footer can be assumed based on the number of paragraphs.

Also, it sounds like "BREAKING CHANGE" must precede the first paragraph in the body?

Originally created by @epage on GitHub (Jul 26, 2019). The way the spec is written, it sounds like there can only be one paragraph in the body but I'm assuming that isn't the case because there are a lot of times when a commits need multiple paragraphs. This confusion is exacerbated by the discussion in #98 which makes it sound like the footer can be assumed based on the number of paragraphs. Also, it sounds like "BREAKING CHANGE" must precede the first paragraph in the body?
Author
Owner

@bcoe commented on GitHub (Jul 29, 2019):

This is a conversation worth having, I believe the way things are written it's assumed your body and first footer will be separated by two newlines, so if you did have multiple paragraphs you would write them, like:

fix: a title

my first paragraph.
  my second paragraph.

my footer

There is an expectation that BREAKING CHANGE: will start the first paragraph when the PR represents a breaking change, my expectation currently is that the breaking change would then pull in the full body -- I don't think we'd only want the breaking change section to only display the first paragraph?

@bcoe commented on GitHub (Jul 29, 2019): This is a conversation worth having, I believe the way things are written it's assumed your `body` and first `footer` will be separated by two newlines, so if you did have multiple paragraphs you would write them, like: ``` fix: a title my first paragraph. my second paragraph. my footer ``` There is an expectation that `BREAKING CHANGE:` will start the first paragraph when the PR represents a breaking change, my expectation currently is that the breaking change would then pull in the full body -- I don't think we'd only want the breaking change section to only display the first paragraph?
Author
Owner

@epage commented on GitHub (Jul 29, 2019):

Since your comment in #171 was a little more explicit

We could then mention in the spec that multi paragraph commit bodies should be spaced in, not separated by two newlines.

@epage commented on GitHub (Jul 29, 2019): Since your comment in #171 was a little more explicit > We could then mention in the spec that multi paragraph commit bodies should be spaced in, not separated by two newlines.
Author
Owner

@epage commented on GitHub (Jul 29, 2019):

The suggestion to indent further paragraphs seems at odds with wider git commit styles like tpope and chris beams talk about. Conventional could go its own direction on this but I suspect that will hurt adoption.

Personally, I also treat git commit messages as markdown, which would have paragraphs separated by extra newlines. This is reinforced by common platforms, like github, taking commit messages and making them the PR message.

@epage commented on GitHub (Jul 29, 2019): The suggestion to indent further paragraphs seems at odds with wider git commit styles like [tpope](https://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html) and [chris beams](https://chris.beams.io/posts/git-commit/) talk about. Conventional could go its own direction on this but I suspect that will hurt adoption. Personally, I also treat git commit messages as markdown, which would have paragraphs separated by extra newlines. This is reinforced by common platforms, like github, taking commit messages and making them the PR message.
Author
Owner

@bcoe commented on GitHub (Jul 29, 2019):

@epage this creates an ambiguity in the grammar between what's a footer and what's a body, is my concern; I don't want to define a specification that's not easily machine readable, I also don't want to introduce that many additional rules, e.g., that footers need to be key values (even that is ambiguous).

@bcoe commented on GitHub (Jul 29, 2019): @epage this creates an ambiguity in the grammar between what's a footer and what's a body, is my concern; I don't want to define a specification that's not easily machine readable, I also don't want to introduce that many additional rules, e.g., that footers need to be key values (even that is ambiguous).
Author
Owner

@epage commented on GitHub (Jul 29, 2019):

I understand the concern over ambiguity. I'm writing a parser which led me to these questions. I also strongly value inter-operating with existing tooling, processes, and practices and am hopeful that we can find a way to satisfy all^W most of them. I suspect it will take a wider look at ways of resolving the ambiguity would be helpful, similar to some of the discussion in #98 before people (seemed) to settle on footer requiring a body.

And yes, footer being key/value pairs only gives parse hints; it doesn't make anything definitive. I do not even know strict that pattern is in various footers/trailers. Github's Closed #<N> fits a role similar to the footers and isn't a key/value pair.

@epage commented on GitHub (Jul 29, 2019): I understand the concern over ambiguity. I'm writing a parser which led me to these questions. I also strongly value inter-operating with existing tooling, processes, and practices and am hopeful that we can find a way to satisfy all`^W` most of them. I suspect it will take a wider look at ways of resolving the ambiguity would be helpful, similar to some of the discussion in #98 before people (seemed) to settle on footer requiring a body. And yes, footer being key/value pairs only gives parse hints; it doesn't make anything definitive. I do not even know strict that pattern is in various footers/trailers. Github's `Closed #<N>` fits a role similar to the footers and isn't a key/value pair.
Author
Owner

@epage commented on GitHub (Jul 30, 2019):

I think a fairly unobtrusive unambiguous rule would be for the footer to be defined as the the section after the last markdown horizontal rule. We could even constrain it to a subset of markdown's horizontal rule definition to simplify parsing, ie 3+ - in direct sequence. I'm mixed about whether to require a blank line before and after the horizontal rule.

Adapting your earlier example:

fix: a title

my first paragraph.

my second paragraph.

---

my footer

And one from the web page

fix: correct minor typos in code

see the issue for details on the typos fixed

---

closes issue #12
@epage commented on GitHub (Jul 30, 2019): I think a fairly unobtrusive unambiguous rule would be for the footer to be defined as the the section after the last markdown horizontal rule. We could even constrain it to a subset of markdown's [horizontal rule definition](https://github.github.com/gfm/#thematic-breaks) to simplify parsing, ie 3+ `-` in direct sequence. I'm mixed about whether to require a blank line before and after the horizontal rule. Adapting your earlier example: ``` fix: a title my first paragraph. my second paragraph. --- my footer ``` And one from the web page ``` fix: correct minor typos in code see the issue for details on the typos fixed --- closes issue #12 ```
Author
Owner

@damianopetrungaro commented on GitHub (Jul 30, 2019):

Jumping by to drop my 2 cents:
This is how it should be split IMHO (as it is described right now).
If you need to have a lot of paragraphs probably the PR is too big.

fix: a title

my first paragraph.
my second paragraph.

my footer

It is easy to automate, easy to understand just looking at it and already used by the whole community that has no issue with it.

What is the point of your issue? Where did you see a real limit of the convention?
Just asking so I can understand better :)

@damianopetrungaro commented on GitHub (Jul 30, 2019): Jumping by to drop my 2 cents: This is how it should be split IMHO (as it is described right now). If you need to have a lot of paragraphs probably the PR is too big. ``` fix: a title my first paragraph. my second paragraph. my footer ``` It is easy to automate, easy to understand just looking at it and already used by the whole community that has no issue with it. What is the point of your issue? Where did you see a real limit of the convention? Just asking so I can understand better :)
Author
Owner

@epage commented on GitHub (Jul 30, 2019):

(as it is described right now).

From my perspective, that is not described in the spec.

If you need to have a lot of paragraphs probably the PR is too big.

Remember, this is not even the PR but the commit which is even worse for it to be too big :)

And while I can understand the sentiment that too much text is a smell for a ommit that does too much, there are a lot of valid cases for multi-paragraph commit messages. For one commit, I included benchmark information which had newlines embedded in it as one example.

Also, I already feel type + subject does a great job of encouraging atomic commits.

It is easy to automate, easy to understand just looking at it and already used by the whole community that has no issue with it.

I wonder how much of the conventional community understands this limitation. I've been using conventional for years and seen other people using it as well. I've always seen paragraphs separated by newlines. I'll admit, there might be some bias in us sharing knowledge and learning from each other.

Where did you see a real limit of the convention?

First, the issue was because the convention isn't stated and is left for people to read into it what they want to read into it.

As for limitations:

  • How do you tell the difference between a line-wrapped paragraph and the start of the next paragraph? While I can't think of a reason right now to parse out separate paragraphs, I think visually parsing them by a human is also an important factor
  • As I stated earlier, a lot of platforms, like github, default to treating commit messages as markdown which presumes a blank line between paragraphs
  • As I stated earlier, common git practice (exhibit 1 and 2) is to use newlines between paragraphs. Personally, I feel augmenting rather than replacing conventions is important for Conventional's adoption. I know it will add more roadblocks for me in getting it adopted by my company.
@epage commented on GitHub (Jul 30, 2019): > (as it is described right now). From my perspective, that is not described in the spec. > If you need to have a lot of paragraphs probably the PR is too big. Remember, this is not even the PR but the commit which is even worse for it to be too big :) And while I can understand the sentiment that too much text is a smell for a ommit that does too much, there are a lot of valid cases for multi-paragraph commit messages. For one commit, [I included benchmark information which had newlines embedded in it](https://github.com/cobalt-org/liquid-rust/commit/98c1f66f03506adcce834476368ccda95e08a0ed) as one example. Also, I already feel `type` + subject does a great job of encouraging atomic commits. > It is easy to automate, easy to understand just looking at it and already used by the whole community that has no issue with it. I wonder how much of the conventional community understands this limitation. I've been using conventional for years and seen other people using it as well. I've always seen paragraphs separated by newlines. I'll admit, there might be some bias in us sharing knowledge and learning from each other. > Where did you see a real limit of the convention? First, the issue was because the convention isn't stated and is left for people to read into it what they want to read into it. As for limitations: - How do you tell the difference between a line-wrapped paragraph and the start of the next paragraph? While I can't think of a reason right now to parse out separate paragraphs, I think visually parsing them by a human is also an important factor - As I stated earlier, a lot of platforms, like github, default to treating commit messages as markdown which presumes a blank line between paragraphs - As I stated earlier, common git practice (exhibit [1](https://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html) and [2](https://chris.beams.io/posts/git-commit/)) is to use newlines between paragraphs. Personally, I feel augmenting rather than replacing conventions is important for Conventional's adoption. I know it will add more roadblocks for me in getting it adopted by my company.
Author
Owner

@bcoe commented on GitHub (Jul 31, 2019):

@damianopetrungaro I think this is a conversation worth having; I'm personally not a fan of introducing an additional markdown token, like, ----.

I like the simplicity of having the footer sections and body separated by \n\n, but it does fall short of supporting some PR formats, such as:

e378e7d7b1

When we hit points of friction like this, I like the discussion, even if ultimately we just add clarification to an FAQ.

@bcoe commented on GitHub (Jul 31, 2019): @damianopetrungaro I think this is a conversation worth having; I'm personally not a fan of introducing an additional markdown token, like, `----`. I like the simplicity of having the footer sections and body separated by `\n\n`, but it does fall short of supporting some PR formats, such as: https://github.com/nodejs/node/commit/e378e7d7b1db748532bd00269b744ae9f3952e62 When we hit points of friction like this, I like the discussion, even if ultimately we just add clarification to an FAQ.
Author
Owner

@JeanMertz commented on GitHub (Aug 9, 2019):

I'm writing a tool that uses conventional commits as well, and build a small parser in Rust to get the relevant commit details. I have to agree with the points @epage made.

Here are a couple of points worth considering:

  • The spec should be unambiguous, especially when it comes to such an important aspect as BREAKING CHANGE (so that includes "what is the footer", and "how to detect a breaking change").
  • Requiring paragraphs to not be separated by newlines is a no-go in my opinion, it hurts readability, and prevents people from writing commit messages the way they normally would.
  • Similarly for indenting paragraphs.
  • Saying "if you have more than one paragraph, you're doing it wrong", is not the answer in my opinion, I write many paragraphs for seemingly simple changes, because they have big consequences, and I can't split up those consequences based on the number of lines in a commit, it's just a fact of life that single line changes can have far reaching consequences, that require explanation (for which the commit message is the best place).

Adding to this, it's worth noting that even Git itself does some hacky/fancy (take your pick) string parsing. For example, in annotated signed git tags, they parse the message for -----BEGIN PGP SIGNATURE----- to strip the signature out of the message itself.

All in all, given the importance of parsing for breaking changes, I feel that a clear separation between the three "parts" of a commit would be a good design:

  • The description is the first line
  • The body is everything after the first empty line, until "the last footer marker"
  • The footer is everything below the last footer marker
  • We can pick anything as the footer marker, as long as parsers make sure to check that they found the last marker, and ignore the rest, so that the body can include that same marker (for example, a Markdown line such as ===) without messing up the parser.
@JeanMertz commented on GitHub (Aug 9, 2019): I'm [writing a tool](https://github.com/rustic-games/jilu) that uses conventional commits as well, and [build a small parser in Rust](https://docs.rs/conventional-commit) to get the relevant commit details. I have to agree with the points @epage made. Here are a couple of points worth considering: * The spec should be unambiguous, especially when it comes to such an important aspect as BREAKING CHANGE (so that includes "what is the footer", and "how to detect a breaking change"). * Requiring paragraphs to not be separated by newlines is a no-go in my opinion, it hurts readability, and prevents people from writing commit messages the way they normally would. * Similarly for indenting paragraphs. * Saying "if you have more than one paragraph, you're doing it wrong", is not the answer in my opinion, I write many paragraphs for seemingly simple changes, because they have big consequences, and I can't split up those consequences based on the number of lines in a commit, it's just a fact of life that single line changes can have far reaching consequences, that require explanation (for which the commit message is _the_ best place). Adding to this, it's worth noting that even Git itself does some hacky/fancy (take your pick) string parsing. For example, in annotated signed git tags, they [parse the message for `-----BEGIN PGP SIGNATURE-----`](https://github.com/git/git/blob/3034dab9ed6b11970a53099a7b3ca981f1461365/gpg-interface.c#L21-L25) to strip the signature out of the message itself. All in all, given the importance of parsing for breaking changes, I feel that a clear separation between the three "parts" of a commit would be a good design: * The description is the first line * The body is everything after the first empty line, until "the last footer marker" * The footer is everything below the last footer marker * We can pick anything as the footer marker, as long as parsers make sure to check that they found the last marker, and ignore the rest, so that the body can include that same marker (for example, a Markdown line such as `===`) without messing up the parser.
Author
Owner

@damianopetrungaro commented on GitHub (Aug 10, 2019):

Ok I think that at this point PR are welcome and we can simply discuss what's the best separator between the different sections.

@damianopetrungaro commented on GitHub (Aug 10, 2019): Ok I think that at this point PR are welcome and we can simply discuss what's the best separator between the different sections.
Author
Owner

@JeanMertz commented on GitHub (Aug 10, 2019):

Reading the spec again, and thinking out loud, I wonder if it even makes sense to have the concept of a "footer", or if we should simply have a way to identify commit metadata specifically.

If we tightened the spec a bit, you could say something like:

  1. The first line is the description (with the type, scope, etc).
  2. The second line MUST be either non-existent (single-line commit) or an empty line.
  3. The rest of the commit is considered to be the body.
  4. Any paragraph in the body that begins with [ALL CAPS] is considered to contain meta-information about the commit.
  5. A paragraph is delineated by an empty line before it.

Then, the metadata specification could mention the following:

  • The following meta information is defined in the spec:
    • [BREAKING CHANGE] for breaking changes introduced in the commit
    • [RESOLVES] for linking to issues that are resolved in this commit.
      • multiple links are allowed, separated by a comma or newlines, other whitespace is ignored
      • you can optionally prepend links with a combination of */-, and closes/fixes to have GH auto-close issues, and support lists, parsers should take this into account.
  • Any other paragraph starting with [...] is also considered part of the metadata, and should be parsed as [<metadata type>] <metadata content> by parsers, but their purpose are not (yet) encoded in the spec.
  • Starting from the first metadata paragraph onwards, all non-metadata paragraphs are considered to be part of the last metadata type.

That last one is important I think, to support the use-case of writing extensive breaking change documentation in a commit.

So let's take an example:

feat(amazing): this is an amazing commit with a breaking change

This is an explanation of this commit.

It introduces all the new features.

[BREAKING CHANGE] This paragraph goes into great depth on what
broke because of this change.

So much broke, that it requires summing up all breakage.

* by
* using
* bullet
* points

And then a final "sorry for the trouble!".

[RESOLVES]

- fixes https://my-issue-url/1
- fixes https://my-issue-url/2
- fixes https://my-issue-url/3
- fixes https://my-issue-url/4

[CUSTOM] 
My custom metadata!

Which doesn't have to start on the same line as "[CUSTOM]"

So the above would parse as:

part content
type feat
scope amazing
description this is an amazing commit with a breaking change
body This is an explanation of this commit.

It introduces all the new features.
meta/breaking This paragraph goes into great depth on what
broke because of this change.

So much broke, that it requires summing up all breakage.

* by
* using
* bullet
* points

And then a final "sorry for the trouble!".
meta/resolves[0] https://my-issue-url/1
meta/resolves[1] https://my-issue-url/2
meta/resolves[2] https://my-issue-url/3
meta/resolves[3] https://my-issue-url/4
meta/custom My custom metadata!

Which doesn't have to start on the same line as "[CUSTOM]"

I feel like this would strike the best balance between:

  1. Readability (use of proper paragraphs, no weird characters other than the [ and ])
  2. A strict definition of metadata that can be parsed without ambiguity.

So basically the concept of a "footer" would no longer exist, unless you want to mention that the "footer" starts at the first occurrence of a metadata tag, but I don't think there's any value in including that in the spec, to be honest.

Let me know what you think, I would be willing to write the changes in the spec for this.

@JeanMertz commented on GitHub (Aug 10, 2019): Reading the spec again, and thinking out loud, I wonder if it even makes sense to have the concept of a "footer", or if we should simply have a way to identify commit metadata specifically. If we tightened the spec a bit, you could say something like: 1. The first line is the description (with the type, scope, etc). 2. The second line MUST be either non-existent (single-line commit) or an empty line. 3. The rest of the commit is considered to be the body. 4. Any paragraph in the body that begins with `[ALL CAPS]` is considered to contain meta-information about the commit. 5. A paragraph is delineated by an empty line before it. Then, the metadata specification could mention the following: * The following meta information is defined in the spec: * `[BREAKING CHANGE]` for breaking changes introduced in the commit * `[RESOLVES]` for linking to issues that are resolved in this commit. * multiple links are allowed, separated by a comma or newlines, other whitespace is ignored * you can optionally prepend links with a combination of `*/-`, and `closes/fixes` to have GH auto-close issues, and support lists, parsers should take this into account. * Any other paragraph starting with `[...]` is also considered part of the metadata, and should be parsed as `[<metadata type>] <metadata content>` by parsers, but their purpose are not (yet) encoded in the spec. * Starting from the first metadata paragraph onwards, all non-metadata paragraphs are considered to be part of the last metadata type. That last one is important I think, to support the use-case of writing extensive breaking change documentation in a commit. So let's take an example: ``` feat(amazing): this is an amazing commit with a breaking change This is an explanation of this commit. It introduces all the new features. [BREAKING CHANGE] This paragraph goes into great depth on what broke because of this change. So much broke, that it requires summing up all breakage. * by * using * bullet * points And then a final "sorry for the trouble!". [RESOLVES] - fixes https://my-issue-url/1 - fixes https://my-issue-url/2 - fixes https://my-issue-url/3 - fixes https://my-issue-url/4 [CUSTOM] My custom metadata! Which doesn't have to start on the same line as "[CUSTOM]" ``` So the above would parse as: | part | content | | - | - | | type | feat | | scope | amazing | | description | this is an amazing commit with a breaking change | | body | This is an explanation of this commit.<br><br>It introduces all the new features. | | meta/breaking | This paragraph goes into great depth on what<br>broke because of this change.<br><br>So much broke, that it requires summing up all breakage.<br><br>* by<br>* using<br>* bullet<br>* points<br><br>And then a final "sorry for the trouble!". | | meta/resolves[0] | https://my-issue-url/1 | | meta/resolves[1] | https://my-issue-url/2 | | meta/resolves[2] | https://my-issue-url/3 | | meta/resolves[3] | https://my-issue-url/4 | | meta/custom | My custom metadata!<br><br>Which doesn't have to start on the same line as "[CUSTOM]" | I feel like this would strike the best balance between: 1. Readability (use of proper paragraphs, no weird characters other than the `[` and `]`) 2. A strict definition of metadata that can be parsed without ambiguity. So basically the concept of a "footer" would no longer exist, unless you want to mention that the "footer" starts at the first occurrence of a metadata tag, but I don't think there's any value in including that in the spec, to be honest. Let me know what you think, I would be willing to write the changes in the spec for this.
Author
Owner

@JeanMertz commented on GitHub (Aug 10, 2019):

Building a bit on the above, I think just supporting [ALL CAPS] as metadata is not enough. I think single line metadata with colons to separate the type from the value is also needed, so an extension to the above could be to also support:

Co-Authored-By: <metadata value>
Fixes: <metadata value>

This is to support already existing non-standard usage of these tags to automate certain processes in big repositories and hosters such as Github (something @epage also brought up in https://github.com/conventional-commits/conventionalcommits.org/issues/171).

I haven't thought of what it would mean to merge this with what I wrote above, but at least both of these are unambiguous, and can co-exist together.

@JeanMertz commented on GitHub (Aug 10, 2019): Building a bit on the above, I think just supporting `[ALL CAPS]` as metadata is not enough. I think single line metadata with colons to separate the type from the value is also needed, so an extension to the above could be to also support: ``` Co-Authored-By: <metadata value> Fixes: <metadata value> ``` This is to support already existing non-standard usage of these tags to automate certain processes in big repositories and hosters such as Github (something @epage also brought up in https://github.com/conventional-commits/conventionalcommits.org/issues/171). I haven't thought of what it would mean to merge this with what I wrote above, but at least both of these are unambiguous, and can co-exist together.
Author
Owner

@epage commented on GitHub (Aug 10, 2019):

Fixes: <metadata value>

This is to support already existing non-standard usage of these tags to automate certain processes in big repositories and hosters such as Github (something @epage also brought up in #171).

So hosting services, like github, support extracting Closes #<num>. Would adding the : break support for that feature? If so, I suspect that will surprise too many people that our recommendation does not align with github.

@epage commented on GitHub (Aug 10, 2019): > `Fixes: <metadata value>` > This is to support already existing non-standard usage of these tags to automate certain processes in big repositories and hosters such as Github (something @epage also brought up in #171). So hosting services, like github, support extracting `Closes #<num>`. Would adding the `:` break support for that feature? If so, I suspect that will surprise too many people that our recommendation does not align with github.
Author
Owner

@JeanMertz commented on GitHub (Aug 14, 2019):

So hosting services, like github, support extracting Closes #<num>. Would adding the : break support for that feature?

It does not appear to break support:

For example, to close an issue numbered 123, you could use the phrase "Closes #123" or "Closes: #123" in your pull request description or commit message.

I'm pretty sure they added support for the colon because it's a known standard in Git.

@JeanMertz commented on GitHub (Aug 14, 2019): > So hosting services, like github, support extracting `Closes #<num>`. Would adding the `:` break support for that feature? It [does not appear to break support](https://help.github.com/en/articles/closing-issues-using-keywords): > For example, to close an issue numbered 123, you could use the phrase "Closes #123" or "Closes: #123" in your pull request description or commit message. I'm pretty sure they added support for the colon because it's a [known standard in Git](https://git-scm.com/docs/git-interpret-trailers).
Author
Owner

@blowmage commented on GitHub (Aug 15, 2019):

FWIW, I would prefer it if the footer was wrapped in []:

fix: a title

my first paragraph.

my second paragraph.

[my footer]
@blowmage commented on GitHub (Aug 15, 2019): FWIW, I would prefer it if the footer was wrapped in `[]`: ``` fix: a title my first paragraph. my second paragraph. [my footer] ```
Author
Owner

@epage commented on GitHub (Aug 16, 2019):

The downside to that is it isn't compatible with git conventions for trailers which I assume we'd want to be compatible with.

@epage commented on GitHub (Aug 16, 2019): The downside to that is it isn't compatible with git conventions for trailers which I assume we'd want to be compatible with.
Author
Owner

@bcoe commented on GitHub (Aug 20, 2019):

I think @JeanMertz has put together a very compelling proposal here:

https://github.com/conventional-commits/conventionalcommits.org/issues/179

@bcoe commented on GitHub (Aug 20, 2019): I think @JeanMertz has put together a very compelling proposal here: https://github.com/conventional-commits/conventionalcommits.org/issues/179
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/conventionalcommits.org#70