mirror of
https://github.com/semver/semver.git
synced 2026-03-22 22:20:28 -05:00
Dealing with forks #13
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @chrisdpratt on GitHub (Mar 1, 2012).
If I fork a package and make my own changes to it, I don't want to mess with the version number of the original package, but I should still somehow indicate that this is different.
As an example, let's say a package is at 0.2.0. I fork the package and make a minor change (something that would normally result in the version number changing to 0.2.1). Would it be appropriate to make my version number something like, 0.2.0-0.0.1, or is there a better approach?
@Cais commented on GitHub (Mar 29, 2012):
I am seriously considering this specification for my own works, and this is an interesting question. My impression would be along these lines:
Of course, all of this is dependent on being within the licensing of the package.
@haacked commented on GitHub (Oct 2, 2012):
If you fork a package, you probably don't care about the version number unless you plan to publish the package, right? After all, if you submit it back to the original author, they'll incorporate it in their next release and version that next release appropriately, ignoring whatever version number you have.
The only time the version of your fork matters is if you plan to publish it. In that case, wouldn't you have to change the identity of the package anyways? In that case, the version probably doesn't matter so much. But I do like the idea that you'd want to give it a version related to the original so people looking at your package know that you're sort of in lockstep with the original.
So if original is
CoolPackage 0.2.0and your package isCoolPackageFork 0.2.0people get a sense of how your fork relates to the original.I'm going to close this as there's nothing actionable to do unless you have a specific proposal for SemVer. Great question!
@datagrok commented on GitHub (Dec 27, 2013):
I've run into a situation where I need to fork an existing package that I don't plan to publish, and need to keep the identity the same:
If I version my locally forked BarPackage as 1.0.0-0.0.1, or 1.0.0-localfork or 1.0.0-anything, semver sorts it as a pre-release of 1.0.0; my build system won't consider it when installing dependencies for FooPackage, which requires >=1.0.0.
If I version my locally forked BarPackage as 1.0.0 or 1.0.1, it conflicts with the upstream BarPackage 1.0.1. Or it risks conflict the moment they release that version. This doesn't negatively affect anybody else since I'm not publishing the version, but it causes all kinds of confusion for me and my coworkers, especially after 6 months when somebody helpfully refreshes the local version cache by rsync'ing from the public archive.
If I change the identity of BarPackage 1.0.0 to BarPackageFork 1.0.0, my build system no longer considers it as a candidate that satisfies FooPackage's dependency on BarPackage. I would have to change FooPackage's dependencies, forking it, and all of the packages that depend on FooPackage in turn...
The simplest workable approach I can come up with is to overload the
+-separated "build metadata" with a secondary local versioning scheme and configure my build system to recognize it, like BarPackage 1.0.0+fork.0.0.1. But if this were dealt with in the spec, then everyone who uses the same build system could take advantage of it.A different approach might be to select another separator character that works like
-but indicates a "post-release" modification of the same version number. If we chose, for example,~, then 1.0.0-alpha < 1.0.0 < 1.0.0~fork.0.0.1 < 1.0.1-alpha < 1.0.1.I anticipate this issue has already been discussed to death in other issues or a mailing list somewhere. If so, please accept my apologies for posting to this closed issue, and perhaps link to those discussions from here?
@EddieGarmon commented on GitHub (Dec 29, 2013):
@datagrok this is exactly the use case for build metadata. It is unspecified for this reason. If you wanted to formalize an addendum to semver for your org, that is great, but semver will most likely leave this open as a extension point.
@datagrok commented on GitHub (Dec 30, 2013):
@EddieGarmon okay. In that case the only thing I'd quibble about is the following language (from 2.0.0§10), which does not read as though it is "leaving this open as an extension point:"
So to accomplish my goals I'll need to in my addendum or extension explicitly violate that part of the spec. Should the spec perhaps instead read,
@neverfox commented on GitHub (Apr 18, 2014):
One solution, if you want to give "people get a sense of how your fork relates to the original" is to bump the version past what actually exists in the original, but then add a pre-release version, e.g. if the original is 2.0.1, release 2.0.2-1 or 0.0.1 or whatever. Then if the original moves to 2.0.2 and you incorporate those changes, you'll be in sync according to precedence. This seemed to be a good solution for a project where I ported a library's css to stylus and that was the only real change. For all intents and purposes it was still the same version as the original from the end-user perspective.
The disadvantage is that it's technically not indicating the original version. I might consider using the metadata approach instead, however it does seem you'd lose automatic precedence that way, and it's not recognized by npm. However, I really like the suggestion of the post-release indicator.
@datagrok commented on GitHub (Jul 22, 2014):
@neverfox a problem with that suggestion arises when you have package A depending on package B at version <=2.0.1. I need to modify package B, but if I bump the version past 2.0.1, automatic dependency-resolving installers will ignore it. I'd have to also modify package A, and potentially every package in the tree above it.
@peterjenkins commented on GitHub (Jul 22, 2014):
@datagrok posted on December 27 describing pretty well the problem I'm facing.
My company consumes open source projects and sometimes has to make modifications that are kept private and not pushed upstream. (For the reasons he lays out.)
I'm not sure how to deal with this using semver. I don't want to lose the original version number, e.g. by changing the name of the package to MyCompanyName-PackageName and resetting the version to 0.0.1. Would be nice to have some way to track these types of forks in a coherent manner.
Ideally any changes I make post-fork would version-sort higher than the original release. But I have no problem with them sorting below subsequent releases of the original package.
Because if I'm not ready to pull in changes from the original package and re-apply my patches on top of it, I can just explicitly reference the version I forked and I'll still get the code from my fork.
@peterjenkins commented on GitHub (Jul 22, 2014):
I'd like to provide some more detail here, although I don't yet have a solution.
This example is using CocoaPods, but it could just as easily be npm, or another semver consumer.
My organization recently created a commit with some changes to SDWebImage.
SDWebImage agreed that the changes were a good idea but deferred them until a major semantic version release. (Appropriately, since they change behavior.)
We need these changes now so we forked their repository and applied our changes on top of their current version.
Even if we change the name of the package and repository against which we are linking, what is the appropriate semantic version to use to represent our forked version of SDWebImage?
Note that we will continue to iterate on the code in this repository before SDWebImage releases their next major version. So we will need correct sorting for semantic version numbers. But the semantic version of the base that we are off of has not changed.
In addition, SD web image may release another minor version that has changes that we need. So that also needs to be taken into consideration, that we'll need to bump the upstream version and re-apply our patches.
Now the response that you could have is to say the following:
The semantic versioning specification can't (or won't) solve these problems.
And I can't say that I would argue with you because I don't have a proposal for a way of handling this.
But what I hope I have done is described the limitation of semantic version that we're currently experiencing. So that other people can link to this comment and say "this affects me as well."
@peterjenkins commented on GitHub (Jul 22, 2014):
One more quick note. Ideally, I would like to be able to version the patch that is applied on top of SDWebImage. So if we bump the minor version but our applied patch stays the same, the part of the version that represents our patch would also stay the same, even though it is applied on top of new code.
@neverfox commented on GitHub (Jul 22, 2014):
@datagrok Good point.
@neverfox commented on GitHub (Jul 22, 2014):
It seems to me more and more that the solution is to represent forks with a fork or "post-release" indicator. Each time the symbol occurs, semver is applied to what appears to the right. This can support theoretically infinite fork complexity, e.g. 1.0.1 > 1.0.0
0.1.20.0.1 > 1.0.0~0.1.2 > 1.0.0. I think that would address all of cases @pjenkins-cc mentions. Of course, the key is getting buy-in from the major dependency managers.@peterjenkins commented on GitHub (Jul 22, 2014):
👍 for "post-release" indicator proposed by @neverfox!
@neverfox commented on GitHub (Jul 22, 2014):
Credit goes to @datagrok. I'll add that you could accomplish the same thing by just having unlimited dot notation slots, but how would that get handled currently?
On July 22, 2014 10:14:32 AM PDT, "pjenkins-cc - notifications@github.com" github.soma.3d1062736b.notifications#github.com@ob.0sg.net wrote:
Sent from my Android device. Please excuse my brevity.
@FichteFoll commented on GitHub (Jul 22, 2014):
This indeed sounds like a perfect cadidate for post-release versioning. My current favorite for this is the following suggestion by @zafarkhaja over at https://github.com/mojombo/semver/issues/200#issuecomment-47330860:
@neverfox commented on GitHub (Jul 22, 2014):
Ha, I had the same thought. +1
On July 22, 2014 1:01:03 PM PDT, "FichteFoll - notifications@github.com" github.soma.3d1062736b.notifications#github.com@ob.0sg.net wrote:
Sent from my Android device. Please excuse my brevity.
@peterjenkins commented on GitHub (Jul 23, 2014):
Let's try an example to see how it feels.
We're using SDWebImage version 3.6. So far we have two patches applied on top of it. Let's imagine some future changes beyond where we are now and see what the sequence of version numbers might look like. (Note the resulting version from each mutation is on the next line of the table.)
In step 8, we wouldn't actually need to bump the major version because we are pre-1.0. So my example is a bit contrived. But it does beg the question, what is the meaning of 1.0 for a patch?
In trying out the example, it seems there's still a few questions to figure out.
MyCompanySDWebimage 3.8+0.1.0~x86)After writing this up, I feel like there's some more discussion that needs to take place, but the basic idea does seem plausible.
I haven't spent too much time examining the logic a semver implementor would need to code up.
Might be something roughly like this:
One more thing I will note is that we have two independent development teams relying on these changes to SDWebImage that can't yet be accepted upstream. And actually it would be helpful to use semantic versioning for the interface exposed by our patch. So I'd like to be able to link against version 0.1.x of the patch, and know that if backwards incompatible changes are introduced by the other team as version 1.0.0 of the patch, semver won't pull them in until I'm ready.
In summary, the problem is definitely real--we do have a fork of SDWebImage and need to pick a version number to use. (We're currently in a situation like MyCompanySDWebimage 3.6+0.0.2.) I don't love the idea of saying "just use MyCompanySDWebImage 0.0.1". But this is also getting complicated.
Hopefully these examples helped clarify things by making them more concrete and didn't just muddy the waters.
@neverfox commented on GitHub (Jul 23, 2014):
Nice example, but one minor difference in what I imagined:
Basically think of the post-release version as an overlay of the upstream version, reassessing at each merge. The post release + the upstream version should answers the question: if the upstream incorporated my changes and only my changes, what would the next semver version be upstream?
@peterjenkins commented on GitHub (Jul 23, 2014):
@neverfox I agree about resetting the post-release version whenever you accept new changes from upstream.
Actually in this case, our patch does change the public interface exposed by the original package. (Which is why it can't be included now and has to be deferred for a major release.)
Even if it's only that one patch being applied, you'd need to set the version to 3.8+1.0.0 following your new guidelines.
The post-release versions are only valid for a single upstream version. Each time you pull new code from upstream, you MUST reset the post-release version back to one of (1.0.0, 0.1.0, 0.0.1) depending on how it relates to the public interface exposed by the original package. This would greatly simplify my example above.
Kind of like a Docker union file system, the version applies to the total public interface exposed by the software package.
@peterjenkins commented on GitHub (Jul 23, 2014):
Here's a snippet from the current BNF:
The new version might look like this:
@neverfox commented on GitHub (Jul 23, 2014):
Is there a way making that notation recursive to represent forks of forks etc.?
@peterjenkins commented on GitHub (Jul 23, 2014):
@neverfox You're right, what I posted is not recursive. How about this:
Note that the meaning of
<semver core>has changed.@neverfox commented on GitHub (Jul 23, 2014):
I think that captures it. Nice!
@FichteFoll commented on GitHub (Jul 23, 2014):
Would do exactly like @neverfox proposed regarding fork versioning.
I'm not sure about the porposed BNF though because
<version core>, this would look like0.0.12or12.0.0and you'd never use the first/latter two zeros for anything since you are in fact using a linear development cycle until (pre-)release.However, the second issue in particular makes the enforcement of a
<semver core> "+" <semver core>syntax impossible.@peterjenkins commented on GitHub (Jul 24, 2014):
@FichteFoll The idea is that the same rules that normally apply to package versions also apply to post-release versions (forks):
So instead of evaluating what was changed against the API of the previous package release, you evaluate the changes against the API of the package version that you forked.
So if you fork a package to fix a bug without changing the API, your new version is +0.0.1.
If you fork to add a new feature, which changes the API but doesn't break consumers of the old API, your new version is +0.1.0. This can also include PATCH level changes.
If you fork and break existing consumers (for example, by renaming a method), you need to use +1.0.0.
@zafarkhaja commented on GitHub (Jul 24, 2014):
@pjenkins-cc in case the post-release identifier makes its way into the spec, I think that the spec shouldn't impose any special constraints on its format. It should follow the same rules as the pre-release and build identifiers. In fact, these rules provide the flexibility that allows us to use datetimes and hashes for pre-release versions and build metadata, which is very convenient.
I believe that it should be defined on per project basis what kind of information you store in the post-release version provided that it follows the general rules of the spec. This will allow for broader adoption and application of SemVer.
@FichteFoll commented on GitHub (Jul 24, 2014):
@pjenkins-cc Yes, I know that.
The issue was that I triied to use the "nightly builds" or "developer builds" discussed in #200 with the post-release syntax here, and it's ... not as good. The entire reason of using post-releases and not pre-releases of a future version is that during development you don't know yet whether the next release will be a patch, minor or major because you haven't decided on the features you actually want to ship in the end. This also means that you can not push a major update to your dev branch, name that release
1.0.0+1.0.0and later decide that you don't want that yet and revert it ... and now you are on1.0.0+2.0.0?By the way, imo metadata should be part of
<semver core>.Example:
SDWebimage 3.6.0~x86->MyCompanySDWebimage 3.6.0~x86+0.0.2~somelib@neverfox commented on GitHub (Jul 24, 2014):
I think the issue is that we're really discussing two pretty different use cases that may have different needs. I think of forks as distinct from post-release. Recalling the OP, there are scenarios where a developer wants to use semver, but wants it to be prefixed by something that refers back to the point where the fork occurred. Disregarding that prefix, the expectation would be for semver to be followed as it normally would, which implies enforcing x.y.z. It's like saying "Use semver, but staring at the 7th character of the version string."
On the other hand, there are "post-release" scenarios (and I admit, as someone who personally does't make use of additional parts of the spec like pre-release, metadata etc., I'll need others to let me in on some realistic examples of what those scenarios might be) where your interest is just to mark something that isn't really to be considered a true release. This is aided by a no-enforcement approach.
I don't know if that means there should be two solutions instead of one, but it might help to keep that taxonomy in mind when discussing it.
@peterjenkins commented on GitHub (Jul 24, 2014):
Here are some more thoughts.
Metadata
In my view, the original purpose of metadata was information that differentiates multiple builds that have the same source code. If you can build a static library separately for
x86orarm7s, but it's the same source code, those two builds should sort the same. One is not "newer" than the other. Semver can't tell you which one you need.So when you fork a project with the "fork" button on GitHub, you fork the code, not the specific build products. But the end product has one architecture--the same library isn't an x86 build of the original source code and then the patch on top of it is compiled for arm. So that's my reasoning for one single build metadata field. (At the end, preceded by a tilde.)
I understand people have shoehorned all sorts of things into the "build" metadata field. My hope is that adding post-release versions will help mitigate the need to do so.
Nightly builds (and related)
I haven't read all the discussion on #200 and the related issues, but I've read some of it. So far I've understood the following points.
From where I stand, all those points can be addressed while still keeping post-release identifiers as "real" semver versions. Let me explain.
Suppose you are part of a team developing a package. If you are committing code, you're generally changing the behavior of that package in some way, unless you are refactoring.
If you are changing the behavior of the package, someone else may be relying on the old behavior. One of the reasons semver exists is to have a standard way to communicate how that behavior is changing.
The nice thing about having hierarchical post-release versions is you can communicate those changes to behavior on a more granular level than you were able to do so before. Let me try an example. I'm going to add lots of details to make it concrete and model a real-world situation, but none of what I am discussing is specific to this situation.
Example situation
Imagine you have a closed source proprietary library and you are making regular releases to specific third party partners that have signed an NDA. This library is for iOS, and consists of several
.hfiles and a.astatic library. Your current "production intent" version that you have given your partners is 1.2.25, and they're currently using that version in their app store versions of their apps. Note that this is a case where your product has reached 1.0.0. If you were still 0.y.z, the situation would be different.Now imagine you're working in an agile process in a two week sprint. You're planning to do four two-week sprints before you ship the next version of the library, and you already have a date that you have to ship by. You're time boxing at the release level and also at the sprint level.
Now imagine you have another group within your company that also depends on this library. Because they're publishing an end-user product similar the one made by your third party partners. But they get access to the pre-release versions of the library, where the partners only get the final "production intent" lib delivered to them.
So you're going to provide a version of your library to this other group at the end of each sprint.
Now on top of all this, you're doing nightly builds and you want a way to version them. You don't want to evaluate semver for each commit. (Too restrictive.) And since you don't have someone there to do the evaluation at the time the build is automatically cut, you don't want to evaluate semver for each nightly build either. But you do want to use semver with this other group, because they need to know what changed in the sprint.
Modelling that situation using semver and post-release versions
Let's try our hand at how this new post-release capability can model the change that is occurring in the public interface that your library exposes. Note that in this case the public interface is very clearly defined--it's exposed directly by the
.hfiles that have viewable source by the internal and external consumers of your library.External consumers
One way to examine this is to start from the outer level and work backward. You start with your current version that is already given to your 3rd party partners and is being used in apps published in the App Store. This version is 1.2.25. You will need to figure out what new version will be used for the update that you're shipping eight weeks from now. But you don't need to decide on that version until pretty close to the time you ship. So what you can do is add a post-release indicator and track the changes to that interface as a separate version. When you actually do need to give that production intent library to your partners, you're going to collapse the post-release indicator, pick the final "real" semantic version for the release, and send them an email with creds to an FTP site so they can download a zip file, containing the new
.hfiles, the new.a, and some release notes telling them what's changed.Internal consumers
Now before you get to your final real release, you're going to do four internal-only releases to the other team within your company that's consuming your library. The format of the library as it is delivered to them is the same--the public interface is defined in an identical manner.
That team will have a concern as they are integrating those four releases, and their goal will be to understand how those releases have changed when compared against the previous version you gave them. So you complete sprint number three and give them a new .a and header files, and they'll be wondering: How is this different from the version you gave me after sprint number two?
To express this, you can use a post-release indicator for each build you give to this other group. You start with build 1.2.25 and after two weeks of heavy development, you've added a bunch of new features and also fixed some bugs. But you didn't break backwards compatibility. So you give them 1.2.25+0.1.0. This is the first version on top of 1.2.25 that is exposed to these consumers, so there's no reason for the middle number of the to be greater than 1. Say you just fix bugs in sprint 2, and then you give them 1.2.25+0.1.1. Now you add new features in sprint 3 and break backwards compatibility with 1.2.25+0.1.1. (Note the distinction there.) In this case, they need to know adopting this new version means they're going to have to make changes on their side. So you finish up sprint 3, and you give them 1.2.25+1.0.0.
Pre-internal consumers
Within each of these two week sprints, the builds are undergoing rapid development. New features are being added and removed, depending on the unpredictable temperament of your product owner. Realizing the absurdity in trying to apply semver to every nightly build or even every commit, you strike that idea immediately as not valid. But you still want the ability to version your changes--having a single definitive version number for a given set of source code is immensely valuable.
In order to avoid making a decision about semantic version continuously, you assume that every nightly build breaks backwards compatibility with the previous nightly build. Note again, you're not making a comparison with your 1.2.25 release version--there's no need to do that until right before you send the new version out to your partners.
So suppose you're starting sprint 3, and you need to assign a version to the nightly builds you are producing. And you want to express that each nightly build is expected to break things from the previous one. You choose to express this by adding another post-release version component.
So on Monday of sprint 3, your most recent internally released version is 1.2.25+0.1.1. At midnight, your build machine chews up your source code from git and spits out 1.2.25+0.1.1+1.0.0. At midnight on Tuesday, it does the same thing as produces 1.2.25+0.1.1+2.0.0.
Taking a step back
At this point you may be thinking to yourself, 2.0.0? Why do I need those extra version numbers? Why not just "2". Well, without the extra version numbers, you lose the meaning. One of the big purposes of semver is communication. Here's a quote from semver.org
Suppose I'm new to your company, and I know nothing about your build process or development workflow. But I know semver, and you tell me you're using semver. I can look at the auto-generated build number 1.2.25+0.1.1+2.0.0 and extract some meaning from it. "OK, so they have one round of MINOR level changes since the most recent release, plus a round of PATCH changes. Then they cut some sort of snapshot, then there's been two sets of breaking changes applied on top of that."
Another step back
Semantic means meaningful. Before semantic versioning, version numbers were pretty arbitrary. They often reflected a decision made by Product: "This next release is going to be 2.0."
That's fine for something facing consumers. But typically semver is used when the people consuming your product are not consumers. They're coders. Semantic versioning gives you a way to tell those coder consumers how pulling in your new version will affect the product they are building.
When I first read the semver spec, I was blown away. I'd been working with software with version numbers for years. But never in a way that had much meaning to it. And here this guy took the same constructs that I was familiar with, and said: "hey, if you follow these rules, we can communicate in a consistent manner and make everyone's life easier." If you're interested enough to be following this GitHub issue, you may have had a similar reaction.
Forks and snapshots play an important role in many software teams. We need the capability to create version numbers that communicate about forks in a consistent and clear manner. This capability isn't present in Semantic Versioning 2.0.0.
Let's keep driving the discussion forward.
@datagrok commented on GitHub (Jul 25, 2014):
I disagree with the motivation to make the post-release identifier a strict semver. Instead, it should work like the pre-release identifier, but have a higher rather than lower precedence.
I think we should use
~for post-release identifiers because+is in use for build metadata. I don't think we want a Semver 3.0 that changes the meaning of+away from 2.0.Semver already defines fairly simple logic for ordering pre-release identifiers. From 2.0.0§11:
It goes on to specify how to order versions that differ only in their pre-release identifiers.
If semver were to grow a means to specify post-release identifiers using the same sorting logic as pre-release identifiers, it would meet my needs for locally applied patches and local forks, without imposing a lot of additional complexity.
A new section that is a near-verbatim copy of §9 might read (changes in boldface):
§11 could include:
I'd add a cautionary statement that versions including post-release identifiers should probably not be published, but if they are, the identifier should include some alphanumeric text indicating the publisher. That way my SDWebImage-3.6
0.0.1 does not conflict with your SDWebImage-3.60.0.1In this way,
1.0.0-x.x.x < 1.0.0-x.y.z < 1.0.0 < 1.0.0
x.x.x < 1.0.0x.y.zPackage managers given the requirement of Package==1.0.0 should find all of the versions above and use the one with highest precedence, namely 1.0.0~x.y.z.
BNF:
@s4y commented on GitHub (Oct 19, 2014):
Just poking this, as I ran into this issue in Iced CoffeeScript. They want to keep its version numbers in sync with main CoffeeScript but be able to release multiple versions per CoffeeScript release. Right now, it seems like the only options are:
…but it seems like this proposal would work better.
@peterjenkins commented on GitHub (Oct 20, 2014):
@datagrok "I disagree with the motivation to make the post-release identifier a strict semver."
I understand the reluctance to having two heavy-weight items in the version string. However, I feel that it is necessary. In my opinion, the primary point of semver is to describe the runtime behavior of the code. Pre-release indicators don't need semver because they don't have anything to do with code behavior: the main semver still applies.
However, the presence of a post-release indicator invalidates at least part of the original semver. So if the post-release modifier isn't a full semver, you lose the ability to compare two versions and understand the impact of the changes.