Should we adopt C4?

jospoortvliet · June 2, 2016, 12:06pm

The Collective Code Construction Contract is an evolution of the github.com Fork + Pull Model, aimed at providing an optimal collaboration model for free software projects.

It is absolutely interesting to read and we should see if we can use this, adopt it perhaps. From the link above:

Goals

C4 is meant to provide a reusable optimal collaboration model for open source software projects. It has these specific goals:

To maximize the scale and diversity of the community around a project, by reducing the friction for new Contributors and creating a scaled participation model with strong positive feedbacks;
To relieve dependencies on key individuals by separating different skill sets so that there is a larger pool of competence in any required domain;
To allow the project to develop faster and more accurately, by increasing the diversity of the decision making process;
To support the natural life cycle of project versions from experimental through to stable, by allowing safe experimentation, rapid failure, and isolation of stable code;
To reduce the internal complexity of project repositories, thus making it easier for Contributors to participate and reducing the scope for error;
To enforce collective ownership of the project, which increases economic incentive to Contributors and reduces the risk of hijack by hostile entities.

jwildeboer · June 2, 2016, 12:16pm

An exhaustive explanation of what C4 is and what it does is at http://hintjens.com/blog:117

Specifically the description of the development process at http://hintjens.com/blog:117#toc46 may look really counterintuitive at first (optimistic merging, accept all formally correct pull requests immediately) but it really works better that way.

As I brought up C4 over at G+, I am happy to explain it in more detail if required.

IMHO all projects that care about Software Freedom should use C4 for the community aspect. But that is strictly IMHO, of course.

jospoortvliet · June 2, 2016, 1:13pm

jwildeboer:

An exhaustive explanation of what C4 is and what it does is at Building Online Communities - Hintjens.com

Specifically the description of the development process at Building Online Communities - Hintjens.com may look really counterintuitive at first (optimistic merging, accept all formally correct pull requests immediately) but it really works better that way.[/quote]

Hehe, that does sound scary, yet at the same time - see the blog I just published, we really want to be better citizens and take our responsibility to merge stuff faster. If people send in a PR and don’t have time or skills to fix it right away, we have to be prepared to do the work for them. At least for the first few PR’s people submit.

[quote=“jwildeboer, post:2, topic:113, full:true”]As I brought up C4 over at G+, I am happy to explain it in more detail if required.

IMHO all projects that care about Software Freedom should use C4 for the community aspect. But that is strictly IMHO, of course.

Thanks for the input, this is really interesting.

Andy · June 2, 2016, 1:22pm

In general C4 seems perfectly fine even though some process related corner stone seems to be inefficient to some extend making collaboration a bit bureaucratic and also prohibits certain scenarios:

C4 is obviously following the trunk based development paradigm which I prefer over the “branch for everything” style simply for efficiency reasons. C4 defines that users need to fork, do their changes and open a PR. This makes collaboration a bit hard between long time members who want to collaborate on an issue since one would have to fork and grant the other one write permission so that collaboration between to persons does not have to be done via PRs (which would again be inefficient) targeting the fork.
One other issue has been mentioned by Joos. You cannot fix issues of a PR since that one comes from a PR that is hosted in another fork where you do not have write permission. So you would probably for the PRs originating fork, fix anything, open a knew PR and close the old PR. So this seems rather complicated.
Not working with branches also prevents release trains as in working on 9.2, 9.3 and 10.0 at the same time in parallel, since you only have one branch: master

Just my two scents feel free to challenge it. I would rather go for something between trunk-based and gitflow with boh being the extreme variant so aiming for a best of breed approach

jwildeboer · June 2, 2016, 1:53pm

Yes, it may look quite complicated. But the logic behind this is rather simple. It’s all about making it really easy for contributors to join. So merge all PRs ASAP that solve a problem (remember: all PRs MUST solve a documented and accepted issue in C4) as long as they are technically correct. Should the PR break stuff, revert the changes with a comment stating that. This makes sure you capture the reputation and code quality of all contributors and (more important) the contributor gets immediate gratification and feedback. Fighting over code before the merge of the PR takes place has the disadvantage of delying the merge and not having a record in the code of the problems. The fixes always happen in MASTER, so you don’t have to look at the fork at all (ideally). With fast merging, the risk of incompatibility is also reduced to almost zero. OTOH it actually means that you MUST merge fast to avoid those problems.

Will 9.2 ever be merged into 9.3 or 10.0? If not, aren’t they technically forks then? And isn’t keeping those forks synchronized a waste of presious resources? This is what C4 explains in more detail at Building Online Communities - Hintjens.com

Jan

Andy · June 2, 2016, 2:29pm

Yes, they have to be merged and should most likely be merged from time to time, again and again and technically they aren’t forks since they would live in the same repository, so contributors can fork the repo with the branches and contribute to the different branches as in different upcoming releases.
As for the wasting of resources I would argue it is the same waste as not doing it since trunk based development when you do a mayor, mayor change it means: large code changes and long time development with you being the poor one as the developer who has to resolve conflicts all the time and stay up to date with an all time changing base. It took me some days/weeks to implement the material design for the oC Android app which needed to be some kind of a big bang release since in order to get this done a lot of code had to be touched since the introduction broke the app at runtime in many areas. So finding bugs down the road and maybe having to revert this, when other commit already made it to master in between will be incredibly complex and painful.

The rather likely scenario is probably working on 9.1.x and 9.2.0 and having both of these branches in the “official” repo.

jwildeboer · June 2, 2016, 2:43pm

Yes, the pain of parallel maintenance of several versions and backporting and forward porting fixes and features, frustration over regressions that make life miserable - I work at Red Hat, I know it all too well

IMHO the main lessons to be learned from C4 are:

Optimistic Merging over Pessimistic Merging (which includes the separation between contributer and maintainers. IMHO that’s the secret ingredient for a good community)
Keeping the community focused and productive
Put people over code because people write code but code cannot create people
Have defined rules instead of flamewars to weed out the trolls and bad influences.

I think C4 reflects a lot of solutions to problems we have all seen in F/OSS communities. But it is not cast in stone. However, I think the main lessons I listed above should be agreeable by all nextcloud people as goals to work on. Again, all of this is just an outsider trying to help

Jan

Andy · June 2, 2016, 3:00pm

@jwildeboer I totally agree with you especially on the main lessons learned you mentioned and as a general guideline working trunk based should also be the way to go for reasons of speed but I would prefer to have something like C4 as a general guideline which everybody then follows but still keeping things pragmatic so that one can still slightly defer from the guide in cases where it makes sense and people agreed on doing thing slightly different in that special case.

jan · June 2, 2016, 7:57pm

Just as general info: At the moment we basically used Github flow (not git flow): https://guides.github.com/introduction/flow/ – much simpler than git flow.

Also, this section:

Contributors SHALL NOT have commit access to the repository unless they are also Maintainers.

Currently we do give everyone commit access. We do have a review policy (2 thumbs up) and no one is allowed to push directly to master. But people definitely do need commit access to collaborate on branches.

MariusBluem · June 2, 2016, 8:10pm

Good Point @jan

We should create an CONTRIBUTING.md file in the repository, where this is explained (will be forked with the whole core anyway … I think ).

However, I liked this video about the PR process: https://www.youtube.com/watch?v=M_-PtACnnz4

jyaworski · June 3, 2016, 12:57am

Here’s my take:

Use C4
Write unit tests for everything, and encourage our contributors to do so as well
Write acceptance tests for supported operating systems
If a PR passes unit tests, ping the maintainers about it. Possibly one acceptance test
When we do a release, choose a commit and run unit and acceptance testing on it. If it passes, we can tag that as a release and push tars

Thoughts?

stp · June 3, 2016, 6:19am

Yeah for a “merge most things” strategy we at least have to require a lot of tests. One awesome example of automation and testing the shit out of everything is kubernetes (https://github.com/kubernetes/kubernetes) they even test performance from commit to commit.

Henni · June 3, 2016, 12:05pm

I fully agree with this point, but it will probably still be difficult to translate it into action.
A lot of first time contributors have never written tests and might close their PRs before having to deal with this.
Additionally core/server (however you want to call it) is only to 60% covered by tests. So there is still a lot to do.

But nonetheless C4 sounds like a good idea to boost participation among the community.

Regarding encouraging contributors to write tests:
one way to do this, would be to give them some advice on how to do so. owncloud’s documentation only contains a rudimentary explanation on the tools used. This should be extended and maybe there are also some blog posts to be written about this.

orion1024 · June 3, 2016, 2:03pm

Fast merging would be great.

But automated testing is a must in order to achieve that without putting reliability at risk.

If fast merging is a priority for Nextcloud, then implementing proper automated tests should be as well.
I posted such a suggestion on the planning thread and it got a sizable amount of votes, even when it was only for the desktop client, so the need is definitely there.

Andy · June 3, 2016, 2:03pm

I would prefer a mixture and base it on Github Flow:

Use Github Flow with the (other) guidelines from C4
Write unit tests for everything, and encourage our contributors to do so as well
Write acceptance tests for supported operating systems
If code review is positive and passes unit tests, ping the maintainers about merge. Possibly one acceptance test
When we do a release, choose a commit and run unit and acceptance testing on it. If it passes, we can tag that as a release and push tars/APKs

Branches should be created within the main repo to make collaboration easier. The release/merge speed depends on the capacity in my opinion so C4 doesn’t oppose a speed benefit over Github flow since you need someone having the time to merge and probably fix smaller things in both paradigms. So I would prefer quality over speed. Even though this should be a theoretical thing as in merges have a higher priority than starting on new features. Prio 1 merges, Prio 2 starting new features. This way we shouldn’t end up having PRs in the digital graveyard. Besides this the Prio 0 is obvious which would be Hotfixes (which are technically also PRs and should be opened anyways). So I would propose 2 general rules of thumb while using Github Flow:

Reviewing, Testing and Merging open and ready PRs is more important than working on new issues
Getting Hotfixes done is more important than fixing bug is more important than implementing new features

This approach would put quality first at a relatively low cost and it also means to pause work on a feature implementation to do a review, test, merge of an open PR “in between”. This way PRs should still get merged pretty quickly with means adding value to the product as soon as possible.

jyaworski · June 3, 2016, 3:08pm

That’s a neat idea. The main idea is to go through existing PRs before merging/adding more where possible.

Andy · June 3, 2016, 3:11pm

exactly!

amenthes · June 4, 2016, 8:01pm

Using git and forbidding branches seems to me like an anti-pattern. Might as well go back to SVN then, no?

hook · June 4, 2016, 10:08pm

It’s great to see that the development model is being debated in the open and I hope that a lot of thought will be given to it as well.

I think in the C4 there are several good ideas that could be incorporated into Nextcloud’s development model, but adopting the whole thing in a vanilla fashion doesn’t seem to be a good idea to me.

The branching and forking models both have its social implications, and careful consideration should be taken about which to use and why.

From the legal and governance PoV, the parts that cause greatest concern to me are as follows:

The project SHALL use a share-alike license, such as the GPLv3 or a variant thereof (LGPL, AGPL), or the MPLv2.

All contributions to the project source code (“patches”) SHALL use the same license as the project.

All patches are owned by their authors. There SHALL NOT be any copyright assignment process.

I don’t want to open the CLA/CAA can of worms here, but simply can’t ignore it. Scattered authorship does bring its own issues.

That being said, an Inbound ≡ Outbound CLA¹ combined with an “or later” clause (e.g. AGPLv3+, which ownCloud and inevitable Nextcloud is under) is a less flexible, but IMHO the next best thing.

There are details that have to be thought of here though.

A patch SHALL NOT include non-trivial code from other projects unless the Contributor is the original author of that code.

This can’t work at all.

For starters Nextcloud will already from the very start break that rule almost 100%, as a huge majority of the code is owned by ownCloud, Inc. (for now, let’s see who buys it).
Even if e.g. Nextcloud GmbH is to buy that copyright from ownCloud Inc, there is still a bunch of 3rd party code already in ownCloud. Spreed.ME will be an additional one.
Also sticking to such a NIH rule makes much more sense for a project like ZeroMQ as than for Nextcloud, which aims to be a platform for others to build upon and to extend.

1: Yes, you heard me right. Inbound ≡ Outbound (and part of C4 for that matter) is a Contributor License Agreement, as the contributors agree upon the inbound (and outbound) licenses for the project. What ownCloud had/has and what C4 is against is an outright CAA – Copyright Assignment Agreement.

jyaworski · June 27, 2016, 1:57pm

I tend to agree, here. Branching is one of the main draws of git.