Contribution experience report: Forgejo

Welcome to my third contribution experience report. I have done others for:

My motivation to contribute

In short: I wanted to fix some bugs I had encountered, and I was curious about contributing to a “soft fork”.

Here is more background. In Kanthaus, we used to rely on GitLab to host about 20 Git repositories for various documents (governance, minutes of meetings), for website sources (such as for https://kanthaus.online or https://handbook.kanthaus.online) or some pieces of software used in the house (such as scripts to steer our heat pump or to gather statistics about the house).

Although we like publishing a lot of those things in the open, some repositories need to be private, for instance if they contain personal information or minutes of meetings about sensitive topics. A GitLab pricing change meant that we could no longer have private repositories for free there, unless we reduced the number of members of our GitLab organization to maximum 5. For a while we kept using it, by setting up a communal account whose credentials were shared in the community. This was obviously not ideal from a security standpoint and it also meant that we couldn’t reliably keep track of who did what.

Self-hosting GitLab looked difficult given that it seemed to require a lot of resources. Once Forgejo / Gitea developed a continuous integration feature, that started to look like a credible replacement. So I started experimenting with it and encountered a few problems. I wanted to avoid requiring people to create accounts for this service by reusing Nextcloud for authentication, but in some cases this generated internal errors in Forgejo. Also, although Forgejo had support for migrating repositories from a wide range of sources (including GitLab), migrating some of our repositories there failed with a SQL error.

First contact with the project

I first opened an issue about Forgejo’s website, to encourage adding a more prominent link to the documentation. The suggestion was well received and although I wrote that I was open to submitting a PR for it, it quickly got implemented by someone else. It felt like a good start!

Development environment

I had never written Go code before so I did not know how to set up a development environment for this project. I did not intend to do a lot of work on Forgejo so I just tried editing the Go source files with my text editor. I discovered that the Docker image could be built via Docker itself, without having to install any particular tooling on the host machine, thanks to the “buildx” Docker plugin. Although that takes much longer than “normal” compilation, it felt attractive not to have to worry about the build dependencies at all. To submit my changes to the project I did need to install some tooling in the end, to run the tests and format my code. If I remember correctly, installing the golang Debian package was sufficient for that.

Finding my way into the code base

For the first bug I wanted to solve (about OAuth integration), I could find the place where the error was thrown by enabling the development mode on our Forgejo instance. From there, imitating the surrounding code was good enough to implement the fix I wanted. For the second bug it was a little harder. Although the bug was known (in Gitea’s bug tracker), the root cause was not understood (at least it was not clear from the issue). I think I investigated it by logging all SQL requests made during the migration of my GitLab repository, but I am not sure if I could do that from Forgejo itself or if it was on PostgreSQL’s side. Once I had unterstood the problem it was relatively easy to find where to make my fixes.

Reviewing experience

This is where Forgejo being a “soft fork” made the experience quite interesting. What they mean by “soft fork” is that the commits that diverge away from upstream (Gitea) are regularly rebased on top of upstream. This has a lot of consequences for contributors like me.

First, because all Gitea commits eventually make it into Forgejo, I had a choice of submitting my changes to either projects. The reason why I had started using Forgejo and not Gitea is that it looked like Forgejo had put more thought into governance. The folks behind it started the fork because they disagreed about the lack of separation between Gitea as an open source project and as a commercial service provider. As I could relate to that concern, it felt natural to use Forgejo. I also appreciated the fact that Forgejo was developed using Forgejo itself, whereas Gitea’s repository is hosted on GitHub, which felt a bit ironic.

In this context, submitting my pull request to Forgejo felt more natural: otherwise, I’d just be interacting with the Gitea project and Forgejo’s governance improvements would be of little use to me. Another important factor was that Forgejo’s pull request backlog was much smaller than Gitea’s, so it felt like my contributions were more likely to be reviewed swiftly in Forgejo.

The review process in Forgejo was really great: I got blazingly quick feedback, people were very supportive and helped improve my not very idiomatic Go. I guess there is a particularly strong incentive to be nice to newcomers when you launch a fork, to gather the critical mass needed to make the fork viable.

However, I have later realized that even if I only care about Forgejo, there is still a strong case for submitting contributions to Gitea. Indeed, by submitting pull requests to Forgejo, I am adding to the stack of commits that need to be rebased each week on top of Gitea, which is a significant burden for the project. If Gitea makes changes to the lines of codes my patches touch, then my patches will need to be rewritten as part of this rebase process. This is a bit of a weird thing to do: in a sense, the person doing the rebase will be putting words in my mouth. This case did actually happen when Gitea wrote a different fix for my second issue (about migration from GitLab). The person doing the rebase did ask for my review when that happened, which was classy, but it does feel like a really complicated process, especially because the rebase must happen quite quickly and atomically, so the project can’t really afford lengthy reviewing rounds in such a process.

Another consequence of this weekly rebase is that if you have pull requests to Forgejo that are open for longer than a week, you’ll need to rebase them yourself on top of the newly-rebased Forgejo. It’s not a big deal, but still, it would be nicer without. On the plus side, this encourages everyone to merge pull requests swiftly, I guess.

Note that the Forgejo project is currently considering to become a hard fork instead.

Testing infrastructure

Forgejo has a test suite, mostly consisting of unit or integration tests focusing on the backend, written in Go. For my first fix, I was lucky to find that the area I was working on already had a test which I could duplicate and adapt pretty easily to cover my changes.

For the second, it was a bit trickier. Because migrating a GitLab repository to Forgejo involves making a lot of HTTP requests to GitLab, those requests need to be mocked to make the test pass reliably. There were already some tests for the GitLab integration which did that, but the use case I needed to test involved making quite a few requests, so it did not feel like it would be really doable to set up all the required mock requests by hand. There was also a test which would run against the live GitLab.com instance, but only if an API key was provided in the test environment. Neither Gitea nor Forgejo provided such an API key in their CI, meaning that the test had not been run for a long time: it was actually failing, because it hadn’t been adapted for some recent changes in the code. Not great. I went on to update the test so that it passes, then turn it into a properly mocked test by capturing all the HTTP trafic it generated and turning that into mocked HTTP responses, and finally use the same machinery to write a test for my fix. So there was quite a bit of clean up work needed there.

For this testing work it felt very useful to be able to run a single test and not the entire test suite every time. To me this is really a crucial feature in any project I work on, because the machines are work on are typically not very fast and will take many minutes to run the test suite of a project like OpenRefine or Forgejo. I asked on the project chat how people did that and I was surprised to find that it was not that easy (it required crafting a command to invoke the test runner manually), so I took this opportunity to investigate and document how to do that in VSCodium instead.

Code formatting

When I submitted my first pull request the CI complained about some format violations. I could fix them with the gofmt command, which was the one raising the error in the CI. I then realized I can use make lint to format everything. Also, switching to VSCodium helped to make sure the formatter is run in the background when I save a file.

Governance and roadmap

As mentioned before the governance model of Forgejo is pretty thorough (which is to be expected given that this was the motivation to fork). They have a dedicated repository to document their rules, the composition of teams, the decisions they make, and so on. They were also quite proactive in pulling me in, first by adding me to the Contributors team without me even requesting it.

Because that did not let me merge pull requests as such, that prompted them to create a new team in their governance model (“Mergers”) and encourage me to apply to it (even though my Go is clearly quite wobbly, and I hadn’t requested the right to merge at all). That’s a pretty extraordinary thing to see in a FOSS project.

In OpenRefine I have tried to be similarly proactive in pulling people in but I have realized that because our governance model is pretty messy, it would be worth cleaning that up first.

Also, as a new team member in Forgejo I had the same sort of weird experience as I expect new OpenRefine contributors have. I do not remember getting any notification from the forge about me being added to the team (in GitHub, you would at least get an invitation by email), and then I got tons of notifications from various repositories in the organization because I started watching them automatically. This is an experience I would be interested in improving: maybe by making more changes to Forgejo, for instance to notify people when they get added to teams (ideally, explaining why they are added, what privileges it grants them, how they are invited to use them, and so on).

Would I contribute again?

For sure, it was a great experience. It gives me the confidence that I should be able to fix other problems in Forgejo as I encounter them in our use of the platform. I am excited to see how the project evolves, especially with the prospect of a hard fork.