All Posts By

Dawn Foster

Several silver forks with shadows on a teal background.

What Happens to Relicensed Open Source Projects and Their Forks?

By Blog Post

Featured image by Ursula Gamez on Unsplash.

This post has been republished with permission from the original article on The New Stack: What Happens to Relicensed Open Source Projects and Their Forks?

Many popular open source projects are owned and driven by a single company, and in today’s difficult economic climate, those companies are under increasing pressure to deliver a strong return on their investments. One response to this pressure has been relicensing popular open source projects to more restrictive licenses in the hopes of generating more revenue. In some cases, relicensing has resulted in a hard fork of the original project. These relicensing events and their forks can be disruptive to the organizations and individuals using and contributing to affected open source projects.

Several companies have relicensed their open source projects in the past few years, so the CHAOSS project decided to look at how an open source project’s organizational dynamics evolve after relicensing, both within the original project and its fork. Our research compares and contrasts data from three case studies of projects that were forked after relicensing: Elasticsearch with fork OpenSearch, Redis with fork Valkey, and Terraform with fork OpenTofu.

These relicensed projects and their forks represent three scenarios that shed light on this topic in slightly different ways. The following summarizes what we found when we looked at the data, and you can dive into the details about these six projects in the paper, presentation and data we shared at the recent OpenForum Academy Symposium.

Elasticsearch and OpenSearch

Almost all contributions to the original Elasticsearch project came from employees of the relicensing company (Elastic), and the fork was created by new contributors and owned by a single company (Amazon).

Elasticsearch

Elasticsearch was an open source project under the Apache 2.0 license until Feb. 3, 2021, when the project was relicensed under the Server Side Public License (SSPL) and the Elastic License. On Aug. 29, 2024, it again became an open source project when Elastic announced it was adding the GNU Affero General Public License (AGPLv3) as an additional licensing option, but there isn’t yet enough data to include this in the analysis.

Both before and after the relicense, contributors to the Elasticsearch repository were mostly Elastic employees; they consistently made over 95% of the lines added to and deleted from Elasticsearch, with almost no participation from contributors outside of Elastic. As a result, the 2021 relicense had little to no impact on contributors, but there was a bigger impact on the users or consumers of Elasticsearch who were forced to decide whether to continue using it, and if so, under which of the two available licenses.

OpenSearch

OpenSearch was forked from Elasticsearch on April 12, 2021, under the Apache 2.0 license, by the Amazon Web Services (AWS) team so that it could continue to offer this service to its customers. OpenSearch was owned by Amazon until September 16, 2024, when it transferred the project to the Linux Foundation.

As with Elasticsearch, most contributions to the OpenSearch repository came from Amazon employees, however, to a lesser extent and with increases in organizational diversity over time. In the first year of the fork, a small number of Amazon employees made 80% of total additions and 91% of total deletions to the code. Only two people who didn’t work for Amazon made 10 or more commits, making up 7% of additions and 4% of deletions.

In the final year of the fork under Amazon’s ownership (before the project was moved under the Linux Foundation), its organizational diversity improved, with 63% of additions and 64% of deletions coming from Amazon employees who made 10 or more commits. Six people who didn’t work for Amazon made 10 or more commits, making up 11% of additions and 13% of deletions. In summary, the contributors are mostly from Amazon, but organizational diversity is gradually improving.

Terraform and OpenTofu

Almost all contributions to the relicensed Terraform project came from employees of the company (HashiCorp), and the fork (OpenTofu) was created by new contributors as a foundation project.

Terraform

Terraform was under the open source Mozilla Public License v2.0 (MPL 2.0) until Aug. 10, 2023, when it was relicensed along with HashiCorp’s other open source projects (e.g., Vagrant, Vault) to the Business Source License (BSL). Similar to Elasticsearch, the Terraform repository had very few contributors who weren’t HashiCorp employees. In the year before and the year after relicensing, there were only two contributors to Terraform who were not affiliated with HashiCorp, and they both made a very small number of contributions.

Since there were so few contributions from outside of the company, there was no substantial impact on the contributor community from the relicensing event, so the only people affected would likely have been Terraform users.

OpenTofu

OpenTofu was forked from Terraform on Aug. 25, 2023, by a group of users as a Linux Foundation project under the MPL 2.0. These users were starting from scratch with the codebase since no contributors to the OpenTofu repository had previously contributed to Terraform.

Contributions came from 31 people at 11 organizations who made five or more contributions to the OpenTofu repository in the first year. The most substantial contributions came from Spacelift, whose employees made over half of the additions and deletions. Employees from Env0 and Scalr have also made a few contributions, so there is some organizational diversity across the project.

Redis and Valkey

The relicensed project (Redis) had significant numbers of contributors who were not employed by the company, and the fork (Valkey) was created by those existing contributors as a foundation project.

Redis

The Redis project was an open source project under the Berkeley Software Distribution 3-clause (BSD-3) until March 20, 2024, when the project was relicensed under the Redis Source Available License (RSALv2) and the SSPLv1. This was contrary to the 2018 Redis blog post stating that the Redis open source project would always remain under the BSD license.

The Redis project differs from Elasticsearch and Terraform in the number of contributions to the Redis repository from people who were not employees of Redis. In the year leading up to the relicense, when Redis was still open source, there were substantial contributions from employees of other companies: Twice as many non-Redis employees made five or more commits, and about a dozen employees of other companies made almost twice as many commits as Redis employees made.

In the six months after the relicense, all of the external contributors from companies (including Amazon, Alibaba, Tencent, Huawei and Ericsson) who contributed over five commits to the Redis project in the year prior to the relicense stopped contributing. In sum, Redis had strong organizational diversity before the relicense, but only Redis employees made significant contributions afterward.

Valkey

Valkey was forked from Redis 7.2.4 on March 28, 2024, as a Linux Foundation project under the BSD-3 license. The fork was driven by a group of people who previously contributed to Redis with public support from their employers. Within its first six months, the Valkey repository had 29 contributors employed at 10 companies, and 18 of those people previously contributed to Redis. Valkey has a diverse set of contributors from various companies, with Amazon having the most contributors.

Next Steps

This is the first step in a much larger research project underway within the CHAOSS Data Science Working Group. To date, we’ve only looked at the primary repository and organizational affiliation data for each project, so we’re working toward including more repositories and additional metrics to better understand the project health dynamics within these projects. We also might expand to look at other projects that were forked after being relicensed.

Looking at all of these projects together, we see that the forks from relicensed projects tend to have more organizational diversity than the original projects. This is especially true when the forks are created under a neutral foundation, like the Linux Foundation, rather than forked by a single company.

It is still too early to understand the ultimate success or failure of these projects — both the original and the fork. The new forks have more organizational diversity, and projects with greater organizational diversity tend to be more sustainable. However, we don’t yet know if this will be true for these projects, especially for companies that continue struggling to meet their investors’ expectations.

CHAOSS will take part in State of Open Con, a conference covering open source software, open hardware, open data, open standards and AI openness on Feb. 4 and 5 in London. Alex Williams, founder and publisher of The New Stack, will moderate a track on the future of open source.

selective focus photography of clear lightbulb

Unlocking Insights: Practitioner Guides for Interpreting Open Source Metrics

By Blog Post

Photo by Martin Wilner on Unsplash

I am thrilled to announce that we have just launched a series of Practitioner Guides to help people develop meaningful open source project health insights. 

Today, we have released the first four guides in the series:

These guides are designed to be used by practitioners who may or may not be experts in data analysis or open source. The goal is to help people understand how to interpret the data about an open source project to develop insights that can help improve the project health of that open source project. The Practitioner Guides are for Open Source Program Offices (OSPOs), project leads, community managers, maintainers, and anyone who wants to better understand project health and take action on what they learn from their metrics. Each guide contains details about how to identify trends, diagnose potential issues, gather additional data, make improvements in your project and monitor the results of those improvements. 

We have more guides being developed already, and we welcome your contributions! You can propose a new guide, author a guide someone else has suggested, or submit a pull request to make our existing guides even better!

These guides are being developed within the CHAOSS Data Science Working Group. We have a Slack channel and meet every other week to talk about a wide range of data topics, so I hope you’ll join us!

Group picture of the attendees of OFA. Lots of smiling faces with the virtual attendees on the screen in the background.

OFA Symposium: Open Source Research Collaboration

By Blog Post

Last week, I attended The OpenForum Academy Symposium in Berlin. This is Open Forum Europe’s (OFE) academic conference around open source with the goal of collaboration between researchers, industry, and folks working on policy to share ideas and eventually generate more academic research that is useful for open source policy people. In an effort to avoid a 10 page blog post, I’ll only cover the highlights of a few talks that I found particularly interesting and that seem more relevant for CHAOSS.

In the first keynote, Julia Ferraioli and Juniper Lovato talked about the Beyond the Repository ACM Paper that they co-authored with Amanda Casari. I personally think this should be required reading for anyone doing research in open source. The paper goes in depth into why researchers should think about how their methods and results impact entire open source ecosystems, including the people working within the projects being studied. The paper is organized into nine best practices that help researchers understand how they might design their studies in ways that keep the ethical implications and ecosystem impact of their research top of mind. In particular, they suggest that researchers actively work with the practitioners involved in the projects as they look beyond the repository to gather data and consider the ramifications of the research.

The keynote was followed by several presentations focused on Open Source Communities and Cooperatives. Jérémie Haese talked about the working paper (to be published in Management Science), Open at the Core: Moving from Proprietary Technology to Building a Product on Open Source Software, that he is writing jointly with Christian Peukert using Microsoft’s move to Chromium as a case study. Among other things, they saw an increase in the pool of contributors along with more people reporting security vulnerabilities due to increased bug bounties offered by Microsoft resulting in an increasing number of vulnerabilities being fixed. 

Jorge Benet presented, A Cooperative Model for Digital Infrastructure and Recommendations to Adopt It, which has been fully published. The report discusses their findings from 21 digital infrastructure projects from 12 cooperatives across 7 countries with a model that looks at value creation, proposition, and capture with recommendations for projects wishing to adopt the model. 

Elçin Yenişen Yavuz talked about how user-led open source foundations are different from other types of foundations. While many of us work on projects in foundations led by communities and vendors, the foundations led by users of the software (e.g., Apereo Foundation, Academy Software Foundation, openMDM) have more direct benefits for the end users, including more control over functionality, shared resources, sustainability, and productivity. Results of some of this research can be found in the Problems, Solutions, and Success Factors in the openMDM UserLed Open Source Consortium paper in Communications of the Association for Information Systems.

There were a few talks about Legal implications from Open Source, which is a bit less relevant for the CHAOSS audience, but there was one talk from Wayne Wei Wang, Open-Source Commons Made in China: A Case Study of OpenAtom Foundation and Mulan-series Licenses, that I found interesting partly because some of us have been working with the folks at openEuler, which is an OpenAtom project under a Mulan-series license. Wayne talked about some ways that open source is different in China due to Chinese state entrepreneurialism and the relationships between central planning and open source. This is based on Wayne’s research paper: China’s digital transformation: Data-empowered state capitalism and social governmentality.

I found Knut Blind’s talk, Open Source in the Context of Innovation, particularly interesting, since it talked about various measures of innovation in open source. He shared the stage with a handful of others as they talked about how existing research on innovation using patents and papers can be compared to open source innovation by looking at open source contributions (like commits) as a comparison to patents and cited papers, which aren’t as dissimilar as they might seem if you think about how they all share a similar process that goes from submission through review and finally into publication / release. They also talked about using the GitHub Innovation Graph to look at open source innovation for various national economies. Finally, they talked about how dependencies can be used when looking at innovation, but that there are some challenges with this approach when you try to compare projects to understand innovation. For example, Javascript modules tend to be designed for integration into projects, so they will have many more dependencies than C/C++ projects, which are often designed as standalone apps.

Nataliya Wright’s talk, Open Source Software and Global Entrepreneurial Growth, looked at how contributing to open source can spur global entrepreneurial growth. They found that contributing to open source predicts higher valuations and funding for IT ventures (note that some, but not all, of this is related to selection bias based on the types of companies and founders that contribute). While the talk was based on new research, some of their early stage results can be found in the Research Policy paper, Open Source Software and Global Entrepreneurship.

This was a really interesting conference with many more talks than I could cover here, and I’m already looking forward to next year’s conference!

Pile of intertwined measuring tapes of various colors

Demonstrating OSPO Value and How CHAOSS Can Help

By Blog Post No Comments

Recently, I’ve been thinking about how Open Source Program Offices (OSPOs) can demonstrate value within their organizations and how CHAOSS metrics and software can help. As a result, I’ve given presentations, had discussions on a podcast, and wrote a blog post about this topic. I wanted to write a summary blog post here to highlight the work in one place while tying some of those conversations together into a broader narrative.

I recently attended OSPOlogy Live in Frankfurt, which had presentations and engaging roundtable discussions about a wide variety of topics relevant for OSPOs. My presentation was about Getting More Value from your OSPO, and I talked about how OSPOs can take a more strategic approach by fostering alignment between individual contributors, business unit leadership, and the communities where employees contribute. The presentation also spent quite a bit of time on how an OSPO can demonstrate value toward accomplishing the overall goals of an organization while using metrics to make improvements and demonstrate success toward meeting those goals. Sean Goggins presented on the topic of Selecting the Right Collections of Sustainability Metrics with a focus on how OSPOs can be data providers that can help an organization make sense out of the mountains of data generated by open source software. CHAOSS Metrics Models help OSPOs focus on collections of meaningful data to create something that provides insight and wisdom about their open source efforts. Ulrike Fempel from SAP’s OSPO wrote a great wrap-up of the Frankfurt edition of OSPOlogy Live if you’d like more details. If you haven’t already attended an OSPOlogy Live event, it’s a great place to discuss challenges and solutions with your OSPO peers!

Building on my presentation about showing the value of an OSPO and Sean’s talk about metrics models as collections of metrics, I wrote a blog post about Measuring Open Source Project Health for Opensource.net that focused mostly on the CHAOSS Starter Project Health Metrics Model. OSPOs can be overwhelmed by the mountains of data and metrics available to understand open source projects, so this blog post and metrics model are designed to help new OSPOs (or ones new to metrics) get started with a few relatively easy metrics. Not only are these metrics relatively easy to gather, they also make it easy to understand how to take action on the data to make meaningful improvements to the health of open source projects. The goal is to get OSPOs started on their journey into using data to learn and improve with the idea that they can expand on this and start measuring other things that matter to an OSPO.

Another thing that OSPOs care deeply about is the long-term viability of the open source projects that their organizations’ rely on for infrastructure along with the products and services that they deliver to their customers. It ultimately comes down to a complex assessment of risk vs. reward across many dimensions, including security, community, and governance, just to name a few. We recently released a podcast about Open Source Software Viability and Project Selection where Matt Germonprez, Sophia Vargus (Google), Gary White (Verizon), and I talked in depth about how to assess viability and some of the metrics used in those assessments. Gary is also working on publishing some metrics models and blog posts, so watch this space to learn more about measuring open source software viability. 

If you are interested in learning more, we have an OSPO Working Group within the CHAOSS project that we have jointly with the TODO Group. We meet every other Thursday and have Slack channels both within the CHAOSS Slack and TODO Group Slack workspaces if you want to join in these discussions or ask questions.

Photo by patricia serna on Unsplash.

A group of CHAOSS community members taking a selfie on the bridge leading to the Bilbao old town area

CHAOSS at Open Source Summit Europe

By Blog Post No Comments

The CHAOSS crew was well-represented at the Linux Foundation’s Open Source Summit event in Bilbao last week with several talks and panels from CHAOSS community members. 

On Tuesday, we held a panel discussion: Demonstrating OSPO Value with Daniel Izquierdo, Chan Voong, David Hirsch, and me. The idea for this panel came out of the CHAOSS OSPO WG, and during the panel we talked about how to demonstrate OSPO impact using metrics, practical applications for OSPOs, tools, and how to build a narrative for your stakeholders out of your data.

CHAOSS board member, Brian Proffitt, along with his Red Hat colleague, Natalie Pazmiño, held a session about the challenges of Measuring the Impact of Community Events, which can be harder to measure than traditional industry events that rely mostly on lead generation. They talked about creating collateral that can be measured (e.g., whitepaper downloads, landing pages via QR code) and creating opportunities for later participation in a channel that you can measure. They had some creative approaches, so I talked to Brian about the possibility of creating some CHAOSS metrics / metrics models to share their ideas.

Daniel Izquierdo and Yehui Wang had a session about Building SaaS Services with CHAOSS Technology to Evaluate Community Health and Sustainability where they talked about how CHAOSS’ GrimoireLab software is based on 16 years of research, development, and testing in the market, which made it possible for OSS Compass to be built on top of GrimoireLab in just one year! OSS Compass is a SaaS solution implementing CHAOSS metrics and metrics models, and the slides at the link above show examples of how they’ve implemented them. CHAOSS has brought great visibility for GrimoireLab, and the community has been a great amplifier. 

I also gave a talk about Contributor Growth Strategies for OSS Projects where I talked about the challenges that maintainers face and how hard it can be to get more people participating in a project along with some ideas for ways that these challenges can be overcome. I used several graphs from CHAOSS tools to demonstrate how metrics can help maintainers decide where to focus their efforts for growing their contributor base. The slides in the link above have more details about the challenges, solutions, and metrics. 

In addition to the talks from the Chaotics at the event, there were a few others that I found interesting:

  • Nithya Ruff’s keynote about the Evolving OSPO touched on several topics that we’ve been talking about recently in the OSPO WG. She talked about how risk can slow innovation, and how OSPOs are working hard to manage risks that include licenses, AI, security, and regulations.
  • Building On-Ramps for Non-Code Contributors in Open Source by Natali Vlatko and Celeste Horgan echoed many of the conversations we’ve had over the years in the CHAOSS DEI WG with some solid ideas for both maintainers and contributors about how to get more people engaged in your project through documentation, community, project management, and other roles.
  • There were also a bunch of other talks that were relevant for CHAOSS folks, especially some from the Diversity Empowerment Summit, Open Source Leadership Summit, and OSPOCon.

The individual session videos aren’t yet available, but the full day videos for some tracks are available, and this video from the Leadership track contains the talks from Daniel and Yehui (time index 16:20), Natali and Celeste (1:08:50), and my talk (2:05:01).

In addition to the content from the talks, Linux Foundation Research also released 4 new reports: The 2023 State of OSPOs and OSS Initiatives, The World of Open Source Europe Spotlight 2023, The European Public Sector Open Source Opportunity, and Open Source for Sustainability.

Overall, it was great to see many of my CHAOSS friends, some of them for the first time in person. We had great conversations and fun both at the conference and over pintxos, a traditional food in northern Spain’s Basque region.

Emilio and Miguel Angel at the Bitergia booth with an orange bitergia tablecloth and table covered in stickers, coasters, and other materials to give away.
Emilio and Miguel Angel at the Bitergia booth

Survey: Help the CHAOSS project improve our tools and metrics

By Blog Post, News No Comments

We know our metrics and tools can be overwhelming, even for experienced open source professionals. As we ramp up our new CHAOSS Data Science Initiatives, we wanted to start by learning more about what works well and what doesn’t for people using CHAOSS tools and metrics now or in the past. Understanding the challenges that people have experienced will help us drive improvements within the project to overcome those challenges.

We are launching a survey of existing and past users of CHAOSS tools and metrics designed to help us better understand the barriers and challenges that make it difficult for people to gain meaningful, empirically-driven community health insights using CHAOSS tools and metrics.

If you have ever used tools based on CHAOSS technologies (e.g., Augur, GrimoireLab, Bitergia, Cauldron) or used other tools to implement CHAOSS metrics, we want to hear from you! 

Take Our Survey

 If you’re interested in learning more about our new CHAOSS Data Science Initiatives or joining the community, you can join our #data-science Slack channel or attend our Data Science Working Group meetings to collaborate on data science work in the CHAOSS community.