CHAOSS Blog

If you would like to write an article for our blog, please reach out to our community manager – Elizabeth Barron.

selective focus photography of clear lightbulb

Unlocking Insights: Practitioner Guides for Interpreting Open Source Metrics

By Blog Post

Photo by Martin Wilner on Unsplash

I am thrilled to announce that we have just launched a series of Practitioner Guides to help people develop meaningful open source project health insights. 

Today, we have released the first four guides in the series:

These guides are designed to be used by practitioners who may or may not be experts in data analysis or open source. The goal is to help people understand how to interpret the data about an open source project to develop insights that can help improve the project health of that open source project. The Practitioner Guides are for Open Source Program Offices (OSPOs), project leads, community managers, maintainers, and anyone who wants to better understand project health and take action on what they learn from their metrics. Each guide contains details about how to identify trends, diagnose potential issues, gather additional data, make improvements in your project and monitor the results of those improvements. 

We have more guides being developed already, and we welcome your contributions! You can propose a new guide, author a guide someone else has suggested, or submit a pull request to make our existing guides even better!

These guides are being developed within the CHAOSS Data Science Working Group. We have a Slack channel and meet every other week to talk about a wide range of data topics, so I hope you’ll join us!

GrimoireLab 1.0

By Blog Post, News

GrimoireLab 1.0

For eight years, we have been working to produce the best platform for software development analytics possible. With the work of more than 150 developers and after over 11,600 commits, we’re excited to announce the release of the first major version of GrimoireLab.

GrimoireLab is an evolution of the work done during more than 10 years by Bitergia, LibreSoft URJC research group, and several contributors in Metrics Grimoire and VizGrimorie projects.. Since 2017, GrimoireLab has been part of The Linux Foundation CHAOSS Software community as one of its founding projects.

GrimoireLab has become common for open source project health dashboards. It has been used by some of the most important software companies and open source foundations in the world. The platform has also been used as the underlying foundation for other applications, including Bitergia Analytics, OSS Compass, LFX Insights, Cauldron, and Mystic

What’s included in this GrimoireLab release?

  • An automated platform to generate software analytics and insights. 
  • Data collection from more than 30 data sources.
  • Generation of more than 150 metrics and visualizations to understand activity, performance, and community of open source projects.
  • Identities manager to track the activity of an individual across platforms and organizations. 
  • Integration with third-party applications to visualize and analyze data (Kibana/OpenSearch Dashboards/Jupiter Notebooks).

Why are we releasing this major version right now?

As our roadmap lays out, we’ve identified some challenges that require a major shift in how the platform works. We expect that version 2.0 of GrimoireLab will be significantly different, improving on scalability and maintenance and addressing advancements in AI.  Therefore, we believe that releasing a stable version now will give our users predictability and stability moving forward.

What can you expect from now on?

Version 1.0 will be maintained as a stable release that will continue to power enterprise, open source, and research users. Meanwhile, we will create a branch named 1.x to fix bugs and to include new features that will be part of the next major release. The active development of GrimoireLab 2.0 will show up in the main branch. 

Some of the architectural changes detailed in our roadmap for version 2.0 include:

  • Maintenance effort will be reduced in version 2.0 with a graphical user interface and an API for configuring data collection in GrimoireLab. Currently, system administrators need to manually update text files when new data is to be collected.
  • Scalability and performance will be improved to handle more than 5,000 data endpoints and deliver insights faster. Currently, 3,500 high-active repositories require three days of data analysis before the data is ready for the user.
  • Integration with other tools will be made easier. Users will be able to use different tools for visualizing and analyzing the data from GrimoireLab.

Our thanks!

This release would not have been possible without the help of the entire community. We are deeply thankful to all our users. We would especially like to thank Álvaro del Castillo, Valerio Cosentino, Jesús González-Barahona, Alberto Pérez García-Plaza, J. Manrique López, Venu Vardhan Reddy Tekula, David Moreno, Gregorio Robles, Andy Grunwald, and the members of the CHAOSS project. 

We recognize Bitergia and The Document Foundation for being early adopters and to be the first ones to enter their names on the new ADOPTERS.md file. If you use GrimoireLab, please add your organization to our ADOPTERS.md file so that we can recognize you. If you’ve done research with GrimorieLab, please add a citation and link to your publication.

The GrimoireLab Developers

Embarking on an Exciting Journey: CHAOSS Unveils Ambitious Goals for 2024

By Blog Post

The CHAOSS project is gearing up for a transformative year in 2024, building on the momentum of a successful 2023. In a recent discussion, we shared insights into the project’s goals, which focus on significant growth, community engagement, and the establishment of international standards. Let’s delve into the exciting objectives that will shape the future of CHAOSS in the coming year.

Establishing CHAOSS Metrics as International Standards:

Imagine CHAOSS metrics and metrics models receiving a nod as formal international standards, specifically candidates for ISO standards. We’re collaborating with the Joint Development Foundation, part of the Linux Foundation, to create standards that will facilitate global engagement with these metrics and metrics models. This move lends more legitimacy to our hard work and opens doors for wider recognition and adoption, making it easier for management and C-level executives to embrace these metrics and metrics models. The journey involves thoughtful consideration of which metrics and metrics models are suitable candidates, navigating the standardization process, and even contemplating compliance programs. As we venture into this uncharted territory, we’re ready for the exciting challenges and opportunities.

Crafting Outreach Plans for Community Growth:

Building and sustaining the CHAOSS user community has always been a priority. In 2024, we’re taking a more deliberate approach to growing this community. We could do many things; it will be key to prioritize activities that will genuinely make a difference in raising awareness and engaging users as we craft a strategic advocacy plan. From refining user-specific key messages to exploring the possibility of an Ambassador’s program, our goal is to make every member of the CHAOSS community feel welcomed and included. We’re also gearing up to share our insights with other open source communities, envisioning a toolkit that aids others in navigating the challenges of community outreach and promotion. And we’ll continue to explore downstream use and engagement with the artifacts our community creates, as in the case of Augur and GrimoireLab

Fostering Collaboration Within the Contributor Community:

Contributors are the heart and soul of CHAOSS, contributing in diverse ways, from coding to blogging to organizing meetups. To foster collaboration, we continue to find ways to make it easier for everyone to contribute, appreciating all forms of participation. The ‘Chaotic of the Week’ program continues to shine a spotlight on community members, and you’ll see us at even more events, engaging both users and contributors. We’re thankful to our community for their blogging efforts—one shining example are the blogs penned by Gary White, principal engineer in Verizon’s Open Source Program Office, about the company’s use of CHAOSS metrics. Collaboration with other communities is a key aspect, including the TODO Group, Linux Foundation, and universities, and we’ll focus on making these partnerships more visible and explicit. We’ll also revisit our approach to mentorship, thinking strategically and innovatively about the programs we engage in.

Offering SaaS Solutions for CHAOSS Metrics:

We’ll aim to make CHAOSS metrics more accessible and user-friendly, while reducing resource constraints, and as a result, bringing these metrics to a wider audience, by providing hosted Software as a Service (SaaS) offerings. Hosted solutions can help simplify software installation, a hurdle identified in an earlier survey, and give people a taste of different CHAOSS software pieces, so they can decide which fits their needs. These solutions aim to cater to a broader audience, accommodating various project needs, from large corporate open source offices and smaller, scientific organizations. Today, Augur has a hosted instance, and the plan is to convert Augur data into metric model data and implement standards for these models. In addition, we’re looking at securing funding to host the OSS Compass project, CHAOSS-affiliated software that provides a user-friendly interface for metrics.

Data-Driven Insights for Informed Communities:

Historically, the CHAOSS project has taken an agnostic approach to metrics interpretation, providing a wealth of metrics and tools and leaving individual projects to determine how to use and interpret these metrics based on their unique community dynamics. Recognizing the complexity and abundance of metrics, the goal now is to help users, primarily those new to metrics, derive meaningful observations for their communities. Collaborating with context working groups and creating insight guides, we’re bridging the gap for users looking to enhance their communities through informed metric analysis. We’re starting to work on these guide, and we encourage you to get involved and contribute to them. Additionally, we’re focused on building use cases, and community examples will be vital in helping users better interpret data and implement insights effectively.

Adapting to New Technologies:

While we’re not fully prepared to tackle this goal, we’re keeping a watchful eye on emerging technologies like artificial intelligence (AI). Our earlier exploration into AI’s impact on open source communities is just the beginning. New technologies beyond AI will continue to emerge, and as they do, we’ll be ready to adapt our metrics policies and practices accordingly.

With these goals set for 2024, we’re excited about the growth we’ve witnessed and the intentional approach we’re taking as we look forward. We invite you to be a part of this exciting adventure, contributing your unique perspective and energy to enrich our shared experience. Join us in shaping the future, addressing challenges, and celebrating successes together. The CHAOSS community is a vibrant space; together, let’s continue building something amazing!

 

CHAOSS DEI Project Badging

By Blog Post, News
CHAOSS Logo

Here at CHAOSS, we are excited to announce the launch of CHAOSS DEI Project Badging. CHAOSS DEI Project Badging is an initiative developed to recognize open source projects that prioritize diversity, equity, and inclusion (DEI) work within their respective communities. The initiative uses CHAOSS DEI metrics as a benchmark to reflect on DEI efforts in an open source project. The objectives of CHAOSS DEI Project Badging are to (1) enable people to signal their ongoing efforts in improving and prioritizing DEI within their communities, (2) to recognize projects and communities for their DEI efforts, and (3) to help communities make informed decisions and take thoughtful actions in creating more inclusive and equitable environments. CHAOSS DEI Project Badging welcomes participation by badged projects in the evolution of the program. 

How It Works

Self-Assessment and Documentation

Before applying for a CHAOSS Project DEI Badge, maintainers or project admins are encouraged to conduct a self-assessment of their project’s DEI efforts based on the following CHAOSS DEI metrics: Project Access, Inclusive Leadership, Communication Transparency, and Newcomer Experience. After the self-assessment, maintainers can document how the project attends to and prioritizes DEI around these areas in a markdown file called the DEI.md file. This DEI.md file should exist within the project’s repository for easy feedback from the community. A guide for putting together your DEI.md file is available through the DEI.md Guide

Badging Application

Once the DEI.md file is published and publicly available, project owners can proceed to apply here. The applicant must be a project owner and the repository that holds the DEI.md file must be specified.

Badging Evaluation

The review follows an automated process in which a CHAOSS bot scans the project repository for the presence of a DEI.md file. The bot will review the DEI.md file for relevant information provided by the maintainer and its alignment with the CHAOSS DEI metrics stated in the DEI.md template to determine eligibility for the badge. A project badge will be issued upon successful review of the DEI.md file. The four CHAOSS metrics used in the DEI.md file include:

  • Project Access
  • Inclusive Leadership
  • Communication Transparency
  • Newcomer Experience

The evaluation of your DEI.md file is based on it being publicly available and demonstrating an attention to the four CHAOSS metrics. Ultimately, we are ensuring that you are providing your community members with a well-formed DEI.md file. We do not evaluate how you reflect on the four metrics as each community will do things differently. If community members have concerns about what is expressed in their community’s DEI.md file, we ask that you discuss this within your community to ensure the DEI.md file appropriately reflects your community DEI efforts. 

Recognition and Badging

Projects that meet the established criteria will receive the CHAOSS DEI Project Badge, which can be prominently displayed on the project’s website, documentation, or other relevant platforms. The badge signifies the project’s commitment to DEI and highlights its attention to DEI best practices.

Continued Engagement

Once the badging process is completed, you can re-apply for a project badge — we recommend it after about a year. We will also be developing CHAOSS Project Badges for Silver, Gold, and Platinum levels in the future that include new CHAOSS metrics.

Getting Involved

If you would like to help CHAOSS build the future of DEI Project Badging, we welcome your participation! You are encouraged to join our DEI Working Group meetings every Wednesday at 10:00 am US Central/Chicago Time. Details on how to join these meetings can be found on the CHAOSS Calendar. You are also welcome to join the CHAOSS Community Slack and connect with us there.

2023 CHAOSS Community Wrap-Up

By Blog Post

I’m a big fan of periodically taking time to reflect on the past, and the new year seems like an excellent time to do so. So let’s look at some of the highlights from the CHAOSS community in 2023!

  • Our Slack community doubled in 2023, as we went from about 900 members at the end of 2022 to about 1800 members currently.
  • We hired a Director of Data Science (the one and only Dr. Dawn Foster) and launched our Data Science Initiative.
  • We launched CHAOSS Latin America and CHAOSS Balkans as our newest chapters (Welcome to Selene Yang and Kristi Progri as our Chapter Leaders!)
  • We completed the DEI Project Badging pilot working closely with GitHub’s All In project to host this within the CHAOSS project.
  • In addition to our usual CHAOSScon EU and CHAOSScon NA events, we held our first CHAOSScon Africa event, which was a huge success!
  • We retired our Risk and Evolution working groups and launched two new Context Working Groups instead: OSS in Universities and Scientific Communities.
  • We ushered in some new Board Members: Anita ihuman, Ruth Ikegah, Brian Proffitt, and Kevin Lumbard. Dr. Dawn Foster became a new Board Co-Chair, joining Sean Goggins in the leadership role, and thank you to Nicole Huesman as outgoing Co-Chair for all of her hard work in this role over the past 2 years.
  • We expanded our outreach to span across many global open source conferences (and we gave away a couple of LEGO Globes to represent the global nature of the CHAOSS project).
  • Our number of released Metrics Models grew from 1 to 16! 
  • We badged 50 open source events in our DEI Event Badging Initiative and grew our team of active Badgers to 24.
  • We shared our framework for surveying a community around DEI initiatives and released the results of our own internal DEI survey.
  • We shared Anita ihuman’s incredible research on CHAOSS DEI metrics.
  • We started a CHAOTIC of the Week series to highlight some of our community members and the great work they do in the community.
  • We launched a program called Tour Guides to help newcomers find their way.
  • We recently restarted the CHAOSSCast podcast with a new episode about every 2 weeks highlighting something interesting from the CHAOSS community.
  • We received continued support from the Alfred P. Sloan Foundation and the Ford Foundation to help us keep the CHAOSS community one of the best communities on the planet. 

2023 was a time for growth and represented a subtle shift in how CHAOSS thinks about metrics. We have even more in mind for 2024, which you can learn more about in our recent podcast: CHAOSS Goals for 2024 and Beyond. It has never been more exciting to be a CHAOTIC! We truly hope you can join our community and help all of us improve the health of our open source communities. 

Guide for OSS Viability: A CHAOSS Metric Model

By Blog Post
Photo by William Bout on Unsplash

This guide is part of a three part series. This is part three. Read part one or two for context and a deep dive into the metrics respectively.

In the last two posts of this series, we covered the existence of the CHAOSS Metrics (Super) Model for Viability. We then covered what exactly comprises that metrics model, and gave brief impressions of why and how they comprise a whole.

In this guide, we’ll talk about what’s possible with the CHAOSS tools, and how we can comprise a Viability metrics model. Namely, we’ll focus on GrimoireLab and Augur.

Consider the chart below to see the breakdown of what is available for which service.

Breakdown by Category

CategoryMetricGrimoirelabAugur
StrategyProgramming Language DistributionAvailableAvailable
StrategyBus FactorAvailableAvailable
StrategyElephant FactorAvailableAvailable
StrategyOrganizational InfluenceAvailableAvailable
StrategyRelease FrequencyNot AvailableAvailable
CommunityClonesNot AvailableNot Available
CommunityForksAvailableAvailable
CommunityTypes of ContributionsNot AvailableNot Available
CommunityChange RequestsAvailableNot Available
CommunityCommittersAvailableNot Available
CommunityChange Request Closure RatioAvailableAvailable
CommunityProject PopularityAvailableAvailable
CommunityLibyearsNot AvailableAvailable
GovernanceIssue Label InclusivityAvailableAvailable
GovernanceDocumentation UsabilityNot AvailableNot Available
GovernanceTime to CloseAvailableAvailable
GovernanceChange Request Closure RatioAvailableAvailable
GovernanceProject PopularityAvailableAvailable
GovernanceLibyearsNot AvailableAvailable
GovernanceIssue AgeAvailableAvailable
GovernanceRelease FrequencyNot AvailableNot Available
Compliance / SecurityOpenSSF Best PracticesNot AvailableNot Available
Compliance / SecurityLicense CoverageNot AvailableAvailable
Compliance / SecurityOSI Approved LicensesNot AvailableAvailable
Compliance / SecurityLicenses DeclaredNot AvailableAvailable
Compliance / SecurityDefect Resolution DurationAvailableNot Available
Compliance / SecurityLibyearsNot AvailableAvailable
Compliance / SecurityUpstream Code DepenciesNot AvailableNot Available
A Summary of Available CHAOSS metrics and their fit to Viability across Grimoire and Augur

Breakdown by Tool

Augur Summary  
CategoryAvailableNot Available
Community50.00%50.00%
Compliance / Security57.14%42.86%
Governance75.00%25.00%
Strategy100.00% 
Grand Total67.86%32.14%
Grimoirelab Summary  
CategoryAvailableNot Available
Community62.50%37.50%
Compliance / Security14.29%85.71%
Governance62.50%37.50%
Strategy80.00%20.00%
Grand Total53.57%46.43%

While we can’t get every metric for every service, we can get a good majority of what we need through a mix of Grimoire Lab, and Augur. We intend to continue building the ability to get this data into services like Grimore and Augur, then update the CHAOSS metrics wiki to reflect how we’ve done it.

Augur provides the most metrics overall for three categories, while Grimoire is best for Community management. Grimoire also provides sigils, which create panels for you as a user for a good amount of metrics you may want to use. Augur also has a tool supported by RedHat that visualizes metrics within it.

How Does this Guide My Decisions?

Depending on your use case, you may find different opportunities to use the Viability model. It was originally developed for use evaluating using open source products, and your thresholds for each model category will vary based on your assumption of risk.

For example:

  • Organizations starting their journey in governing Open Source Software at their organization usually start with Compliance and Security, cornering vulnerabilities and licensing information to choose their software.
  • Large companies may consider strategy to be the next-most important. Given that many organizations build software that is in use for years, the importance of the strategy in a project — and indeed who maintains that strategy — can be a critical decision.
  • Community is important for cutting-edge or newer implementations of technology. While older technology will likely have a less volatile community, where maintainers and flow of new contributions can be judged over time, a new project may need a stronger community with more vigilance on it’s competitors to ensure that a software stack isn’t abandoned.
  • Governance is crucial for organizations that intend to engage the open source community; or contribute to a project to shape new functionality. If an organization is willing to commit time and resources to maintaining a project — the Governance of that project becomes important to consider

Getting Started

Consult the documentation of GrimoireLab and Augur for more details on how to get started. Based on what your team needs or cares about, consider choosing the tool that has the highest coverage, or use them both to maximize your results. If you find that you can trace some metrics that I’ve gotten wrong here, I’d love to know! Drop by our OSPO working group, metrics working group, or somewhere else to publish your contributions!

Until then, you can find me on CHAOSS community slack, as Gary White. Thanks for reading!

Group picture of the attendees of OFA. Lots of smiling faces with the virtual attendees on the screen in the background.

OFA Symposium: Open Source Research Collaboration

By Blog Post

Last week, I attended The OpenForum Academy Symposium in Berlin. This is Open Forum Europe’s (OFE) academic conference around open source with the goal of collaboration between researchers, industry, and folks working on policy to share ideas and eventually generate more academic research that is useful for open source policy people. In an effort to avoid a 10 page blog post, I’ll only cover the highlights of a few talks that I found particularly interesting and that seem more relevant for CHAOSS.

In the first keynote, Julia Ferraioli and Juniper Lovato talked about the Beyond the Repository ACM Paper that they co-authored with Amanda Casari. I personally think this should be required reading for anyone doing research in open source. The paper goes in depth into why researchers should think about how their methods and results impact entire open source ecosystems, including the people working within the projects being studied. The paper is organized into nine best practices that help researchers understand how they might design their studies in ways that keep the ethical implications and ecosystem impact of their research top of mind. In particular, they suggest that researchers actively work with the practitioners involved in the projects as they look beyond the repository to gather data and consider the ramifications of the research.

The keynote was followed by several presentations focused on Open Source Communities and Cooperatives. Jérémie Haese talked about the working paper (to be published in Management Science), Open at the Core: Moving from Proprietary Technology to Building a Product on Open Source Software, that he is writing jointly with Christian Peukert using Microsoft’s move to Chromium as a case study. Among other things, they saw an increase in the pool of contributors along with more people reporting security vulnerabilities due to increased bug bounties offered by Microsoft resulting in an increasing number of vulnerabilities being fixed. 

Jorge Benet presented, A Cooperative Model for Digital Infrastructure and Recommendations to Adopt It, which has been fully published. The report discusses their findings from 21 digital infrastructure projects from 12 cooperatives across 7 countries with a model that looks at value creation, proposition, and capture with recommendations for projects wishing to adopt the model. 

Elçin Yenişen Yavuz talked about how user-led open source foundations are different from other types of foundations. While many of us work on projects in foundations led by communities and vendors, the foundations led by users of the software (e.g., Apereo Foundation, Academy Software Foundation, openMDM) have more direct benefits for the end users, including more control over functionality, shared resources, sustainability, and productivity. Results of some of this research can be found in the Problems, Solutions, and Success Factors in the openMDM UserLed Open Source Consortium paper in Communications of the Association for Information Systems.

There were a few talks about Legal implications from Open Source, which is a bit less relevant for the CHAOSS audience, but there was one talk from Wayne Wei Wang, Open-Source Commons Made in China: A Case Study of OpenAtom Foundation and Mulan-series Licenses, that I found interesting partly because some of us have been working with the folks at openEuler, which is an OpenAtom project under a Mulan-series license. Wayne talked about some ways that open source is different in China due to Chinese state entrepreneurialism and the relationships between central planning and open source. This is based on Wayne’s research paper: China’s digital transformation: Data-empowered state capitalism and social governmentality.

I found Knut Blind’s talk, Open Source in the Context of Innovation, particularly interesting, since it talked about various measures of innovation in open source. He shared the stage with a handful of others as they talked about how existing research on innovation using patents and papers can be compared to open source innovation by looking at open source contributions (like commits) as a comparison to patents and cited papers, which aren’t as dissimilar as they might seem if you think about how they all share a similar process that goes from submission through review and finally into publication / release. They also talked about using the GitHub Innovation Graph to look at open source innovation for various national economies. Finally, they talked about how dependencies can be used when looking at innovation, but that there are some challenges with this approach when you try to compare projects to understand innovation. For example, Javascript modules tend to be designed for integration into projects, so they will have many more dependencies than C/C++ projects, which are often designed as standalone apps.

Nataliya Wright’s talk, Open Source Software and Global Entrepreneurial Growth, looked at how contributing to open source can spur global entrepreneurial growth. They found that contributing to open source predicts higher valuations and funding for IT ventures (note that some, but not all, of this is related to selection bias based on the types of companies and founders that contribute). While the talk was based on new research, some of their early stage results can be found in the Research Policy paper, Open Source Software and Global Entrepreneurship.

This was a really interesting conference with many more talks than I could cover here, and I’m already looking forward to next year’s conference!

Metrics for OSS Viability

By Blog Post

In the last post, we gave a background on Viability in Open Source. We covered the motivation, and implementation plan of how we’ll collect and measure metrics about open source that we use or might use at Verizon.

Metrics on a list next to a laptop, ruler, and calendar
Considering all these metrics together takes a list and a laptop, and maybe a ruler.
Photo by Marissa Grootes on Unsplash

In this post, we will cover the nitty-gritty details of which metrics we’re using in our model, and why they fit together. Rather than covering each metric individually, I’ll summarize what metrics are in each model, and give an overview on why they fit together for a good picture of the model. We’ll also cover the value proposition of metrics that cross between the categories comprising the full model.

What follows is the list of metrics comprising the Viability model, and why they are included:

Compliance + Security

Just for this model:

OpenSSF Best Practices

  • This is a proxy metric for us to ensure that a project responds to security incidents, and has enough protections in place to generally interpret a reliable Compliance and Security strategy.
  • This allows us to avoid using costly SCA/SAST scanning on every open source project we consider.

License Coverage

  • We use this to make decisions about risk of using unlicensed software, or determine if our use case of the software is compatible with the license provided.

Licenses Declared

  • This lets us compare our intended usage and project policy against the policy of our dependencies.

OSI Approved Licenses

  • Knowing we’ve reviewed the implication of each OSI license provides confidence that we understand how to use the software in compliance with those licenses.

Defect Resolution Duration

  • This metric allows us to consider, apples to apples, radically different response rates to defects when they occur between projects. 
  • Our understanding of this metric across projects and across a suite of dependencies calibrates our risk profile and tolerance.

Upstream Code Dependencies

  • Where we intend this metric for the model of viability is to ensure that dependencies of a project are also included in any viability evaluation we perform. 
  • It is important to consider an application alongside dependencies it shares to give a full picture of a particular project’s risk portfolio.

Shared between models:

Libyears

  • “A simple measure of software dependency freshness. It is a single number telling you how up-to-date your dependencies are.”
  • This metric allows for apples-to-apples comparisons between projects of freshness. 
  • Scales to show complex projects with many dependencies, and the risk associated with using those projects with the massive maintenance cost behind the scenes.

What the metrics for Viability

Overall, we use this metrics model to gauge how well both the community and the maintainers of the application consider the security and compliance of their application. We expect to use these indicators to gauge risk. We have showstoppers like licensing, where a license can be flatly incompatible with our intended use case, through security-centric badges and metrics, to how fast and regularly a team maintains the dependencies and defects reported to their application. 

Additionally, like in other models, some metrics are very tricky to trace or visualize. We leave a healthy amount of flexibility in how we rank applications against tricky-to-gather metrics, and we recommend that users of our models do the same. For example: Much like the Defect Resolution Duration; the appetite for how many Libyears is appropriate for a project will always be up to maintainers. Depending on how or where an app may run, and how frequently we can update it, we think about Libyears critically. 

Following other metrics and models, Libyears notably contributes to three of our metrics models: Compliance/Security, Governance, and Community. We believe that it fits particularly well in Compliance + Security as it gives an indicator not only about how critically maintainers consider compliance and security in their own project, but in the projects they’re dependent on.

Governance

Just for this model:

Issues Inclusivity

  • Provides an effective measurement for intentional aggregation of issues
  • Indicates how community skills are applied to project responsibilities.

Documentation Usability

  • Strong, usable documentation is required.
  • Though this can include a lot of manual effort, this is a very important metric to attempt to collect.

Time to Close

  • How long it usually takes for a contribution to the project to make its way to the codebase
  • Will give us an idea of consistency in the project (median, mean, mode)
    • This is not to be confused with defect resolution (we hold a higher standard for)
  • Other processes may occur alongside opening and closing a PR, for example, but this provides enough of an indicator to be inherently useful to the Governance of a project.

Issue Age

  • How long questions / suggestions / etc. generally hang around a project.
  • Simple to understand, easy to dive further into by looking at what issues are about.

Shared Between Models:

Change Request Closure Ratio

  • Compare the drift of new requests to their rate of closure. 
  • Gives us an idea of how the project is maintained – or if more maintainers might be needed to keep up with demand for new features.

Project Popularity

  • Aggregate of other smaller metrics one might expect to find in a cursory glance over a project landing page. 
  • Likes, stars, badges, forks, clones, downstream dependencies, mentions on social media, and more.

Release Frequency

  • Knowing the timing of regular releases, and being able to understand frequency and cadence we may expect security patches and new features to identify how well our project’s release cadence and strategy fits with potential dependencies. 
  • This is somewhat a proxy for LTS / release strategy that may otherwise be available for larger projects.

Libyears

  • “A simple measure of software dependency freshness. It is a single number telling you how up-to-date your dependencies are.”
  • This metric allows for apples-to-apples comparisons between projects of freshness. 
  • Scales to show complex projects with many dependencies, and the risk associated with using those projects with the massive maintenance cost behind the scenes.

What the metrics mean for Viability

These metrics are useful to show the intention or lack of intention in the project Governance. For example, If there’s a lack of inclusive labels on issues: it identifies a gap in welcoming new contributors and softing existing contributors through workstreams. The Governance of a project is reflected in turn. Same goes for many of these metrics. The ability to contribute, understand, or depend on a project is highly coupled to the effort behind Governance.

This isn’t to say poor Governance metrics indicate that a project is governed by fools. Low CRCR, for example, may simply indicate that there are not as many maintainers to support a contributing community. A lack of new issues could be the result of a recent large release that addresses many recurring issues. These metrics are important to aggregate these reasons not to cast doubt on maintainers of projects. Only to identify the Governance capacity and effort across projects in a software portfolio.

If some of these metrics feel like they could be strong community metrics, I think they can be. Many of the shared metrics here are a combination of the effort a community has with a project, and the effort of the body governing a project. We think the overlap of shared metrics captures this relationship well, considering the responsibility contributors and maintainers share in creating OSS.

Community

Just for this model:

Clones

  • How many times a project has been pulled from a repository into a local machine. 
  • Indicator of how many people are using or evaluating the project.

Technical Forks

  • How many forks have been made of a given project.
  • Forks, in our estimation, are normally performed to create contributions through changes or to take the project in a new direction in their own community.

Types of Contributions

  • Not all contributions are code, strategy / issues / Reviews / events / Writing articles / etc. gives a strong indication of the maintainer’s ability to grow or continue building a project.
    • Likewise, if contributions are coming in only as requests with no coding alongside it – we can assume the project doesn’t have active contributors. 
    • Any large ratio distribution might tip the scales of if we should or should not recommend a project as viable.

Change Requests

  • The volume of regular requests, or the emergence of a pattern of change requests (around holidays, weekends, weekdays) can tell us a lot about a project. 
  • By identifying trends, we can make many educated guesses about the strength, patterns, and sustainability of a project Community.

Committers

  • We don’t say “no projects under x committers is viable” – because the two are not related.
    • We care about committer trends

Shared Between Models:

Change Request Closure Ratio

  • Compare the drift of new requests to their rate of closure. 
  • Can help indicate a cooling or heating contribution community by monitoring merged community requests.

Project Popularity

  • Aggregate of other smaller metrics one might expect to find in a cursory glance over a project landing page. 
  • Likes, stars, badges, forks, clones, downstream dependencies, mentions on social media, and more.

Libyears

  • “A simple measure of software dependency freshness. It is a single number telling you how up-to-date your dependencies are.”
  • This metric allows for apples-to-apples comparisons between projects of freshness. 
  • Scales to show complex projects with many dependencies, and the risk associated with using those projects with the massive maintenance cost behind the scenes.

What the metrics mean for Viability

With Community, we seek to understand the “tinkering” that happens with a project, as well as being able to measure the contributions that are made. Clones and forks indicate how many users of software have pulled it to build from source, inspect the source code, submit a contribution, or take the project in a new direction. That flavor of popularity feels meaningful to trace community engagement in a project. 

With committer trends, types of contributions, and change requests, we can see how a Community is interacting. Maybe more markdown RFC’s are created than features, maybe vice-versa. With an understanding on what types of contributions are made, and how regular they are, we make a more informed judgment on project viability. In an example: we think it’s reasonable to expect that a project which has shed 90% of its committers in a three month period is less viable than a stable (flat) committer trend. The inverse could indicate a growing or stable project gaining popularity around a particular technology trend. Where some “tinkering” metrics feel micro, other metrics take a macro lens.

By measuring some shared metrics, we give this model an opportunity to be viewed from the perspective of how much the community maintains a project, and how much interest there is generally. We find this distinct from the Governance angle, even with significant overlap, as trends in these metrics are almost never entirely at fault of the community or in the maintainers of a given project. The numbers could be meaningful for either space, so they exist in both models.

Strategy

Just for this model:

Programming Language Distribution

  • We have strong opinions on which languages are viable at Verizon.
    • Many companies have similar standards and expectations – or normally center around a particular language. 
  • Unsupported or unused languages are a strong indicator of project viability. 

Bus Factor

  • A count of the fewest number of committers that comprise 50% of activity over a period. 
  • We can better understand the risk of using a particular project or set of projects, regarding how much support the project would get if top contributors left.

Elephant Factor

  • Elephant factor is a lot like bus factor – but it counts the fewest “entities” that comprise 50% of activity on a project. 
  • We use this to infer the influence companies have on a project, or how detrimental it would be if that company shifted priority.

Organizational Influence

  • Organizational Influence measures the amount of control an organization may have in a project. This is an estimate, and an aggregation of several other metrics. 
  • Details in the link, but organizational diversity is one example of a metric that can aggregate to create an imprint of influence. 

Shared Between Models:

Release Frequency

  • Knowing the timing of regular releases, and being able to understand frequency and cadence we may expect security patches and new features identifies how well our project’s release cadence and strategy fits with potential dependencies. 
  • This is somewhat a proxy for LTS / release strategy that may otherwise be available for larger projects.

What the metrics mean for Viability

Metrics we trace in this model trace the strategy, or expected influence from individuals and organizations. For example: With a bus factor of 1, it’s very possible that burnout or other factors could pull that one person away. With a more resilient count of folks, we are more likely to see a stable and viable maintenance strategy. As a highly regulated and large entity, Verizon considers which other entities might be developing critical infrastructure for our applications. We consider our risk appetite and tolerance in the scope of a project we use, to ensure we don’t rely too heavily on one particular provider. These metrics continue our mission of managing that risk profile.

We share release frequency between Strategy and Governance. This categorizes the overlap of how the maintainers of a project provide both a governance plan and a maintenance strategy.

Wrap Up

Compliance + Security, Governance, Community, and Strategy. These are the tenets we use for our Viability Metrics (Super) Model. I’m excited to share this model with the broader software community for input and feedback, and to make it better over time. We will share our lessons learned and what practices we find the most effective for maintaining a viable software portfolio as we iterate.

Tune in next time for us to share a guide on OSS viability. We include recommended tools to set up Viability monitoring on projects. If you’d like to connect to talk about this post, join the CHAOSS community slack and find me, Gary White!

 

Artificial Intelligence: An Open Source Disruptor

By Blog Post

By Matt Germonprez, Dawn Foster, and Sean Goggins 

 

Corporations have increased their investments in open source because of its potential to share the weight of non-differentiating technology costs with other organizations that rely on the same core technologies, and consequently innovate more quickly and increase organizational value. In many cases, the financial leverage gained through open source engagement is substantial, visible, and measurable. However, open source engagement is, to some extent, a cost each organization must assess. For organizations considering open source engagement, it means evaluating the ratio of increased value over the costs of engagement – a ratio that may very well be directly affected by AI. 

Open source has benefited an untold number of industries. Open source carries forward well known and positive outcomes for engagement by companies. These include leveraged development, distribution of software maintenance costs, improved time to market, increased innovation, and talent acquisition. However, these positive outcomes, derived from the value leverage provided by open source, now have the potential to be found elsewhere, most notably through the use of artificial intelligence that leverages large language models. 

It is becoming increasingly clear that AI will be an open source disruptor that alters how companies think about things like the provenance of source code, as seen in the Linux Foundation’s recent release of their Generative AI Policy. The LF policy highlights key areas of concern including contributions, copyright, and licensing. Other key efforts to address open source and AI include the OSI’s deep dive in Defining Open Source AI and AI Verify Foundation’s focus on building ethical and trustworthy AI. These initiatives are critically motivated to address key issues of AI as part of open source processes and the needed accessibility of AI for all. Each initiative rightfully assumes a future that includes AI and also rightfully prepares an audience for key issues that require attention. 

AI is already emerging as a disruptive factor in the work of open source communities. Some open source communities are having issues with the volume of low quality AI-generated code contributions. People often contribute to open source projects (a secondary goal), particularly high-profile projects, to build their resumes and GitHub profiles (a primary goal). However, AI now provides an option for people to reduce the work needed to achieve secondary goals in hopes of achieving primary goals. As a result, open source projects are seeing an increase in nonsense code contributions that are causing additional work for already overloaded project maintainers. 

Within any company, AI has the capacity to impact how engagements with open source projects are evaluated and approached. We know reasons for corporate engagement with open source projects include the reduction of the internal resources needed for software development, maintenance and improving product time to market. To obtain these positive outcomes, the costs of engaging with open source projects by assigning employees to contribute, and become leaders are offset by the benefits. Open source program offices aim to lower the costs and amplify the benefits of these engagements. But what if AI, used to increase development speed, and the expense of engagement with open source communities further lowers the costs and retains the benefits associated with developing software in the open? What if a company could still achieve cost and time savings without working in the public? What if conversations that were otherwise present in open source projects and communities could now take place as well-defined AI prompts? Should open source program offices be focusing on working with AI, in addition to working with open source projects? 

Questions that need more exploration are premised on how AI carries the potential to alter the cost-benefit ratios of corporate software development in lieu of engaging with open source projects across three key areas including: 

  1. Community-level: Working in a Community
    1. Does AI increase open source community level noise? 
    2. Are AI developed contributions distinguishable from those developed by individuals? 
    3. Does AI reduce the volume of corporate engagement within open source communities? 
  2. Ecosystem-level: Working in an Ecosystem
    1. Does AI reduce the need for companies to perform ecosystem level monitoring? 
    2. Does AI reduce the need for companies to engage with open source communities? 
  3. Policy-level: Addressing Licensing and Security Concerns
    1. Does AI provide a source of legal exposure for communities and companies?
    2. Will AI be used to mask malicious code within communities and companies?

Underlying these questions is a certainty that AI will alter the dynamics of collaboration in open source engagement, and we suggest that this new reality be addressed directly. There is a case where AI will alter cost ratios within individual companies,, as well as uncertainty about how these changes will shift, or possibly erode critical value presently derived from open source engagement. One core challenge we face will be identifying corporate approaches to AI within open source that affects communities, ecosystems, and policies with deliberateness. To date, corporate engagement with open source recognizes that a rising tide lifts all boats. Will AI change our views of the tide?

Viability: An Open Source CHAOSS Metric (Super)Model

By Blog Post No Comments

This post is part of a three part series. This is part one. Stay tuned to the CHAOSS blog for a deep dive into the metrics around Viability, and a guide on how we can use this model.

Companies who use open source software (so, all of them) have been thinking more and more about bills of material, vulnerabilities, and license risks. This has especially been encouraged through recent United States efforts regarding Software Bills of Material (SBOM’s). This push is a good opportunity to observe and report on software supply chains. We can all learn a lot from knowing what’s in our software dependencies! Proliferation of SBOM’s, SAST, and SCA scanning tools allows for users and developers of software everywhere to better understand their risk portfolio. Most people find significant value by using these reports to surface critically important vulnerabilities and license information. 

Choosing and Updating Components

When Verizon, and the OSPO within, looked through our own SBOM’s and assessed which dependencies we should update. We found that common practice sparsely dictates relative priority. Barely any priority is set as industry practice for updating dependencies. Outside of CVE’s (with their own criticality rating), and licensing resolution (usually a hard yes/no from legal teams). Old dependencies without any open vulnerabilities are regularly put behind new features and critical breakages. That felt wrong to us. Over time, dependencies that had “never needed updating” find themselves painfully entrenched in our applications. When they do need upgrading the effort to do so can take dramatic time away from other development. We thought for sure, there has to be a way to identify (and reduce) that risk.

A visual representation of open source libraries available to accomplish a minor engineering task
Photo by Zaini Izzuddin on Unsplash

What else should we be looking for in our open source software choices? What guidance can we give to encourage good decisions not just when maintaining software? Should that guidance be different than when people decide on software to use? Software technologists regularly make decisions about which software they should use in their project. Rarely is this done with an SCA tool, SAST tool, or an SBOM. Could we provide tools and metrics to guide decisions? How can we create less risk in our choices, providing more sustainable software infrastructure? From these question we set to develop a model to make sense of the complexity.

More than Vulnerabilities and Licensing

Enter: Viability. Metrics (and metrics models) about open source projects. These provide insight into the Community, Strategy, Governance, and Compliance/Security of open source projects.  This series of blog posts will cover the motivation and background, what metrics are included in the model, and finally how we can use those metrics to measure Viability in OSS. Viability is built with metrics detailed by CHAOSS, as CHAOSS has spent a while talking about how to make these metrics serviceable and understandable for technologists across geography and industry focus. 

For brevity and focus Viability is split into four metrics models: Community, Strategy, Governance, and Compliance + Security. The individual value proposition of metrics per model is best described in their respective pages. The whole includes many metrics, and some overlap per sub-model. We did this so they may be used independently or in concert for a full picture of viable open source projects. We recommend using all of the models together, but mixing and matching them is also expected and appropriate.

The application of these metrics at Verizon allows us to operationalize tasks for choosing and maintaining software. While evaluating dependencies, engineers have a better idea of what parts of an open source project excel, and what may fall short. Depending on the application intended, OSPOs and application teams can decide if the risk is worth taking on a project together. 

Viability provides direction to what dependencies should be updated. Instead of lumping old dependencies into a ball of vague “technical debt”, we can estimate the fit of community, strategy, governance, and compliance / licensing against a particular project. We can also approximate the risk of not addressing projects in a way that is quantifiable to stakeholders; and in review of priorities.

Thanks, and come see us!

I’m excited to share this model with everyone; it’s been a while in the making. Be part of the next project, ask questions, and get involved with CHAOSS working group by dropping into the CHAOSS slack channel. You can ping me (Gary White!) while you’re there.