CHAOSS Blog

If you would like to write an article for our blog, please reach out to our community manager – Elizabeth Barron.

The 2024 Community Survey is Open!

By Blog Post, News

Big news! We have opened the 2024 CHAOSS Community Survey

In 2022, CHAOSS conducted a survey to gain a deeper understanding of our community concerning diversity, equity, inclusion, welcomingness, and belonging. The insights we gained were incredibly beneficial, as DEI is something we continually strive to center and prioritize.

Much has changed in the past two years, so it’s time to revisit this survey. As before, we will use the results to get a sense of our community members’ experiences within CHAOSS, to identify any areas for improvement, and to gauge where we are on track.

If you are a CHAOTIC (either past or present), we highly encourage you to complete this survey! Even if you are brand new to the community, we’d love to hear from you. In this way, you can help make CHAOSS better for everyone.

This survey:

  • has 14 questions in 3 sections
  • should take around 10-15 minutes to complete
  • is completely anonymous and no personally identifiable information will be collected
  • is GDPR compliant
  • will be open until October 31

Let me stress that I am the only person with access to the raw data, and the scrubbed data will be available to a few CHAOTICS that will help us identify common themes. You are highly encouraged to be as candid as you like; we genuinely want your honest feedback.

Click here to take the 2024 CHAOSS Community Survey.

Thank you for your participation and for helping us center diversity, equity, and inclusion in the CHAOSS community!

Defining and Measuring the Value of Working in the Open: Key Takeaways from the workshop at “What’s Next for Open Source?”

By , Blog Post

On July 11, 2024 in New York City, open source enthusiasts and strategists gathered for an insightful workshop led by two CHAOTICS: Georg Link, Open Source Strategist at Bitergia, and Stephanie Lieggi, Executive Director at UCSC OSPO and CROSS, UC Santa Cruz. This event was held the day after the United Nations’s OSPOs for Good event, so a theme throughout the workshop was how to drive value for OSPOs across governments, agencies, non-profits, universities, and industry. The event focused on addressing a common challenge of OSPOs: demonstrating the value of working in the open and justifying resources for open source projects. Here are the key themes, topic points, and takeaways from the workshop.

Introduction: The Value of Working in the Open

The workshop opened with a compelling question: Have you ever struggled to show the value of working in the open? This question set the stage for exploring effective strategies to justify resources for open source initiatives. Georg introduced the Goal-Question-Metrics (GQM) approach, which is a recommended approach from the CHAOSS project. Because the workshop revolved around the GQM approach, we will explain it here.

The Goal Question Metric (GQM) Approach

At CHAOSS, we have advocated for using the GQM because it is a structured method for deriving metrics that align with organizational goals. It involves three key steps:

  1. Goals: Identify and understand your organizational goals. These can vary significantly but typically include objectives like recruiting talent or enhancing community engagement.
  2. Questions: Break down these goals into specific, actionable questions. For example, to assess recruitment efforts, one might ask, “Who are important contributors?” or “How many did we help hire?”
  3. Metrics: Develop metrics to answer these questions. Metrics should be operational and data-driven, such as the number of contributions by name, hiring successes, or project activity levels. Some good data points, like the number of commits, may not be relevant to the question you need to answer.

Selecting the right metrics is crucial. They should align with organizational goals and also be mindful of data sources and potential unintended consequences. 

After identifying the most effective metrics, tell a story with them to highlight how your OSPO contributed to the success of the organization.

Sharing Organizational Goals

In an interactive discussion, workshop participants shared diverse organizational goals, emphasizing the importance of aligning open source efforts with the broader mission statements of their organizations. We collected some bullet points on a screen and participants discussed many of them.

The most common goals included:

  • Being a thought leader: Working in the open to demonstrate the skill and knowledge of the organization to lead conversations.
  • Helping the community: Giving back to the community by working transparently and in the open.
  • Diversity: Including many voices, especially those who would not get a seat at the table if it weren’t for working in the open.
  • Tech literacy: Upskilling employees by allowing them to work in the open on new technologies.
  • Civic utility: Working in the open can bridge the gap between the government agency or organization and the constituents who are being served.
  • Interoperability:Working in the open allows for alignment of technological advances within the organization and in the larger open source community.

Case Studies

Finding ways to demonstrate the value of working in the open is not a new challenge, although the creation of new OSPOs is now elevating the stakes in sustaining these efforts.  To speed up the process of finding meaningful metrics and stories, the workshop participants were introduced to people who could share their past experiences.

CHAOSS Project: Community Health Analytics

Sophia Vargas from Google’s OSPO presented the CHAOSS Project, which focuses on community health analytics. Their key questions addressed workload distribution, project slowdown, and new contributor onboarding. Their metrics used included the percentage of work done by top contributors and response times.

Open Source Foundations

Arun Gupta from Intel discussed the benefits of open source foundations like the Linux Foundation, CNCF, Eclipse Foundation, and Apache Foundation. Metrics for evaluating these foundations included project counts, contributor numbers, code contributions, and financial investments.

Metrics for OSS Sustainability

Vladimir Filkov from UC Davis and Charlie Schweik from UMass Amherst highlighted the importance of process-driven sustainability metrics. Their approach involved comprehensive metrics and analytics using AI, and they link the metrics to governance and policy actions.

Global Community Technology Challenge

Wilfred Pinfold from OpenCommons discussed efforts to get cities collaborating on consistent services through the Global Community Technology Challenge.

GitHub and WHO OSPO

Cynthia Lo from GitHub shared the development of an open source metrics dashboard for the social sector, focusing on repository health metrics that the World Health Organization found valuable.

https://github.com/WorldHealthOrganization/world-health-org-metrics

OSPO Levels and Responsibilities

Ana J. Santamaria from TODO, Linux Foundation, discussed the various responsibilities of OSPOs and the importance of demonstrating ROI. She emphasized enabling non-technical managers and shared resources for further learning.

Small-Group Discussion

The second half of the workshop engaged participants in  small-group discussions. In groups of 3, they discussed their goals and the challenges of measuring success. Each member had the chance to share their own perspective while the others practiced active listening and then gave feedback. After the small-group discussion, the groups could share with the other groups what they had discussed.

Get Involved and Drive Open Source Success

This workshop provided a primer to strategies to measure the value of working in the open. By leveraging the GQM approach, sharing organizational goals, and learning from real-world case studies, participants left with actionable insights to enhance their open-source initiatives.

To continue your journey in open source success, we encourage you to explore the following resources:

  • Join the CHAOSS Community: Dive into the world of community health analytics and learn how to apply metrics to your projects. Visit CHAOSS’s Quick Start for New Contributors to get started and join CHAOSS on Slack.
  • Participate in OSPOs4Good: Join the Slack community for ongoing discussions, support, and collaboration.
  • Explore the OSPO Landscape: Learn more about OSPO responsibilities and best practices through the Linux Foundation’s TODO Group.

Your involvement is crucial to advancing the open source movement. Whether you are just starting or are an experienced strategist, there are always new ways to contribute and innovate. Let’s work together to demonstrate and maximize the value of working in the open!

selective focus photography of clear lightbulb

Unlocking Insights: Practitioner Guides for Interpreting Open Source Metrics

By Blog Post

Photo by Martin Wilner on Unsplash

I am thrilled to announce that we have just launched a series of Practitioner Guides to help people develop meaningful open source project health insights. 

Today, we have released the first four guides in the series:

These guides are designed to be used by practitioners who may or may not be experts in data analysis or open source. The goal is to help people understand how to interpret the data about an open source project to develop insights that can help improve the project health of that open source project. The Practitioner Guides are for Open Source Program Offices (OSPOs), project leads, community managers, maintainers, and anyone who wants to better understand project health and take action on what they learn from their metrics. Each guide contains details about how to identify trends, diagnose potential issues, gather additional data, make improvements in your project and monitor the results of those improvements. 

We have more guides being developed already, and we welcome your contributions! You can propose a new guide, author a guide someone else has suggested, or submit a pull request to make our existing guides even better!

These guides are being developed within the CHAOSS Data Science Working Group. We have a Slack channel and meet every other week to talk about a wide range of data topics, so I hope you’ll join us!

Author

GrimoireLab 1.0

By Blog Post, News

GrimoireLab 1.0

For eight years, we have been working to produce the best platform for software development analytics possible. With the work of more than 150 developers and after over 11,600 commits, we’re excited to announce the release of the first major version of GrimoireLab.

GrimoireLab is an evolution of the work done during more than 10 years by Bitergia, LibreSoft URJC research group, and several contributors in Metrics Grimoire and VizGrimorie projects.. Since 2017, GrimoireLab has been part of The Linux Foundation CHAOSS Software community as one of its founding projects.

GrimoireLab has become common for open source project health dashboards. It has been used by some of the most important software companies and open source foundations in the world. The platform has also been used as the underlying foundation for other applications, including Bitergia Analytics, OSS Compass, LFX Insights, Cauldron, and Mystic

What’s included in this GrimoireLab release?

  • An automated platform to generate software analytics and insights. 
  • Data collection from more than 30 data sources.
  • Generation of more than 150 metrics and visualizations to understand activity, performance, and community of open source projects.
  • Identities manager to track the activity of an individual across platforms and organizations. 
  • Integration with third-party applications to visualize and analyze data (Kibana/OpenSearch Dashboards/Jupiter Notebooks).

Why are we releasing this major version right now?

As our roadmap lays out, we’ve identified some challenges that require a major shift in how the platform works. We expect that version 2.0 of GrimoireLab will be significantly different, improving on scalability and maintenance and addressing advancements in AI.  Therefore, we believe that releasing a stable version now will give our users predictability and stability moving forward.

What can you expect from now on?

Version 1.0 will be maintained as a stable release that will continue to power enterprise, open source, and research users. Meanwhile, we will create a branch named 1.x to fix bugs and to include new features that will be part of the next major release. The active development of GrimoireLab 2.0 will show up in the main branch. 

Some of the architectural changes detailed in our roadmap for version 2.0 include:

  • Maintenance effort will be reduced in version 2.0 with a graphical user interface and an API for configuring data collection in GrimoireLab. Currently, system administrators need to manually update text files when new data is to be collected.
  • Scalability and performance will be improved to handle more than 5,000 data endpoints and deliver insights faster. Currently, 3,500 high-active repositories require three days of data analysis before the data is ready for the user.
  • Integration with other tools will be made easier. Users will be able to use different tools for visualizing and analyzing the data from GrimoireLab.

Our thanks!

This release would not have been possible without the help of the entire community. We are deeply thankful to all our users. We would especially like to thank Álvaro del Castillo, Valerio Cosentino, Jesús González-Barahona, Alberto Pérez García-Plaza, J. Manrique López, Venu Vardhan Reddy Tekula, David Moreno, Gregorio Robles, Andy Grunwald, and the members of the CHAOSS project. 

We recognize Bitergia and The Document Foundation for being early adopters and to be the first ones to enter their names on the new ADOPTERS.md file. If you use GrimoireLab, please add your organization to our ADOPTERS.md file so that we can recognize you. If you’ve done research with GrimorieLab, please add a citation and link to your publication.

The GrimoireLab Developers

Author

Embarking on an Exciting Journey: CHAOSS Unveils Ambitious Goals for 2024

By Blog Post

The CHAOSS project is gearing up for a transformative year in 2024, building on the momentum of a successful 2023. In a recent discussion, we shared insights into the project’s goals, which focus on significant growth, community engagement, and the establishment of international standards. Let’s delve into the exciting objectives that will shape the future of CHAOSS in the coming year.

Establishing CHAOSS Metrics as International Standards:

Imagine CHAOSS metrics and metrics models receiving a nod as formal international standards, specifically candidates for ISO standards. We’re collaborating with the Joint Development Foundation, part of the Linux Foundation, to create standards that will facilitate global engagement with these metrics and metrics models. This move lends more legitimacy to our hard work and opens doors for wider recognition and adoption, making it easier for management and C-level executives to embrace these metrics and metrics models. The journey involves thoughtful consideration of which metrics and metrics models are suitable candidates, navigating the standardization process, and even contemplating compliance programs. As we venture into this uncharted territory, we’re ready for the exciting challenges and opportunities.

Crafting Outreach Plans for Community Growth:

Building and sustaining the CHAOSS user community has always been a priority. In 2024, we’re taking a more deliberate approach to growing this community. We could do many things; it will be key to prioritize activities that will genuinely make a difference in raising awareness and engaging users as we craft a strategic advocacy plan. From refining user-specific key messages to exploring the possibility of an Ambassador’s program, our goal is to make every member of the CHAOSS community feel welcomed and included. We’re also gearing up to share our insights with other open source communities, envisioning a toolkit that aids others in navigating the challenges of community outreach and promotion. And we’ll continue to explore downstream use and engagement with the artifacts our community creates, as in the case of Augur and GrimoireLab

Fostering Collaboration Within the Contributor Community:

Contributors are the heart and soul of CHAOSS, contributing in diverse ways, from coding to blogging to organizing meetups. To foster collaboration, we continue to find ways to make it easier for everyone to contribute, appreciating all forms of participation. The ‘Chaotic of the Week’ program continues to shine a spotlight on community members, and you’ll see us at even more events, engaging both users and contributors. We’re thankful to our community for their blogging efforts—one shining example are the blogs penned by Gary White, principal engineer in Verizon’s Open Source Program Office, about the company’s use of CHAOSS metrics. Collaboration with other communities is a key aspect, including the TODO Group, Linux Foundation, and universities, and we’ll focus on making these partnerships more visible and explicit. We’ll also revisit our approach to mentorship, thinking strategically and innovatively about the programs we engage in.

Offering SaaS Solutions for CHAOSS Metrics:

We’ll aim to make CHAOSS metrics more accessible and user-friendly, while reducing resource constraints, and as a result, bringing these metrics to a wider audience, by providing hosted Software as a Service (SaaS) offerings. Hosted solutions can help simplify software installation, a hurdle identified in an earlier survey, and give people a taste of different CHAOSS software pieces, so they can decide which fits their needs. These solutions aim to cater to a broader audience, accommodating various project needs, from large corporate open source offices and smaller, scientific organizations. Today, Augur has a hosted instance, and the plan is to convert Augur data into metric model data and implement standards for these models. In addition, we’re looking at securing funding to host the OSS Compass project, CHAOSS-affiliated software that provides a user-friendly interface for metrics.

Data-Driven Insights for Informed Communities:

Historically, the CHAOSS project has taken an agnostic approach to metrics interpretation, providing a wealth of metrics and tools and leaving individual projects to determine how to use and interpret these metrics based on their unique community dynamics. Recognizing the complexity and abundance of metrics, the goal now is to help users, primarily those new to metrics, derive meaningful observations for their communities. Collaborating with context working groups and creating insight guides, we’re bridging the gap for users looking to enhance their communities through informed metric analysis. We’re starting to work on these guide, and we encourage you to get involved and contribute to them. Additionally, we’re focused on building use cases, and community examples will be vital in helping users better interpret data and implement insights effectively.

Adapting to New Technologies:

While we’re not fully prepared to tackle this goal, we’re keeping a watchful eye on emerging technologies like artificial intelligence (AI). Our earlier exploration into AI’s impact on open source communities is just the beginning. New technologies beyond AI will continue to emerge, and as they do, we’ll be ready to adapt our metrics policies and practices accordingly.

With these goals set for 2024, we’re excited about the growth we’ve witnessed and the intentional approach we’re taking as we look forward. We invite you to be a part of this exciting adventure, contributing your unique perspective and energy to enrich our shared experience. Join us in shaping the future, addressing challenges, and celebrating successes together. The CHAOSS community is a vibrant space; together, let’s continue building something amazing!

 

Author

CHAOSS DEI Project Badging

By Blog Post, News
CHAOSS Logo

Here at CHAOSS, we are excited to announce the launch of CHAOSS DEI Project Badging. CHAOSS DEI Project Badging is an initiative developed to recognize open source projects that prioritize diversity, equity, and inclusion (DEI) work within their respective communities. The initiative uses CHAOSS DEI metrics as a benchmark to reflect on DEI efforts in an open source project. The objectives of CHAOSS DEI Project Badging are to (1) enable people to signal their ongoing efforts in improving and prioritizing DEI within their communities, (2) to recognize projects and communities for their DEI efforts, and (3) to help communities make informed decisions and take thoughtful actions in creating more inclusive and equitable environments. CHAOSS DEI Project Badging welcomes participation by badged projects in the evolution of the program. 

How It Works

Self-Assessment and Documentation

Before applying for a CHAOSS Project DEI Badge, maintainers or project admins are encouraged to conduct a self-assessment of their project’s DEI efforts based on the following CHAOSS DEI metrics: Project Access, Inclusive Leadership, Communication Transparency, and Newcomer Experience. After the self-assessment, maintainers can document how the project attends to and prioritizes DEI around these areas in a markdown file called the DEI.md file. This DEI.md file should exist within the project’s repository for easy feedback from the community. A guide for putting together your DEI.md file is available through the DEI.md Guide

Badging Application

Once the DEI.md file is published and publicly available, project owners can proceed to apply here. The applicant must be a project owner and the repository that holds the DEI.md file must be specified.

Badging Evaluation

The review follows an automated process in which a CHAOSS bot scans the project repository for the presence of a DEI.md file. The bot will review the DEI.md file for relevant information provided by the maintainer and its alignment with the CHAOSS DEI metrics stated in the DEI.md template to determine eligibility for the badge. A project badge will be issued upon successful review of the DEI.md file. The four CHAOSS metrics used in the DEI.md file include:

  • Project Access
  • Inclusive Leadership
  • Communication Transparency
  • Newcomer Experience

The evaluation of your DEI.md file is based on it being publicly available and demonstrating an attention to the four CHAOSS metrics. Ultimately, we are ensuring that you are providing your community members with a well-formed DEI.md file. We do not evaluate how you reflect on the four metrics as each community will do things differently. If community members have concerns about what is expressed in their community’s DEI.md file, we ask that you discuss this within your community to ensure the DEI.md file appropriately reflects your community DEI efforts. 

Recognition and Badging

Projects that meet the established criteria will receive the CHAOSS DEI Project Badge, which can be prominently displayed on the project’s website, documentation, or other relevant platforms. The badge signifies the project’s commitment to DEI and highlights its attention to DEI best practices.

Continued Engagement

Once the badging process is completed, you can re-apply for a project badge — we recommend it after about a year. We will also be developing CHAOSS Project Badges for Silver, Gold, and Platinum levels in the future that include new CHAOSS metrics.

Getting Involved

If you would like to help CHAOSS build the future of DEI Project Badging, we welcome your participation! You are encouraged to join our DEI Working Group meetings every Wednesday at 10:00 am US Central/Chicago Time. Details on how to join these meetings can be found on the CHAOSS Calendar. You are also welcome to join the CHAOSS Community Slack and connect with us there.

Author

2023 CHAOSS Community Wrap-Up

By Blog Post

I’m a big fan of periodically taking time to reflect on the past, and the new year seems like an excellent time to do so. So let’s look at some of the highlights from the CHAOSS community in 2023!

  • Our Slack community doubled in 2023, as we went from about 900 members at the end of 2022 to about 1800 members currently.
  • We hired a Director of Data Science (the one and only Dr. Dawn Foster) and launched our Data Science Initiative.
  • We launched CHAOSS Latin America and CHAOSS Balkans as our newest chapters (Welcome to Selene Yang and Kristi Progri as our Chapter Leaders!)
  • We completed the DEI Project Badging pilot working closely with GitHub’s All In project to host this within the CHAOSS project.
  • In addition to our usual CHAOSScon EU and CHAOSScon NA events, we held our first CHAOSScon Africa event, which was a huge success!
  • We retired our Risk and Evolution working groups and launched two new Context Working Groups instead: OSS in Universities and Scientific Communities.
  • We ushered in some new Board Members: Anita ihuman, Ruth Ikegah, Brian Proffitt, and Kevin Lumbard. Dr. Dawn Foster became a new Board Co-Chair, joining Sean Goggins in the leadership role, and thank you to Nicole Huesman as outgoing Co-Chair for all of her hard work in this role over the past 2 years.
  • We expanded our outreach to span across many global open source conferences (and we gave away a couple of LEGO Globes to represent the global nature of the CHAOSS project).
  • Our number of released Metrics Models grew from 1 to 16! 
  • We badged 50 open source events in our DEI Event Badging Initiative and grew our team of active Badgers to 24.
  • We shared our framework for surveying a community around DEI initiatives and released the results of our own internal DEI survey.
  • We shared Anita ihuman’s incredible research on CHAOSS DEI metrics.
  • We started a CHAOTIC of the Week series to highlight some of our community members and the great work they do in the community.
  • We launched a program called Tour Guides to help newcomers find their way.
  • We recently restarted the CHAOSSCast podcast with a new episode about every 2 weeks highlighting something interesting from the CHAOSS community.
  • We received continued support from the Alfred P. Sloan Foundation and the Ford Foundation to help us keep the CHAOSS community one of the best communities on the planet. 

2023 was a time for growth and represented a subtle shift in how CHAOSS thinks about metrics. We have even more in mind for 2024, which you can learn more about in our recent podcast: CHAOSS Goals for 2024 and Beyond. It has never been more exciting to be a CHAOTIC! We truly hope you can join our community and help all of us improve the health of our open source communities. 

Author

Guide for OSS Viability: A CHAOSS Metric Model

By Blog Post
Photo by William Bout on Unsplash

This guide is part of a three part series. This is part three. Read part one or two for context and a deep dive into the metrics respectively.

In the last two posts of this series, we covered the existence of the CHAOSS Metrics (Super) Model for Viability. We then covered what exactly comprises that metrics model, and gave brief impressions of why and how they comprise a whole.

In this guide, we’ll talk about what’s possible with the CHAOSS tools, and how we can comprise a Viability metrics model. Namely, we’ll focus on GrimoireLab and Augur.

Consider the chart below to see the breakdown of what is available for which service.

Breakdown by Category

CategoryMetricGrimoirelabAugur
StrategyProgramming Language DistributionAvailableAvailable
StrategyBus FactorAvailableAvailable
StrategyElephant FactorAvailableAvailable
StrategyOrganizational InfluenceAvailableAvailable
StrategyRelease FrequencyNot AvailableAvailable
CommunityClonesNot AvailableNot Available
CommunityForksAvailableAvailable
CommunityTypes of ContributionsNot AvailableNot Available
CommunityChange RequestsAvailableNot Available
CommunityCommittersAvailableNot Available
CommunityChange Request Closure RatioAvailableAvailable
CommunityProject PopularityAvailableAvailable
CommunityLibyearsNot AvailableAvailable
GovernanceIssue Label InclusivityAvailableAvailable
GovernanceDocumentation UsabilityNot AvailableNot Available
GovernanceTime to CloseAvailableAvailable
GovernanceChange Request Closure RatioAvailableAvailable
GovernanceProject PopularityAvailableAvailable
GovernanceLibyearsNot AvailableAvailable
GovernanceIssue AgeAvailableAvailable
GovernanceRelease FrequencyNot AvailableNot Available
Compliance / SecurityOpenSSF Best PracticesNot AvailableNot Available
Compliance / SecurityLicense CoverageNot AvailableAvailable
Compliance / SecurityOSI Approved LicensesNot AvailableAvailable
Compliance / SecurityLicenses DeclaredNot AvailableAvailable
Compliance / SecurityDefect Resolution DurationAvailableNot Available
Compliance / SecurityLibyearsNot AvailableAvailable
Compliance / SecurityUpstream Code DepenciesNot AvailableNot Available
A Summary of Available CHAOSS metrics and their fit to Viability across Grimoire and Augur

Breakdown by Tool

Augur Summary  
CategoryAvailableNot Available
Community50.00%50.00%
Compliance / Security57.14%42.86%
Governance75.00%25.00%
Strategy100.00% 
Grand Total67.86%32.14%
Grimoirelab Summary  
CategoryAvailableNot Available
Community62.50%37.50%
Compliance / Security14.29%85.71%
Governance62.50%37.50%
Strategy80.00%20.00%
Grand Total53.57%46.43%

While we can’t get every metric for every service, we can get a good majority of what we need through a mix of Grimoire Lab, and Augur. We intend to continue building the ability to get this data into services like Grimore and Augur, then update the CHAOSS metrics wiki to reflect how we’ve done it.

Augur provides the most metrics overall for three categories, while Grimoire is best for Community management. Grimoire also provides sigils, which create panels for you as a user for a good amount of metrics you may want to use. Augur also has a tool supported by RedHat that visualizes metrics within it.

How Does this Guide My Decisions?

Depending on your use case, you may find different opportunities to use the Viability model. It was originally developed for use evaluating using open source products, and your thresholds for each model category will vary based on your assumption of risk.

For example:

  • Organizations starting their journey in governing Open Source Software at their organization usually start with Compliance and Security, cornering vulnerabilities and licensing information to choose their software.
  • Large companies may consider strategy to be the next-most important. Given that many organizations build software that is in use for years, the importance of the strategy in a project — and indeed who maintains that strategy — can be a critical decision.
  • Community is important for cutting-edge or newer implementations of technology. While older technology will likely have a less volatile community, where maintainers and flow of new contributions can be judged over time, a new project may need a stronger community with more vigilance on it’s competitors to ensure that a software stack isn’t abandoned.
  • Governance is crucial for organizations that intend to engage the open source community; or contribute to a project to shape new functionality. If an organization is willing to commit time and resources to maintaining a project — the Governance of that project becomes important to consider

Getting Started

Consult the documentation of GrimoireLab and Augur for more details on how to get started. Based on what your team needs or cares about, consider choosing the tool that has the highest coverage, or use them both to maximize your results. If you find that you can trace some metrics that I’ve gotten wrong here, I’d love to know! Drop by our OSPO working group, metrics working group, or somewhere else to publish your contributions!

Until then, you can find me on CHAOSS community slack, as Gary White. Thanks for reading!

Author

Group picture of the attendees of OFA. Lots of smiling faces with the virtual attendees on the screen in the background.

OFA Symposium: Open Source Research Collaboration

By Blog Post

Last week, I attended The OpenForum Academy Symposium in Berlin. This is Open Forum Europe’s (OFE) academic conference around open source with the goal of collaboration between researchers, industry, and folks working on policy to share ideas and eventually generate more academic research that is useful for open source policy people. In an effort to avoid a 10 page blog post, I’ll only cover the highlights of a few talks that I found particularly interesting and that seem more relevant for CHAOSS.

In the first keynote, Julia Ferraioli and Juniper Lovato talked about the Beyond the Repository ACM Paper that they co-authored with Amanda Casari. I personally think this should be required reading for anyone doing research in open source. The paper goes in depth into why researchers should think about how their methods and results impact entire open source ecosystems, including the people working within the projects being studied. The paper is organized into nine best practices that help researchers understand how they might design their studies in ways that keep the ethical implications and ecosystem impact of their research top of mind. In particular, they suggest that researchers actively work with the practitioners involved in the projects as they look beyond the repository to gather data and consider the ramifications of the research.

The keynote was followed by several presentations focused on Open Source Communities and Cooperatives. Jérémie Haese talked about the working paper (to be published in Management Science), Open at the Core: Moving from Proprietary Technology to Building a Product on Open Source Software, that he is writing jointly with Christian Peukert using Microsoft’s move to Chromium as a case study. Among other things, they saw an increase in the pool of contributors along with more people reporting security vulnerabilities due to increased bug bounties offered by Microsoft resulting in an increasing number of vulnerabilities being fixed. 

Jorge Benet presented, A Cooperative Model for Digital Infrastructure and Recommendations to Adopt It, which has been fully published. The report discusses their findings from 21 digital infrastructure projects from 12 cooperatives across 7 countries with a model that looks at value creation, proposition, and capture with recommendations for projects wishing to adopt the model. 

Elçin Yenişen Yavuz talked about how user-led open source foundations are different from other types of foundations. While many of us work on projects in foundations led by communities and vendors, the foundations led by users of the software (e.g., Apereo Foundation, Academy Software Foundation, openMDM) have more direct benefits for the end users, including more control over functionality, shared resources, sustainability, and productivity. Results of some of this research can be found in the Problems, Solutions, and Success Factors in the openMDM UserLed Open Source Consortium paper in Communications of the Association for Information Systems.

There were a few talks about Legal implications from Open Source, which is a bit less relevant for the CHAOSS audience, but there was one talk from Wayne Wei Wang, Open-Source Commons Made in China: A Case Study of OpenAtom Foundation and Mulan-series Licenses, that I found interesting partly because some of us have been working with the folks at openEuler, which is an OpenAtom project under a Mulan-series license. Wayne talked about some ways that open source is different in China due to Chinese state entrepreneurialism and the relationships between central planning and open source. This is based on Wayne’s research paper: China’s digital transformation: Data-empowered state capitalism and social governmentality.

I found Knut Blind’s talk, Open Source in the Context of Innovation, particularly interesting, since it talked about various measures of innovation in open source. He shared the stage with a handful of others as they talked about how existing research on innovation using patents and papers can be compared to open source innovation by looking at open source contributions (like commits) as a comparison to patents and cited papers, which aren’t as dissimilar as they might seem if you think about how they all share a similar process that goes from submission through review and finally into publication / release. They also talked about using the GitHub Innovation Graph to look at open source innovation for various national economies. Finally, they talked about how dependencies can be used when looking at innovation, but that there are some challenges with this approach when you try to compare projects to understand innovation. For example, Javascript modules tend to be designed for integration into projects, so they will have many more dependencies than C/C++ projects, which are often designed as standalone apps.

Nataliya Wright’s talk, Open Source Software and Global Entrepreneurial Growth, looked at how contributing to open source can spur global entrepreneurial growth. They found that contributing to open source predicts higher valuations and funding for IT ventures (note that some, but not all, of this is related to selection bias based on the types of companies and founders that contribute). While the talk was based on new research, some of their early stage results can be found in the Research Policy paper, Open Source Software and Global Entrepreneurship.

This was a really interesting conference with many more talks than I could cover here, and I’m already looking forward to next year’s conference!

Author

Metrics for OSS Viability

By Blog Post

In the last post, we gave a background on Viability in Open Source. We covered the motivation, and implementation plan of how we’ll collect and measure metrics about open source that we use or might use at Verizon.

Metrics on a list next to a laptop, ruler, and calendar
Considering all these metrics together takes a list and a laptop, and maybe a ruler.
Photo by Marissa Grootes on Unsplash

In this post, we will cover the nitty-gritty details of which metrics we’re using in our model, and why they fit together. Rather than covering each metric individually, I’ll summarize what metrics are in each model, and give an overview on why they fit together for a good picture of the model. We’ll also cover the value proposition of metrics that cross between the categories comprising the full model.

What follows is the list of metrics comprising the Viability model, and why they are included:

Compliance + Security

Just for this model:

OpenSSF Best Practices

  • This is a proxy metric for us to ensure that a project responds to security incidents, and has enough protections in place to generally interpret a reliable Compliance and Security strategy.
  • This allows us to avoid using costly SCA/SAST scanning on every open source project we consider.

License Coverage

  • We use this to make decisions about risk of using unlicensed software, or determine if our use case of the software is compatible with the license provided.

Licenses Declared

  • This lets us compare our intended usage and project policy against the policy of our dependencies.

OSI Approved Licenses

  • Knowing we’ve reviewed the implication of each OSI license provides confidence that we understand how to use the software in compliance with those licenses.

Defect Resolution Duration

  • This metric allows us to consider, apples to apples, radically different response rates to defects when they occur between projects. 
  • Our understanding of this metric across projects and across a suite of dependencies calibrates our risk profile and tolerance.

Upstream Code Dependencies

  • Where we intend this metric for the model of viability is to ensure that dependencies of a project are also included in any viability evaluation we perform. 
  • It is important to consider an application alongside dependencies it shares to give a full picture of a particular project’s risk portfolio.

Shared between models:

Libyears

  • “A simple measure of software dependency freshness. It is a single number telling you how up-to-date your dependencies are.”
  • This metric allows for apples-to-apples comparisons between projects of freshness. 
  • Scales to show complex projects with many dependencies, and the risk associated with using those projects with the massive maintenance cost behind the scenes.

What the metrics for Viability

Overall, we use this metrics model to gauge how well both the community and the maintainers of the application consider the security and compliance of their application. We expect to use these indicators to gauge risk. We have showstoppers like licensing, where a license can be flatly incompatible with our intended use case, through security-centric badges and metrics, to how fast and regularly a team maintains the dependencies and defects reported to their application. 

Additionally, like in other models, some metrics are very tricky to trace or visualize. We leave a healthy amount of flexibility in how we rank applications against tricky-to-gather metrics, and we recommend that users of our models do the same. For example: Much like the Defect Resolution Duration; the appetite for how many Libyears is appropriate for a project will always be up to maintainers. Depending on how or where an app may run, and how frequently we can update it, we think about Libyears critically. 

Following other metrics and models, Libyears notably contributes to three of our metrics models: Compliance/Security, Governance, and Community. We believe that it fits particularly well in Compliance + Security as it gives an indicator not only about how critically maintainers consider compliance and security in their own project, but in the projects they’re dependent on.

Governance

Just for this model:

Issues Inclusivity

  • Provides an effective measurement for intentional aggregation of issues
  • Indicates how community skills are applied to project responsibilities.

Documentation Usability

  • Strong, usable documentation is required.
  • Though this can include a lot of manual effort, this is a very important metric to attempt to collect.

Time to Close

  • How long it usually takes for a contribution to the project to make its way to the codebase
  • Will give us an idea of consistency in the project (median, mean, mode)
    • This is not to be confused with defect resolution (we hold a higher standard for)
  • Other processes may occur alongside opening and closing a PR, for example, but this provides enough of an indicator to be inherently useful to the Governance of a project.

Issue Age

  • How long questions / suggestions / etc. generally hang around a project.
  • Simple to understand, easy to dive further into by looking at what issues are about.

Shared Between Models:

Change Request Closure Ratio

  • Compare the drift of new requests to their rate of closure. 
  • Gives us an idea of how the project is maintained – or if more maintainers might be needed to keep up with demand for new features.

Project Popularity

  • Aggregate of other smaller metrics one might expect to find in a cursory glance over a project landing page. 
  • Likes, stars, badges, forks, clones, downstream dependencies, mentions on social media, and more.

Release Frequency

  • Knowing the timing of regular releases, and being able to understand frequency and cadence we may expect security patches and new features to identify how well our project’s release cadence and strategy fits with potential dependencies. 
  • This is somewhat a proxy for LTS / release strategy that may otherwise be available for larger projects.

Libyears

  • “A simple measure of software dependency freshness. It is a single number telling you how up-to-date your dependencies are.”
  • This metric allows for apples-to-apples comparisons between projects of freshness. 
  • Scales to show complex projects with many dependencies, and the risk associated with using those projects with the massive maintenance cost behind the scenes.

What the metrics mean for Viability

These metrics are useful to show the intention or lack of intention in the project Governance. For example, If there’s a lack of inclusive labels on issues: it identifies a gap in welcoming new contributors and softing existing contributors through workstreams. The Governance of a project is reflected in turn. Same goes for many of these metrics. The ability to contribute, understand, or depend on a project is highly coupled to the effort behind Governance.

This isn’t to say poor Governance metrics indicate that a project is governed by fools. Low CRCR, for example, may simply indicate that there are not as many maintainers to support a contributing community. A lack of new issues could be the result of a recent large release that addresses many recurring issues. These metrics are important to aggregate these reasons not to cast doubt on maintainers of projects. Only to identify the Governance capacity and effort across projects in a software portfolio.

If some of these metrics feel like they could be strong community metrics, I think they can be. Many of the shared metrics here are a combination of the effort a community has with a project, and the effort of the body governing a project. We think the overlap of shared metrics captures this relationship well, considering the responsibility contributors and maintainers share in creating OSS.

Community

Just for this model:

Clones

  • How many times a project has been pulled from a repository into a local machine. 
  • Indicator of how many people are using or evaluating the project.

Technical Forks

  • How many forks have been made of a given project.
  • Forks, in our estimation, are normally performed to create contributions through changes or to take the project in a new direction in their own community.

Types of Contributions

  • Not all contributions are code, strategy / issues / Reviews / events / Writing articles / etc. gives a strong indication of the maintainer’s ability to grow or continue building a project.
    • Likewise, if contributions are coming in only as requests with no coding alongside it – we can assume the project doesn’t have active contributors. 
    • Any large ratio distribution might tip the scales of if we should or should not recommend a project as viable.

Change Requests

  • The volume of regular requests, or the emergence of a pattern of change requests (around holidays, weekends, weekdays) can tell us a lot about a project. 
  • By identifying trends, we can make many educated guesses about the strength, patterns, and sustainability of a project Community.

Committers

  • We don’t say “no projects under x committers is viable” – because the two are not related.
    • We care about committer trends

Shared Between Models:

Change Request Closure Ratio

  • Compare the drift of new requests to their rate of closure. 
  • Can help indicate a cooling or heating contribution community by monitoring merged community requests.

Project Popularity

  • Aggregate of other smaller metrics one might expect to find in a cursory glance over a project landing page. 
  • Likes, stars, badges, forks, clones, downstream dependencies, mentions on social media, and more.

Libyears

  • “A simple measure of software dependency freshness. It is a single number telling you how up-to-date your dependencies are.”
  • This metric allows for apples-to-apples comparisons between projects of freshness. 
  • Scales to show complex projects with many dependencies, and the risk associated with using those projects with the massive maintenance cost behind the scenes.

What the metrics mean for Viability

With Community, we seek to understand the “tinkering” that happens with a project, as well as being able to measure the contributions that are made. Clones and forks indicate how many users of software have pulled it to build from source, inspect the source code, submit a contribution, or take the project in a new direction. That flavor of popularity feels meaningful to trace community engagement in a project. 

With committer trends, types of contributions, and change requests, we can see how a Community is interacting. Maybe more markdown RFC’s are created than features, maybe vice-versa. With an understanding on what types of contributions are made, and how regular they are, we make a more informed judgment on project viability. In an example: we think it’s reasonable to expect that a project which has shed 90% of its committers in a three month period is less viable than a stable (flat) committer trend. The inverse could indicate a growing or stable project gaining popularity around a particular technology trend. Where some “tinkering” metrics feel micro, other metrics take a macro lens.

By measuring some shared metrics, we give this model an opportunity to be viewed from the perspective of how much the community maintains a project, and how much interest there is generally. We find this distinct from the Governance angle, even with significant overlap, as trends in these metrics are almost never entirely at fault of the community or in the maintainers of a given project. The numbers could be meaningful for either space, so they exist in both models.

Strategy

Just for this model:

Programming Language Distribution

  • We have strong opinions on which languages are viable at Verizon.
    • Many companies have similar standards and expectations – or normally center around a particular language. 
  • Unsupported or unused languages are a strong indicator of project viability. 

Bus Factor

  • A count of the fewest number of committers that comprise 50% of activity over a period. 
  • We can better understand the risk of using a particular project or set of projects, regarding how much support the project would get if top contributors left.

Elephant Factor

  • Elephant factor is a lot like bus factor – but it counts the fewest “entities” that comprise 50% of activity on a project. 
  • We use this to infer the influence companies have on a project, or how detrimental it would be if that company shifted priority.

Organizational Influence

  • Organizational Influence measures the amount of control an organization may have in a project. This is an estimate, and an aggregation of several other metrics. 
  • Details in the link, but organizational diversity is one example of a metric that can aggregate to create an imprint of influence. 

Shared Between Models:

Release Frequency

  • Knowing the timing of regular releases, and being able to understand frequency and cadence we may expect security patches and new features identifies how well our project’s release cadence and strategy fits with potential dependencies. 
  • This is somewhat a proxy for LTS / release strategy that may otherwise be available for larger projects.

What the metrics mean for Viability

Metrics we trace in this model trace the strategy, or expected influence from individuals and organizations. For example: With a bus factor of 1, it’s very possible that burnout or other factors could pull that one person away. With a more resilient count of folks, we are more likely to see a stable and viable maintenance strategy. As a highly regulated and large entity, Verizon considers which other entities might be developing critical infrastructure for our applications. We consider our risk appetite and tolerance in the scope of a project we use, to ensure we don’t rely too heavily on one particular provider. These metrics continue our mission of managing that risk profile.

We share release frequency between Strategy and Governance. This categorizes the overlap of how the maintainers of a project provide both a governance plan and a maintenance strategy.

Wrap Up

Compliance + Security, Governance, Community, and Strategy. These are the tenets we use for our Viability Metrics (Super) Model. I’m excited to share this model with the broader software community for input and feedback, and to make it better over time. We will share our lessons learned and what practices we find the most effective for maintaining a viable software portfolio as we iterate.

Tune in next time for us to share a guide on OSS viability. We include recommended tools to set up Viability monitoring on projects. If you’d like to connect to talk about this post, join the CHAOSS community slack and find me, Gary White!

 

Author