All Posts By

matt

Artificial Intelligence: An Open Source Disruptor

By Blog Post

By Matt Germonprez, Dawn Foster, and Sean Goggins 

 

Corporations have increased their investments in open source because of its potential to share the weight of non-differentiating technology costs with other organizations that rely on the same core technologies, and consequently innovate more quickly and increase organizational value. In many cases, the financial leverage gained through open source engagement is substantial, visible, and measurable. However, open source engagement is, to some extent, a cost each organization must assess. For organizations considering open source engagement, it means evaluating the ratio of increased value over the costs of engagement – a ratio that may very well be directly affected by AI. 

Open source has benefited an untold number of industries. Open source carries forward well known and positive outcomes for engagement by companies. These include leveraged development, distribution of software maintenance costs, improved time to market, increased innovation, and talent acquisition. However, these positive outcomes, derived from the value leverage provided by open source, now have the potential to be found elsewhere, most notably through the use of artificial intelligence that leverages large language models. 

It is becoming increasingly clear that AI will be an open source disruptor that alters how companies think about things like the provenance of source code, as seen in the Linux Foundation’s recent release of their Generative AI Policy. The LF policy highlights key areas of concern including contributions, copyright, and licensing. Other key efforts to address open source and AI include the OSI’s deep dive in Defining Open Source AI and AI Verify Foundation’s focus on building ethical and trustworthy AI. These initiatives are critically motivated to address key issues of AI as part of open source processes and the needed accessibility of AI for all. Each initiative rightfully assumes a future that includes AI and also rightfully prepares an audience for key issues that require attention. 

AI is already emerging as a disruptive factor in the work of open source communities. Some open source communities are having issues with the volume of low quality AI-generated code contributions. People often contribute to open source projects (a secondary goal), particularly high-profile projects, to build their resumes and GitHub profiles (a primary goal). However, AI now provides an option for people to reduce the work needed to achieve secondary goals in hopes of achieving primary goals. As a result, open source projects are seeing an increase in nonsense code contributions that are causing additional work for already overloaded project maintainers. 

Within any company, AI has the capacity to impact how engagements with open source projects are evaluated and approached. We know reasons for corporate engagement with open source projects include the reduction of the internal resources needed for software development, maintenance and improving product time to market. To obtain these positive outcomes, the costs of engaging with open source projects by assigning employees to contribute, and become leaders are offset by the benefits. Open source program offices aim to lower the costs and amplify the benefits of these engagements. But what if AI, used to increase development speed, and the expense of engagement with open source communities further lowers the costs and retains the benefits associated with developing software in the open? What if a company could still achieve cost and time savings without working in the public? What if conversations that were otherwise present in open source projects and communities could now take place as well-defined AI prompts? Should open source program offices be focusing on working with AI, in addition to working with open source projects? 

Questions that need more exploration are premised on how AI carries the potential to alter the cost-benefit ratios of corporate software development in lieu of engaging with open source projects across three key areas including: 

  1. Community-level: Working in a Community
    1. Does AI increase open source community level noise? 
    2. Are AI developed contributions distinguishable from those developed by individuals? 
    3. Does AI reduce the volume of corporate engagement within open source communities? 
  2. Ecosystem-level: Working in an Ecosystem
    1. Does AI reduce the need for companies to perform ecosystem level monitoring? 
    2. Does AI reduce the need for companies to engage with open source communities? 
  3. Policy-level: Addressing Licensing and Security Concerns
    1. Does AI provide a source of legal exposure for communities and companies?
    2. Will AI be used to mask malicious code within communities and companies?

Underlying these questions is a certainty that AI will alter the dynamics of collaboration in open source engagement, and we suggest that this new reality be addressed directly. There is a case where AI will alter cost ratios within individual companies,, as well as uncertainty about how these changes will shift, or possibly erode critical value presently derived from open source engagement. One core challenge we face will be identifying corporate approaches to AI within open source that affects communities, ecosystems, and policies with deliberateness. To date, corporate engagement with open source recognizes that a rising tide lifts all boats. Will AI change our views of the tide?

Starter Project Health Metrics Model

By Blog Post No Comments

Have you ever been in a position in your company or community where you would like to start getting a sense of the health of a project – but you don’t know where to begin? 

People often struggle to get started with measuring project health in a way that allows them to draw meaningful conclusions without becoming overwhelmed. Measuring key aspects of project health is an essential first step toward understanding how an open source project can be improved and deciding where to focus improvement efforts. The following four metrics are great ways to get started.The Starter Project Health Metrics Model was published by the CHAOSS project to address this very issue. 

  • Time to First Response Determine the amount of time between when an activity was opened (e.g. Issue or Change Request) and when it received the first response from a human. A quick response helps contributors feel welcome and appreciated.
  • Change Request Closure Ratio Measure the ratio between the total number of open change requests during a time period versus the total number of change requests closed in that same period. This helps to determine whether your project has enough maintainers to keep up with incoming contributions.
  • Bus Factor Determine the smallest number of people that make 50% of contributions to understand whether your project would be in jeopardy if one or more key contributors left.
  • Release Frequency Determine the frequency of project releases (including point releases with bug fixes) to make sure that security fixes, new features, and bug fixes are available to your users.

The model is available at: https://chaoss.community/kb/metrics-model-starter-project-health/ 

CHAOSScon EU 2023 Summary

By Blog Post No Comments

By Matt Germonprez and Georg Link

CHAOSScon 2023 Europe is now complete! Thanks to everyone who helped put on another wonderful event and thanks to everyone who took the time to attend and participate! It was really great to connect with everyone in the beautiful city of Brussels. 

Photo Credit: Sean Goggins

This CHAOSScon, we led several discussions centering on two key questions. We asked participants to reflect on the questions in small groups and report back. This post will highlight a few of the key takeaways — and also provide captured comments that maybe didn’t make it into this post. 

What challenges exist for using metrics within your OSPO or Community?

The Consistent Use of Metrics within an Organization or Community

Many of the responses centered on the difficulties associated with the consistent use of metrics within an organization or community. For example, with the sheer size of many organizations and communities, the deployment and interpretation of metrics can vary a lot between people. This leads us, in the CHAOSS Community, to a 2023 goal of developing ways to communicate simple metric strategies and results that can be easily shared between people. 

The full set of recorded comments for the first question with general categorizations is here: 

  • Organization/Community: 
    • There isn’t a central place for metrics discussions within an organization.
    • There isn’t a common taxonomy for how to speak about metrics.
    • Fragmentation of metrics within an organization can make consistent use difficult.
    • The size of an organization can make consistent use of metrics difficult.
    • The size of a community can make consistent use of metrics difficult.
    • Different stages of project maturity can make consistent use of metrics difficult.
    • There is minimal guidance on replicating CHAOSS structures locally.
  • Metrics/Metrics Model: 
    • How to determine business value from a project?
    • How to measure the value of participation in a project?
    • How to determine business risk from a project?
    • How to determine company impact on a project?
    • How to measure the cost of non-participation?
    • Metrics themselves can be quite complex
Photo Credit: Sean Goggins

What should the CHAOSS project be working on in the future?

Building Collaborative Communities and Helping People Communicate Better

There are many, many things that we could work on within the CHAOSS project in 2023. Recurring themes from CHAOSScon include (1) helping people connect with others in similar contexts to discuss health-related concerns (i.e., corporate open source program offices) and (2) developing ways to help people communicate about metrics within their respective organization or community. 

The full set of recorded comments for the second question with general categorizations is here: 

  • CHAOSS Software: 
    • Enable all contributions to be seen in a dashboard.
    • Add social, design, and educational material use (like a social calendar) into a dashboard of contributor metrics to cross-visualize data about the ecosystem beyond code.
  • CHAOSS Operations: 
    • Continue to support the newcomer experience.
    • Assist others with the interpretation of results.
    • Have mechanisms to help interpret the data in non-biased ways / manipulating it towards certain predetermined hypotheses.
    • Provide badging to validate business metrics strategies.
    • Provide metrics and metrics model validation.
    • Support user groups that need specific metrics and metrics models to help in a variety of contexts.
    • Develop personas for groups of people with similar interests.
    • Build better starting points for people who may not want to build metrics.
    • Have social justice built into the metric development process, not as charity, but to address systemic issues continuously.
    • Provide Pathways into the metrics, more context on where they have been used in the past (e.g. to answer the question what do people like me use?).
    • Think about people who work in different languages – is CHAOSS accessible for them?
  • CHAOSS Communication: 
    • Talk with other communities to understand different views of metrics.
    • Tell user stories of how others are using the metrics and metrics models in practice.
    • Provide a framework for how to apply metrics.
    • Provide ways to represent and talk about metrics in business meetings.
    • Push a goal-oriented approach.
    • Articulating why metrics are important to help the business case.
    • Provide a set of five metric models to use that do five different things.
  • Metrics/Metrics Model: 
    • Software license compliance between repos
    • Test coverage in repos.
    • Code quality
    • Pipeline status
    • Ensuring a broad community of voices.
    • Build productivity and efficiency metrics and metrics models.
    • Need to understand the signal-to-noise ratio.
    • More human-centric metrics, the kind that help to signal/prevent burnout.
    • An estimate of how difficult it might be to implement. A signal of whether tech experience is required to implement a particular metric, or if it can be done by a quick survey, or if it can be identified automatically, or if it requires big program management lift. 

Thanks again to everyone for an amazing CHAOSScon 2023 in Brussels! We hope to see you soon!!

Photo Credit: Sean Goggins