Tag

research Archives - CHAOSS

Group picture of the attendees of OFA. Lots of smiling faces with the virtual attendees on the screen in the background.

OFA Symposium: Open Source Research Collaboration

By Blog Post

Last week, I attended The OpenForum Academy Symposium in Berlin. This is Open Forum Europe’s (OFE) academic conference around open source with the goal of collaboration between researchers, industry, and folks working on policy to share ideas and eventually generate more academic research that is useful for open source policy people. In an effort to avoid a 10 page blog post, I’ll only cover the highlights of a few talks that I found particularly interesting and that seem more relevant for CHAOSS.

In the first keynote, Julia Ferraioli and Juniper Lovato talked about the Beyond the Repository ACM Paper that they co-authored with Amanda Casari. I personally think this should be required reading for anyone doing research in open source. The paper goes in depth into why researchers should think about how their methods and results impact entire open source ecosystems, including the people working within the projects being studied. The paper is organized into nine best practices that help researchers understand how they might design their studies in ways that keep the ethical implications and ecosystem impact of their research top of mind. In particular, they suggest that researchers actively work with the practitioners involved in the projects as they look beyond the repository to gather data and consider the ramifications of the research.

The keynote was followed by several presentations focused on Open Source Communities and Cooperatives. Jérémie Haese talked about the working paper (to be published in Management Science), Open at the Core: Moving from Proprietary Technology to Building a Product on Open Source Software, that he is writing jointly with Christian Peukert using Microsoft’s move to Chromium as a case study. Among other things, they saw an increase in the pool of contributors along with more people reporting security vulnerabilities due to increased bug bounties offered by Microsoft resulting in an increasing number of vulnerabilities being fixed. 

Jorge Benet presented, A Cooperative Model for Digital Infrastructure and Recommendations to Adopt It, which has been fully published. The report discusses their findings from 21 digital infrastructure projects from 12 cooperatives across 7 countries with a model that looks at value creation, proposition, and capture with recommendations for projects wishing to adopt the model. 

Elçin Yenişen Yavuz talked about how user-led open source foundations are different from other types of foundations. While many of us work on projects in foundations led by communities and vendors, the foundations led by users of the software (e.g., Apereo Foundation, Academy Software Foundation, openMDM) have more direct benefits for the end users, including more control over functionality, shared resources, sustainability, and productivity. Results of some of this research can be found in the Problems, Solutions, and Success Factors in the openMDM UserLed Open Source Consortium paper in Communications of the Association for Information Systems.

There were a few talks about Legal implications from Open Source, which is a bit less relevant for the CHAOSS audience, but there was one talk from Wayne Wei Wang, Open-Source Commons Made in China: A Case Study of OpenAtom Foundation and Mulan-series Licenses, that I found interesting partly because some of us have been working with the folks at openEuler, which is an OpenAtom project under a Mulan-series license. Wayne talked about some ways that open source is different in China due to Chinese state entrepreneurialism and the relationships between central planning and open source. This is based on Wayne’s research paper: China’s digital transformation: Data-empowered state capitalism and social governmentality.

I found Knut Blind’s talk, Open Source in the Context of Innovation, particularly interesting, since it talked about various measures of innovation in open source. He shared the stage with a handful of others as they talked about how existing research on innovation using patents and papers can be compared to open source innovation by looking at open source contributions (like commits) as a comparison to patents and cited papers, which aren’t as dissimilar as they might seem if you think about how they all share a similar process that goes from submission through review and finally into publication / release. They also talked about using the GitHub Innovation Graph to look at open source innovation for various national economies. Finally, they talked about how dependencies can be used when looking at innovation, but that there are some challenges with this approach when you try to compare projects to understand innovation. For example, Javascript modules tend to be designed for integration into projects, so they will have many more dependencies than C/C++ projects, which are often designed as standalone apps.

Nataliya Wright’s talk, Open Source Software and Global Entrepreneurial Growth, looked at how contributing to open source can spur global entrepreneurial growth. They found that contributing to open source predicts higher valuations and funding for IT ventures (note that some, but not all, of this is related to selection bias based on the types of companies and founders that contribute). While the talk was based on new research, some of their early stage results can be found in the Research Policy paper, Open Source Software and Global Entrepreneurship.

This was a really interesting conference with many more talks than I could cover here, and I’m already looking forward to next year’s conference!

Artificial Intelligence: An Open Source Disruptor

By Blog Post

By Matt Germonprez, Dawn Foster, and Sean Goggins 

 

Corporations have increased their investments in open source because of its potential to share the weight of non-differentiating technology costs with other organizations that rely on the same core technologies, and consequently innovate more quickly and increase organizational value. In many cases, the financial leverage gained through open source engagement is substantial, visible, and measurable. However, open source engagement is, to some extent, a cost each organization must assess. For organizations considering open source engagement, it means evaluating the ratio of increased value over the costs of engagement – a ratio that may very well be directly affected by AI. 

Open source has benefited an untold number of industries. Open source carries forward well known and positive outcomes for engagement by companies. These include leveraged development, distribution of software maintenance costs, improved time to market, increased innovation, and talent acquisition. However, these positive outcomes, derived from the value leverage provided by open source, now have the potential to be found elsewhere, most notably through the use of artificial intelligence that leverages large language models. 

It is becoming increasingly clear that AI will be an open source disruptor that alters how companies think about things like the provenance of source code, as seen in the Linux Foundation’s recent release of their Generative AI Policy. The LF policy highlights key areas of concern including contributions, copyright, and licensing. Other key efforts to address open source and AI include the OSI’s deep dive in Defining Open Source AI and AI Verify Foundation’s focus on building ethical and trustworthy AI. These initiatives are critically motivated to address key issues of AI as part of open source processes and the needed accessibility of AI for all. Each initiative rightfully assumes a future that includes AI and also rightfully prepares an audience for key issues that require attention. 

AI is already emerging as a disruptive factor in the work of open source communities. Some open source communities are having issues with the volume of low quality AI-generated code contributions. People often contribute to open source projects (a secondary goal), particularly high-profile projects, to build their resumes and GitHub profiles (a primary goal). However, AI now provides an option for people to reduce the work needed to achieve secondary goals in hopes of achieving primary goals. As a result, open source projects are seeing an increase in nonsense code contributions that are causing additional work for already overloaded project maintainers. 

Within any company, AI has the capacity to impact how engagements with open source projects are evaluated and approached. We know reasons for corporate engagement with open source projects include the reduction of the internal resources needed for software development, maintenance and improving product time to market. To obtain these positive outcomes, the costs of engaging with open source projects by assigning employees to contribute, and become leaders are offset by the benefits. Open source program offices aim to lower the costs and amplify the benefits of these engagements. But what if AI, used to increase development speed, and the expense of engagement with open source communities further lowers the costs and retains the benefits associated with developing software in the open? What if a company could still achieve cost and time savings without working in the public? What if conversations that were otherwise present in open source projects and communities could now take place as well-defined AI prompts? Should open source program offices be focusing on working with AI, in addition to working with open source projects? 

Questions that need more exploration are premised on how AI carries the potential to alter the cost-benefit ratios of corporate software development in lieu of engaging with open source projects across three key areas including: 

  1. Community-level: Working in a Community
    1. Does AI increase open source community level noise? 
    2. Are AI developed contributions distinguishable from those developed by individuals? 
    3. Does AI reduce the volume of corporate engagement within open source communities? 
  2. Ecosystem-level: Working in an Ecosystem
    1. Does AI reduce the need for companies to perform ecosystem level monitoring? 
    2. Does AI reduce the need for companies to engage with open source communities? 
  3. Policy-level: Addressing Licensing and Security Concerns
    1. Does AI provide a source of legal exposure for communities and companies?
    2. Will AI be used to mask malicious code within communities and companies?

Underlying these questions is a certainty that AI will alter the dynamics of collaboration in open source engagement, and we suggest that this new reality be addressed directly. There is a case where AI will alter cost ratios within individual companies,, as well as uncertainty about how these changes will shift, or possibly erode critical value presently derived from open source engagement. One core challenge we face will be identifying corporate approaches to AI within open source that affects communities, ecosystems, and policies with deliberateness. To date, corporate engagement with open source recognizes that a rising tide lifts all boats. Will AI change our views of the tide?