All Posts By

CHAOSS Community

CHAOSS Weekly Newsletter

By | News

Things are rolling with this year’s Google Summer of Code. Make sure to check out (and comment on!) the work that the students are doing. You can track what they are up to on their blogs:

Parth Sharma: https://tinyurl.com/yxzjn4y8
Bingwen Ma: https://tinyurl.com/yytbj2jo
Aniruddha Karajgi: https://tinyurl.com/y62gx4uf
Nishchith Shetty: https://tinyurl.com/y4a5q4vp

Read More

Diversity and Inclusion on The New Stack

By | Blog Post, Related Publication

CHAOSS D&I WG on The New Stack

A contributed article on The New Stack details how the CHAOSS D&I Working Group got started and evolved. Georg Link and Sarah Conway take a close look at the working group's history and capture insights that are only known to those who were part of the evolution. The article outline is:

  • History of the CHAOSS project
  • Rise of the CHAOSS D&I Working Group
  • Creating D&I Metrics
  • What’s Next for CHAOSS D&I WG

The New Stack staff also interviewed Georg Link and Nicole Huesman about the CHAOSS D&I Working Group for their podcast. Questions they answer are:

  • How did you get involved with the CHAOSS project?
  • How did the CHAOSS project come about? Why was the Linux Foundation interested in creating a standard set of metrics for open source projects?
  • What were the initial metrics that people were interested in measuring? What were they trying to learn about their projects that fueled that initial interest?
  • What are the questions and metrics that you would use to quantify diversity and inclusion?
  • How is CHAOSS measuring results of diversity and inclusion? What are good metrics to use as indicators?
  • What is the plan for getting diversity and inclusion metrics out to individual projects and supporting them with using the metrics?
  • Why do people want metrics? What do they want to use them for? How does it improve software delivery?
  • How to get involved in the CHAOSS project?

Links:

Contributed article on The New Stack: https://thenewstack.io/how-chaoss-di-can-help-diversity-in-the-open-source-community/

Podcast on The New Stack: https://thenewstack.io/how-chaoss-measures-diversity-windows-gets-a-proper-terminal/

CHAOSS Weekly Newsletter

By | News

With respect to Grimoire Lab, version 0.2.22 was launched last week. This release is the first one that officially includes Graal, a generic repository analyzer. This is the first step to make the platform handle the data produced by this tool.

Read More

CHAOSS Weekly Newsletter

By | News

CHAOSS is participating in the 2019 Grace Hopper Open Source Day (https://ghc.anitab.org/tag/osd/). Sean and Carter from the Augur team will lead a session on October 3, 2019. This is a great opportunity for CHAOSS to be involved in this fantastic event!

Read More

VMware Open Source

By | Blog Post

Tracking Open Source Contributions – My Contribution to CHAOSScon Europe 2019

By Alexandre Courouble

This blog post originally appeared on the VMware Open Source Blog on March 5, 2019.

Back in October, my colleague John Hawley and I reflected on our visit to last year’s U.S. CHAOSScon where we gave a talk on “The Pains and Tribulations of Finding Data.” At the end of that post, we mentioned learning more at the conference about Grimoire Lab’s Perceval tool for tracking data from multiple open source projects on a single dashboard. That opportunity helped me develop the work that was the subject of a talk I gave at the recent CHAOSScon Europe 2019.

In my new presentation, I described a tool I’ve created to track the open source contributions of all the developers in VMware’s Open Source Program Office. The code itself isn’t open sourced at present, but its capabilities and outcomes will be of interest to any open source group hoping to better understand the open source contributions its members are collectively making. We’re also planning to use what we’ve learned to extend an existing open source metrics tool to make it more user centric.

The idea for our contribution tracker grew out of a pretty straightforward need. Every engineer in our team was already tracking their open source contributions, but they were doing it by laboriously copying and pasting the information from every project they’d worked on. This easily took up an hour or more of their time every week. Then, our manager had to spend even more time compiling everything he’d been sent into a legible and useful report.

The Tracker makes all of that work redundant. Currently, it works with GitHub to track:

  • pull requests merged
  • comments and reviews on pull requests
  • issues opened
  • issues commented on

It does this by calling the GitHub API with a query about each of these data points for every engineer on our team. The data is stored in an SQLite database and then we use a small Python app to serve it to a web interface. Not everyone in our team works on GitHub based projects, of course, which is where Perceval comes in. In those cases, we use Perceval to access the git repository of each non-GitHub-based project our team members are working on and then search that data for their emails and/or usernames.

For non-GitHub projects, we can see:

  • Commits authored
  • Commits merged into the repository

The Contribution Tracker automatically aggregates this data and offers a view into it through the web interface. It actually does more than aggregating data by contributor. It offers:

  • A user-specific view showing pull requests, comments, reviews and issues over a particular time period.
  • A summary view that lists all team pull requests, comments, reviews and issues over a particular time period.
  • A projects and organization view showing all commits made by all team members on a specific project.
  • A charts view showing the evolution of open source contributions over time.

Crucially, the Tracker outputs data in the form of whole English sentences. Instead of just a number in a report, you see a set of sentences saying things like: “Alex C. committed [name] pull request” and so on, with links to every item worked on.

The tool has already accomplished its primary goal by saving everyone on our team a ton of time. Now individual engineers don’t need to do any tracking of their own and all our manager has to do is open the tool, select the outputs they want, then copy and paste them into an email to distribute as needed.

I wanted to share this story at CHAOSScon in case our experience was valuable to others. From the positive responses I received, it seems like it could well be. One concrete outcome is that, based on this work, we’re hoping to help extend the open source metrics tool Augur to offer user-based metrics. Augur comes out of Professor Sean Goggin’s lab at the University of Missouri and currently offers only project-centered metrics. Like GrimoireLab, Augur displays a series of metrics regarding a specific project hosted on GitHub. We found Augur to have a simpler user interface and a smaller barrier of entry than GrimoireLab’s.

We’re also talking about adding the narrative capability developed for our internal tool to Augur – when you can output data in the form of legible English sentences, you make that data accessible to a wider range of potential users. In addition, I had a very positive conversation with the people who built Perceval, which helped me understand how we could optimize our tool to make maximal use of Perceval as we develop it further.

The larger context of this work, and of my presentation, is that for-profit corporations are increasingly looking to build open source teams. Unsurprisingly, they want their employees to collect metrics that demonstrate the value they are adding to the sector. If we can make it easier for these teams to track their achievements, we’re helping build support from entities that have a lot of resources available to move open source forward.

Stay tuned to the VMware Open Source Blog for future conference recaps and follow us on Twitter (@vmwopensource).

GrimoireLab – Graal

By | Blog Post

Welcome Graal

Currently, GrimoireLab allows to produce analytics with data extracted from more than 30 tools related with contributing to Open Source development such as version control systems, issue trackers and forums. Despite the large set of metrics available in GrimoireLab, none of them relies on information extracted from source code, thus limiting the end-users to benefit of a wider spectrum of software development data.

Graal is a tool that allows to conduct customizable and incremental analysis of source code by leveraging on existing tools. It enhances Perceval (one of the key GrimoireLab‘s components) and produces an output similar to Perceval one to ease the integration with GrimoireLab, thus complementing the analytics offered by latter with source Code related metrics.

Once installed, Graal can be used as a stand-alone program or Python library.

Backends

Several backends have been already developed. They leverage on some source code analysis tools, where executions are triggered via system calls or their Python interfaces.

In the current status, the backends mostly target Python code, however other backends can be easily developed to cover other programming languages.

The available backends are:

  • CoCom gathers data about code complexity (e.g., cyclomatic complexity, LOC) from projects written in popular programming languages such as: C/C++, Java,
    Scala, JavaScript, Ruby and Python. It leverages on Cloc and Lizard; the former is a Linux package used to count blank lines, comment lines and LOC, while the latter is a code complexity analyzer written in Python.

  • CoDep extracts package and class dependencies of a Python module and serialized them as JSON structures, composed of edges and nodes, thus easing the bridging with front-end technologies for graph visualizations. It combines PyReverse, a reverse engineering tool able to generate UML-like diagrams, plus NetworkX, a library to create, manipulate and study complex networks.

  • CoQua retrieves code quality insights, such as checks about line-code’s length, well-formed variable names, unused imported modules and code clones. It uses PyLint, a code, bug and quality checker for Python.

  • CoVuln scans the code to identify security vulnerabilities such as potential SQL and Shell injections, hard-coded passwords and weak cryptographic key size. It relies on Bandit, a tool designed to find common security issues in Python code.

  • CoLic scans the code to extract license information. It currently supports Nomos and ScanCode.

Further reading

More details about how Graal works can be found at:

Contributing to the GMD Working Group

By | Blog Post

How we produce metrics in the GMD Working Group

The GMD Working Group
is one of the CHAOSS working groups, tasked with defining
useful metrics relevant for the analysis of
software development projects from the point of view of
GMD (growth-maturity-decline). It also works
in the areas of risk and value. For all of them, we're
intending to follow the same process to produce metrics,
similar to what other CHAOSS working grupos are doing.
This post describes this process, that we have recently
completed for the first metric (many others should follow
during the next weeks).

The process is top down, starting by the definition of
the focus areas of interest. For each of the focus areas,
we define the goals we intend to reach for it, and then
we follow GQM
(goal-question-metric) to first derive questions which,
when answered, should help to reach our goals, and then
metrics that help to answer those questions.
Finally, we explore how those metrics could be implemented,
and produce reference implementations for specific data sources.
During all the process we have into account
use cases,
which illustrate how metrics are used in the real world.

Currently, the working group is dealing with five
focus areas:
code development, community growth, issue resolution, risk, and value.
We estimate that all of them are relevant to improve our
knowledge of FOSS (free, open source software) projects.

Goals for the code development focus area

For now, the more complete of these focus areas
is code development,
for which we have identified some goals: activity,
efficiency and quality. For each of them we're in the process
of identifying questions. For example for activity,
we have identified a question "How many changes are happening to the code base, during a certain time period?",
code named Changes, that we think should help to learn about
the activity of a project.

To help to answer this question,
we have identified some metrics, such as
"Number of changes to the code base", code named as
Code_Changes_No(Period), which tries to capture
how many changes to the source code were done during the
period of interest.
We explain this in detail in the
definition of the Code_Changes metric.
This definition tries to be neutral with respect to the
specific data source (in this case a source code management
repository, such as git, Subversion, or Mercurial),
but also includes specific sections for specific data sources.

Python notebook with Code_Changes implementation for git

Finally, to clarify the metric, and provide a definition which
is not ambiguous and can be checked for conformance,
we also provide an implementation of it for a certain data source.
In our case, we implemented
it for git, as a Python notebook
(check it in Binder).
It includes documentation on the details of the implementation,
and an actual implementation of the metric as a Python class.
It also includes examples of how to use it with real repositories,
and an exploration of some details of the specific data source,
relevant for implementations and comparison between different implementations.
Reference implementations are based on
Perceval output, which is a collection of JSON documents,
one per item obtained from the data source, as much similar
as the data produced by the data source as possible.

When producing the first reference implementation for
Code_Changes, we completed the first full process,
from focus area and goal to questions and metrics.
Now, we intend to complete this process for the rest
of goals in all our focus areas. Do you want to join us in this travel?
If so, you are welcome! We are ready to review your pull requests,
and work with you towards having useful definitions and
implementations of metrics that help us all to better understand
FOSS projects.

Metrics With Greater Utility: The Community Manager Use Case

By | Blog Post

By Sean Goggins 

Introduction

Community managers take a variety of perspectives, depending on where their communities are in the lifecycle of growth, maturity, and decline. This is an evolving report of what we are learning from community managers, some of whom we are working with on live experiments with a CHAOSS project prototyping software tool called Augur (http://www.github.com/CHAOSS/augur). At this point, we are paying particular focus to how community managers consume metrics and how the presentation of open source software health and sustainability metrics could make them more and in some cases less useful for doing their jobs.

Right now, based on Augur prototypes and follow up discussions so far, we have the following observations that will inform our work both the “Growth Maturity and Decline” working group and in Augur Development. Here are a few things we have learned from prototyping Augur with community managers. These features in Augur are particularly valued:

  1. Allowing comparisons with projects within a defined universe of of projects is essential
  2. Allow community managers to add and remove repositories that they monitor from their repertories periodically
  3. Downloadable graphics
  4. Downloadable data (.csv or .json)
  5. Availability of a “Metrics API”, limiting the amount of software infrastructure the community manager needs to maintain for themselves. This is more valued by program managers overseeing larger portfolios right now, but we think it has potential to grow as awareness of the relatively light weight of this approach becomes more apparent. By apparent, we really mean “easy to use and understand”; right now it is easy for a programmer, but less so for a community manager without this background or current interest.

Date Summarized Comparison Metrics

With these advantages in mind, making the most of this opportunity to help community managers with useful metrics is going to include the availability of date summarized comparison metrics. These types of metrics have two “filters” or “parameters” fed into them that are more abstractly defined in the Growth, Maturity, and Decline metrics on the CHAOSS project.

  1. Given a pool of repositories of interest for a community manager, rank them in ascending or descending order by a metric
  2. Over a specified time period or
  3. Over a specified periodicity (e.g., month) for a length of time (e.g., year).

For example, one open source program officer we talked with is interested in the following set of date summarized comparison metrics. Given a pool of repositories of interest to the program officer (dozens to hundreds of repositories):

  1. What ten repositories have the most commits this year (straight commits, and lines of code)?
  2. How many new projects were launched this year?
  3. What are the top ten new repositories in terms of commits this year (straight commits, and lines of code)?
  4. How many commits and lines of code were contributed by outside contributors this calendar year? Organizationally sponsored contributors?
  5. What organizations are the top five external contributors of commits, comments, and merges?
  6. What are the total number of repository watchers we have across all of our projects?
  7. Which repositories have the most stars? Of the ones new this year? Of all the projects? Which projects have the most new stars this year?

Open Ended Community Manager Questions to Support with Metrics

There are other, more open ended questions that may be useful to open source community managers:

  1. Is a repository active?
    1. Visual differentiation that examines issue and commit data
    2. Activity in the past 30 days
    3. Across all repositories, present the 50th percentile as a baseline and show repositories above and below that line
  2. Should we archive this repository?
    1. Enable an input from the manager after reviewing statistics
    2. Activity level, inactivity level and dependencies
    3. Mean/Median/Mode histogram for commits/repo
  3. Should we feature this repository in our top 10? (Probably a subjective decision based on some kind of composite scoring system that is likely specific to the needs of every community manager or program office.)
  4. Who are our top authors? (Some kind of aggregated contribution ranking by time period [year, month, week, day?]. nominally, I have a concern about these kinds of metrics being “gameable”, but if they are not visible to contributors themselves, there is less “gaming” opportunity.)
  5. What are our top repositories? (Probably a subjective decision based on some kind of composite scoring system that is likely specific to the needs of every community manager or program office.)
  6. Most active repositories by time period [year, month, week?]. Activity to be revealed through a mix of retention and maintainer activity primarily focusing on the latter. Number of issues and commits. Also the frequency of pull requests and the number of closed issues.
  7. Least active repositories by time period [Week? Month? Year?]. Bottom of scores calculated, as above.
  8. Who is our most active contributor (Some kind of aggregated contribution ranking by time period [year, month, week, day?]. nominally, I have a concern about these kinds of metrics being “gameable”, but if they are not visible to contributors themselves, there is less “gaming” opportunity.)
  9. What new contributors submitted their first new patches/issues this week? (Visualization Note: New contributors can be colored in visualizations and then additionally a graph can be made for number of new contributors)
  10. Which contributors became inactive? (Will need a mechanism for setting “inactive” thresholds.)
  11. Baseline level for the “average” repository in an organization and for each, individual organization repository.
  12. What projects outside of a community manager’s general view (GitHub organization or other boundary) do my repositories depend on or do my contributors also significantly contribute to?
  13. Build a summary report in 140 characters or less. For example, “Your total commits in this time period [month, week?] across the organization increased 12% over the last period. Your most active repositories remained the same. You have 8 new contributors, which is 1 below your mean for the past year. For more information, click here.”
  14. Once a metrics baseline is established, what can be done to move them? [^1]
  15. Are there optimal measures for some metrics?
    1. Pull request size?
    2. Ratio of maintainers to contributors?
    3. New contributor to consistent contributor ratio?
    4. New contributor to maintainer ratio?

Augur Specific Design Change Recommendations

Next is a list of Augur specific design changes suggested thus far, based on conversations with community managers.

  1. Showing all of the projects in a GitHub organization in a dashboard by default is generally useful.
  2. Make the lines more clear in the charts, especially when there are multiple lines in comparison
  3. How to zoom in and out is not intuitive. In the case of Google Finance, for example, a default, subset period was displayed when they used the “below the line mirrored line” interface this is modeled after. That old model makes it fairly clear that the ability to adjust the range of dates is what that box below the line in google finance is for. Alternately, Google’s more updated way of representing time, providing users choices, and showing comparisons may be even more useful and engaging. In general, its important that the time zooming is more clear.
    In one view, Google lets you see a 1 year window of a stock's performance.

    Figure 1: In one view, Google lets you see a 1 year window of a stock’s performance.

     

    In another view, you can choose a 3 month period. Comparing the two time periods also draws out the trend with red or green colors, depending on whether or not the index, in this case a stock's price, has increased or decreased overall during the selected time period.

    Figure 2: In another view, you can choose a 3 month period. Comparing the two time periods also draws out the trend with red or green colors, depending on whether or not the index, in this case a stock’s price, has increased or decreased overall during the selected time period.

     

    Comparisons are similarly interesting in Google's finance interface. You can simply add a number of stocks in much the same way our users want to add a number of different repositories.

    Figure 3: Comparisons are similarly interesting in Google’s finance interface. You can simply add a number of stocks in much the same way our users want to add a number of different repositories

  4. For the projects a community manager chooses to follow, go ahead and give them comparison check-boxes at the top of the page. I think from a design point of view, we should limit comparisons as discussed, to 7 or 8, simply due to the limits in human visual perception.
  5. The ability to adjust the viewing windows to a month summary level is desired.
  6. Right now, Augur does not make it clear that metrics are, by default, aggregated by week.
  7. New contributor response time. When a new contributor joins a project, what is the response time for their contribution?
  8. A graph **comparing** commits and commit comments on x and y axes **between projects** is desired. Same with Issue and Issue comments.
  9. In general, the last two years of data gets the most use. We should focus our default display on this range.

Data Source Trust Issues

  1. Greater transparency of metrics data origins will be helpful for understanding discrepancies between current understanding and what metrics show.
    1. We should include some detailed notes from Brian Warner about how Facade is counting lines of code, and possibly some instrumentation to enable those counts to be altered by user provided parameters.
    2. Outside contributor organization Data. One community manager reported that their lines of code by organization data seems to look wrong. I did explain that these are mapped from a list of companies and emails we put together, and getting this right is something community managers will need some kind of mapping tool to do. GitDM is a tool that people sometimes use to create these maps, and Augur does follow a derivative of that work. It is probably the case that maintaining these affiliation lists is something that needs to be made easier for community managers, especially in cases where the number of organizations contributing to a project is diverse (there is a substantial range among community managers we spoke with. Some are managing complex ecosystems involving mostly outside contributors. Most are in the middle. And some of contributor lists highly skewed toward their own organization.)
  2. GHTorrent data, while excellent for prototyping, faces some limitations under the scrutiny of community managers. For example, when using the cloned repositories, and then going back to *issues*, the issues data in GHTorrent does not “look right”. I think the graph API might offers some possibilities for us to store issue statistics we pull directly from GitHub and update periodically as an alternative to GHTorrent.
  3. When issues are moved from an older system, like Gerrit, into GitHub issues, in general, the statistics for the converted issues are dodgy, even through the GitHub API. We are likely to encounter this, and at some point may want to include Gerrit data in a common data structure with issues from GitHub and other sources.

New Metrics Suggested

  1. Add metric “number of clones”
  2. “Unique visitors” to a repository is a data point available from the GitHub API which is interesting.
  3. Include a metric that is a comparison of the ratio of new committers and total committers in a time period. Or, perhaps simply those two metrics in alignment. Seeing the number of new committers in a set of repositories can be a useful indication of momentum in one direction or another; though I hasten to add that this is not canonically the case.
  4. Some kind of representation of the ratio between commits and lines of code per commit
  5. Test coverage within a repository is something to consider measuring for safety critical systems software.
  6. Identifying the relationship between the DCO (Developer Certificate of Origin) and the CLA (Contributor License Agreement).
  7. There is a tension between risk and value that, as our metrics develop in those areas, we are well advised to keep in mind.
  8. The work that Matt Snell and Matt Germonprez at the University of Nebraska-Omaha are starting related to risk metrics is of great interest. Getting these metrics into Augur is something we should plan for as soon as reasonably possible.

Design Possibilities

Augur

For Augur, I think the interface changes that enable comparisons and adjust the level of self apparent ways to compress or expand the time, as per the Google examples, are at the top of the list of things that will make Augur more useful for community managers. Feedback on these notes will be helpful. I think the new committers to committers ratio is important, as well as enabling comparisons across projects in the bubble graphs as well. Transparency of data sources and limitations of data sources for both the API and the front end, which are above average but not complete, are important.

Growth Maturity and Decline Working Group

Many of the metrics of interest to community managers fall under the “Growth Maturity and Decline” working group. From a design perspective it appears that, possibly, the way that metrics are expressed and consumed by these stakeholders in their individual derivatives of the community manager use case is quite far removed from the detailed definition work occurring around specific metrics. Discussion around an example implementation like Augur is helping draw out some of this more “zoomed out” feedback. The design of system interfaces frequently includes the need to navigate between granular details and the overall user experience [@zemel_what_2007; @barab_our_2007]. This is less of a focus in the development of software engineering metrics, though recent research is beginning to illustrate the criticality of visual design for interpreting analytic information [@gonzalez-torres_knowledge_2016].

References

  • Barab, S, T Dodge, MK Thomas, C Jackson, and H Tuzun. 2007. Our designs and the social agendas they carry. Journal of the Learning Sciences 16 (2): 263-305.
  • Gonzalez-Torres, Antonio, Francisco J. Garcia-Penalvo, Roberto Theron- Sanchez, and Ricardo Colomo-Palacios. 2016. Knowledge discovery in software teams by means of evolutionary visual software analytics. Sci- ence of Computer Programming 121: 55{74. doi:10.1016/j.scico.2015.09.005. https://linkinghub.elsevier.com/retrieve/pii/S0167642315002658.
  • Zemel, Alan, Timothy Koschmann, Curtis LeBaron, and Paul Feltovich. 2007. What are we Missing? Usability’s Indexical Ground. Computer Supported Cooperative Work.

Acknowledgements

Many members of the CHAOSS community contributed to this report and analysis. I am happy to share names with permission from the contributors, but I have not requested permission as of the publication date.

[^1]: Once we are to this point, I think CHAOSS is kicking butt and taking names.

A PDF Version of this Post is Available Here.

Open Community Metrics and Privacy: MozFest’18 Recap

By | Blog Post

By: Raniere Silva, Software Sustainability Institute, and Georg Link, University of Nebraska at Omaha.

This post was originally published here and on Software Sustainability Institute blog .

Image by Raniere Silva.

TL;DR

Open communities lack a shared language to talk about metrics and share best practices. Metrics are aggregate information that summarise raw data into a single number, stripping away any context of data. Pedagogical metric displays are an idea for metrics that include an explanation and educates the user on how to interpret the metric. Metrics are inherently biased and can lead to discrimination. Many problems brought up during the MozFest session are worked on in the CHAOSS project.

Introduction

Mozilla Festival, or MozFest, is an annual event organized by Mozilla volunteers to join thinkers and champions of the open internet. The event is located at the Ravensbourne University, Greenwich, London, United Kingdom, and hosts a safe space for attendees to talk about decentralization, digital inclusion, openness, privacy, security, web literacy and much more.

At MozFest this year, we organised a 90-minute conversation on the topic of open community metrics and privacy. We seeded the conversation by sharing past examples (e.g., we are 1,000,000 Mozillians) asked our nine participants to discuss what it means to aggregate metrics for open communities and what impact this can have for (non-)members of these communities. The following sections elaborate on thoughts, ideas, and discussions from the session.

What are Open Community Metrics?

The MozFest conversation revealed that we are lacking a shared language for talking about open community metrics. Our session is indicative that the problem exists across open communities. Maybe it is because open community metrics have not been standardised, have not been around for long yet, or that every community has different ways of dealing with them. To resolve this, we discussed the meaning of what is an open community and what are metrics.

We compared community to networks. A network consists of people who are connected to each other. In contrast, members of a community have a shared goal and purpose, often creating something together. Communities build a shared identity, and membership in the community is an alignment of one’s identity. Adding in the ‘open’, an open community is one that uses communication tools that displays the conversations to anyone who wants to inspect them. An open community does not always welcome everyone as a member (inclusive governance) but it operates in a transparent way.

We contrasted metrics and data. Data is the raw information about who wrote a message to whom and what was the content of the message. The data needs to be enriched before it becomes information, such as how active a specific user is or how popular a certain topic is within the open community. Metrics are aggregates of those information and summarise the information into a single number or graph. Metrics is the information you would expect to see on a dashboard that summarises aspects of your open community.

Open Community Data Example

The following example from a fictional open source project demonstrates how much information about participants in open communities is often available online. Such data can be aggregated by software, such as Perceval that retrieves and gathers data from different places where open source contributors interact, for example GitHub, Mailing-lists, Discourse, Slack, and Twitter. Due to the open nature of the community, much personal data is made freely available. For example, a Git commit retrieved by Perceval could be:

 

commit 944713e38290d8dd4a9ac7f02267ada72a0e5a10

Author: Jane Doe <jane.doe@mail.com>

Date:   Mon Oct 29 14:56:59 2018 +0000

   Adicionado arquivo README.md

 

The commit contains Jane’s full name and email address. Even if Jane was a very private person and, except for their name and email, no other information about them was available online, we can narrow the search for the real person based on the Portuguese commit message, timestamp, and timezone. With access to other databases, we could enhance Jane’s profile more and with a higher degree of accuracy and precision.

How can we protect Jane’s privacy when open source communities are collecting, processing, aggregating and analyzing data like the one that we describe? And how open source communities can stay true to their principles by making the data used on the analysis open (or FAIR – findability, accessibility, interoperability, and reusability) without violating Jane’s privacy?

Sharing how Metrics Should be Understood

Now that we have a better shared understanding of what metrics are, we can discuss the intricacies of understanding the meaning behind metrics. A problem with metrics is that they only provide a single number or graph with no background information on how the data was collected, how it was summarized, and what were the intentions of the creators of a metric. This can lead to misunderstandings. For example, when number-of-messages is used as a metric to gauge popularity in a topic, then we need to understand how conventions unfold in an open community and the tool used. A tool that has thumb-up reactions or votings for messages provides a way to show interest in a topic and interact with messages; whereas the count of messages in a tool that is lacking those features and only has text messages (e.g. mailing lists) will be inflated because it counts simple “+1” messages.

Another problem with metrics is that they are inherently biased. This runs counter to a popular belief that numbers are objective. At the foundation of metrics are data, which is biased because someone decided what data to collect and how to store it. The bias in data limits what information can be summarized by metrics. Furthermore, a metric reduces data to a single number or graph and as such removes detail and context. In this process, some data is weighted more than other data or not included at all. How data is summarized is determined by the metric creator, though what is done to the data is not obviously clear to a metric user.

These problems point to a need of bridging between metric creator and metric user. A solution we discussed at MozFest is to provide an explanation with every metric. The person or team that prepares a metric for an open community would be responsible for including a description to help with interpreting the metric. We might call this concept of a metric that includes an explanation and educates the user on its use a ‘pedagogical metric display’.

Metrics by Community vs. Metrics about Community

Open community metrics can be created by open communities about themselves, or by externals. Metrics created by people external to an open community have issues that stem from their external perspective. For example, they may not know the meaning behind interaction patterns of a community and as such may choose to summarize data that is not aligned with how the data came to be. Externals also might not be familiar with the goals and identity of an open community and thus create metrics that are misaligned. To resolve these issues, metrics might be better created by open communities. Pedagogical metric displays could bridge between internal metric creators and external metric users.

An interesting question is why metrics are created for open communities. A small open community typically does not have metrics and we only observe metrics being introduced in large and growing communities. An intuitive answer could be that in a small community it is easier to maintain a sense of how the community is doing but in a larger community it becomes infeasible to keep track of everything at the interaction or data level and thus people aggregate the data to metrics to maintain this sense of knowing how a community is doing. A follow-on question is whether communities use metrics to understand themselves better, or whether communities use metrics to drive their growth. For example, knowing and ensuring that new users who make their first post in a community forum receive friendly welcome replies can help a community thrive. Conversely, a daily notification to maintainers that sums up how many new posters did not receive a reply within 48 hours is in itself not helpful because it only provides a negative metric without actionable insights.

A question raised at MozFest is about creating and sharing metrics. By this, we mean the process of calculating a metric. For example, one open community might come up with a metric that serves the community well and other communities are interested in using the same metric for themselves. The immediate practical question is how metrics can be transferred between contexts and adapted for different tools that these communities might be using. An ideological question is whether the metric will actually be useful to the second community.

Who is (Dis)Advantaged by Open Community Metrics?

This section explores the implications of privacy in the context of open community metrics. First, anyone who participates in an open community can be tracked and profiled. A user may try to maintain a sense of privacy by using a pseudonym, a unique email address for each open community, and avoid revealing personal information. How successful these approaches are is subject to further study, but examples exist for failing to maintain privacy in online spaces.

Another problem with open community metrics is how they are being used. A funding agency or other benefactor might want to see open community metrics that demonstrate impact or outcomes. This seems like a simple case where an open community that is tracking metrics and can show how it is producing outcomes or making an impact has a leg up in the struggle for financial support. Producing metrics could become a matter of making an open community sustainable.

Other uses of metrics are also possible. An open community can use metrics to identify areas where community members are struggling. Identifying and fixing a bug in a community platform improves the experience. However, a problematic use of metrics could come from outside a community. What if a community of runners, who share their tracked routes, heart rates, best times, and who they ran with, was analysed by an insurance company to identify new ways to structure insurance policies for people who run regularly (and consequently discriminates against parts of the community).

We have heard that membership in some open communities is a differentiating factor for job applicants. A developer with an active GitHub or StackOverflow profile could have a higher chance of landing a new job. Conversely, a user who posts pictures of drink parties in an open community could be disadvantaged in the hiring process. Granted, these are less issues of open community metrics and more about the data of open communities being available to external people. However, an open community can aid such discriminating use cases by providing metrics at the individual person level (ever wonder who looks at the green dots on GitHub and for what purposes?). The goal of metrics on the level of the individual might be to incentivize members of a community to participate more and behave in certain ways (e.g., avoid negative brownie points). But those same metrics could become signals and be used in unexpected ways. Naturally, people will begin to game a metric when it is displayed.

With the assumption that open community metrics are used in decision making (e.g., hiring, insurance premiums, credit reporting, national citizen score, …) the people who do not participate in a community might be naturally disadvantaged. Some people may choose to avoid open communities and stay anonymous. However, some people are lacking internet access, do not speak (fluently) the dominant language in their community, or do not have the time besides job and family to participate. These people are disadvantaged for reasons outside of the open community and the use of metrics is unfair from the onset.

Conclusion

The MozFest session was an exchange. We know from the participants that each of them took away something from the conversation. The central discussion points are summarized here. As facilitators, we see MozFest to be a mirror of the conversation that is happening in CHAOSS. With this blog post, we hoped to document some of the discussion and provide a baseline from which we can move forward. Many of the problems brought up in the MozFest session and documented here are problems that the CHAOSS project are working on (I know, shameless plug for CHAOSS).

Thanks!

We are thankful to the attendees that help us to brainstorm open community metrics and privacy.

Reflections on CHAOSScon NA 2018

By | Blog Post

By Alexandre Courouble and John Hawley

This blog post originally appeared on the VMware Open Source Blog on October 9, 2018.

Previously, we’ve explored the challenge of measuring progress in open source projects and looked forward to the recent CHAOSScon meeting, held right before the North American Open Source Summit (OSS). CHAOSS, for those who may not know, is the Community Health Analytics Open Source Software project. August’s CHAOSScon marked the first time that the project had held its own, independent pre-OSS event.

After attending the event, we thought it might be interesting to share our takeaways from the conference and reflect on where we stand with regard to the challenges that we outlined both in our previous posts and in our CHAOSScon talk, “The Pains and Tribulations of Finding Data.”

For those who weren’t able to attend CHAOSScon and would like to see our talk, it’s now available for viewing here. We started with an overview of current solutions for gaining visibility into open source data and then outlined what we view as the challenges currently standing in the way of creating solid progress metrics for open source development. We’ll go back to the talk at the end of the post, but first, our overall takeaways.

One thing we appreciated about having a dedicated CHAOSScon was that the event attracted a mix of longtime colleagues and collaborators as well as people new to the community. In particular, there was a strong presence from corporate open source teams. Engineers from Twitter, Comcast, Google and Bitergia shared how they have been tackling different kinds of open source data challenges. Hearing about their own trials and tribulations definitely seemed to validate our impression that we share a number of basic data measurement problems that are worth addressing as a community.

It was good, too, to see CHAOSS welcoming these corporate perspectives. Open source conferences often eschew that kind of engagement, but it is useful to hear how teams are solving problems for themselves out in the wild. Here’s hoping that this marks the start of a new trend.

A pair of workshops in the afternoon offered another useful takeaway. One was on “Establishing Metrics That Matter for Diversity & Inclusion” and the other was a report from the CHAOSS working group on Growth-Maturity-Metrics.

It was clear from the latter workshop that we now have a good number of quantifiable data points to establish where a project is on the growth-maturity-decline continuum. But diversity metrics present a much trickier challenge. The data there exists mostly in mailing lists and board discussions and is currently only really explored through surveys. But the issue provoked a really interesting discussion full of smart suggestions and we’re excited to see what new solutions the community will come up with in the future.

Turning to what we learned from our own panel, we were thrilled to be speaking in front of a similarly engaged audience. We opened with a shout out to the open source projects that have already created tooling around data acquisition and we were lucky enough to have maintainers from many of those projects in the room with us. It was good of our audience to indulge a presentation heavy on questions and light on answers. They seemed genuinely curious about the issues we were raising and interested in trying to figure out how to fundamentally address them – some even started working on potential solutions as we were speaking.

We didn’t arrive at any grand consensus on solutions, but it’s clear that there is active community interest in trying to at least understand the problem of open source metrics and how we might be able to solve it. That’s certainly inspiring us to keep working on the issue—after all, things will only get better as more ideas get discussed, researched, tried and retried. This is not something we expect to be magically fixed in a couple of steps, but we’re excited to keep reaching out to the colleagues we interacted with at the conference and see what develops.

Our final takeaway is a classic example of conference serendipity. We arrived there knowing about GrimoireLab, a tool for tracking data about multiple open source projects on a single dashboard—we even referenced it in our talk. But what we didn’t know is that it’s easy to create your own implementation of it. We attended a presentation where several groups shared how they had implemented GrimoireLab with success and we’re now implementing it internally ourselves to track the status of our open source projects. Talk about a win-win situation.

Stay tuned to the VMware Open Source Blog for future conference recaps and follow us on Twitter (@vmwopensource).