Category

News

CHAOSS Weekly Newsletter

By | News

Weekly

REMEMBER: PUBLIC COMMENT NOW OPEN ON CHAOSS METRICS VERSION 1

https://chaoss.community/metrics-rc/ 

COMMUNITY

We are excited to announce the two keynotes for CHAOSScon NA in August.

Zaheda Bhorat

@zahedab

Head of Open Source Strategy

Amazon Web Services

and

Jana Gallus

@janagallus

Assistant Professor

UCLA

More information, including titles and the full schedule will follow very shortly.

METRICS

Have a few minutes during a coffee break? Read and comment in the linked issue on one or two candidate metrics. You can check out the candidate metrics here: https://chaoss.community/metrics-rc/ 

SOFTWARE

Regarding Augur, they have advanced their ‘first class’ data collection tool for metrics. The API covers ALL evolution Metrics and then some. Risk and Value metrics are about a week away. The front end separation accomplished two important goals:

  1. Making newcomer startup easier
  2. Recognizing that many open source projects want to use the data and present it using existing infrastructure.

Regarding GrimoireLab, they are improving King Arthur, the job scheduler to improve visibility into data collection tasks.

EVENTS

CHAOSScon:

August 20, 2019

MEETINGS

Evolution WG Meeting:

July 3, 2019

9:30am US Central

Value WG Meeting:

July 5, 2019

11:00am US Central

D&I WG Meeting:

July 8, 2019

9:30am US Central

CHAOSS Community Call:

July 9, 2019

11:00am US Central

Common WG Meeting:

July 11, 2019

10:00am US Central

Risk WG Meeting:

July 15, 2019

1:00pm US Central

All meetings at:

https://unomaha.zoom.us/j/720431288 

REPOSITORIES

Governance

Metrics

GrimoireLab

Augur

Cregit

Working Groups

Common Metrics WG

Diversity & Inclusion WG

Evolution WG

Risk WG

Value WG

RESOURCES

CHAOSS Community Minutes/Agenda:

https://tinyurl.com/y67nfrrw

Common Minutes/Agenda:

https://tinyurl.com/y43q2oth

D&I Minutes/Agenda/Repository:

https://tinyurl.com/yy3t75ry

Evolution Minutes/Agenda:

https://tinyurl.com/yyk4uaqe

Risk Minutes/Agenda:

https://tinyurl.com/y572r7qv

Value Minutes/Agenda:

https://tinyurl.com/y2bqn66v 

CHAOSS on YouTube:

https://tinyurl.com/yyppumke

CHAOSS on the Web:

https://chaoss.community/ 

CHAOSS on GitHub:

https://github.com/chaoss 

CHAOSS on Community Bridge:

https://tinyurl.com/y6bao886 

CHAOSS Weekly Newsletter

By | News

Weekly

PUBLIC COMMENT NOW OPEN ON CHAOSS METRICS VERSION 1

https://chaoss.community/metrics-rc/ 

METRICS

It’s official! CHAOSS Metric release candidates are open for public comment for the next four weeks. You can check out the candidate metrics here:

https://chaoss.community/metrics-rc/ 

If you have comments on any of the metrics, follow the comment link and post your issue there. It’s that simple. See a typo? Want to add content? Need to include a picture? Post it to the specific metric related issue. A big thanks to everyone for their very hard work in moving the first version of the metrics forward.

SOFTWARE

Regarding GrimoireLab, GrimoireLab 0.2.25 was released last week. This release fixes minor bugs related to how data is stored for Gerrit and Jira backends, and improves how data retention policy works for identities.

The team also worked to refactor some parts of the task scheduler with the idea of working on its new administrative interface.  

Regarding Augur, there were a few frontend updates on our new dashboard that consist of minor design adjustments and building repo and repo group list pages that hit our new api endpoints and list all of these things.

COMMUNITY

CHAOSScon is coming on August 20th. We nearly have the schedule set and we’ll be sharing that with everyone in the very near future.

EVENTS

CHAOSScon:

August 20, 2019

MEETINGS

Common WG Meeting:

June 27, 2019

10:00am US Central

Value WG Meeting:

June 28, 2019

11:00am US Central

D&I WG Meeting:

July 1, 2019

9:30am US Central

Risk WG Meeting:

July 1, 2019

1:00pm US Central

CHAOSS Community Call:

July 2, 2019

11:00am US Central

Evolution WG Meeting:

July 3, 2019

9:30am US Central

All meetings at:

https://unomaha.zoom.us/j/720431288 

REPOSITORIES

Governance

Metrics

GrimoireLab

Augur

Cregit

Working Groups

Common Metrics WG

Diversity & Inclusion WG

Evolution WG

Risk WG

Value WG

RESOURCES

CHAOSS Community Minutes/Agenda:

https://tinyurl.com/y67nfrrw

Common Minutes/Agenda:

https://tinyurl.com/y43q2oth

D&I Minutes/Agenda/Repository:

https://tinyurl.com/yy3t75ry

Evolution Minutes/Agenda:

https://tinyurl.com/yyk4uaqe

Risk Minutes/Agenda:

https://tinyurl.com/y572r7qv

Value Minutes/Agenda:

https://tinyurl.com/y2bqn66v 

CHAOSS on YouTube:

https://tinyurl.com/yyppumke

CHAOSS on the Web:

https://chaoss.community/ 

CHAOSS on GitHub:

https://github.com/chaoss 

CHAOSS on Community Bridge:

https://tinyurl.com/y6bao886 

CHAOSS Weekly Newsletter

By | News

Weekly

SOFTWARE

Regarding Augur, there were many small bugs that were run into while running the GitHub worker and encountering rare cases. The Augur team updated the worker to handle all these cases that they ran into and the worker is nearly running perfectly smooth. They have changed the way they create GitHub API requests to include user authorization. Other changes were GitHub made to the worker in order to query issues of all states (open and closed), and of all the pages available (by default, the GitHub API only returns the first page of issues).

COMMUNITY

Did you know that CHAOSS has a YouTube channel?

https://tinyurl.com/yyppumke

We record and provide transcriptions of all of our WG meetings. If miss a meeting or would simply like to see what’s going on in the meetings, check out the videos. Also worth mentioning, all of these WG meetings are OPEN TO EVERYONE. If you have an interest in Diversity & Inclusion, Risk, Value, Evolution, or Common metrics join in! Connection details are to the right.

METRICS

This should be the big push for WGs to release their candidate metrics in advance of CHAOSS Metrics Version 1 is:

  • Candidate metrics available for comment by June 21
  • Comment period closes July 26
  • CHAOSS Metrics Version 1 posted August 9

For the WGs, here is the consistent reminder to check-out the releasing.md file at:  

https://tinyurl.com/y3r86q3w 

EVENTS

CHAOSScon:

August 20, 2019

MEETINGS

Evolution WG Meeting:

June 19, 2019

9:30am US Central

Common WG Meeting:

June 20, 2019

10:00am US Central

Value WG Meeting:

June 21, 2019

11:00am US Central

D&I WG Meeting:

June 24, 2019

9:30am US Central

Risk WG Meeting:

July 1, 2019

1:00pm US Central

CHAOSS Community Call:

June 25, 2019

11:00am US Central

All meetings at:

https://unomaha.zoom.us/j/720431288 

REPOSITORIES

Governance

Metrics

GrimoireLab

Augur

Cregit

Working Groups

Common Metrics WG

Diversity & Inclusion WG

Evolution WG

Risk WG

Value WG

RESOURCES

CHAOSS Community Minutes/Agenda:

https://tinyurl.com/y67nfrrw

Common Minutes/Agenda:

https://tinyurl.com/y43q2oth

D&I Minutes/Agenda/Repository:

https://tinyurl.com/yy3t75ry

Evolution Minutes/Agenda:

https://tinyurl.com/yyk4uaqe

Risk Minutes/Agenda:

https://tinyurl.com/y572r7qv

Value Minutes/Agenda:

https://tinyurl.com/y2bqn66v 

CHAOSS on YouTube:

https://tinyurl.com/yyppumke

CHAOSS on the Web:

https://chaoss.community/ 

CHAOSS on GitHub:

https://github.com/chaoss 

CHAOSS on Community Bridge:

https://tinyurl.com/y6bao886 

CHAOSS Weekly Newsletter

By | News

Weekly

COMMUNITY

Things are rolling with this year’s Google Summer of Code. Make sure to check out (and comment on!) the work that the students are doing. You can track what they are up to on their blogs:

Submissions are now closed for CHAOSScon. We’ll have more to come shortly regarding the schedule and keynote. Just letting everyone know that the event is currently at capacity (100 people). There is a waitlist that is being managed by the Linux Foundation. If you have questions on this, email Matt Germonprez at germonprez@gmail.com.

CHAOSScon will be on August 20, co-located with the Open Source Summit North America in San Diego, CA. We’re really looking forward to seeing everyone there.

METRICS

The Diversity & Inclusion WG has honed in on specific metrics in advance of CHAOSS Metrics Version 1. For example, three ‘ready for release’ metrics are associated with Event Diversity:

For folks in the other WGs, these metrics are inline with the releasing.md file at:  

https://tinyurl.com/y3r86q3w 

The Evolution WG is experimenting with how to show in metric detail pages where to find the metric in CHAOSS software. See pull request wg-evolution/163

Finally, WGs should note that the timeline associated with CHAOSS Metrics Version 1 is:

  • Candidate metrics available for comment by June 21
  • Comment period closes July 26
  • CHAOSS Metrics Version 1 posted August 9

SOFTWARE

With respect to Grimoire Lab, during the last week GrimoireLab 0.2.24 was released. This version fixes minor errors related to the mapping in ElasticSearch for Bugzilla and Slack datasources. It also improves the readability of log messages.

With respect to Augur, they are getting to team off to the races, landing many planes with respect to tooling and metrics.

EVENTS

CHAOSScon:

August 20, 2019

MEETINGS

Common WG Meeting:

June 13, 2019

10:00am US Central

Value WG Meeting:

June 14, 2019

11:00am US Central

D&I WG Meeting:

June 17, 2019

9:30am US Central

Risk WG Meeting:

June 17, 2019

1:00pm US Central

CHAOSS Community Call:

June 18, 2019

11:00am US Central

Evolution WG Meeting:

June 19, 2019

9:30am US Central

All meetings at:

https://unomaha.zoom.us/j/720431288 

REPOSITORIES

Governance

Metrics

GrimoireLab

Augur

Cregit

Working Groups

Common Metrics WG

Diversity & Inclusion WG

Evolution WG

Risk WG

Value WG

RESOURCES

CHAOSS Community Minutes/Agenda:

https://tinyurl.com/y67nfrrw

Common Minutes/Agenda:

https://tinyurl.com/y43q2oth

D&I Minutes/Agenda/Repository:

https://tinyurl.com/yy3t75ry

Evolution Minutes/Agenda:

https://tinyurl.com/yyk4uaqe

Risk Minutes/Agenda:

https://tinyurl.com/y572r7qv

Value Minutes/Agenda:

https://tinyurl.com/y2bqn66v 

CHAOSS on YouTube:

https://tinyurl.com/yyppumke

CHAOSS on the Web:

https://chaoss.community/ 

CHAOSS on GitHub:

https://github.com/chaoss 

CHAOSS on Community Bridge:

https://tinyurl.com/y6bao886 

CHAOSS Weekly Newsletter

By | News

Weekly

CHAOSScon Submission Deadline Extended to June 7th

SOFTWARE

With respect to Grimoire Lab, version 0.2.22 was launched last week. This release is the first one that officially includes Graal, a generic repository analyzer. This is the first step to make the platform handle the data produced by this tool.

Graal allows to fetch commits from Git repositories and provides mechanisms to plug third party tools or libraries focused on source code analysis. The latest release allows to get data about code complexity, code quality, class dependencies, security vulnerabilities, and licenses.

Other features included in 0.2.22 are repositories labels and identities blacklisting. Repositories labels allows users to tag or categorize repositories with custom keywords using the project file. Identities can be blacklisted using the GrimoireLab identities file - so far this was only possible using SortingHat -.

This release also includes minor bug fixes and some improvements in the documentation.

With respect to Augur, the team continued development on API endpoints for its new schema, and completed development of its worker for pulling data from GitHub.

With respect to cregit, the research publication showing that cregit brings a significant improvement over blame-per-line is available online:

https://tinyurl.com/yyrbrzh4. A copy of the article (for those without a subscription) is available here:

http://github.com/dmgerman/papers

Additionally, they are rolling a new interface for directories, to complement the source code view. With this view, it is possible to summarize any attribute of the source code (while keeping a notion of its size).

https://cregit.linuxsources.org/code/5.1/

COMMUNITY

DEADLINE EXTENDED TO JUNE 7, 2019

CHAOSScon will be on August 20, co-located with the Open Source Summit North America in San Diego, CA. The CHAOSScon Call for Proposals is open. Submit your talk proposals now! We’re really looking forward to seeing everyone there

Congratulations to the successful running of SoHeal 2019 (The International Workshop on Software Health) in Montreal. There were a number of CHAOSS members participating in the conference. Thanks to the organizing committee and all attendees. You can check out the program with slides and recordings here:

https://soheal.github.io/program.html 

METRICS

Working groups are advancing metrics for the first CHAOSS release ahead of OSSNA. Many of the working groups are doing hack-a-thons over the next few weeks. Follow along by checking out the tracking spreadsheet here: https://tinyurl.com/yxpj6kv4

Of particular interest, the Common WG is exploring metrics associated with organizational affiliation, geography, and responsiveness. Interested in these areas? Join their call this week on June 6th at 10am US Central.

For the working groups… don’t forget → look at the releasing.md file to help with framing the requirements for metrics release at:

https://tinyurl.com/y3r86q3w 

EVENTS

CHAOSScon CFP closes:

June 7, 2019

SUBMIT NOW!!

CHAOSSCon:

August 20, 2019

MEETINGS

Evolution WG Meeting:

June 5, 2019

9:30am US Central

Common WG Meeting:

June 6, 2019

10:00am US Central

Value WG Meeting:

June 7, 2019

11:00am US Central

D&I WG Meeting:

June 10, 2019

9:30am US Central

CHAOSS Community Call:

June 11, 2019

11:00am US Central

Risk WG Meeting:

June 17, 2019

1:00pm US Central

All meetings at:

https://unomaha.zoom.us/j/720431288 

REPOSITORIES

Governance

Metrics

GrimoireLab

Augur

Cregit

Working Groups

Common Metrics WG

Diversity & Inclusion WG

Evolution WG

Risk WG

Value WG

RESOURCES

CHAOSS Community Minutes/Agenda:

https://tinyurl.com/y67nfrrw

Common Minutes/Agenda:

https://tinyurl.com/y43q2oth

D&I Minutes/Agenda/Repository:

https://tinyurl.com/yy3t75ry

Evolution Minutes/Agenda:

https://tinyurl.com/yyk4uaqe

Risk Minutes/Agenda:

https://tinyurl.com/y572r7qv

Value Minutes/Agenda:

https://tinyurl.com/y2bqn66v 

CHAOSS on YouTube:

https://tinyurl.com/yyppumke

CHAOSS on the Web:

https://chaoss.community/ 

CHAOSS on GitHub:

https://github.com/chaoss 

CHAOSS on Community Bridge:

https://tinyurl.com/y6bao886 

CHAOSS Weekly Newsletter May 28th, 2019

By | News

 Weekly Newsletter


COMMUNITY

CHAOSS is participating in the 2019 Grace Hopper Open Source Day (https://ghc.anitab.org/tag/osd/). Sean and Carter from the Augur team will lead a session on October 3, 2019. This is a great opportunity for CHAOSS to be involved in this fantastic event!

The next CHAOSScon will be on August 20, co-located with the Open Source Summit North America in San Diego, CA. The CHAOSScon Call for Proposals is open. Submit your talk proposals now!

Thanks to everyone who is helping to coordinate this event, it certainly couldn’t be done without you. We look forward to seeing everyone there.

METRICS

Over the next few weeks, the working groups are advancing metrics for the first CHAOSS release ahead of OSSNA. Of interest, the D&I working group is going to be doing a metrics hack-a-thon on June 3 and June 10 (9:30am US Central) to make their push for a candidate release by mid-June. If you want to follow along, check out the tracking spreadsheet here:

https://tinyurl.com/yxpj6kv4

Don’t forget → look at the releasing.md file to help with framing the requirements for metrics release at:

https://tinyurl.com/y3r86q3w 

EVENTS

CHAOSScon CFP closes:

May 31, 2019

SUBMIT NOW!!

CHAOSSCon:

August 20, 2019

MEETINGS

Evolution WG Meeting:

May 29, 2019

9:30am US Central

Common WG Meeting:

May 30, 2019

10:00am US Central

Value WG Meeting:

May 31, 2019

11:00am US Central

D&I WG Meeting:

June 3, 2019

9:30am US Central

Risk WG Meeting:

June 3, 2019

1:00pm US Central

CHAOSS Community Call:

June 4, 2019

11:00am US Central

All meetings at:

https://unomaha.zoom.us/j/720431288 

REPOSITORIES

Governance

Metrics

GrimoireLab

Augur

Cregit

Working Groups

Common Metrics WG

Diversity & Inclusion WG

Evolution WG

Risk WG

Value WG

SOFTWARE

With respect to Augur, they launched a version using its new `Broker Architecture` for data collection. Augur’s new broker architecture includes a new database schema that models the entirety of the open-source ecosystem in a platform-agnostic way. Augur will continue to support existing `GHTorrent` API endpoints and user interfaces for backward compatibility.  

The Augur team created 3 workers in the new architecture last week. One of these workers retrieves and stores badge information from the Linux Foundation’s Badging API endpoint. The other worker is for the GitHub api, and when given a github url, it can retrieve and store information about contributors and issues.  The data stored for issues, issue comments and contributors is noted in the new Augur schema: https://tinyurl.com/y3sbqjtl 

The third worker created is the Facade worker. Here, they substantially refactored Brian Warner’s facade into component parts responsible for discrete tasks (8 components, in total). The Facade worker also now puts data into the Augur schema. The refactored and updated version of Facade was tested against a set of 100 open source repositories and produces identical analysis in our “worker” version and the original version. Next up will be our release of a “TimeStamped” commit version of Facade.

Finally, they created a new broker and housekeeper components last week. The broker is now able to receive tasks and manage workers’ queues, and is able to hand out tasks to suitable workers. The housekeeper is now able to routinely give the broker a task that is intended to be handed out to workers on a repeated schedule, and it is able to have multiple synchronous schedules.

With respect to Grimoire Lab, Perceval had three bugs for Slack, GitHub, and GitLab fixed. The first bug was due to some Slack messages having timestamps with more than 6 decimals. The second bug was affecting the list of reviewers returned for GitHub enterprise instances and it was fixed by an external contribution. The last bug was from a recent change within the GitLab API, which for efficiency reasons doesn't include the attribute `last` in the pagination responses for endpoints (e.g., for issues and merge requests) with more than 10,000 results.

A bug in ELK was also fixed which caused identities not to refresh in the Gerrit enricher and improved the Jira and GitLab enrichers. The former is now able to store in the same index data coming from more than one Jira server, while the latter includes milestone start and due dates.

Finally, Sigils had updated panels such as the overview, the ones for GitHub and Jira to use the field `grimoire_creation_date` as datetime field. Also, the GitLab index pattern to handle the new information about milestones was also updated.

RESOURCES

CHAOSS Community Minutes/Agenda:

https://tinyurl.com/y67nfrrw

Common Minutes/Agenda:

https://tinyurl.com/y43q2oth

D&I Minutes/Agenda/Repository:

https://tinyurl.com/yy3t75ry

Evolution Minutes/Agenda:

https://tinyurl.com/yyk4uaqe

Risk Minutes/Agenda:

https://tinyurl.com/y572r7qv

Value Minutes/Agenda:

https://tinyurl.com/y2bqn66v 

CHAOSS on YouTube:

https://tinyurl.com/yyppumke

CHAOSS on the Web:

https://chaoss.community/ 

CHAOSS on GitHub:

https://github.com/chaoss 

CHAOSS on Community Bridge:

https://tinyurl.com/y6bao886 

GrimoireLab – Graal

By | News

Welcome Graal

Currently, GrimoireLab allows to produce analytics with data extracted from more than 30 tools related with contributing to Open Source development such as version control systems, issue trackers and forums. Despite the large set of metrics available in GrimoireLab, none of them relies on information extracted from source code, thus limiting the end-users to benefit of a wider spectrum of software development data.

Graal is a tool that allows to conduct customizable and incremental analysis of source code by leveraging on existing tools. It enhances Perceval (one of the key GrimoireLab‘s components) and produces an output similar to Perceval one to ease the integration with GrimoireLab, thus complementing the analytics offered by latter with source Code related metrics.

Once installed, Graal can be used as a stand-alone program or Python library.

Backends

Several backends have been already developed. They leverage on some source code analysis tools, where executions are triggered via system calls or their Python interfaces.

In the current status, the backends mostly target Python code, however other backends can be easily developed to cover other programming languages.

The available backends are:

  • CoCom gathers data about code complexity (e.g., cyclomatic complexity, LOC) from projects written in popular programming languages such as: C/C++, Java, Scala, JavaScript, Ruby and Python. It leverages on Cloc and Lizard; the former is a Linux package used to count blank lines, comment lines and LOC, while the latter is a code complexity analyzer written in Python.

  • CoDep extracts package and class dependencies of a Python module and serialized them as JSON structures, composed of edges and nodes, thus easing the bridging with front-end technologies for graph visualizations. It combines PyReverse, a reverse engineering tool able to generate UML-like diagrams, plus NetworkX, a library to create, manipulate and study complex networks.

  • CoQua retrieves code quality insights, such as checks about line-code’s length, well-formed variable names, unused imported modules and code clones. It uses PyLint, a code, bug and quality checker for Python.

  • CoVuln scans the code to identify security vulnerabilities such as potential SQL and Shell injections, hard-coded passwords and weak cryptographic key size. It relies on Bandit, a tool designed to find common security issues in Python code.

  • CoLic scans the code to extract license information. It currently supports Nomos and ScanCode.

Further reading

More details about how Graal works can be found at:

Contributing to the GMD Working Group

By | News

How we produce metrics in the GMD Working Group

The GMD Working Group
is one of the CHAOSS working groups, tasked with defining
useful metrics relevant for the analysis of
software development projects from the point of view of
GMD (growth-maturity-decline). It also works
in the areas of risk and value. For all of them, we're
intending to follow the same process to produce metrics,
similar to what other CHAOSS working grupos are doing.
This post describes this process, that we have recently
completed for the first metric (many others should follow
during the next weeks).

The process is top down, starting by the definition of
the focus areas of interest. For each of the focus areas,
we define the goals we intend to reach for it, and then
we follow GQM
(goal-question-metric) to first derive questions which,
when answered, should help to reach our goals, and then
metrics that help to answer those questions.
Finally, we explore how those metrics could be implemented,
and produce reference implementations for specific data sources.
During all the process we have into account
use cases,
which illustrate how metrics are used in the real world.

Currently, the working group is dealing with five
focus areas:
code development, community growth, issue resolution, risk, and value.
We estimate that all of them are relevant to improve our
knowledge of FOSS (free, open source software) projects.

Goals for the code development focus area

For now, the more complete of these focus areas
is code development,
for which we have identified some goals: activity,
efficiency and quality. For each of them we're in the process
of identifying questions. For example for activity,
we have identified a question "How many changes are happening to the code base, during a certain time period?",
code named Changes, that we think should help to learn about
the activity of a project.

To help to answer this question,
we have identified some metrics, such as
"Number of changes to the code base", code named as
Code_Changes_No(Period), which tries to capture
how many changes to the source code were done during the
period of interest.
We explain this in detail in the
definition of the Code_Changes metric.
This definition tries to be neutral with respect to the
specific data source (in this case a source code management
repository, such as git, Subversion, or Mercurial),
but also includes specific sections for specific data sources.

Python notebook with Code_Changes implementation for git

Finally, to clarify the metric, and provide a definition which
is not ambiguous and can be checked for conformance,
we also provide an implementation of it for a certain data source.
In our case, we implemented
it for git, as a Python notebook
(check it in Binder).
It includes documentation on the details of the implementation,
and an actual implementation of the metric as a Python class.
It also includes examples of how to use it with real repositories,
and an exploration of some details of the specific data source,
relevant for implementations and comparison between different implementations.
Reference implementations are based on
Perceval output, which is a collection of JSON documents,
one per item obtained from the data source, as much similar
as the data produced by the data source as possible.

When producing the first reference implementation for
Code_Changes, we completed the first full process,
from focus area and goal to questions and metrics.
Now, we intend to complete this process for the rest
of goals in all our focus areas. Do you want to join us in this travel?
If so, you are welcome! We are ready to review your pull requests,
and work with you towards having useful definitions and
implementations of metrics that help us all to better understand
FOSS projects.

Metrics With Greater Utility: The Community Manager Use Case

By | News

By Sean Goggins 

Introduction

Community managers take a variety of perspectives, depending on where their communities are in the lifecycle of growth, maturity, and decline. This is an evolving report of what we are learning from community managers, some of whom we are working with on live experiments with a CHAOSS project prototyping software tool called Augur (http://www.github.com/CHAOSS/augur). At this point, we are paying particular focus to how community managers consume metrics and how the presentation of open source software health and sustainability metrics could make them more and in some cases less useful for doing their jobs.

Right now, based on Augur prototypes and follow up discussions so far, we have the following observations that will inform our work both the “Growth Maturity and Decline” working group and in Augur Development. Here are a few things we have learned from prototyping Augur with community managers. These features in Augur are particularly valued:

  1. Allowing comparisons with projects within a defined universe of of projects is essential
  2. Allow community managers to add and remove repositories that they monitor from their repertories periodically
  3. Downloadable graphics
  4. Downloadable data (.csv or .json)
  5. Availability of a “Metrics API”, limiting the amount of software infrastructure the community manager needs to maintain for themselves. This is more valued by program managers overseeing larger portfolios right now, but we think it has potential to grow as awareness of the relatively light weight of this approach becomes more apparent. By apparent, we really mean “easy to use and understand”; right now it is easy for a programmer, but less so for a community manager without this background or current interest.

Date Summarized Comparison Metrics

With these advantages in mind, making the most of this opportunity to help community managers with useful metrics is going to include the availability of date summarized comparison metrics. These types of metrics have two “filters” or “parameters” fed into them that are more abstractly defined in the Growth, Maturity, and Decline metrics on the CHAOSS project.

  1. Given a pool of repositories of interest for a community manager, rank them in ascending or descending order by a metric
  2. Over a specified time period or
  3. Over a specified periodicity (e.g., month) for a length of time (e.g., year).

For example, one open source program officer we talked with is interested in the following set of date summarized comparison metrics. Given a pool of repositories of interest to the program officer (dozens to hundreds of repositories):

  1. What ten repositories have the most commits this year (straight commits, and lines of code)?
  2. How many new projects were launched this year?
  3. What are the top ten new repositories in terms of commits this year (straight commits, and lines of code)?
  4. How many commits and lines of code were contributed by outside contributors this calendar year? Organizationally sponsored contributors?
  5. What organizations are the top five external contributors of commits, comments, and merges?
  6. What are the total number of repository watchers we have across all of our projects?
  7. Which repositories have the most stars? Of the ones new this year? Of all the projects? Which projects have the most new stars this year?

Open Ended Community Manager Questions to Support with Metrics

There are other, more open ended questions that may be useful to open source community managers:

  1. Is a repository active?
    1. Visual differentiation that examines issue and commit data
    2. Activity in the past 30 days
    3. Across all repositories, present the 50th percentile as a baseline and show repositories above and below that line
  2. Should we archive this repository?
    1. Enable an input from the manager after reviewing statistics
    2. Activity level, inactivity level and dependencies
    3. Mean/Median/Mode histogram for commits/repo
  3. Should we feature this repository in our top 10? (Probably a subjective decision based on some kind of composite scoring system that is likely specific to the needs of every community manager or program office.)
  4. Who are our top authors? (Some kind of aggregated contribution ranking by time period [year, month, week, day?]. nominally, I have a concern about these kinds of metrics being “gameable”, but if they are not visible to contributors themselves, there is less “gaming” opportunity.)
  5. What are our top repositories? (Probably a subjective decision based on some kind of composite scoring system that is likely specific to the needs of every community manager or program office.)
  6. Most active repositories by time period [year, month, week?]. Activity to be revealed through a mix of retention and maintainer activity primarily focusing on the latter. Number of issues and commits. Also the frequency of pull requests and the number of closed issues.
  7. Least active repositories by time period [Week? Month? Year?]. Bottom of scores calculated, as above.
  8. Who is our most active contributor (Some kind of aggregated contribution ranking by time period [year, month, week, day?]. nominally, I have a concern about these kinds of metrics being “gameable”, but if they are not visible to contributors themselves, there is less “gaming” opportunity.)
  9. What new contributors submitted their first new patches/issues this week? (Visualization Note: New contributors can be colored in visualizations and then additionally a graph can be made for number of new contributors)
  10. Which contributors became inactive? (Will need a mechanism for setting “inactive” thresholds.)
  11. Baseline level for the “average” repository in an organization and for each, individual organization repository.
  12. What projects outside of a community manager’s general view (GitHub organization or other boundary) do my repositories depend on or do my contributors also significantly contribute to?
  13. Build a summary report in 140 characters or less. For example, “Your total commits in this time period [month, week?] across the organization increased 12% over the last period. Your most active repositories remained the same. You have 8 new contributors, which is 1 below your mean for the past year. For more information, click here.”
  14. Once a metrics baseline is established, what can be done to move them? [^1]
  15. Are there optimal measures for some metrics?
    1. Pull request size?
    2. Ratio of maintainers to contributors?
    3. New contributor to consistent contributor ratio?
    4. New contributor to maintainer ratio?

Augur Specific Design Change Recommendations

Next is a list of Augur specific design changes suggested thus far, based on conversations with community managers.

  1. Showing all of the projects in a GitHub organization in a dashboard by default is generally useful.
  2. Make the lines more clear in the charts, especially when there are multiple lines in comparison
  3. How to zoom in and out is not intuitive. In the case of Google Finance, for example, a default, subset period was displayed when they used the “below the line mirrored line” interface this is modeled after. That old model makes it fairly clear that the ability to adjust the range of dates is what that box below the line in google finance is for. Alternately, Google’s more updated way of representing time, providing users choices, and showing comparisons may be even more useful and engaging. In general, its important that the time zooming is more clear.
    In one view, Google lets you see a 1 year window of a stock's performance.

    Figure 1: In one view, Google lets you see a 1 year window of a stock’s performance.

     

    In another view, you can choose a 3 month period. Comparing the two time periods also draws out the trend with red or green colors, depending on whether or not the index, in this case a stock's price, has increased or decreased overall during the selected time period.

    Figure 2: In another view, you can choose a 3 month period. Comparing the two time periods also draws out the trend with red or green colors, depending on whether or not the index, in this case a stock’s price, has increased or decreased overall during the selected time period.

     

    Comparisons are similarly interesting in Google's finance interface. You can simply add a number of stocks in much the same way our users want to add a number of different repositories.

    Figure 3: Comparisons are similarly interesting in Google’s finance interface. You can simply add a number of stocks in much the same way our users want to add a number of different repositories

  4. For the projects a community manager chooses to follow, go ahead and give them comparison check-boxes at the top of the page. I think from a design point of view, we should limit comparisons as discussed, to 7 or 8, simply due to the limits in human visual perception.
  5. The ability to adjust the viewing windows to a month summary level is desired.
  6. Right now, Augur does not make it clear that metrics are, by default, aggregated by week.
  7. New contributor response time. When a new contributor joins a project, what is the response time for their contribution?
  8. A graph **comparing** commits and commit comments on x and y axes **between projects** is desired. Same with Issue and Issue comments.
  9. In general, the last two years of data gets the most use. We should focus our default display on this range.

Data Source Trust Issues

  1. Greater transparency of metrics data origins will be helpful for understanding discrepancies between current understanding and what metrics show.
    1. We should include some detailed notes from Brian Warner about how Facade is counting lines of code, and possibly some instrumentation to enable those counts to be altered by user provided parameters.
    2. Outside contributor organization Data. One community manager reported that their lines of code by organization data seems to look wrong. I did explain that these are mapped from a list of companies and emails we put together, and getting this right is something community managers will need some kind of mapping tool to do. GitDM is a tool that people sometimes use to create these maps, and Augur does follow a derivative of that work. It is probably the case that maintaining these affiliation lists is something that needs to be made easier for community managers, especially in cases where the number of organizations contributing to a project is diverse (there is a substantial range among community managers we spoke with. Some are managing complex ecosystems involving mostly outside contributors. Most are in the middle. And some of contributor lists highly skewed toward their own organization.)
  2. GHTorrent data, while excellent for prototyping, faces some limitations under the scrutiny of community managers. For example, when using the cloned repositories, and then going back to *issues*, the issues data in GHTorrent does not “look right”. I think the graph API might offers some possibilities for us to store issue statistics we pull directly from GitHub and update periodically as an alternative to GHTorrent.
  3. When issues are moved from an older system, like Gerrit, into GitHub issues, in general, the statistics for the converted issues are dodgy, even through the GitHub API. We are likely to encounter this, and at some point may want to include Gerrit data in a common data structure with issues from GitHub and other sources.

New Metrics Suggested

  1. Add metric “number of clones”
  2. “Unique visitors” to a repository is a data point available from the GitHub API which is interesting.
  3. Include a metric that is a comparison of the ratio of new committers and total committers in a time period. Or, perhaps simply those two metrics in alignment. Seeing the number of new committers in a set of repositories can be a useful indication of momentum in one direction or another; though I hasten to add that this is not canonically the case.
  4. Some kind of representation of the ratio between commits and lines of code per commit
  5. Test coverage within a repository is something to consider measuring for safety critical systems software.
  6. Identifying the relationship between the DCO (Developer Certificate of Origin) and the CLA (Contributor License Agreement).
  7. There is a tension between risk and value that, as our metrics develop in those areas, we are well advised to keep in mind.
  8. The work that Matt Snell and Matt Germonprez at the University of Nebraska-Omaha are starting related to risk metrics is of great interest. Getting these metrics into Augur is something we should plan for as soon as reasonably possible.

Design Possibilities

Augur

For Augur, I think the interface changes that enable comparisons and adjust the level of self apparent ways to compress or expand the time, as per the Google examples, are at the top of the list of things that will make Augur more useful for community managers. Feedback on these notes will be helpful. I think the new committers to committers ratio is important, as well as enabling comparisons across projects in the bubble graphs as well. Transparency of data sources and limitations of data sources for both the API and the front end, which are above average but not complete, are important.

Growth Maturity and Decline Working Group

Many of the metrics of interest to community managers fall under the “Growth Maturity and Decline” working group. From a design perspective it appears that, possibly, the way that metrics are expressed and consumed by these stakeholders in their individual derivatives of the community manager use case is quite far removed from the detailed definition work occurring around specific metrics. Discussion around an example implementation like Augur is helping draw out some of this more “zoomed out” feedback. The design of system interfaces frequently includes the need to navigate between granular details and the overall user experience [@zemel_what_2007; @barab_our_2007]. This is less of a focus in the development of software engineering metrics, though recent research is beginning to illustrate the criticality of visual design for interpreting analytic information [@gonzalez-torres_knowledge_2016].

References

  • Barab, S, T Dodge, MK Thomas, C Jackson, and H Tuzun. 2007. Our designs and the social agendas they carry. Journal of the Learning Sciences 16 (2): 263-305.
  • Gonzalez-Torres, Antonio, Francisco J. Garcia-Penalvo, Roberto Theron- Sanchez, and Ricardo Colomo-Palacios. 2016. Knowledge discovery in software teams by means of evolutionary visual software analytics. Sci- ence of Computer Programming 121: 55{74. doi:10.1016/j.scico.2015.09.005. https://linkinghub.elsevier.com/retrieve/pii/S0167642315002658.
  • Zemel, Alan, Timothy Koschmann, Curtis LeBaron, and Paul Feltovich. 2007. What are we Missing? Usability’s Indexical Ground. Computer Supported Cooperative Work.

Acknowledgements

Many members of the CHAOSS community contributed to this report and analysis. I am happy to share names with permission from the contributors, but I have not requested permission as of the publication date.

[^1]: Once we are to this point, I think CHAOSS is kicking butt and taking names.

A PDF Version of this Post is Available Here.

New GrimoireLab release: 18.09-02

By | News

We have a new release of GrimoireLab, 18.09-02, corresponding to grimoirelab-0.1.2 (the main Python package).

This release includes full support Mattermost and GoogleHits, some improvements in the Kibiter UI and panels, some bug fixes and minor new features.

The corresponding packages have been uploaded to pypi (so they’re installable with pip). I’ve tested most of the examples in the GrimoireLab Tutorial with this new release, and everything seems to work. Please, report any problem you may find.

As usual, this release of pypi packages was generated with docker containers, to ensure platform independence. You can install all the packages just with:

$ pip install grimoirelab

Remember that now we also have a new grimoirelab package, that pulls all the Python packages for the release. So, installation is easier, and traceability too: for knowing the GrimoireLab release, just run

$ grimoirelab -v
GrimoireLab 0.1.2

The tag you get (0.1.2 in this case) corresponds to a certain release file (18.09-02 in this case), and specific commits and Python package versions.

We have also produced four Docker images available in DockerHub, all of them with the tags :18.09-02 and :latest. You can pull and run them straight away:

  • grimoirelab/factory: for creating the Python packages
  • grimoirelab/installed: with GrimoireLab installed
  • grimoirelab/full: grimoirelab/installed plus services needed to produce a dashboard, by default produces a dashboard of the CHAOSS project.
  • grimoirelab/secured: grimoirelab/full plus access control and SSL for access to Kibiter

If you want to use or help to debug the containers, have a look at the docker directory in the chaoss/grimoirelab repository.

The list of new stuff is in the NEWS file (check all changes since 18.08-01, which was the latest release with packages in pypi).