CHAOSS Mentorship Alumni

The CHAOSS Community thanks all mentees and mentors who participated in Google Summer of Code, Google Season of Docs and Outreachy.

List of all Mentees

Student Program & Year Project Mentors
Yash Prakash GSoC 2021 Automate Metrics Release and Process Improvement Kevin Lumbard, Georg Link, Jaskirat Singh
Ritik Malik GSoC 2021 Automate Metrics Release and Process Improvement Kevin Lumbard, Georg Link, Matt Germonprez, Jaskirat Singh
Dhruv Sachdev GSoC 2021 Develop a Shared Data Resource Focused on Dependencies, Risks and Vulnerabilities in Open Source Software Sean Goggins, Vinod Ahuja
Anuj Lamoria GSoC 2021 Automatically identify Contributor Aliases Sean Goggins
Rashmi K A GSoC 2021 Sorting Hat - Extend data model and user interface to capture better information about contributors Venu Vardhan Reddy
Yeming Gu GSoC 2021 Yeming Gu's GSoC proposal for CHAOSS Sean Goggins, Vinod Ahuja
Veerasamy Sevagen Summer of Open Source Promotion 2021 Expanding and restyling the GrimoireLab tutorial Venu Vardhan
Jaskirat Singh GSoD 2020 Create a CHAOSS Community-wide Handbook Georg J.P.link, Armstrong Foundjem, Matt Germonprez
Xiaoya Xia GSoD 2020 Build documentation for CHAOSS D&I Badging project Matt Snell, Aastha Bist
Ria Gupta GSoC 2020 Social Currency Metric System Valerio Cosentino, Samantha Venia, Logan
Tianyi Zhou GSoC 2020 Large Social Network Analysis and Anomaly Detection with Augur Sean Goggins, Jonahz, Gabe Heim
Sarit Adhikari GSoC 2020 Machine Learning for Anomaly Detection in Open Source Communities Sean Goggins, Carter Landis, Gabe Heim
Abhinav Bajpai GSoC 2020 Implementing GitLab Data Collection Worker and Mapper to bind the responses of GitLab API, Github API & the Augur schema Sean Goggins, Carter Landis
bistaastha GSoC 2020 Build Workflow process for CHAOSS D&I Badging Project Matt Snell
vchrombie GSoC 2020 Creating Quality models using GrimoireLab and CHAOSS metrics Valerio Cosentino, Aniruddha Karajgi
Saicharan Reddy GSoC 2020 Implementation of GitLab Data Collection Worker & Test Coverage Improvement Elita Nelson, Sean Goggins, Jonahz
Akshara P GSoC 2020 Machine Learning for Anomaly Detection in Open Source Communities Elita Nelson, Sean Goggins, Gabe Heim
Pratik Mishra GSoC 2020 Machine Learning for Anomaly Detection in Open Source Communities Elita Nelson, Sean Goggins, Gabe Heim
Ore-Aruwaji Oloruntola Outreachy 2020 Build Workflow Process for CHAOSS Diversity & Inclusion Badging Matt Germonprez, Matt Snell, Saleh Abdel Motaal
Parth Sharma GSoC 2019 Build CHAOSS Risk and Growth Maturity and Decline Metrics in Augur Sean Goggins
Bingwen Ma GSoC 2019 Build CHAOSS Risk and Growth Maturity and Decline Metrics in Augur Sean Goggins, Matt Germonprez
Aniruddha Karajgi GSoC 2019 Implementing CHAOSS Metrics with Perceval Jesus Gonzalez-Barahona, valcos, Pranjal Aswani
Nishchith K Shetty GSoC 2019 Support of Source Code Related Metrics Jesus Gonzalez-Barahona, valcos, Pranjal Aswani
Keanu Nichols GSoC 2018 Reporting of CHAOSS Metrics Sean Goggins, Jesus Gonzalez-Barahona
Pranjal Aswani GSoC 2018 Reporting of CHAOSS Metrics: Refactoring the existing code and extending the capabilities of the Manuscripts Project Valerio Cosentino, Jesus Gonzalez-Barahona

Mentees

Yash Prakash

Yash Prakash selected for GSoC 2021

Project Title

Automate Metrics Release and Process Improvement

Project Description

CHAOSS metrics have been defined to provide an in-depth view into the various features of an open-source project. The metrics are also a key input to help organizations strategically invest their resources.

The main aim of the project is to understand the metrics release process, propose process improvements and automate the release process of these metrics.

In addition to the original English version of these metrics, these metrics are also translated into different languages to help communities across the globe understand and benefit from them.

By the end of this project, there would be complete automation in the process of generation of reports for the metrics and their translations

Links

Ritik Malik

Ritik Malik selected for GSoC 2021

Project Title

Automate Metrics Release and Process Improvement

Project Description

Improving the metric release process and fully automating will not only save time, but it will also help us to define a central structure for the current as well as upcoming WGs/metrics. Keeping in mind the ever evolving CHAOSS, the process would be scalable and flexible enough for easy tweaking in future. The quality and the impression of the PDF would also be on equal priority.

Links

Dhruv Sachdev

Dhruv Sachdev selected for GSoD 2021

Project Title

Develop a Shared Data Resource Focused on Dependencies, Risks and Vulnerabilities in Open Source Software

Project Description

This project is aimed at developing a shared data resource to identify various dependencies for Open Source Software, using some of the existing tools to analyze dependencies and map them to know if there are Direct, Transitive, and Circular Dependencies. This project deals with code-level dependencies and not infrastructure-based dependencies like OS or database. This project is implemented using augur which is a software suite for collecting and measuring structured data about free and open-source software (FOSS) communities.

Links

Anuj Lamoria

Anuj Lamoria selected for GSoC 2021

Project Title

Automatically identify Contributor Aliases

Project Description

The aim of this project is to generalize, and make available a PyPy distributable Python package the core functionality currently within the Augur contributor worker, and envisioned as the next phase of the Augur contributor worker.The main goal of this project is Automatically identify Contributor Aliases (emails, platform user accounts) to Increase Parsimony of Statistics and Metrics With Privacy Enhancement I would be focusing on the Augur and developing useful risk-prediction analysis tools and visualization modules. The main work in this project are as follows: Construct an API Accessible Graph Database for identifying and mapping contributors who use multiple email addresses within a platform, and identifiers across platforms. Implement methods to manage this information. Integrate this information into clearer, more parsimonious CHAOSS metrics. Automate the management of contributor changes over time Enable analysis at the project level that obscures or anonymizes individual developer identity

Links

Rashmi K A

Rashmi K A selected for GSoD 2021

Project Title

Sorting Hat - Extend data model and user interface to capture better information about contributors

Project Description

Grimoirelab is an open-source toolset for software development analytics. Grimoirelab provides a set of tools to collect, analyze and visualize software development metrics from a variety of sources like Git, Jira, Confluence, Slack, etc. In order to manage the identities of people across these different sources, Grimoirelab developed Sorting Hat. Sorting Hat manages the identities of people and related metadata.

As part of the metadata collected around identities, Sorting Hat stores organizational information such as the name and domains related to the organization. This project aims to add to this information by extending the existing Organization model to capture the internal structure of organizations such as departments, sub-organizations, and teams. This will help in annotating the identity information more meaningfully.

Links

Yeming Gu

Yeming Gu selected for GSoD 2021

Project Title

Yeming Gu's GSoC proposal for CHAOSS

Project Description

My project aims to define a new similarity measure metric based on social coding semantics underlying the open source trace data to enrich the ability of Augur. The heterogeneous information network schema and network embedding techniques are introduced to capture the latent similarity information between repositories. This project will end up with some new computational models to transform those information into computable representation vectors with respect to every repository.

Links

Veerasamy Sevagen

Veerasamy Sevagen selected for Summer of Open Source Promotion 2021

Project Title

Expanding and restyling the GrimoireLab tutorial

Project Description

GrimoireLab is a powerful open-source platform that provides support for monitoring and in-depth analysis of software projects. It produces a rich set of metrics with data extracted from more than 30 tools related to contributing to Open Source development such as version control systems, issue trackers and forums. These metrics are shown and exploited on Web dynamic dashboards, which can be easily inspected by decision-makers to help them understand the evolution and health of their projects. The main entry point to learn about GrimoireLab is the tutorial, which provides a walkthrough of the platform and its components. Recently, the community has requested to revamp and expand its content to include additional information such as dashboard customization and management.

Links

Jaskirat Singh

Jaskirat Singh selected for GSoD 2020 under the Linux Foundation

Project Title

Create a CHAOSS Community-wide Handbook

Project Description

Community Handbook is a document that defines the community’s key policies and procedures and outlines the community’s mission, values, and workings. This handbook provides a clear introduction and workings to the newly joined members of the community. Currently, the CHAOSS community Handbook is available on the GitHub repository and needs to be revamped and refactored with more information for newcomers and existing community users.

Links

Xiaoya Xia

Xiaoya Xia selected for GSoD 2020 under the Linux Foundation

Project Title

Build documentation for CHAOSS D&I Badging project

Project Description

The work is about building documentation for a young project of CHAOSS: D&I Badging project, the project is a peer-review system, using badge as the final review result, and CHAOSS D&I metrics as the review references. So applicants and reviewers are two important roles, there should have clear and elaborate guidance to show them what to do, how to submit the application, and how to review with a checklist on GitHub.

Links

Ria Gupta

Ria Gupta selected for GSoc 2020

Project Title

Social Currency Metric System

Project Description

Implementing Social Currency Metric System (SCMS) will be a huge milestone in providing a better and holistic view of project health in the open-source community. By adding social currency in the metric, we can quantitatively measure the value of community interactions to accurately gauge “reputation” of a community.

Links

Tianyi Zhou

Tianyi Zhou selected for GSoC 2020

Project Title

Large Social Network Analysis and Anomaly Detection with Augur.

Project Description

Augur is a software that collect data for a list of given repositories and provide a variety of CHAOSS metrics to provides open source health and sustainability metrics.Then users can empirically investigate and uncover useful insights for software engineering, such as understanding the collaborative pattern.

In the current stage, Augur is unable to mining the repositories to generate the co-editing information in the open source software development. This project idea is about to develop this kind of functions to enhance the Augur project. It will end up with a new data worker and analysis tool for fine-grained co-editing network which opens up a massive new source of high-resolution data on human collaboration patterns.

Then I would like to establish the Augur server to mining and monitor the open source ecosystem (with 10000+ repositories or more). The collaboration networks and social trace data of all contributors in the open source ecosystem will be evaluated in terms of social network analysis. It will play a vital role in achieving Augur's goal towards analysis of open source organizations health as well as the goal of the CHAOSS project towards diversity and inclusion.

Links

Sarit Adhikari

Sarit Adhikari selected for GSoC 2020

Project Title

Machine Learning for Anomaly Detection in Open Source Communities

Project Description

Open-source software development is a collaborative effort that requires decentralized decision making from different developers and maintainers. In order to measure the progress of the project, It is important to quantify the code changes across time. CHAOSS provides analytics and metrics to help open source communities measure the impact of the developer’s work on the project and the impact of the project on the community. Augur is a prototyped implementation of the CHAOSS Project on open source software metrics which systematically integrates data from several open-source repositories, issue trackers, mailing lists, etc. Anamoly detection is a common data science strategy of finding extreme data points (outliers), whose features differ vastly from other normal data points. From an open-source software development perspective, it detects unusual surges and drops in development activities like code-commits, pull-requests, etc. This project aims to identify the different types of anomalies that are available from trace data and deliver a personalized notification to the user using several machine learning techniques.

Links

Abhinav Bajpai

Abhinav Bajpai selected for GSoC 2020

Project Title

Implementing GitLab Data Collection Worker and Mapper to bind the responses of GitLab API, Github API & the Augur schema.

Project Description

The project aims to develop a Gitlab collection worker closely tied up with the Github collection worker using mapper files to bind the intended attributes of their API responses and the Augur schema. Gitlab worker holds the responsibility to fetch progressive data related to Issues, Commits, Merge Requests etc from GitLab using python-gitlab API on which metrics could be generated. Additional modules like Data Setter module and Schema moderator would be implemented to work as a common channel for both the workers to push the collected API response into the Augur Database or to change the Augur schema by editing the mapper files. Data Setter module will additionally hold the responsibility of implementing the duplicate management mechanism.

Links

bistaastha

bistaastha selected for GSoC 2020

Project Title

Build Workflow process for CHAOSS D&I Badging Project.

Project Description

CHAOSS Diversity and Inclusion Badging program aims to encourage projects and events to obtain badges for good diversity and inclusion practices. This project will be about building a GitHub based workflow for CHAOSS D&I Badging program.

This project would extend the existing CHAOSS Badging work and implement an Open Peer review process. Also, the project would focus on integrating GitHub based workflow bots.

Links

vchrombie

vchrombie selected for GSoC 2020

Project Title

Creating Quality models using GrimoireLab and CHAOSS metrics.

Project Description

GrimoireLab is a powerful open-source platform that provides support for monitoring and in-depth analysis of software projects. It produces a rich set of dashboards, which can be easily inspected by decision-makers to help them understand the evolution and health of their projects. Despite the large set of dashboards available in GrimoireLab, comparing projects between each other is not straightforward since it requires navigating and drilling down the data in different dashboards.

Prosoul is a web application that empowers decision-makers with the means to create and manage their own quality models, which are useful means to evaluate and compare software projects. This project idea is about supporting the definition of Quality Models using GrimoireLab data and Prosoul.

The main aim of the project is to design an approach to shape the GrimoireLab data in a format that can easily be consumed by Prosoul and implement it on the data obtained from a few data sources like git, github and mailing list repositories to obtain simple quality models.

Links

Saicharan Reddy

Saicharan Reddy selected for GSoC 2020

Project Title

Implementation of GitLab Data Collection Worker & Test Coverage Improvement

Project Description

The primary goal of this project is to congregate data pertaining to GitLab issues, commits, merge requests amongst other entities & store it into the unified data model ecosystem of Augur. The project will use a task queue, a broker and worker instances to process the information at scale. Metrics for sustainability &. overall project health will be built upon the information stored in the unified model. This project also aims at increasing the overall test coverage of the project. Ergo, Unit & Integration tests for data collection workers would be implemented to ensure data consistency.

Links

Akshara P

Akshara P selected for GSoC 2020

Project Title

Machine Learning for Anomaly Detection in Open Source Communities

Project Description

Augur is a Flask based prototyping web stack for CHAOSS metrics. It provides structured data mined from various sources like git repositories, mailing lists and issue trackers using a plugin architecture incorporating other open-source metrics projects like Facade and FOSSology. Augur enables users to keep track of the activities happening across the repositories they care about and compare their performance. The main goals of this project are to detect anomalies in various metrics in the open-source community and notify the community managers at the earliest; providing API endpoints for the required metrics, and a customized dashboard to visualize these metrics through charts. The completion of this project would result in a customized dashboard for every user, providing real-time statistics of the anomalous activities happening across their repositories.

Links

Pratik Mishra

Pratik Mishra selected for GSoC 2020

Project Title

Machine Learning for Anomaly Detection in Open Source Communities

Project Description

This project will play one of the most vital roles in achieving Augur's goal towards analysis of open source organisation health.It will not only provide visualisation but also offer useful Insights that will help users to find the reason behind anomalous activities or anomalous period.

Links

Ore-Aruwaji Oloruntola

Ore-Aruwaji Oloruntola was selected for Outreachy 2020

Project Title

Build Workflow Process for CHAOSS Diversity & Inclusion Badging

Links

Parth Sharma

Parth Sharma successfully completed GSoC 2019.

Project Title

Build CHAOSS Risk and Growth Maturity and Decline Metrics in Augur

Project Description

Augur is fully functional prototyping web stack for CHAOSS metrics. It provides structured data mined from git repositories using a plugin architecture that incorporates other open source metrics projects like Facade and FOSSology. The main aim of this project is to extend Augur’s functionality by implementing Risk and Growth-Maturity-Decline CHAOSS metrics and use cases with a focus on the open source community manager use case. This project, with a focus on the community manager use case, will allow open source community managers to leverage Risk and Growth-Maturity-Decline metrics to better manage their communities and projects.

Links

Bingwen Ma

Bingwen Ma successfully completed GSoC 2019.

Project Title

Build CHAOSS Risk and Growth Maturity and Decline Metrics in Augur

Project Description

The project aims are to implement Risk metrics and other metrics within the Growth-Maturity-Decline CHAOSS metrics and use cases using Augur, focusing on what we have unearthed as the open source community manager use case.

Links

Aniruddha Karajgi

Aniruddha Karajgi successfully completed GSoC 2019.

Project Title

Implementing CHAOSS Metrics with Perceval

Project Description

The aim of this project is to create reference implementations and tests, primarily for the metrics defined by the Evolution Working Group, but also for the other working groups. This will be done by analyzing the data retrieved by Perceval from various sources using jupyter notebooks, pandas and matplotlib.

Links

Nishchith K Shetty

Nishchith K Shetty successfully completed GSoC 2019.

Project Title

Support of Source Code Related Metrics.

Project Description

Graal produces analysis related to code complexity, quality, dependencies, vulnerability and licensing and the data produced conforms to the ones that can be processed by GrimoireLab. I will mainly be focusing on:

  • Adding support of source code related metrics to Grimoirelab with the help of analysis data produced by Graal.
  • Adapting Grimoirelab toolchain to be able to execute Graal and process the data produced by it.
  • Writing appropriate unit tests for additional backends, their corresponding supporting connectors, and methods.
  • Producing analytics related to proposed and calculated metrics* ( described below )
  • Adding documentation related to additional features and improvements in existing ones.

Out of all the five backends provided by Graal, CoCom (Code Complexity) covers a vast majority of the popular languages and CoLic (Code License) supported by NOMOS & ScanCode helps us fetch license & copyright related information from software development repositories and is language independent. Addition of metrics related to these two backends during GSoC period could be applied to a wide range of projects in the future.

Links

Keanu Nichols

Keanu Nichols successfully completed GSoC 2018.

Project Title

Reporting of CHAOSS Metrics.

Project Description

Writing Python code to query GrimoireLab Elastisearch databases and obtain from it the metrics relevant for the report. Possible technologies to achieve this aim include Python Pandas.

Links

Pranjal Aswani

Pranjal Aswani successfully completed GSoC 2018.

Project Title

Reporting of CHAOSS Metrics: Refactoring the existing code and extending the capabilities of the Manuscripts Project.

Project Description

The Manuscripts project, which is a part of the Grimoire Toolset, helps us in analysing repositories and projects by creating a report based on predefined Metrics which give an overview of the project. The infrastructure of the current report generation system needs to be updated so that the users can spend less time on figuring out the hows and can focus on the functionality. The aim of this project is to extend the capabilities of the Manuscripts project so that it covers almost all the Metrics that can be calculated using the different data sources. At the end of this project, we will have a bigger and better reporting system.

Links