CHAOSS Mentorship Alumni
The CHAOSS Community thanks all mentees and mentors who participated in Google Summer of Code, Google Season of Docs and Outreachy.
List of all Mentees
Student | Program & Year | Project | Mentors |
---|---|---|---|
Yash Prakash | GSoC 2021 | Automate Metrics Release and Process Improvement | Kevin Lumbard, Georg Link, Jaskirat Singh |
Ritik Malik | GSoC 2021 | Automate Metrics Release and Process Improvement | Kevin Lumbard, Georg Link, Matt Germonprez, Jaskirat Singh |
Dhruv Sachdev | GSoC 2021 | Develop a Shared Data Resource Focused on Dependencies, Risks and Vulnerabilities in Open Source Software | Sean Goggins, Vinod Ahuja |
Anuj Lamoria | GSoC 2021 | Automatically identify Contributor Aliases | Sean Goggins |
Rashmi K A | GSoC 2021 | Sorting Hat - Extend data model and user interface to capture better information about contributors | Venu Vardhan Reddy |
Yeming Gu | GSoC 2021 | Yeming Gu's GSoC proposal for CHAOSS | Sean Goggins, Vinod Ahuja |
Veerasamy Sevagen | Summer of Open Source Promotion 2021 | Expanding and restyling the GrimoireLab tutorial | Venu Vardhan |
Jaskirat Singh | GSoD 2020 | Create a CHAOSS Community-wide Handbook | Georg J.P.link, Armstrong Foundjem, Matt Germonprez |
Xiaoya Xia | GSoD 2020 | Build documentation for CHAOSS D&I Badging project | Matt Snell, Aastha Bist |
Ria Gupta | GSoC 2020 | Social Currency Metric System | Valerio Cosentino, Samantha Venia, Logan |
Tianyi Zhou | GSoC 2020 | Large Social Network Analysis and Anomaly Detection with Augur | Sean Goggins, Jonahz, Gabe Heim |
Sarit Adhikari | GSoC 2020 | Machine Learning for Anomaly Detection in Open Source Communities | Sean Goggins, Carter Landis, Gabe Heim |
Abhinav Bajpai | GSoC 2020 | Implementing GitLab Data Collection Worker and Mapper to bind the responses of GitLab API, Github API & the Augur schema | Sean Goggins, Carter Landis |
bistaastha | GSoC 2020 | Build Workflow process for CHAOSS D&I Badging Project | Matt Snell |
vchrombie | GSoC 2020 | Creating Quality models using GrimoireLab and CHAOSS metrics | Valerio Cosentino, Aniruddha Karajgi |
Saicharan Reddy | GSoC 2020 | Implementation of GitLab Data Collection Worker & Test Coverage Improvement | Elita Nelson, Sean Goggins, Jonahz |
Akshara P | GSoC 2020 | Machine Learning for Anomaly Detection in Open Source Communities | Elita Nelson, Sean Goggins, Gabe Heim |
Pratik Mishra | GSoC 2020 | Machine Learning for Anomaly Detection in Open Source Communities | Elita Nelson, Sean Goggins, Gabe Heim |
Ore-Aruwaji Oloruntola | Outreachy 2020 | Build Workflow Process for CHAOSS Diversity & Inclusion Badging | Matt Germonprez, Matt Snell, Saleh Abdel Motaal |
Parth Sharma | GSoC 2019 | Build CHAOSS Risk and Growth Maturity and Decline Metrics in Augur | Sean Goggins |
Bingwen Ma | GSoC 2019 | Build CHAOSS Risk and Growth Maturity and Decline Metrics in Augur | Sean Goggins, Matt Germonprez |
Aniruddha Karajgi | GSoC 2019 | Implementing CHAOSS Metrics with Perceval | Jesus Gonzalez-Barahona, valcos, Pranjal Aswani |
Nishchith K Shetty | GSoC 2019 | Support of Source Code Related Metrics | Jesus Gonzalez-Barahona, valcos, Pranjal Aswani |
Keanu Nichols | GSoC 2018 | Reporting of CHAOSS Metrics | Sean Goggins, Jesus Gonzalez-Barahona |
Pranjal Aswani | GSoC 2018 | Reporting of CHAOSS Metrics: Refactoring the existing code and extending the capabilities of the Manuscripts Project | Valerio Cosentino, Jesus Gonzalez-Barahona |
Mentees
Yash Prakash
Yash Prakash selected for GSoC 2021
Project Title
Automate Metrics Release and Process Improvement
Project Description
CHAOSS metrics have been defined to provide an in-depth view into the various features of an open-source project. The metrics are also a key input to help organizations strategically invest their resources.
The main aim of the project is to understand the metrics release process, propose process improvements and automate the release process of these metrics.
In addition to the original English version of these metrics, these metrics are also translated into different languages to help communities across the globe understand and benefit from them.
By the end of this project, there would be complete automation in the process of generation of reports for the metrics and their translations
Links
Ritik Malik
Ritik Malik selected for GSoC 2021
Project Title
Automate Metrics Release and Process Improvement
Project Description
Improving the metric release process and fully automating will not only save time, but it will also help us to define a central structure for the current as well as upcoming WGs/metrics. Keeping in mind the ever evolving CHAOSS, the process would be scalable and flexible enough for easy tweaking in future. The quality and the impression of the PDF would also be on equal priority.
Links
Dhruv Sachdev
Dhruv Sachdev selected for GSoD 2021
Project Title
Develop a Shared Data Resource Focused on Dependencies, Risks and Vulnerabilities in Open Source Software
Project Description
This project is aimed at developing a shared data resource to identify various dependencies for Open Source Software, using some of the existing tools to analyze dependencies and map them to know if there are Direct, Transitive, and Circular Dependencies. This project deals with code-level dependencies and not infrastructure-based dependencies like OS or database. This project is implemented using augur which is a software suite for collecting and measuring structured data about free and open-source software (FOSS) communities.
Links
Anuj Lamoria
Anuj Lamoria selected for GSoC 2021
Project Title
Automatically identify Contributor Aliases
Project Description
The aim of this project is to generalize, and make available a PyPy distributable Python package the core functionality currently within the Augur contributor worker, and envisioned as the next phase of the Augur contributor worker.The main goal of this project is Automatically identify Contributor Aliases (emails, platform user accounts) to Increase Parsimony of Statistics and Metrics With Privacy Enhancement I would be focusing on the Augur and developing useful risk-prediction analysis tools and visualization modules. The main work in this project are as follows: Construct an API Accessible Graph Database for identifying and mapping contributors who use multiple email addresses within a platform, and identifiers across platforms. Implement methods to manage this information. Integrate this information into clearer, more parsimonious CHAOSS metrics. Automate the management of contributor changes over time Enable analysis at the project level that obscures or anonymizes individual developer identity
Links
Rashmi K A
Rashmi K A selected for GSoD 2021
Project Title
Sorting Hat - Extend data model and user interface to capture better information about contributors
Project Description
Grimoirelab is an open-source toolset for software development analytics. Grimoirelab provides a set of tools to collect, analyze and visualize software development metrics from a variety of sources like Git, Jira, Confluence, Slack, etc. In order to manage the identities of people across these different sources, Grimoirelab developed Sorting Hat. Sorting Hat manages the identities of people and related metadata.
As part of the metadata collected around identities, Sorting Hat stores organizational information such as the name and domains related to the organization. This project aims to add to this information by extending the existing Organization model to capture the internal structure of organizations such as departments, sub-organizations, and teams. This will help in annotating the identity information more meaningfully.
Links
Yeming Gu
Yeming Gu selected for GSoD 2021
Project Title
Yeming Gu's GSoC proposal for CHAOSS
Project Description
My project aims to define a new similarity measure metric based on social coding semantics underlying the open source trace data to enrich the ability of Augur. The heterogeneous information network schema and network embedding techniques are introduced to capture the latent similarity information between repositories. This project will end up with some new computational models to transform those information into computable representation vectors with respect to every repository.
Links
Veerasamy Sevagen
Veerasamy Sevagen selected for Summer of Open Source Promotion 2021
Project Title
Expanding and restyling the GrimoireLab tutorial
Project Description
GrimoireLab is a powerful open-source platform that provides support for monitoring and in-depth analysis of software projects. It produces a rich set of metrics with data extracted from more than 30 tools related to contributing to Open Source development such as version control systems, issue trackers and forums. These metrics are shown and exploited on Web dynamic dashboards, which can be easily inspected by decision-makers to help them understand the evolution and health of their projects. The main entry point to learn about GrimoireLab is the tutorial, which provides a walkthrough of the platform and its components. Recently, the community has requested to revamp and expand its content to include additional information such as dashboard customization and management.
Links
Jaskirat Singh
Jaskirat Singh selected for GSoD 2020 under the Linux Foundation
Project Title
Create a CHAOSS Community-wide Handbook
Project Description
Community Handbook is a document that defines the community’s key policies and procedures and outlines the community’s mission, values, and workings. This handbook provides a clear introduction and workings to the newly joined members of the community. Currently, the CHAOSS community Handbook is available on the GitHub repository and needs to be revamped and refactored with more information for newcomers and existing community users.
Links
Xiaoya Xia
Xiaoya Xia selected for GSoD 2020 under the Linux Foundation
Project Title
Build documentation for CHAOSS D&I Badging project
Project Description
The work is about building documentation for a young project of CHAOSS: D&I Badging project, the project is a peer-review system, using badge as the final review result, and CHAOSS D&I metrics as the review references. So applicants and reviewers are two important roles, there should have clear and elaborate guidance to show them what to do, how to submit the application, and how to review with a checklist on GitHub.
Links
Ria Gupta
Ria Gupta selected for GSoc 2020
Project Title
Social Currency Metric System
Project Description
Implementing Social Currency Metric System (SCMS) will be a huge milestone in providing a better and holistic view of project health in the open-source community. By adding social currency in the metric, we can quantitatively measure the value of community interactions to accurately gauge “reputation” of a community.
Links
Tianyi Zhou
Tianyi Zhou selected for GSoC 2020
Project Title
Large Social Network Analysis and Anomaly Detection with Augur.
Project Description
Augur is a software that collect data for a list of given repositories and provide a variety of CHAOSS metrics to provides open source health and sustainability metrics.Then users can empirically investigate and uncover useful insights for software engineering, such as understanding the collaborative pattern.
In the current stage, Augur is unable to mining the repositories to generate the co-editing information in the open source software development. This project idea is about to develop this kind of functions to enhance the Augur project. It will end up with a new data worker and analysis tool for fine-grained co-editing network which opens up a massive new source of high-resolution data on human collaboration patterns.
Then I would like to establish the Augur server to mining and monitor the open source ecosystem (with 10000+ repositories or more). The collaboration networks and social trace data of all contributors in the open source ecosystem will be evaluated in terms of social network analysis. It will play a vital role in achieving Augur's goal towards analysis of open source organizations health as well as the goal of the CHAOSS project towards diversity and inclusion.
Links
Sarit Adhikari
Sarit Adhikari selected for GSoC 2020
Project Title
Machine Learning for Anomaly Detection in Open Source Communities
Project Description
Open-source software development is a collaborative effort that requires decentralized decision making from different developers and maintainers. In order to measure the progress of the project, It is important to quantify the code changes across time. CHAOSS provides analytics and metrics to help open source communities measure the impact of the developer’s work on the project and the impact of the project on the community. Augur is a prototyped implementation of the CHAOSS Project on open source software metrics which systematically integrates data from several open-source repositories, issue trackers, mailing lists, etc. Anamoly detection is a common data science strategy of finding extreme data points (outliers), whose features differ vastly from other normal data points. From an open-source software development perspective, it detects unusual surges and drops in development activities like code-commits, pull-requests, etc. This project aims to identify the different types of anomalies that are available from trace data and deliver a personalized notification to the user using several machine learning techniques.
Links
Abhinav Bajpai
Abhinav Bajpai selected for GSoC 2020
Project Title
Implementing GitLab Data Collection Worker and Mapper to bind the responses of GitLab API, Github API & the Augur schema.
Project Description
The project aims to develop a Gitlab collection worker closely tied up with the Github collection worker using mapper files to bind the intended attributes of their API responses and the Augur schema. Gitlab worker holds the responsibility to fetch progressive data related to Issues, Commits, Merge Requests etc from GitLab using python-gitlab API on which metrics could be generated. Additional modules like Data Setter module and Schema moderator would be implemented to work as a common channel for both the workers to push the collected API response into the Augur Database or to change the Augur schema by editing the mapper files. Data Setter module will additionally hold the responsibility of implementing the duplicate management mechanism.
Links
bistaastha
bistaastha selected for GSoC 2020
Project Title
Build Workflow process for CHAOSS D&I Badging Project.
Project Description
CHAOSS Diversity and Inclusion Badging program aims to encourage projects and events to obtain badges for good diversity and inclusion practices. This project will be about building a GitHub based workflow for CHAOSS D&I Badging program.
This project would extend the existing CHAOSS Badging work and implement an Open Peer review process. Also, the project would focus on integrating GitHub based workflow bots.
Links
vchrombie
vchrombie selected for GSoC 2020
Project Title
Creating Quality models using GrimoireLab and CHAOSS metrics.
Project Description
GrimoireLab is a powerful open-source platform that provides support for monitoring and in-depth analysis of software projects. It produces a rich set of dashboards, which can be easily inspected by decision-makers to help them understand the evolution and health of their projects. Despite the large set of dashboards available in GrimoireLab, comparing projects between each other is not straightforward since it requires navigating and drilling down the data in different dashboards.
Prosoul is a web application that empowers decision-makers with the means to create and manage their own quality models, which are useful means to evaluate and compare software projects. This project idea is about supporting the definition of Quality Models using GrimoireLab data and Prosoul.
The main aim of the project is to design an approach to shape the GrimoireLab data in a format that can easily be consumed by Prosoul and implement it on the data obtained from a few data sources like git, github and mailing list repositories to obtain simple quality models.
Links
Saicharan Reddy
Saicharan Reddy selected for GSoC 2020
Project Title
Implementation of GitLab Data Collection Worker & Test Coverage Improvement
Project Description
The primary goal of this project is to congregate data pertaining to GitLab issues, commits, merge requests amongst other entities & store it into the unified data model ecosystem of Augur. The project will use a task queue, a broker and worker instances to process the information at scale. Metrics for sustainability &. overall project health will be built upon the information stored in the unified model. This project also aims at increasing the overall test coverage of the project. Ergo, Unit & Integration tests for data collection workers would be implemented to ensure data consistency.
Links
Akshara P
Akshara P selected for GSoC 2020
Project Title
Machine Learning for Anomaly Detection in Open Source Communities
Project Description
Augur is a Flask based prototyping web stack for CHAOSS metrics. It provides structured data mined from various sources like git repositories, mailing lists and issue trackers using a plugin architecture incorporating other open-source metrics projects like Facade and FOSSology. Augur enables users to keep track of the activities happening across the repositories they care about and compare their performance. The main goals of this project are to detect anomalies in various metrics in the open-source community and notify the community managers at the earliest; providing API endpoints for the required metrics, and a customized dashboard to visualize these metrics through charts. The completion of this project would result in a customized dashboard for every user, providing real-time statistics of the anomalous activities happening across their repositories.
Links
Pratik Mishra
Pratik Mishra selected for GSoC 2020
Project Title
Machine Learning for Anomaly Detection in Open Source Communities
Project Description
This project will play one of the most vital roles in achieving Augur's goal towards analysis of open source organisation health.It will not only provide visualisation but also offer useful Insights that will help users to find the reason behind anomalous activities or anomalous period.
Links
Ore-Aruwaji Oloruntola
Ore-Aruwaji Oloruntola was selected for Outreachy 2020
Project Title
Build Workflow Process for CHAOSS Diversity & Inclusion Badging
Links
Parth Sharma
Parth Sharma successfully completed GSoC 2019.
Project Title
Build CHAOSS Risk and Growth Maturity and Decline Metrics in Augur
Project Description
Augur is fully functional prototyping web stack for CHAOSS metrics. It provides structured data mined from git repositories using a plugin architecture that incorporates other open source metrics projects like Facade and FOSSology. The main aim of this project is to extend Augur’s functionality by implementing Risk and Growth-Maturity-Decline CHAOSS metrics and use cases with a focus on the open source community manager use case. This project, with a focus on the community manager use case, will allow open source community managers to leverage Risk and Growth-Maturity-Decline metrics to better manage their communities and projects.
Links
Bingwen Ma
Bingwen Ma successfully completed GSoC 2019.
Project Title
Build CHAOSS Risk and Growth Maturity and Decline Metrics in Augur
Project Description
The project aims are to implement Risk metrics and other metrics within the Growth-Maturity-Decline CHAOSS metrics and use cases using Augur, focusing on what we have unearthed as the open source community manager use case.
Links
Aniruddha Karajgi
Aniruddha Karajgi successfully completed GSoC 2019.
Project Title
Implementing CHAOSS Metrics with Perceval
Project Description
The aim of this project is to create reference implementations and tests, primarily for the metrics defined by the Evolution Working Group, but also for the other working groups. This will be done by analyzing the data retrieved by Perceval from various sources using jupyter notebooks, pandas and matplotlib.
Links
Nishchith K Shetty
Nishchith K Shetty successfully completed GSoC 2019.
Project Title
Support of Source Code Related Metrics.
Project Description
Graal produces analysis related to code complexity, quality, dependencies, vulnerability and licensing and the data produced conforms to the ones that can be processed by GrimoireLab. I will mainly be focusing on:
- Adding support of source code related metrics to Grimoirelab with the help of analysis data produced by Graal.
- Adapting Grimoirelab toolchain to be able to execute Graal and process the data produced by it.
- Writing appropriate unit tests for additional backends, their corresponding supporting connectors, and methods.
- Producing analytics related to proposed and calculated metrics* ( described below )
- Adding documentation related to additional features and improvements in existing ones.
Out of all the five backends provided by Graal, CoCom (Code Complexity) covers a vast majority of the popular languages and CoLic (Code License) supported by NOMOS & ScanCode helps us fetch license & copyright related information from software development repositories and is language independent. Addition of metrics related to these two backends during GSoC period could be applied to a wide range of projects in the future.
Links
Keanu Nichols
Keanu Nichols successfully completed GSoC 2018.
Project Title
Reporting of CHAOSS Metrics.
Project Description
Writing Python code to query GrimoireLab Elastisearch databases and obtain from it the metrics relevant for the report. Possible technologies to achieve this aim include Python Pandas.
Links
Pranjal Aswani
Pranjal Aswani successfully completed GSoC 2018.
Project Title
Reporting of CHAOSS Metrics: Refactoring the existing code and extending the capabilities of the Manuscripts Project.
Project Description
The Manuscripts project, which is a part of the Grimoire Toolset, helps us in analysing repositories and projects by creating a report based on predefined Metrics which give an overview of the project. The infrastructure of the current report generation system needs to be updated so that the users can spend less time on figuring out the hows and can focus on the functionality. The aim of this project is to extend the capabilities of the Manuscripts project so that it covers almost all the Metrics that can be calculated using the different data sources. At the end of this project, we will have a bigger and better reporting system.