Code Contributions by Email Domain

Augmentable Software
1 min readFeb 14, 2021

by Patrick DeVivo

Ever want to know what organizations are contributing to an open-source codebase? This AskGit query may be able to help:

Or, in other words: show me the email domains of commit authors (excluding merge commits), ordered by the most frequently occurring domain.

For instance, for the Kubernetes repo (see it live here):

Top email domains of contributors (by commit count) to the kubernetes source code
Top email domains of contributors (by commit count) to the Kubernetes source code. Check it out

Some further angles to pursue

  • Slice and dice by when commits were made (i.e. org 1 was more influential a year ago, now it’s org 2)
  • Look at more than commit count — lines of code added/removed, files modified, types of files modified, the actual content of contributions
  • Look at blame to see what lines of code contributed by an org are still in the codebase (measure code churn by organization?)

Some limitations

  • Email domain may not always be a good proxy for “organization” (committers may use public email providers like gmail, or use personal emails)
  • An organization may have committers using multiple email domains

--

--