git Logo

The other day my supervisor asked me an interesting question. He wanted to know how much time we spent developing a new feature. The feature was massive and we started working on it late last year and is just now about to come out of beta. There were a bunch of issues trying to figure this out: nobody was dedicated full time to it, it’s been worked on in fits and starts during that time, and we don’t currently track our time at this level of detail. It became an interesting thought process.

Most of the time we worked on the feature in a big block of time so our thought was to use the commit times for the various logs to estimate when we started and stopped working on it each day. We knew that the majority of the code existed in three places application/views/scripts/feature/, application/controllers/FeatureController.php, and application/models/Feature. We ran the following three commands to generate a list of commits that touched those three places.

git log --date=iso --pretty=format:"%h%x09%an%x09%ad%x09%s" application/views/scripts/feature/ > feature.tsv
git log --date=iso --pretty=format:"%h%x09%an%x09%ad%x09%s" application/controllers/FeatureController.php >> feature.tsv
git log --date=iso --pretty=format:"%h%x09%an%x09%ad%x09%s" application/models/Feature >> insights.tsv

This gave us output that looked like the following.

...
e1f2a31 Scott Keck-Warren       2019-09-05 21:55:38 -0400       Fixed major bug
6df8a92 Scott Keck-Warren       2019-08-21 21:05:25 -0400       Added wizbang feature
11c5726 Scott Keck-Warren       2019-04-01 21:22:50 -0400       Added missing file
a3ad753 Scott Keck-Warren       2019-03-03 14:24:24 -0500       Rebuild interface
22f17b7 Scott Keck-Warren       2019-01-08 21:19:32 -0500       Update to text
a3ad753 Scott Keck-Warren       2019-03-03 14:24:24 -0500       Rebuild interface
...

The downside to this process of running the git command three times is that some of the commits touched all three places so there was some duplication of each commit. To fix this we had to run it through the uniq command to find just the unique values.

sort feature.tsv | uniq -u > feature.clean.tsv

less feature.clean.tsv
...
e1f2a31 Scott Keck-Warren       2019-09-05 21:55:38 -0400       Fixed major bug
6df8a92 Scott Keck-Warren       2019-08-21 21:05:25 -0400       Added wizbang feature
11c5726 Scott Keck-Warren       2019-04-01 21:22:50 -0400       Added missing file
a3ad753 Scott Keck-Warren       2019-03-03 14:24:24 -0500       Rebuild interface
22f17b7 Scott Keck-Warren       2019-01-08 21:19:32 -0500       Update to text
...

See how a3ad753 is no longer listed.

The next step was for us to import the data into Google Sheets so we could determine the time ranges. We could have used a database for this but we thought Sheets would be faster because we had to massage the data a little because it’s missing a lot of information.