The illustrated guide to recovering lost commits with Git

Git is one hell of a powertool.

Like with any such tool, as soon as you get to know it enough, you start pushing the boundaries. Git gives you a lot of control over your repository:

The list goes on…

More traditional version control systems don’t give you as much power as Git by any stretch of the mind. They are like taking a walk in the woods with your parents, at age 14.

You’re probably gonna see and do neat stuff, but you sure ain’t gonna get lost or anything.

Using Git on the other hand is more akin to being handed a cool motocross to go play alone in the woods… Also at age 14.

Insane motocross shit

We all know what’s bound to happen, right?

You’ll smash into a tree.

The source control equivalent to slamming into a tree is losing commits. Getting all of Git’s power and flexibility at once can be somewhat dangerous. You’ll find it so easy and helpful to branch and merge that you’ll start doing it way more often. On the other hand – especially in the beginning – you’ll misunderstand or plainly miss some important warnings, and make errors. Or you may just end up in weird merging situations you never thought of, and don’t necessarily understand. These situations can often result in losing commits or whole branches.

My goal with this article is to make sure you understand the situation you’re really in: you have temporarily lost commits or branches.

Disclaimer

This article assumes a basic knowledge of how git works, e.g. committing, branching and merging.

My first time

The first time I lost a commit was a good while ago. I can’t remember the details, but basically I got bit by the fact that under the covers, Git uses hard links liberally. Which means that copy / pasting your code directory as a recovery solution isn’t going to save your ass A nice poney when you attempt a potentially damaging operation you don’t fully understand.

Note that compressing your code directory will, though.

So there I was, after attempting an operation I didn’t really understand. I knew I had failed what I attempted and I knew I had lost my last commit. Ironically, I still had Gitk open, displaying that very commit. As long as I didn’t refresh the Gitk view with F5 I could see the lost commit.

Here’s a fun fact: under OSX (not sure about Linux) you cannot select and copy text from Gitk’s interface, except for the SHA1 field [1]. I knew Git probably had a way to recover from that… But you know, I just wanted to get back to work and NOT search documentation and blog posts endlessly.

So I took screenshots, passed them real quick through GOCR, just to see how far it would get.

The result: GOCR doesn’t like the font Monaco :-)

How to (really) recover lost commits with Git

Recently I lost a commit again. This time however, Gitk was not up to date. I knew I’d just lost something I wouldn’t necessarily remember in its entirety. It was a commit an hour old, touching many files. And I have a crappy memory.

This time I had to do it the right way. I found out it’s really easy (once you figure it out), but I found no really clear explanation anywhere. So here goes.

Initial setup

If you wanna follow along – and I strongly recommend it – here’s the boring few steps to create a dummy repo and bring it up to speed with for the rest of this article. We’re going to beat the hell out of this repo and it’s going to be fun.

So just paste the following into a console:

mkdir recovery;cd recovery
git init
touch file
git add file
git commit -m "First commit"
echo "Hello World" > file
git add .
git commit -m "Greetings"git branch cool_branch
git checkout cool_branch
echo "What up world?" > cool_file
git add .
git commit -m "Now that was cool"
git checkout master
echo "What does that mean?" >> file

Ok, let’s look at where we’re at:

gitk ––all &

The ––all option lets you see all branches at the same time, as well as your stashes.

Click here to enlarge your picture!!1

Initial setup - Recovering git commits

We can see the cool_branch as well as some yet uncommitted changes over the master branch.

mathieu@ml recovery (master)$ ls -l
total 16
-rw-r--r--  1 mathieu  staff    15B  7 Jun 18:19 cool_file
-rw-r--r--  1 mathieu  staff    33B  7 Jun 18:19 file

Got my 2 files, I’m good to go.

Let’s make a mistake

Let’s say I decide I want to bring in these cool changes in master. I’ll do it with a rebase. I know there’s no big risk of conflicts so that’s a no-brainer.

mathieu@ml recovery (master)$ git rebase cool_branch
file: needs update

My ugly mug

Now if you look carefully you’ll notice I wasn’t paying attention when Git gave me a feeble complaint about ‘file’.

Everything’s well, so I think “Ok, I don’t need cool_branch anymore”.

mathieu@ml recovery (master)$ git branch -d cool_branch
error: The branch 'cool_branch' is not an ancestor of your current HEAD.
If you are sure you want to delete it, run 'git branch -D cool_branch'.

Huh? Whatever you say, Linus. Let’s get on with it.

mathieu@ml recovery (master)$ git branch -D cool_branch
Deleted branch cool_branch.

Ahh, it feels good to be a Git ninja. Now let’s see where we’re at and refresh Gitk with F5.

Gitk - oh shit moment

Oops, my cool commit is gone! That thing can’t be right. Let’s panic:

mathieu@ml recovery (master)$ ls
file

mathieu@ml recovery (master)$ git status
# On branch master
# Changed but not updated:
#   (use "git add <file>..." to update what will be committed)
#
#    modified:   file
#
no changes added to commit (use "git add" and/or "git commit -a")

mathieu@ml recovery (master)$ git diff
diff --git a/file b/file
index 557db03..f2a8bf3 100644
--- a/file
+++ b/file
@@ -1 +1,2 @@
 Hello World
+What does that mean?

Oh shit face Oh sh!t

So the ‘file: needs update’ message back there meant that the rebase didn’t happen, because I had pending changes.

Helpful.

Recovering a lost commit

Since I don’t think my uncommitted work is complete, I’ll just stash it instead of committing it. Then I’ll hunt down my lost work.

mathieu@ml recovery (master)$ git stash save "Questioning the universe"
Saved working directory and index state "On master: Questioning the universe" HEAD is now at 6da726f... Greetings

In the name of paranoïa, let’s make sure this got in right:

In a paranoïa moment, we make sure the stash is saved correctly

Ok, let’s get on with our rescue mission:

mathieu@ml recovery (master)$ git fsck −−lost-found
dangling commit 93b0c51cfea8c731aa385109b8e99d19b38a55be

That sounds right, exactly one commit in the lost and found.

Let’s just make sure:

mathieu@ml recovery (master)$ git show 93b0c51cfea8c731aa385109b8e99d19b38a55be | mate

We see in textmate that this is our lost commit

Bingo!

Different ways to recover the commit

There are a few different ways to recover that commit. Obviously we can just copy and paste that snippet, but in the case of a bigger commit, that approach will just amount to a lot of error-prone busywork.

I’ll reclaim my Git ninja status and try it a few different ways.

Recover it with rebase

Let’s just replay this change on top of master:

mathieu@ml recovery (master)$ git rebase 93b0c51cfea8c731aa385109b8e99d19b38a55be
First, rewinding head to replay your work on top of it...
HEAD is now at 93b0c51... Now that was cool
Fast-forwarded master to 93b0c51cfea8c731aa385109b8e99d19b38a55be.

Commit recovered with rebase

Neat! Now I feel like a ninja worthy of the title again.

So let’s rewind one commit and try it another way.

mathieu@ml recovery (master)$ git reset --hard head^
HEAD is now at 6da726f... Greetings

Rewinding to a state where we’ve lost our commit

Ok, the commit’s gone.

(Don’t tell anyone but my inner ninja is feeling queasy again.)

Recover it with merge

There are cases where rebase is not powerful enough. For example when you expect to face a lot of conflicts. In this case merge is a better solution:

mathieu@ml recovery (master)$ git merge 93b0c51cfea8c731aa385109b8e99d19b38a55be
Updating 6da726f..93b0c51
Fast forward
 cool_file |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 cool_file

Commit recovered with merge

Too easy… Rewind!

mathieu@ml recovery (master)$ git reset --hard head^
HEAD is now at 6da726f... Greetings

Recover it with cherry-pick

If instead you had a few commits one after another but you just want to pick the last one, rebase and merge won’t do. They would bring the whole branch back in master. That’s a situation for cherry-pick.

mathieu@ml recovery (master)$ git cherry-pick 93b0c51cfea8c731aa385109b8e99d19b38a55be
Finished one cherry-pick.
Created commit f443703: Now that was cool
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 cool_file

Commit recovered with cherry-pick

Insane!

This only leaves one open question: WHO’S YOUR DADDY NOW, GIT?

Now that we’ve established the answer to that question, let’s get back to work!

Let’s make a second mistake

mathieu@ml recovery (master)$ git stash clear

Or was it Git stash apply?

Oops! Accidentally lost the stash

Oh jeez, there we go again…

mathieu@ml recovery (master)$ git fsck −−lost-found
dangling commit 24e3752f7a73ae98b361ce1c260e1f285d653447
dangling commit 93b0c51cfea8c731aa385109b8e99d19b38a55be

Ok, we still see the one we lost earlier, 93b0c51… Let’s look at the other one.

mathieu@ml recovery (master)$ git show 24e3752f7a73ae98b361ce1c260e1f285d653447
commit 24e3752f7a73ae98b361ce1c260e1f285d653447
Merge: 6da726f... c90f079...
Author: Mathieu Martin <[email protected]>
Date:   Sat Jun 7 16:02:57 2008 -0400

On master: Questioning the universe

diff --cc file
index 557db03,557db03..f2a8bf3
--- a/file
+++ b/file
@@@ -1,1 -1,1 +1,2 @@@
Hello World
++What does that mean?

Spot on. Let’s try something wild, while we’re here.

mathieu@ml recovery (master)$ git checkout 24e3752f7a73ae98b361ce1c260e1f285d653447
Note: moving to "24e3752f7a73ae98b361ce1c260e1f285d653447" which isn't a local branch
If you want to create a new branch from this checkout, you may do so
(now or later) by using -b with the checkout command again. Example:
  git checkout -b <new_branch_name>
HEAD is now at 24e3752... On master: Questioning the universe

mathieu@ml recovery (24e3752...)$

As you may have noticed, my console always indicates which branch I’m in, so far [2]. But now I seem to be in some kind of twilight zone, which Gitk confirms.

Oops! Accidentally lost the stash

Let’s follow Git’s suggestion and make that a branch.

mathieu@ml recovery (24e3752...)$ git checkout -b recovery
Switched to a new branch "recovery"

mathieu@ml recovery (recovery)$

Stash recovered as a branch

Looks weird, like stashed items always do, but at least we have our commit.

After fiddling around with what’s been recovered from the stash, I recommend NOT keeping it as a commit.

If you try to replay the change in the recovery branch over master’s most recent commit, you lose the “Questioning the universe” commit. Probably because a stash is a weird kind of commit, or maybe because of a bug. I don’t know.

(Don’t follow this one in your console)

mathieu@ml recovery (recovery)$ git rebase master  #I said don't do this one
First, rewinding head to replay your work on top of it...
HEAD is now at 93b0c51... Now that was cool
Nothing to do.

Rebasing the recovered stash over master doesn’t work

If instead I checkout master and then rebase its last change over the ‘recovery’ branch it seems to work.

Recovered stash back in master

However since I just saw a commit disappear when rebasing the other way around, I get the feeling that this isn’t a normal commit and it may come back to haunt me later.

Recover it by applying a diff

Let’s just apply the diff to master. I’ll do as if it actually was a substantial commit, involving lots of modifications on lots of files, and apply it automatically with ‘git apply’.

First let’s visualize where we’re at, again:

Stash recovered as a branch

A diff against master is not what we want since master includes a new (very cool) commit.

Instead we just want to see the changes introduced by the current commit. To do this we can compare it with the common ancestor between the master and recovery branches. So let’s start by finding it’s ID.

Finding the ID of the common ancestor

mathieu@ml recovery (recovery)$ git diff 6da726f37683c83947d54314cd32ca1ee9d490e0
diff --git a/file b/file
index 557db03..f2a8bf3 100644
--- a/file
+++ b/file
@@ -1 +1,2 @@
Hello World
+What does that mean?

Looks good. Now we throw that diff upstairs.

git diff 6da726f37683c83947d54314cd32ca1ee9d490e0 > ../recovery.diff

Then get apply it to our master branch.

mathieu@ml recovery (recovery)$ git checkout master
Switched to branch "master"

mathieu@ml recovery (master)$ git apply ../recovery.diff

And we finally confirm that everything’s under control.

mathieu@ml recovery (master)$ git status
# On branch master
# Changed but not updated:
#   (use "git add <file>..." to update what will be committed)
#
#    modified:   file
#
no changes added to commit (use "git add" and/or "git commit -a")

mathieu@ml recovery (master)$ git diff
diff --git a/file b/file
index 557db03..f2a8bf3 100644
--- a/file
+++ b/file
@@ -1 +1,2 @@
Hello World
+What does that mean?

This change was first stashed rather than committed because I felt it was not complete. Applying it with Git apply only introduces it as an unstaged change, which works perfectly for this situation. Now I can keep banging at the code until I feel this actually deserves to be committed.

mathieu@ml recovery (master)$ echo "I don't know" >> file

mathieu@ml recovery (master)$ git commit -a -m "Conversation of staggering depth"
Created commit 65a4794: Conversation of staggering depth
 1 files changed, 2 insertions(+), 0 deletions(-)

Cleaning up the crud

Ok, so now I still have this weird looking recovery branch.

Now we want to get rid of this weird recovery branch

Since it’s now useless we can get rid of it.

mathieu@ml recovery (master)$ git branch -d recovery
error: The branch 'recovery' is not an ancestor of your current HEAD.
If you are sure you want to delete it, run 'git branch -D recovery'.

Aha! This time everything’s committed correctly, so I know I can delete it for real. Git is complaining because that commit was not included through its normal merge or rebase commands. So it warns me that I may be about to lose something. However I know I got everything through the diff I made and re-applied.

mathieu@ml recovery (master)$ git branch -D recovery
Deleted branch recovery.

Now that I’m aware that commits are reachable even if they’re not in a branch anymore, I wonder about my repo’s size.

Repository size with a few dangling commits: 224kb

mathieu@ml recovery (master)$ git gc
Counting objects: 22, done.
Compressing objects: 100% (14/14), done.
Writing objects: 100% (22/22), done.
Total 22 (delta 7), reused 0 (delta 0)

mathieu@ml recovery (master)$ git prune

Repo size after cleaning up the crud: 152kb

Fair enough. I would expect the unused commits to now be unreachable, but strangely enough:

mathieu@ml recovery (master)$ git fsck −−lost-found
dangling commit 49ed65cdea22443af3f1fd400754fe1517421b24
dangling commit 4b1bf4792cba929e88114379d7d5e86a2dc9990f
dangling commit 6cdf88318109dede7bd3c1a75be76c7255708ded
dangling commit 715a6b2cfe797383216d0f9b04fe8f50e90e779f
dangling commit f443703e5060d9f3b4d97504bda5f97e5a0b31e8

If anyone finds out what that’s all about, please let me know!

Maybe Git’s just refusing to do any work unless it’s going to actually save a considerable amount of space? I have no idea.

Conclusion

Once you know how to recover from bad mistakes, you’ll find that Git is not only a very powerful tool, but also a very forgiving one. As opposed to a motocross.

The following commands will help you figure you way out of most bad situations:

  • git show
  • git fsck −−lost-found
  • git diff

And these ones will actually get out of these bad situations:

  • git rebase
  • git cherry-pick
  • git merge
  • git apply

As I think I demonstrated, Git gives you the ability to recover from most bad mistakes. The fact that any single commit can be cherry-picked, checked out, rebased or merged makes it really easy to recover from hairy situations.

The only case where you might actually lose information is when something has not been committed or stashed yet, which I think is perfectly reasonable.

So if you take only one thing away from this article, let it be this. Git is much safer than a motocross.

Footnotes

[1] At the time I didn’t know that just having the SHA1 id was enough to save me.

[2] See how to configure your console in the same manner and also get auto-completion for Git here.

Comments