The illustrated guide to recovering lost commits with Git
Git is one hell of a powertool.
Like with any such tool, as soon as you get to know it enough, you start pushing the boundaries. Git gives you a lot of control over your repository:
- trivial branching and merging (even with long lived branches);
- rebasing as a cleaner alternative to merging;
- stashing aside your changes for a quick fix elsewhere;
- extracting logical, distinct commits from a multi-hours coding spree;
The list goes on…
More traditional version control systems don’t give you as much power as Git by any stretch of the mind. They are like taking a walk in the woods with your parents, at age 14.
You’re probably gonna see and do neat stuff, but you sure ain’t gonna get lost or anything.
Using Git on the other hand is more akin to being handed a cool motocross to go play alone in the woods… Also at age 14.
We all know what’s bound to happen, right?
You’ll smash into a tree.
The source control equivalent to slamming into a tree is losing commits. Getting all of Git’s power and flexibility at once can be somewhat dangerous. You’ll find it so easy and helpful to branch and merge that you’ll start doing it way more often. On the other hand – especially in the beginning – you’ll misunderstand or plainly miss some important warnings, and make errors. Or you may just end up in weird merging situations you never thought of, and don’t necessarily understand. These situations can often result in losing commits or whole branches.
My goal with this article is to make sure you understand the situation you’re really in: you have temporarily lost commits or branches.
Disclaimer
This article assumes a basic knowledge of how git works, e.g. committing, branching and merging.
My first time
The first time I lost a commit was a good while ago. I can’t remember the details, but basically I got bit by the fact that under the covers, Git uses hard links liberally. Which means that copy / pasting your code directory as a recovery solution isn’t going to save your ass when you attempt a potentially damaging operation you don’t fully understand.
Note that compressing your code directory will, though.
So there I was, after attempting an operation I didn’t really understand. I knew I had failed what I attempted and I knew I had lost my last commit. Ironically, I still had Gitk open, displaying that very commit. As long as I didn’t refresh the Gitk view with F5 I could see the lost commit.
Here’s a fun fact: under OSX (not sure about Linux) you cannot select and copy text from Gitk’s interface, except for the SHA1 field [1]. I knew Git probably had a way to recover from that… But you know, I just wanted to get back to work and NOT search documentation and blog posts endlessly.
So I took screenshots, passed them real quick through GOCR, just to see how far it would get.
The result: GOCR doesn’t like the font Monaco :-)
How to (really) recover lost commits with Git
Recently I lost a commit again. This time however, Gitk was not up to date. I knew I’d just lost something I wouldn’t necessarily remember in its entirety. It was a commit an hour old, touching many files. And I have a crappy memory.
This time I had to do it the right way. I found out it’s really easy (once you figure it out), but I found no really clear explanation anywhere. So here goes.
Initial setup
If you wanna follow along – and I strongly recommend it – here’s the boring few steps to create a dummy repo and bring it up to speed with for the rest of this article. We’re going to beat the hell out of this repo and it’s going to be fun.
So just paste the following into a console:
mkdir recovery;cd recovery git init touch file git add file git commit -m "First commit" echo "Hello World" > file git add . git commit -m "Greetings"git branch cool_branch git checkout cool_branch echo "What up world?" > cool_file git add . git commit -m "Now that was cool" git checkout master echo "What does that mean?" >> file
Ok, let’s look at where we’re at:
gitk ––all &
The ––all option lets you see all branches at the same time, as well as your stashes.
Click here to enlarge your picture!!1
We can see the cool_branch as well as some yet uncommitted changes over the master branch.
mathieu@ml recovery (master)$ ls -l total 16 -rw-r--r-- 1 mathieu staff 15B 7 Jun 18:19 cool_file -rw-r--r-- 1 mathieu staff 33B 7 Jun 18:19 file
Got my 2 files, I’m good to go.
Let’s make a mistake
Let’s say I decide I want to bring in these cool changes in master. I’ll do it with a rebase. I know there’s no big risk of conflicts so that’s a no-brainer.
mathieu@ml recovery (master)$ git rebase cool_branch file: needs update |
Now if you look carefully you’ll notice I wasn’t paying attention when Git gave me a feeble complaint about ‘file’.
Everything’s well, so I think “Ok, I don’t need cool_branch anymore”.
mathieu@ml recovery (master)$ git branch -d cool_branch error: The branch 'cool_branch' is not an ancestor of your current HEAD. If you are sure you want to delete it, run 'git branch -D cool_branch'.
Huh? Whatever you say, Linus. Let’s get on with it.
mathieu@ml recovery (master)$ git branch -D cool_branch Deleted branch cool_branch.
Ahh, it feels good to be a Git ninja. Now let’s see where we’re at and refresh Gitk with F5.
Oops, my cool commit is gone! That thing can’t be right. Let’s panic:
mathieu@ml recovery (master)$ ls file mathieu@ml recovery (master)$ git status # On branch master # Changed but not updated: # (use "git add <file>..." to update what will be committed) # # modified: file # no changes added to commit (use "git add" and/or "git commit -a") mathieu@ml recovery (master)$ git diff diff --git a/file b/file index 557db03..f2a8bf3 100644 --- a/file +++ b/file @@ -1 +1,2 @@ Hello World +What does that mean?
Oh sh!t
So the ‘file: needs update’ message back there meant that the rebase didn’t happen, because I had pending changes.
Helpful.
Recovering a lost commit
Since I don’t think my uncommitted work is complete, I’ll just stash it instead of committing it. Then I’ll hunt down my lost work.
mathieu@ml recovery (master)$ git stash save "Questioning the universe" Saved working directory and index state "On master: Questioning the universe" HEAD is now at 6da726f... Greetings
In the name of paranoïa, let’s make sure this got in right:
Ok, let’s get on with our rescue mission:
mathieu@ml recovery (master)$ git fsck −−lost-found dangling commit 93b0c51cfea8c731aa385109b8e99d19b38a55be
That sounds right, exactly one commit in the lost and found.
Let’s just make sure:
mathieu@ml recovery (master)$ git show 93b0c51cfea8c731aa385109b8e99d19b38a55be | mate
Bingo!
Different ways to recover the commit
There are a few different ways to recover that commit. Obviously we can just copy and paste that snippet, but in the case of a bigger commit, that approach will just amount to a lot of error-prone busywork.
I’ll reclaim my Git ninja status and try it a few different ways.
Recover it with rebase
Let’s just replay this change on top of master:
mathieu@ml recovery (master)$ git rebase 93b0c51cfea8c731aa385109b8e99d19b38a55be First, rewinding head to replay your work on top of it... HEAD is now at 93b0c51... Now that was cool Fast-forwarded master to 93b0c51cfea8c731aa385109b8e99d19b38a55be.
Neat! Now I feel like a ninja worthy of the title again.
So let’s rewind one commit and try it another way.
mathieu@ml recovery (master)$ git reset --hard head^ HEAD is now at 6da726f... Greetings
Ok, the commit’s gone.
(Don’t tell anyone but my inner ninja is feeling queasy again.)
Recover it with merge
There are cases where rebase is not powerful enough. For example when you expect to face a lot of conflicts. In this case merge is a better solution:
mathieu@ml recovery (master)$ git merge 93b0c51cfea8c731aa385109b8e99d19b38a55be Updating 6da726f..93b0c51 Fast forward cool_file | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) create mode 100644 cool_file
Too easy… Rewind!
mathieu@ml recovery (master)$ git reset --hard head^ HEAD is now at 6da726f... Greetings
Recover it with cherry-pick
If instead you had a few commits one after another but you just want to pick the last one, rebase and merge won’t do. They would bring the whole branch back in master. That’s a situation for cherry-pick.
mathieu@ml recovery (master)$ git cherry-pick 93b0c51cfea8c731aa385109b8e99d19b38a55be Finished one cherry-pick. Created commit f443703: Now that was cool 1 files changed, 1 insertions(+), 0 deletions(-) create mode 100644 cool_file
Insane!
This only leaves one open question: WHO’S YOUR DADDY NOW, GIT?
Now that we’ve established the answer to that question, let’s get back to work!
Let’s make a second mistake
mathieu@ml recovery (master)$ git stash clear
Or was it Git stash apply?
Oh jeez, there we go again…
mathieu@ml recovery (master)$ git fsck −−lost-found dangling commit 24e3752f7a73ae98b361ce1c260e1f285d653447 dangling commit 93b0c51cfea8c731aa385109b8e99d19b38a55be
Ok, we still see the one we lost earlier, 93b0c51… Let’s look at the other one.
mathieu@ml recovery (master)$ git show 24e3752f7a73ae98b361ce1c260e1f285d653447 commit 24e3752f7a73ae98b361ce1c260e1f285d653447 Merge: 6da726f... c90f079... Author: Mathieu Martin <[email protected]> Date: Sat Jun 7 16:02:57 2008 -0400 On master: Questioning the universe diff --cc file index 557db03,557db03..f2a8bf3 --- a/file +++ b/file @@@ -1,1 -1,1 +1,2 @@@ Hello World ++What does that mean?
Spot on. Let’s try something wild, while we’re here.
mathieu@ml recovery (master)$ git checkout 24e3752f7a73ae98b361ce1c260e1f285d653447 Note: moving to "24e3752f7a73ae98b361ce1c260e1f285d653447" which isn't a local branch If you want to create a new branch from this checkout, you may do so (now or later) by using -b with the checkout command again. Example: git checkout -b <new_branch_name> HEAD is now at 24e3752... On master: Questioning the universe mathieu@ml recovery (24e3752...)$
As you may have noticed, my console always indicates which branch I’m in, so far [2]. But now I seem to be in some kind of twilight zone, which Gitk confirms.
Let’s follow Git’s suggestion and make that a branch.
mathieu@ml recovery (24e3752...)$ git checkout -b recovery Switched to a new branch "recovery" mathieu@ml recovery (recovery)$
Looks weird, like stashed items always do, but at least we have our commit.
After fiddling around with what’s been recovered from the stash, I recommend NOT keeping it as a commit.
If you try to replay the change in the recovery branch over master’s most recent commit, you lose the “Questioning the universe” commit. Probably because a stash is a weird kind of commit, or maybe because of a bug. I don’t know.
(Don’t follow this one in your console)
mathieu@ml recovery (recovery)$ git rebase master #I said don't do this one First, rewinding head to replay your work on top of it... HEAD is now at 93b0c51... Now that was cool Nothing to do.
If instead I checkout master and then rebase its last change over the ‘recovery’ branch it seems to work.
However since I just saw a commit disappear when rebasing the other way around, I get the feeling that this isn’t a normal commit and it may come back to haunt me later.
Recover it by applying a diff
Let’s just apply the diff to master. I’ll do as if it actually was a substantial commit, involving lots of modifications on lots of files, and apply it automatically with ‘git apply’.
First let’s visualize where we’re at, again:
A diff against master is not what we want since master includes a new (very cool) commit.
Instead we just want to see the changes introduced by the current commit. To do this we can compare it with the common ancestor between the master and recovery branches. So let’s start by finding it’s ID.
mathieu@ml recovery (recovery)$ git diff 6da726f37683c83947d54314cd32ca1ee9d490e0 diff --git a/file b/file index 557db03..f2a8bf3 100644 --- a/file +++ b/file @@ -1 +1,2 @@ Hello World +What does that mean?
Looks good. Now we throw that diff upstairs.
git diff 6da726f37683c83947d54314cd32ca1ee9d490e0 > ../recovery.diff
Then get apply it to our master branch.
mathieu@ml recovery (recovery)$ git checkout master Switched to branch "master" mathieu@ml recovery (master)$ git apply ../recovery.diff
And we finally confirm that everything’s under control.
mathieu@ml recovery (master)$ git status # On branch master # Changed but not updated: # (use "git add <file>..." to update what will be committed) # # modified: file # no changes added to commit (use "git add" and/or "git commit -a") mathieu@ml recovery (master)$ git diff diff --git a/file b/file index 557db03..f2a8bf3 100644 --- a/file +++ b/file @@ -1 +1,2 @@ Hello World +What does that mean?
This change was first stashed rather than committed because I felt it was not complete. Applying it with Git apply only introduces it as an unstaged change, which works perfectly for this situation. Now I can keep banging at the code until I feel this actually deserves to be committed.
mathieu@ml recovery (master)$ echo "I don't know" >> file mathieu@ml recovery (master)$ git commit -a -m "Conversation of staggering depth" Created commit 65a4794: Conversation of staggering depth 1 files changed, 2 insertions(+), 0 deletions(-)
Cleaning up the crud
Ok, so now I still have this weird looking recovery branch.
Since it’s now useless we can get rid of it.
mathieu@ml recovery (master)$ git branch -d recovery error: The branch 'recovery' is not an ancestor of your current HEAD. If you are sure you want to delete it, run 'git branch -D recovery'.
Aha! This time everything’s committed correctly, so I know I can delete it for real. Git is complaining because that commit was not included through its normal merge or rebase commands. So it warns me that I may be about to lose something. However I know I got everything through the diff I made and re-applied.
mathieu@ml recovery (master)$ git branch -D recovery Deleted branch recovery.
Now that I’m aware that commits are reachable even if they’re not in a branch anymore, I wonder about my repo’s size.
mathieu@ml recovery (master)$ git gc Counting objects: 22, done. Compressing objects: 100% (14/14), done. Writing objects: 100% (22/22), done. Total 22 (delta 7), reused 0 (delta 0) mathieu@ml recovery (master)$ git prune
Fair enough. I would expect the unused commits to now be unreachable, but strangely enough:
mathieu@ml recovery (master)$ git fsck −−lost-found dangling commit 49ed65cdea22443af3f1fd400754fe1517421b24 dangling commit 4b1bf4792cba929e88114379d7d5e86a2dc9990f dangling commit 6cdf88318109dede7bd3c1a75be76c7255708ded dangling commit 715a6b2cfe797383216d0f9b04fe8f50e90e779f dangling commit f443703e5060d9f3b4d97504bda5f97e5a0b31e8
If anyone finds out what that’s all about, please let me know!
Maybe Git’s just refusing to do any work unless it’s going to actually save a considerable amount of space? I have no idea.
Conclusion
Once you know how to recover from bad mistakes, you’ll find that Git is not only a very powerful tool, but also a very forgiving one. As opposed to a motocross.
The following commands will help you figure you way out of most bad situations:
- git show
- git fsck −−lost-found
- git diff
And these ones will actually get out of these bad situations:
- git rebase
- git cherry-pick
- git merge
- git apply
As I think I demonstrated, Git gives you the ability to recover from most bad mistakes. The fact that any single commit can be cherry-picked, checked out, rebased or merged makes it really easy to recover from hairy situations.
The only case where you might actually lose information is when something has not been committed or stashed yet, which I think is perfectly reasonable.
So if you take only one thing away from this article, let it be this. Git is much safer than a motocross.
Footnotes
[1] At the time I didn’t know that just having the SHA1 id was enough to save me.
[2] See how to configure your console in the same manner and also get auto-completion for Git here.