Remove duplicate files from Google Drive

Early on while trying to sync Google Drive to my PC, it got filled with multiple duplicate files. Recently I created a method to remove these duplicates.

First, we’ll install a program called grive to sync Google Drive to a server:

apt-get install git cmake libgcrypt11-dev libjson0-dev libcurl4-openssl-dev libexpat1-dev libboost-filesystem-dev libboost-program-options-dev libboost-all-dev build-essential automake autoconf libtool pkg-config libcurl4-openssl-dev intltool libxml2-dev libgtk2.0-dev libnotify-dev libglib2.0-dev libevent-dev checkinstall 

Install qt4 (Required for cmake for Boost):

libqt4-core libqt4-dev libqt4-gui qt4-dev-tools

Clone the script:

git clone git://github.com/lloyd/yajl yajl
cd yajl
./configure && cmake . && make && checkinstall
cd ..
git clone git://github.com/Grive/grive.git
cd ./grive
./configure && cmake . && make
cp grive/grive /usr/bin/
mkdir ~/Gdrive
cd ~/Gdrive/
grive -a

At this point, follow instructions by copying the generated link and authenticating with Google.
Now Google Drive will sync for the first time

We will now install a program called fdupes which can detect duplicate files recursively and remove them easily.

apt-get install fdupes
cd ~/Gdrive
fdupes -rdN /root/Gdrive

Now sync the changes from local directory to remote by running grive:

grive

You may wish to do a dry run without the dN options to check that it deletes the correct files.


You are reading this post on Joel G Mathew’s tech blog. Joel's personal blog is the Eyrie, hosted here.