Sometime back I put out a backup script written in bash shell script. After having used it successfully for a while, I decided to re-write it in Python. Several reasons:
It had bugs. The bugs were known (it's not as if it's a million lines of code). But I couldn't get rid of them. Simply because I could never understand the semantics of the language well enough to write what I wanted to, exactly.
On the other hand, Python has this elegant, readable, syntax. The things do as you would except them to. Compilation is not minimal as in the case of shell. So, it throws up errors when there are any, and you can catch them. Consequently, the python code that I came up with is smaller and more readable.
My initial experience in using it is that it seems to run quicker too. But I can't vouch for that.
I managed the re-implementation in a couple of hours, as opposed to probably several days worth of effort (with gaps) that went into the shell-script. Part of the reason was of-course that the design was absolutely clear in my mind as I had already done the implementation earlier. So, I don't repent having spent my effort building my first prototype and throwing it away. But another part is (please don't read this sentence if you are a shell-script fan): Python is more modern and better than shell-script.
Here are the main features:
- It will make the contents under sourceRoot identical to destinationRoot.
- It will do selective copying. Means, it will copy a file anywhere inside sourceRoot to the right place inside destinationRoot if and only if there's no corresponding copy in destinationRoot, or destination copy is found to be older than the source copy.
- It allows you to specify any subdirectory of sourceRoot that you want to backup. This is useful when you know that changes since the last backup are localised to a particular location, and you don't want to waste time scanning other locations. For large data, this saves a lot of time.
So, I invite you to prefer this new one to the old one. It does everything that the shell-script did. And it doesn't have the old bugs. So here goes the code:
The above code is a sort of library, which you use along with a driver script. An example follows:
#!/usr/bin/python from backup import * sourceRoot = '/home/sujitkc/my_work' destinationRoot = '/home/sujitkc/work_backup' backup(sourceRoot, destinationRoot)
We usually have multiple backup scenarios. For example, I have the following scenarios:
- I take daily (approximately) backup of my work folder at office.
- I take fortnightly or monthly backup of my personal data at home.
- I take monthly backup of my entire data.
In the above three cases, the only thing that changes is the sourceRoot and destinationRoot. Therefore, I have a separate driver for each of the above scenarios: backup_work.py, backup_personal.py and backup_all.py respectively. In each of these drivers, I have set the sourceRoot and destinationRoot variables to a different appropriate value. Depending on my backup scenario, I just have to run the corresponding driver script. That's all!
All you need to do to use the script are the following:
- Save script 1 as backup.py somewhere and make it executable using:
chmod +x backup.py
- Save script 2 as, say, backup_work.py, at the same location and make it executable using:
chmod +x backup_work.py
- In script 2, change the values of the sourceRoot and destinationRoot variables to appropriate locations.
- You are all set! Now run the script as follows:
./backup_work.py your_backup_directoryIf you are just starting off with this script and are wondering what to put as your_backup_directory, here's a clue: often, the your_backup_directory parameter has the same value as sourceRoot.
If you find the above script useful, it will be highly appreciated if you drop me a word of acknowledgement as a comment to this post. Feel free to communicate if you find any problem with the code.
A Script for Backing Up