Why Amazon S3 backup doesn't work 8
I’ve been going through an inordinate number of machines lately. I’m down one Seasonic power supply, two motherboards (one Megabyte, one MSI) and the Thinkpad came back from lenovo with the original problem solved, but with half the screen on the fritz.
This isn’t new or unusual; most of the machines I’ve put together have only lasted a couple of years or so. Consumer hardware just isn’t built for reliability; hard drives being the worst offenders. As a result, most of the information I really care about is based off-site. The two biggest exceptions are the iTunes library, and my financial data.
The standard solution is to have a backup hard drive. The problem here is that backup hard drives also go bad. I’ve run into this a couple of times with the Linkstation and the Western Digital External. There’s not much you can to do test this except restore from backup every once in a while and see if it actually worked (which is about as fun as it sounds).
Another possible solution is to use offsite backup such as Mozy or Amazon S3. I read Zawodny’s experiment with it and started using s3sync. But.
1) Uploading 30GB of music took a week.
2) Most, if not all, of that data was corrupted to the point of unintelligibility.
I only found this out when I thought I was missing some data, of course. Downloading 30GB of data is also a pain (and frankly, I didn’t expect so many errors – every single track sounded like it was playing underwater at half speed.
So. It turns out the best solution is to make sure you have an inordinate number of machines, and have them synced off the master. I may not be able to rely on one machine, or on a backup hard drive, but I can have music (or any encrypted data) put onto available laptops – and I know that data is good, because I play it from there. This is a looser implementation of jwz’s backup strategy, but it’s good enough for me.
Thinking about this a bit more, it’s surprising that backup technology is as limited as it is. Creating an encrypted bittorrent and sharing it amongst a pool would be an excellent way of ensuring redundancy and error correction (is this what Tor does?), but you may not even need to go that far; every time you do a backup, encrypt it, chop it up into yenc blocks, and dump it onto Usenet. A thousand servers will pass it around and make your data retrievable for all time.