root/tags/1.05/doc/overview.txt

Revision 5, 1.5 kB (checked in by bradfitz, 4 years ago)

include original link

Line 
1Originally posted to:
2  <http://brad.livejournal.com/2205732.html>
3
4There are lots of ways to store files on the net lately:
5
6-- Amazon S3 is the most interesting,
7-- Google's rumored GDrive is surely soon coming
8-- Apple has .Mac
9
10I want to back up to them. And more than one. So first off, abstract
11out net-wide storage.... my backup tool (wsbackup) isn't targetting
12one. They're all just providers.
13
14Also, don't trust sending my data in cleartext, and having it stored
15in cleartext, so public key encryption is a must. Then I can run
16automated backups from many hosts, without much fear of keys being
17compromised.
18
19Don't want people being able to do size-analysis, and huge files are a pain anyway, so big files are cut into chunks.
20
21Files stored on Amazon/Google are of form:
22
23-- meta files: backup_rootname-yyyymmddnn.meta, encrypted (YAML?) file mapping relative paths from backup directory root to the stat() information, original SHA1, and array of chunk keys (SHA1s of encrypted chunks) that comprise the file.
24
25-- [sha1ofencryptedchunk].chunk -- content being <= ,say, 20MB chunk of encrypted data.
26
27Then every night different hosts/laptops recurse directory trees,
28consult a stat() cache (on,say, inode number, mtime, size, whatever)
29and do SHA1 calculations on changed files, lookup rest from cache, and
30build the metafile, upload any new chunks, encrypt the metafile,
31upload the metafile.
32
33Result:
34
35-- I can restore any host from any point in time, with Amazon/Google
36   storing all my data, and only paying $0.15 cents/GB-month.
37
Note: See TracBrowser for help on using the browser.