annotate mercurial/streamclone.py @ 2612:ffb895f16925

add support for streaming clone. existing clone code uses pull to get changes from remote repo. is very slow, uses lots of memory and cpu. new clone code has server write file data straight to client, client writes file data straight to disk. memory and cpu used are very low, clone is much faster over lan. new client can still clone with pull, can still clone from older servers. new server can still serve older clients.
author Vadim Gelfer <vadim.gelfer@gmail.com>
date Fri, 14 Jul 2006 11:17:22 -0700
parents
children 5a5852a417b1
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
2612
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
1 # streamclone.py - streaming clone server support for mercurial
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
2 #
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
3 # Copyright 2006 Vadim Gelfer <vadim.gelfer@gmail.com>
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
4 #
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
5 # This software may be used and distributed according to the terms
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
6 # of the GNU General Public License, incorporated herein by reference.
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
7
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
8 from demandload import demandload
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
9 from i18n import gettext as _
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
10 demandload(globals(), "os stat util")
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
11
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
12 # if server supports streaming clone, it advertises "stream"
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
13 # capability with value that is version+flags of repo it is serving.
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
14 # client only streams if it can read that repo format.
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
15
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
16 def walkrepo(root):
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
17 '''iterate over metadata files in repository.
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
18 walk in natural (sorted) order.
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
19 yields 2-tuples: name of .d or .i file, size of file.'''
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
20
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
21 strip_count = len(root) + len(os.sep)
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
22 def walk(path, recurse):
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
23 ents = os.listdir(path)
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
24 ents.sort()
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
25 for e in ents:
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
26 pe = os.path.join(path, e)
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
27 st = os.lstat(pe)
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
28 if stat.S_ISDIR(st.st_mode):
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
29 if recurse:
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
30 for x in walk(pe, True):
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
31 yield x
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
32 else:
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
33 if not stat.S_ISREG(st.st_mode) or len(e) < 2:
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
34 continue
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
35 sfx = e[-2:]
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
36 if sfx in ('.d', '.i'):
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
37 yield pe[strip_count:], st.st_size
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
38 # write file data first
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
39 for x in walk(os.path.join(root, 'data'), True):
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
40 yield x
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
41 # write manifest before changelog
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
42 meta = list(walk(root, False))
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
43 meta.sort(reverse=True)
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
44 for x in meta:
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
45 yield x
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
46
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
47 # stream file format is simple.
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
48 #
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
49 # server writes out line that says how many files, how many total
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
50 # bytes. separator is ascii space, byte counts are strings.
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
51 #
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
52 # then for each file:
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
53 #
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
54 # server writes out line that says file name, how many bytes in
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
55 # file. separator is ascii nul, byte count is string.
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
56 #
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
57 # server writes out raw file data.
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
58
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
59 def stream_out(repo, fileobj):
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
60 '''stream out all metadata files in repository.
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
61 writes to file-like object, must support write() and optional flush().'''
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
62 # get consistent snapshot of repo. lock during scan so lock not
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
63 # needed while we stream, and commits can happen.
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
64 lock = repo.lock()
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
65 repo.ui.debug('scanning\n')
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
66 entries = []
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
67 total_bytes = 0
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
68 for name, size in walkrepo(repo.path):
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
69 entries.append((name, size))
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
70 total_bytes += size
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
71 lock.release()
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
72
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
73 repo.ui.debug('%d files, %d bytes to transfer\n' %
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
74 (len(entries), total_bytes))
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
75 fileobj.write('%d %d\n' % (len(entries), total_bytes))
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
76 for name, size in entries:
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
77 repo.ui.debug('sending %s (%d bytes)\n' % (name, size))
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
78 fileobj.write('%s\0%d\n' % (name, size))
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
79 for chunk in util.filechunkiter(repo.opener(name), limit=size):
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
80 fileobj.write(chunk)
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
81 flush = getattr(fileobj, 'flush', None)
ffb895f16925 add support for streaming clone.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents:
diff changeset
82 if flush: flush()