Protocol switch from using generators to stream-like objects.
This allows the the pull side to precisely control how much data is
read so that another encapsulation layer is not needed.
An http client gets a response with a finite size. Because ssh clients
need to keep the stream open, we must not read more data than is sent
in a response. But due to the streaming nature of the changegroup
scheme, only the piece that's parsing the data knows how far it's
allowed to read.
This means the generator scheme isn't fine-grained enough. Instead we
need file-like objects with a read(x) method. This switches everything
for push/pull over to using file-like objects rather than generators.
Mercurial git BK (*)
storage revlog delta compressed revisions SCCS weave
storage naming by filename by revision hash by filename
merge file DAGs changeset DAG file DAGs?
consistency SHA1 SHA1 CRC
signable? yes yes no
retrieve file tip O(1) O(1) O(revs)
add rev O(1) O(1) O(revs)
find prev file rev O(1) O(changesets) O(revs)
annotate file O(revs) O(changesets) O(revs)
find file changeset O(1) O(changesets) ?
checkout O(files) O(files) O(revs)?
commit O(changes) O(changes) ?
6 patches/s 6 patches/s slow
diff working dir O(changes) O(changes) ?
< 1s < 1s ?
tree diff revs O(changes) O(changes) ?
< 1s < 1s ?
hardlink clone O(files) O(revisions) O(files)
find remote csets O(log new) rsync: O(revisions) ?
git-http: O(changesets)
pull remote csets O(patch) O(modified files) O(patch)
repo growth O(patch) O(revisions) O(patch)
kernel history 300M 3.5G? 250M?
lines of code 2500 6500 (+ cogito) ??
* I've never used BK so this is just guesses