I have spotted the biggest bottleneck in "bdiff.c". Actually it was
pretty easy to find after I recompiled the python interpreter and
mercurial for profiling.
In "bdiff.c" function "equatelines" allocates the minimum hash table
size, which can lead to tons of collisions. I introduced an
"overcommit" factor of 16, this is, I allocate 16 times more memory
than the minimum value. Overcommiting 128 times does not improve the
performance over the 16-times case.
#!/bin/sh
hg init a
echo a > a/a
hg --cwd a ci -Ama
hg clone a c
hg clone a b
echo b >> b/a
hg --cwd b ci -mb
echo % push should push to default when default-push not set
hg --cwd b push | sed 's/pushing to.*/pushing/'
echo % push should push to default-push when set
echo 'default-push = ../c' >> b/.hg/hgrc
hg --cwd b push | sed 's/pushing to.*/pushing/'