I have spotted the biggest bottleneck in "bdiff.c". Actually it was
pretty easy to find after I recompiled the python interpreter and
mercurial for profiling.
In "bdiff.c" function "equatelines" allocates the minimum hash table
size, which can lead to tons of collisions. I introduced an
"overcommit" factor of 16, this is, I allocate 16 times more memory
than the minimum value. Overcommiting 128 times does not improve the
performance over the 16-times case.
#!/bin/sh
mkdir a
cd a
hg init
echo 123 > a
hg add a
hg commit -m "a" -u a -d "1000000 0"
cd ..
mkdir b
cd b
hg init
echo 321 > b
hg add b
hg commit -m "b" -u b -d "1000000 0"
hg pull ../a
hg pull -f ../a
hg heads