I have spotted the biggest bottleneck in "bdiff.c". Actually it was
pretty easy to find after I recompiled the python interpreter and
mercurial for profiling.
In "bdiff.c" function "equatelines" allocates the minimum hash table
size, which can lead to tons of collisions. I introduced an
"overcommit" factor of 16, this is, I allocate 16 times more memory
than the minimum value. Overcommiting 128 times does not improve the
performance over the 16-times case.
#!/bin/sh
# basic test for hg debugrebuildstate
hg init repo
cd repo
touch foo bar
hg ci -Am 'add foo bar'
touch baz
hg add baz
hg rm bar
echo '% state dump'
hg debugstate | cut -b 1-16,35- | sort
echo '% status'
hg st -A
hg debugrebuildstate
echo '% state dump'
hg debugstate | cut -b 1-16,35- | sort
echo '% status'
hg st -A