I have spotted the biggest bottleneck in "bdiff.c". Actually it was
pretty easy to find after I recompiled the python interpreter and
mercurial for profiling.
In "bdiff.c" function "equatelines" allocates the minimum hash table
size, which can lead to tons of collisions. I introduced an
"overcommit" factor of 16, this is, I allocate 16 times more memory
than the minimum value. Overcommiting 128 times does not improve the
performance over the 16-times case.
d40cc5aacc31ed673d9b5b24f98bee78c283062c 0.4f
1c590d34bf61e2ea12c71738e5a746cd74586157 0.4e
7eca4cfa8aad5fce9a04f7d8acadcd0452e2f34e 0.4d
b4d0c3786ad3e47beacf8412157326a32b6d25a4 0.4c
f40273b0ad7b3a6d3012fd37736d0611f41ecf54 0.5
0a28dfe59f8fab54a5118c5be4f40da34a53cdb7 0.5b
12e0fdbc57a0be78f0e817fd1d170a3615cd35da 0.6
4ccf3de52989b14c3d84e1097f59e39a992e00bd 0.6b
eac9c8efcd9bd8244e72fb6821f769f450457a32 0.6c
979c049974485125e1f9357f6bbe9c1b548a64c3 0.7
3a56574f329a368d645853e0f9e09472aee62349 0.8
6a03cff2b0f5d30281e6addefe96b993582f2eac 0.8.1
35fb62a3a673d5322f6274a44ba6456e5e4b3b37 0.9
2be3001847cb18a23c403439d9e7d0ace30804e9 0.9.1
36a957364b1b89c150f2d0e60a99befe0ee08bd3 0.9.2
27230c29bfec36d5540fbe1c976810aefecfd1d2 0.9.3
fb4b6d5fe100b0886f8bc3d6731ec0e5ed5c4694 0.9.4