comparison tests/test-addremove-similar @ 4135:6cb6cfe43c5d

Avoid some false positives for addremove -s The original code uses the similary score 1 - len(diff(after, before)) / len(after) The diff can at most be the size of the 'before' file, so any small 'before' file would be considered very similar. Removing an empty file would cause all files added in the same revision to be considered copies of the removed file. This changes the metric to bytes_overlap(before, after) / len(before + after) i.e. the actual percentage of bytes shared between the two files.
author Erling Ellingsen <erlingalf@gmail.com>
date Sun, 18 Feb 2007 20:39:25 +0100
parents
children 736e49292809
comparison
equal deleted inserted replaced
4134:9dc64c8414ca 4135:6cb6cfe43c5d
1 #!/bin/sh
2
3 hg init rep; cd rep
4
5 touch empty-file
6 python -c 'for x in range(10000): print x' > large-file
7
8 hg addremove
9
10 hg commit -m A
11
12 rm large-file empty-file
13 python -c 'for x in range(10,10000): print x' > another-file
14
15 hg addremove -s50
16
17 hg commit -m B
18
19 cd ..
20
21 hg init rep2; cd rep2
22
23 python -c 'for x in range(10000): print x' > large-file
24 python -c 'for x in range(50): print x' > tiny-file
25
26 hg addremove
27
28 hg commit -m A
29
30 python -c 'for x in range(70): print x' > small-file
31 rm tiny-file
32 rm large-file
33
34 hg addremove -s50
35
36 hg commit -m B
37