tests/test-addremove-similar
author Erling Ellingsen <erlingalf@gmail.com>
Sun, 18 Feb 2007 20:39:25 +0100
changeset 4135 6cb6cfe43c5d
child 4471 736e49292809
permissions -rwxr-xr-x
Avoid some false positives for addremove -s The original code uses the similary score 1 - len(diff(after, before)) / len(after) The diff can at most be the size of the 'before' file, so any small 'before' file would be considered very similar. Removing an empty file would cause all files added in the same revision to be considered copies of the removed file. This changes the metric to bytes_overlap(before, after) / len(before + after) i.e. the actual percentage of bytes shared between the two files.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
4135
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
     1
#!/bin/sh
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
     2
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
     3
hg init rep; cd rep
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
     4
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
     5
touch empty-file
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
     6
python -c 'for x in range(10000): print x' > large-file
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
     7
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
     8
hg addremove
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
     9
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    10
hg commit -m A
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    11
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    12
rm large-file empty-file
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    13
python -c 'for x in range(10,10000): print x' > another-file
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    14
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    15
hg addremove -s50
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    16
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    17
hg commit -m B
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    18
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    19
cd ..
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    20
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    21
hg init rep2; cd rep2
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    22
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    23
python -c 'for x in range(10000): print x' > large-file
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    24
python -c 'for x in range(50): print x' > tiny-file
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    25
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    26
hg addremove
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    27
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    28
hg commit -m A
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    29
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    30
python -c 'for x in range(70): print x' > small-file
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    31
rm tiny-file
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    32
rm large-file
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    33
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    34
hg addremove -s50
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    35
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    36
hg commit -m B
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    37