convert: add a mode where mercurial_sink skips empty revisions.
The getchanges function of some converter_source classes can return
some false positives. I.e. they sometimes claim that a file "foo"
was changed in some revision, even though its contents are still the
same.
convert_svn is particularly bad, but I think this can also happen with
convert_cvs and, at least in theory, with mercurial_source.
For regular conversions this is not really a problem - as long as
getfile returns the right contents, we'll get a converted revision
with the right contents. But when we use --filemap, this could lead
to superfluous revisions being converted.
Instead of fixing every converter_source, I decided to change
mercurial_sink to work around this problem.
When --filemap is used, we're interested only in revisions that touch
some specific files. If a revision doesn't change any of these files,
then we're not interested in it (at least for revisions with a single
parent; merges are special).
For mercurial_sink, we abuse this property and rollback a commit if
the manifest text hasn't changed. This avoids duplicating the logic
from localrepo.filecommit to detect unchanged files.
adding changesets
adding manifests
adding file changes
added 2 changesets with 2 changes to 1 files
(run 'hg update' to get a working copy)
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
% should fail with encoding error
M a
? latin-1
? latin-1-tag
? utf-8
transaction abort!
rollback completed
abort: decoding near ' encoded: é': 'ascii' codec can't decode byte 0xe9 in position 20: ordinal not in range(128)!
% these should work
marked working directory as branch é
% ascii
changeset: 5:db5520b4645f
branch: ?
tag: tip
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: latin1 branch
changeset: 4:9cff3c980b58
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: Added tag ? for changeset 770b9b11621d
changeset: 3:770b9b11621d
tag: ?
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: utf-8 e' encoded: ?
changeset: 2:0572af48b948
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: latin-1 e' encoded: ?
changeset: 1:0e5b7e3f9c4a
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: koi8-r: ????? = u'\u0440\u0442\u0443\u0442\u044c'
changeset: 0:1e78a93102a3
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: latin-1 e': ? = u'\xe9'
% latin-1
changeset: 5:db5520b4645f
branch: é
tag: tip
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: latin1 branch
changeset: 4:9cff3c980b58
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: Added tag é for changeset 770b9b11621d
changeset: 3:770b9b11621d
tag: é
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: utf-8 e' encoded: é
changeset: 2:0572af48b948
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: latin-1 e' encoded: é
changeset: 1:0e5b7e3f9c4a
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: koi8-r: ÒÔÕÔØ = u'\u0440\u0442\u0443\u0442\u044c'
changeset: 0:1e78a93102a3
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: latin-1 e': é = u'\xe9'
% utf-8
changeset: 5:db5520b4645f
branch: é
tag: tip
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: latin1 branch
changeset: 4:9cff3c980b58
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: Added tag é for changeset 770b9b11621d
changeset: 3:770b9b11621d
tag: é
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: utf-8 e' encoded: é
changeset: 2:0572af48b948
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: latin-1 e' encoded: é
changeset: 1:0e5b7e3f9c4a
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: koi8-r: ÒÔÕÔØ = u'\u0440\u0442\u0443\u0442\u044c'
changeset: 0:1e78a93102a3
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: latin-1 e': é = u'\xe9'
% ascii
tip 5:db5520b4645f
? 3:770b9b11621d
% latin-1
tip 5:db5520b4645f
é 3:770b9b11621d
% utf-8
tip 5:db5520b4645f
é 3:770b9b11621d
% ascii
? 5:db5520b4645f
default 4:9cff3c980b58 (inactive)
% latin-1
é 5:db5520b4645f
default 4:9cff3c980b58 (inactive)
% utf-8
é 5:db5520b4645f
default 4:9cff3c980b58 (inactive)
% utf-8
changeset: 5:db5520b4645f
branch: é
tag: tip
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: latin1 branch
changeset: 4:9cff3c980b58
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: Added tag é for changeset 770b9b11621d
changeset: 3:770b9b11621d
tag: é
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: utf-8 e' encoded: é
changeset: 2:0572af48b948
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: latin-1 e' encoded: é
changeset: 1:0e5b7e3f9c4a
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: koi8-r: ртуть = u'\u0440\u0442\u0443\u0442\u044c'
changeset: 0:1e78a93102a3
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: latin-1 e': И = u'\xe9'
abort: unknown encoding: dolphin, please check your locale settings
abort: decoding near 'é': 'ascii' codec can't decode byte 0xe9 in position 0: ordinal not in range(128)!
abort: branch name not in UTF-8!