notes.txt
author Alexis S. L. Carvalho <alexis@cecm.usp.br>
Sun, 14 May 2006 18:22:12 -0300
changeset 2289 854954fd410a
parent 2227 4f072bb06e89
permissions -rw-r--r--
Fix revlog.parseindex parseindex could fail if read returns too little data in the right moment (e.g. when there's still leftover data from the previous iteration and read returns less than "s" bytes).
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
0
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
     1
Some notes about Mercurial's design
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
     2
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
     3
Revlogs:
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
     4
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
     5
The fundamental storage type in Mercurial is a "revlog". A revlog is
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
     6
the set of all revisions to a file. Each revision is either stored
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
     7
compressed in its entirety or as a compressed binary delta against the
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
     8
previous version. The decision of when to store a full version is made
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
     9
based on how much data would be needed to reconstruct the file. This
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    10
lets us ensure that we never need to read huge amounts of data to
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    11
reconstruct a file, regardless of how many revisions of it we store.
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    12
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    13
In fact, we should always be able to do it with a single read,
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    14
provided we know when and where to read. This is where the index comes
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    15
in. Each revlog has an index containing a special hash (nodeid) of the
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    16
text, hashes for its parents, and where and how much of the revlog
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    17
data we need to read to reconstruct it. Thus, with one read of the
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    18
index and one read of the data, we can reconstruct any version in time
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    19
proportional to the file size.
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    20
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    21
Similarly, revlogs and their indices are append-only. This means that
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    22
adding a new version is also O(1) seeks.
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    23
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    24
Generally revlogs are used to represent revisions of files, but they
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    25
also are used to represent manifests and changesets.
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    26
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    27
Manifests:
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    28
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    29
A manifest is simply a list of all files in a given revision of a
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    30
project along with the nodeids of the corresponding file revisions. So
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    31
grabbing a given version of the project means simply looking up its
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    32
manifest and reconstruction all the file revisions pointed to by it.
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    33
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    34
Changesets:
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    35
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    36
A changeset is a list of all files changed in a check-in along with a
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    37
change description and some metadata like user and date. It also
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    38
contains a nodeid to the relevent revision of the manifest. Changesets
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    39
and manifests are one-to-one, but contain different data for
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    40
convenience.
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    41
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    42
Nodeids:
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    43
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    44
Nodeids are unique ids that are used to represent the contents of a
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    45
file AND its position in the project history. That is, if you change a
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    46
file and then change it back, the result will have a different nodeid
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    47
because it has different history. This is accomplished by including
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    48
the parents of a given revision's nodeids with the revision's text
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    49
when calculating the hash.
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    50
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    51
Graph merging:
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    52
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    53
Nodeids are implemented as they are to simplify merging. Merging a
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    54
pair of directed acyclic graphs (aka "the family tree" of the file
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    55
history) requires some method of determining if nodes in different
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    56
graphs correspond. Simply comparing the contents of the node (by
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    57
comparing text of given revisions or their hashes) can get confused by
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    58
identical revisions in the tree.
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    59
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    60
The nodeid approach makes it trivial - the hash uniquely describes a
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    61
revision's contents and its graph position relative to the root, so
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    62
merge is simply checking whether each nodeid in graph A is in the hash
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    63
table of graph B. If not, we pull them in, adding them sequentially to
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    64
the revlog.
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    65
260
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
    66
Branching and merging:
0
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    67
260
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
    68
Everything in Mercurial is potentially a branch and every user
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
    69
effectively works in their own branch. When you do a checkout,
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
    70
Mercurial remembers what the parent changeset was and uses it for the
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
    71
next check in.
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
    72
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
    73
To do a merge of branches in Mercurial, you check out the heads of the
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
    74
two branches into the same working directory which causes a merge to
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
    75
be performed, and then check in the result once you're happy with it.
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
    76
The resulting checkin will have two parents.
0
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    77
260
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
    78
It decides when a merge is necessary by first determining if there are
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
    79
any uncommitted changes in the working directory. This effectively
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
    80
makes the working directory a branch off the checked in version it's
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
    81
based on. Then it also determines if the working directory is a direct
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
    82
ancestor or descendent of the second version we're attempting to
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
    83
checkout. If neither is true, we simply replace the working directory
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
    84
version with the new version. Otherwise we perform a merge between the
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
    85
two versions.
0
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    86
260
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
    87
Merging files and manifests:
0
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    88
260
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
    89
We begin by comparing two versions manifests and deciding which files
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
    90
need to be added, deleted, and merged.
0
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    91
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    92
Then for each file, we perform a graph merge and resolve as above.
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    93
It's important to merge files using per-file DAGs rather than just
1308
2073e5a71008 Cleanup of tabs and trailing spaces.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 260
diff changeset
    94
changeset level DAGs as this diagram illustrates:
0
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    95
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    96
M   M1   M2
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    97
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    98
AB
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    99
 |`-------v     M2 clones M
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
   100
aB       AB     file A is change in mainline
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
   101
 |`---v  AB'    file B is changed in M2
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
   102
 |   aB / |     M1 clones M
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
   103
 |   ab/  |     M1 changes B
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
   104
 |   ab'  |     M1 merges from M2, changes to B conflict
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
   105
 |    |  A'B'   M2 changes A
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
   106
  `---+--.|
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
   107
      |  a'B'   M2 merges from mainline, changes to A conflict
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
   108
      `--.|
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
   109
         ???    depending on which ancestor we choose, we will have
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
   110
	        to redo A hand-merge, B hand-merge, or both
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
   111
                but if we look at the files independently, everything
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
   112
		is fine
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
   113
260
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
   114
The result is a merged version in the working directory, waiting for
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
   115
check-in.
0
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
   116
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
   117
Rollback:
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
   118
47
45cc818f2445 Rollback is implemented
mpm@selenic.com
parents: 0
diff changeset
   119
When performing a commit or a merge, we order things so that the
45cc818f2445 Rollback is implemented
mpm@selenic.com
parents: 0
diff changeset
   120
changeset entry gets added last. We keep a transaction log of the name
45cc818f2445 Rollback is implemented
mpm@selenic.com
parents: 0
diff changeset
   121
of each file touched and its length prior to the transaction. On
45cc818f2445 Rollback is implemented
mpm@selenic.com
parents: 0
diff changeset
   122
abort, we simply truncate each file to its prior length. This is one
45cc818f2445 Rollback is implemented
mpm@selenic.com
parents: 0
diff changeset
   123
of the nice properties of the append-only structure of the revlogs.
2227
4f072bb06e89 deprecate undo command, replace with rollback command.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents: 1308
diff changeset
   124
We can also reuse this journal for "rollback".
0
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
   125
260
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
   126
Merging between repositories:
0
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
   127
260
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
   128
One of the key features of Mercurial is the ability to merge between
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
   129
independent repositories in a decentralized fashion. Each repository
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
   130
can act as a read-only server or a client. Clients operating by
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
   131
pulling all branches that it hasn't seen from the server and adding
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
   132
them into its graph. This is done in two steps: searching for new
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
   133
"roots" and pulling a "changegroup"
0
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
   134
260
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
   135
Searching for new "roots" begins by finding all new heads and
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
   136
searching backwards from those heads to the first unknown nodes in
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
   137
their respective branches. These nodes are the 'roots' that are used
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
   138
to calculate the 'changegroup': the set of all changesets starting at
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
   139
those roots. Mercurial takes pains to make this search efficient in
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
   140
both bandwidth and round-trips.
0
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
   141
260
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
   142
Once the roots are found, the changegroup can be transferred as a
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
   143
single streaming transfer. This is organized as an ordered set of
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
   144
deltas for changesets, manifests, and files. Large chunks of deltas
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
   145
can be directly added to the repository without unpacking so it's
d7ce76d82876 Some tweaking of notes.txt
mpm@selenic.com
parents: 59
diff changeset
   146
fairly fast.