61 revision's contents and its graph position relative to the root, so |
61 revision's contents and its graph position relative to the root, so |
62 merge is simply checking whether each nodeid in graph A is in the hash |
62 merge is simply checking whether each nodeid in graph A is in the hash |
63 table of graph B. If not, we pull them in, adding them sequentially to |
63 table of graph B. If not, we pull them in, adding them sequentially to |
64 the revlog. |
64 the revlog. |
65 |
65 |
66 Graph resolving: |
66 Branching and merging: |
67 |
67 |
68 Mercurial does branching by copying (or COWing) a repository and thus |
68 Everything in Mercurial is potentially a branch and every user |
69 keeps everything nice and linear within a repository. However, when a |
69 effectively works in their own branch. When you do a checkout, |
70 merge of repositories (a "pull") is done, we may often have two head |
70 Mercurial remembers what the parent changeset was and uses it for the |
71 revisions in a given graph. To keep things simple, Mercurial forces |
71 next check in. |
72 the head revisions to be merged. |
|
73 |
72 |
74 It first finds the closest common ancestor of the two heads. If one is |
73 To do a merge of branches in Mercurial, you check out the heads of the |
75 a child of the other, it becomes the new head. Otherwise, we call out |
74 two branches into the same working directory which causes a merge to |
76 to a user-specified 3-way merge tool. |
75 be performed, and then check in the result once you're happy with it. |
|
76 The resulting checkin will have two parents. |
77 |
77 |
78 Merging files, manifests, and changesets: |
78 It decides when a merge is necessary by first determining if there are |
|
79 any uncommitted changes in the working directory. This effectively |
|
80 makes the working directory a branch off the checked in version it's |
|
81 based on. Then it also determines if the working directory is a direct |
|
82 ancestor or descendent of the second version we're attempting to |
|
83 checkout. If neither is true, we simply replace the working directory |
|
84 version with the new version. Otherwise we perform a merge between the |
|
85 two versions. |
79 |
86 |
80 We begin by comparing changeset DAGs, pulling all nodes we don't have |
87 Merging files and manifests: |
81 in our DAG from the other repository. As we do so, we collect a list |
88 |
82 of changed files to merge. |
89 We begin by comparing two versions manifests and deciding which files |
|
90 need to be added, deleted, and merged. |
83 |
91 |
84 Then for each file, we perform a graph merge and resolve as above. |
92 Then for each file, we perform a graph merge and resolve as above. |
85 It's important to merge files using per-file DAGs rather than just |
93 It's important to merge files using per-file DAGs rather than just |
86 changeset level DAGs as this diagram illustrates: |
94 changeset level DAGs as this diagram illustrates: |
87 |
95 |
101 ??? depending on which ancestor we choose, we will have |
109 ??? depending on which ancestor we choose, we will have |
102 to redo A hand-merge, B hand-merge, or both |
110 to redo A hand-merge, B hand-merge, or both |
103 but if we look at the files independently, everything |
111 but if we look at the files independently, everything |
104 is fine |
112 is fine |
105 |
113 |
106 After we've merged files, we merge the manifest log DAG and resolve |
114 The result is a merged version in the working directory, waiting for |
107 additions and deletions. Then we are ready to resolve the changeset |
115 check-in. |
108 DAG - if our merge required any changes (the new head is not a |
|
109 decendent of our tip), we must create a new changeset describing all |
|
110 of the changes needed to merge it into the tip. |
|
111 |
|
112 Merge performance: |
|
113 |
|
114 The I/O operations for performing a merge are O(changed files), not |
|
115 O(total changes) and in many cases, we needn't even unpack the deltas |
|
116 to add them to our repository (though this optimization isn't |
|
117 necessary). |
|
118 |
116 |
119 Rollback: |
117 Rollback: |
120 |
118 |
121 When performing a commit or a merge, we order things so that the |
119 When performing a commit or a merge, we order things so that the |
122 changeset entry gets added last. We keep a transaction log of the name |
120 changeset entry gets added last. We keep a transaction log of the name |
123 of each file touched and its length prior to the transaction. On |
121 of each file touched and its length prior to the transaction. On |
124 abort, we simply truncate each file to its prior length. This is one |
122 abort, we simply truncate each file to its prior length. This is one |
125 of the nice properties of the append-only structure of the revlogs. |
123 of the nice properties of the append-only structure of the revlogs. |
|
124 We can also reuse this journal for "undo". |
126 |
125 |
127 Remote access: |
126 Merging between repositories: |
128 |
127 |
129 Mercurial currently supports pulling from "serverless" repositories. |
128 One of the key features of Mercurial is the ability to merge between |
130 Simply making the repo directory accessibly via the web and pointing |
129 independent repositories in a decentralized fashion. Each repository |
131 hg at it can accomplish a pull. This is relatively bandwidth efficient |
130 can act as a read-only server or a client. Clients operating by |
132 but no effort has been spent on pipelining, so it won't work |
131 pulling all branches that it hasn't seen from the server and adding |
133 especially well over LAN yet. |
132 them into its graph. This is done in two steps: searching for new |
|
133 "roots" and pulling a "changegroup" |
134 |
134 |
135 It's also quite amenable to rsync, if you don't mind keeping an intact |
135 Searching for new "roots" begins by finding all new heads and |
136 copy of the master around locally. |
136 searching backwards from those heads to the first unknown nodes in |
|
137 their respective branches. These nodes are the 'roots' that are used |
|
138 to calculate the 'changegroup': the set of all changesets starting at |
|
139 those roots. Mercurial takes pains to make this search efficient in |
|
140 both bandwidth and round-trips. |
137 |
141 |
138 Also note the append-only and ordering properties of the commit |
142 Once the roots are found, the changegroup can be transferred as a |
139 guarantee that readers will always see a repository in a consistent |
143 single streaming transfer. This is organized as an ordered set of |
140 state and no special locking is necessary. As there is generally only |
144 deltas for changesets, manifests, and files. Large chunks of deltas |
141 one writer to an hg repository, there is in fact no exclusion |
145 can be directly added to the repository without unpacking so it's |
142 implemented yet. |
146 fairly fast. |
143 |
|
144 |
|
145 Some comparisons to git: |
|
146 |
|
147 Most notably, Mercurial uses delta compression and repositories |
|
148 created with it will grow much more slowly over time. This also allows |
|
149 it to be much more bandwidth efficient. I expect repos sizes and sync |
|
150 speeds to be similar to or better than BK, given the use of binary diffs. |
|
151 |
|
152 Mercurial is roughly the same performance as git in some areas and is |
|
153 faster in others as it keeps around more metadata. One example is |
|
154 listing and retrieving past versions of a file, which it can do |
|
155 without reading all the changesets. This metadata will also allow it |
|
156 to perform better merges as described above. |
|
157 |
|
158 |
|