comparison doc/FAQ.txt @ 455:8d43dfdfb514

More FAQ updates -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 More FAQ updates manifest hash: 98447c3da5aefcc6c4071d03d8014944cf4cbb79 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (GNU/Linux) iD8DBQFCu3TCywK+sNU5EO8RArRjAJ0ZtMHztUL1cQw7FC0C3uJ0YIfKjwCfWfSe JndrQxPs1QeCPK/RbfYiKjE= =aMHP -----END PGP SIGNATURE-----
author mpm@selenic.com
date Thu, 23 Jun 2005 18:49:38 -0800
parents df83b2c306ac
children d6ac88a738c4
comparison
equal deleted inserted replaced
454:58d57594b802 455:8d43dfdfb514
120 effectively the sum of all the contents of the repository for that 120 effectively the sum of all the contents of the repository for that
121 change, it is impossible in Mercurial to simultaneously commit and add 121 change, it is impossible in Mercurial to simultaneously commit and add
122 a tag. Thus tagging a revision must be done as a second step. 122 a tag. Thus tagging a revision must be done as a second step.
123 123
124 124
125 .Q. What if I want to just keep local tags?
126
127 You can add a section called "[tags]" to your .hg/hgrc which contains
128 a list of tag = changeset ID pairs. Unlike traditional tags, these are
129 only visible in the local repository, but otherwise act just like
130 normal tags.
131
132
125 .Q. How do tags work with multiple heads? 133 .Q. How do tags work with multiple heads?
126 134
127 The tags that are in effect at any given time are the tags specified 135 The tags that are in effect at any given time are the tags specified
128 in each head, with heads closer to the tip taking precedence. 136 in each head, with heads closer to the tip taking precedence. Local
137 tags override all other tags.
129 138
130 139
131 .Q. What are some best practices for distributed development with Mercurial? 140 .Q. What are some best practices for distributed development with Mercurial?
132 141
133 First, merge often! This makes merging easier for everyone and you 142 First, merge often! This makes merging easier for everyone and you
185 194
186 Mercurial is primarily developed for UNIX systems, so some UNIXisms 195 Mercurial is primarily developed for UNIX systems, so some UNIXisms
187 may be present in ports. 196 may be present in ports.
188 197
189 198
190 .Q. How does signing work? 199 .Q. How does Mercurial store its data?
191 200
192 Take a look at the hgeditor script for an example. The basic idea 201 The fundamental storage type in Mercurial is a "revlog". A revlog is
193 is to sign the manifest ID inside that changelog entry. The manifest 202 the set of all revisions of a named object. Each revision is either
194 ID is a recursive hash of all of the files in the system and their 203 stored compressed in its entirety or as a compressed binary delta
195 complete history, and thus signing the manifest hash signs the entire 204 against the previous version. The decision of when to store a full
196 project to that point. 205 version is made based on how much data would be needed to reconstruct
197 206 the file. This lets us ensure that we never need to read huge amounts
198 More precisely: each file hash is an SHA1 hash of the contents of that 207 of data to reconstruct a object, regardless of how many revisions of it
199 file and the hashes of its parent revisions. The manifest contains a 208 we store.
200 list of each file in the project along with its current file hash. 209
201 This manifest is hashed similarly to the file hashes, incorporating 210 In fact, we should always be able to do it with a single read,
202 the hashes of the parent revisions. 211 provided we know when and where to read. This is where the index comes
212 in. Each revlog has an index containing a special hash (nodeid) of the
213 text, hashes for its parents, and where and how much of the revlog
214 data we need to read to reconstruct it. Thus, with one read of the
215 index and one read of the data, we can reconstruct any version in time
216 proportional to the object size.
217
218 Similarly, revlogs and their indices are append-only. This means that
219 adding a new version is also O(1) seeks.
220
221 Revlogs are used to represent all revisions of files, manifests, and
222 changesets. Compression for typical objects with lots of revisions can
223 range from 100 to 1 for things like project makefiles to over 2000 to
224 1 for objects like the manifest.
225
226
227 .Q. How are manifests and changesets stored?
228
229 A manifest is simply a list of all files in a given revision of a
230 project along with the nodeids of the corresponding file revisions. So
231 grabbing a given version of the project means simply looking up its
232 manifest and reconstruction all the file revisions pointed to by it.
233
234 A changeset is a list of all files changed in a check-in along with a
235 change description and some metadata like user and date. It also
236 contains a nodeid to the relevent revision of the manifest.
237
238
239 .Q. How do Mercurial hashes get calculated?
240
241 Mercurial hashes both the contents of an object and the hash of its
242 parents to create an identifier that uniquely identifies an object's
243 contents and history. This greatly simplifies merging of histories
244 because it avoid graph cycles that can occur when a object is reverted
245 to an earlier state.
246
247 All file revisions have an associated hash value. These are listed in
248 the manifest of a given project revision, and the manifest hash is
249 listed in the changeset. The changeset hash is again a hash of the
250 changeset contents and its parents, so it uniquely identifies the
251 entire history of the project to that point.
252
253
254 .Q. What checks are there on repository integrity?
255
256 Every time a revlog object is retrieved, it is checked against its
257 hash for integrity. It is also incidentally doublechecked by the
258 Adler32 checksum used by the underlying zlib compression.
259
260 Running 'hg verify' decompresses and reconstitutes each revision of
261 each object in the repository and cross-checks all of the index
262 metadata with those contents.
263
264 But this alone is not enough to ensure that someone hasn't tampered
265 with a repository. For that, you need cryptographic signing.
266
267
268 .Q. How does signing work with Mercurial?
269
270 Take a look at the hgeditor script for an example. The basic idea is
271 to use GPG to sign the manifest ID inside that changelog entry. The
272 manifest ID is a recursive hash of all of the files in the system and
273 their complete history, and thus signing the manifest hash signs the
274 entire project contents.
203 275
204 276
205 .Q. What about hash collisions? What about weaknesses in SHA1? 277 .Q. What about hash collisions? What about weaknesses in SHA1?
206 278
207 The SHA1 hashes are large enough that the odds of accidental hash collision 279 The SHA1 hashes are large enough that the odds of accidental hash collision
211 becomes a realistic concern. 283 becomes a realistic concern.
212 284
213 Collisions with the "short hashes" are not a concern as they're always 285 Collisions with the "short hashes" are not a concern as they're always
214 checked for ambiguity and are still long enough that they're not 286 checked for ambiguity and are still long enough that they're not
215 likely to happen for reasonably-sized projects (< 1M changes). 287 likely to happen for reasonably-sized projects (< 1M changes).
288
289
290