doc/FAQ.txt
changeset 455 8d43dfdfb514
parent 449 df83b2c306ac
child 456 d6ac88a738c4
equal deleted inserted replaced
454:58d57594b802 455:8d43dfdfb514
   120 effectively the sum of all the contents of the repository for that
   120 effectively the sum of all the contents of the repository for that
   121 change, it is impossible in Mercurial to simultaneously commit and add
   121 change, it is impossible in Mercurial to simultaneously commit and add
   122 a tag. Thus tagging a revision must be done as a second step.
   122 a tag. Thus tagging a revision must be done as a second step.
   123 
   123 
   124 
   124 
       
   125 .Q. What if I want to just keep local tags?
       
   126 
       
   127 You can add a section called "[tags]" to your .hg/hgrc which contains
       
   128 a list of tag = changeset ID pairs. Unlike traditional tags, these are
       
   129 only visible in the local repository, but otherwise act just like
       
   130 normal tags.
       
   131 
       
   132 
   125 .Q. How do tags work with multiple heads?
   133 .Q. How do tags work with multiple heads?
   126 
   134 
   127 The tags that are in effect at any given time are the tags specified
   135 The tags that are in effect at any given time are the tags specified
   128 in each head, with heads closer to the tip taking precedence.
   136 in each head, with heads closer to the tip taking precedence. Local
       
   137 tags override all other tags.
   129 
   138 
   130 
   139 
   131 .Q. What are some best practices for distributed development with Mercurial?
   140 .Q. What are some best practices for distributed development with Mercurial?
   132 
   141 
   133 First, merge often! This makes merging easier for everyone and you
   142 First, merge often! This makes merging easier for everyone and you
   185 
   194 
   186 Mercurial is primarily developed for UNIX systems, so some UNIXisms
   195 Mercurial is primarily developed for UNIX systems, so some UNIXisms
   187 may be present in ports.
   196 may be present in ports.
   188 
   197 
   189 
   198 
   190 .Q. How does signing work?
   199 .Q. How does Mercurial store its data?
   191 
   200 
   192 Take a look at the hgeditor script for an example. The basic idea
   201 The fundamental storage type in Mercurial is a "revlog". A revlog is
   193 is to sign the manifest ID inside that changelog entry. The manifest
   202 the set of all revisions of a named object. Each revision is either
   194 ID is a recursive hash of all of the files in the system and their
   203 stored compressed in its entirety or as a compressed binary delta
   195 complete history, and thus signing the manifest hash signs the entire
   204 against the previous version. The decision of when to store a full
   196 project to that point.
   205 version is made based on how much data would be needed to reconstruct
   197 
   206 the file. This lets us ensure that we never need to read huge amounts
   198 More precisely: each file hash is an SHA1 hash of the contents of that
   207 of data to reconstruct a object, regardless of how many revisions of it
   199 file and the hashes of its parent revisions. The manifest contains a
   208 we store.
   200 list of each file in the project along with its current file hash.
   209 
   201 This manifest is hashed similarly to the file hashes, incorporating
   210 In fact, we should always be able to do it with a single read,
   202 the hashes of the parent revisions.
   211 provided we know when and where to read. This is where the index comes
       
   212 in. Each revlog has an index containing a special hash (nodeid) of the
       
   213 text, hashes for its parents, and where and how much of the revlog
       
   214 data we need to read to reconstruct it. Thus, with one read of the
       
   215 index and one read of the data, we can reconstruct any version in time
       
   216 proportional to the object size.
       
   217 
       
   218 Similarly, revlogs and their indices are append-only. This means that
       
   219 adding a new version is also O(1) seeks.
       
   220 
       
   221 Revlogs are used to represent all revisions of files, manifests, and
       
   222 changesets. Compression for typical objects with lots of revisions can
       
   223 range from 100 to 1 for things like project makefiles to over 2000 to
       
   224 1 for objects like the manifest.
       
   225 
       
   226 
       
   227 .Q. How are manifests and changesets stored?
       
   228 
       
   229 A manifest is simply a list of all files in a given revision of a
       
   230 project along with the nodeids of the corresponding file revisions. So
       
   231 grabbing a given version of the project means simply looking up its
       
   232 manifest and reconstruction all the file revisions pointed to by it.
       
   233 
       
   234 A changeset is a list of all files changed in a check-in along with a
       
   235 change description and some metadata like user and date. It also
       
   236 contains a nodeid to the relevent revision of the manifest.
       
   237 
       
   238 
       
   239 .Q. How do Mercurial hashes get calculated?
       
   240 
       
   241 Mercurial hashes both the contents of an object and the hash of its
       
   242 parents to create an identifier that uniquely identifies an object's
       
   243 contents and history. This greatly simplifies merging of histories
       
   244 because it avoid graph cycles that can occur when a object is reverted
       
   245 to an earlier state.
       
   246 
       
   247 All file revisions have an associated hash value. These are listed in
       
   248 the manifest of a given project revision, and the manifest hash is
       
   249 listed in the changeset. The changeset hash is again a hash of the
       
   250 changeset contents and its parents, so it uniquely identifies the
       
   251 entire history of the project to that point.
       
   252 
       
   253 
       
   254 .Q. What checks are there on repository integrity?
       
   255 
       
   256 Every time a revlog object is retrieved, it is checked against its
       
   257 hash for integrity. It is also incidentally doublechecked by the
       
   258 Adler32 checksum used by the underlying zlib compression.
       
   259 
       
   260 Running 'hg verify' decompresses and reconstitutes each revision of
       
   261 each object in the repository and cross-checks all of the index
       
   262 metadata with those contents.
       
   263 
       
   264 But this alone is not enough to ensure that someone hasn't tampered
       
   265 with a repository. For that, you need cryptographic signing.
       
   266 
       
   267 
       
   268 .Q. How does signing work with Mercurial?
       
   269 
       
   270 Take a look at the hgeditor script for an example. The basic idea is
       
   271 to use GPG to sign the manifest ID inside that changelog entry. The
       
   272 manifest ID is a recursive hash of all of the files in the system and
       
   273 their complete history, and thus signing the manifest hash signs the
       
   274 entire project contents.
   203 
   275 
   204 
   276 
   205 .Q. What about hash collisions? What about weaknesses in SHA1?
   277 .Q. What about hash collisions? What about weaknesses in SHA1?
   206 
   278 
   207 The SHA1 hashes are large enough that the odds of accidental hash collision
   279 The SHA1 hashes are large enough that the odds of accidental hash collision
   211 becomes a realistic concern.
   283 becomes a realistic concern.
   212 
   284 
   213 Collisions with the "short hashes" are not a concern as they're always
   285 Collisions with the "short hashes" are not a concern as they're always
   214 checked for ambiguity and are still long enough that they're not
   286 checked for ambiguity and are still long enough that they're not
   215 likely to happen for reasonably-sized projects (< 1M changes).
   287 likely to happen for reasonably-sized projects (< 1M changes).
       
   288 
       
   289 
       
   290