dirstate walking optimizations
The repo walking code introduces a number of calls to dirstate.map.copy(),
significantly slowing down the walk on large trees. When a list of
files is passed to the walking code, we should only look at map entries
relevant to the file list passed in.
dirstate.filterfiles() is added to return a subset of the dirstate map.
The subset includes in files passed in, and if one of the files requested
is actually a directory, it includes any files inside that directory tree.
This brings the time for hg diff Makefile down from 1.7s to .3s on
a linux kernel repo.
Also, the diff command was unconditionally calling makewalk, leading
to an extra pass through repo.changes. This patch avoids the call
to makewalk when commands.diff isn't given a list of patterns, cutting
the time for hg diff (with no args) in half.
Index: mine/mercurial/hg.py
===================================================================
# transaction.py - simple journalling scheme for mercurial
#
# This transaction scheme is intended to gracefully handle program
# errors and interruptions. More serious failures like system crashes
# can be recovered with an fsck-like tool. As the whole repository is
# effectively log-structured, this should amount to simply truncating
# anything that isn't referenced in the changelog.
#
# Copyright 2005 Matt Mackall <mpm@selenic.com>
#
# This software may be used and distributed according to the terms
# of the GNU General Public License, incorporated herein by reference.
import os
import util
class transaction:
def __init__(self, report, opener, journal, after = None):
self.journal = None
# abort here if the journal already exists
if os.path.exists(journal):
raise "journal already exists - run hg recover"
self.report = report
self.opener = opener
self.after = after
self.entries = []
self.map = {}
self.journal = journal
self.file = open(self.journal, "w")
def __del__(self):
if self.journal:
if self.entries: self.abort()
self.file.close()
try: os.unlink(self.journal)
except: pass
def add(self, file, offset):
if file in self.map: return
self.entries.append((file, offset))
self.map[file] = 1
# add enough data to the journal to do the truncate
self.file.write("%s\0%d\n" % (file, offset))
self.file.flush()
def close(self):
self.file.close()
self.entries = []
if self.after:
self.after()
else:
os.unlink(self.journal)
self.journal = None
def abort(self):
if not self.entries: return
self.report("transaction abort!\n")
for f, o in self.entries:
try:
self.opener(f, "a").truncate(o)
except:
self.report("failed to truncate %s\n" % f)
self.entries = []
self.report("rollback completed\n")
def rollback(opener, file):
for l in open(file).readlines():
f, o = l.split('\0')
opener(f, "a").truncate(int(o))
os.unlink(file)