[Svnmerge] svnmerge between multiple repositories, same repo path

dustin at zmanda.com dustin at zmanda.com
Thu Apr 12 17:03:47 PDT 2007


OK, folks, some new patches for you.  The first three here are a series
that implements the functionality I was talking about earlier: users can
use a variety of 'location identifiers' in the revision-tracking
Subversion properties.  

I hope call-it-locid is an easy patch to accept, as it only changes
identifiers and comments. cleanup-abstraction should be fairly
straightforward: it makes code changes, but doesn't change
functionality.  Finally, multiple-locid_fmts is a fairly large change
that will probably generate some controversy.

I'll redraft these after some comments.  For one thing, I need to make
some corresponding changes to the unit tests and README.

call-it-locid.patch
 fix comments, variable names for urls, directories, and
 "repostitory-relative paths" to be more explicit and call the
 repo-relative paths "location identifiers"

cleanup-abstraction.patch
 clean up the locid abstraction, removing a few assumptions about
 the form of location identifiers; depends on previous patch

multiple_locid_fmts.patch
 Add support for three types of identifiers for locations in the
 subversion properties: 
   - path (the existing repo-relative path)
   - uuid (uuid://XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX/repo/relative/path)
   - url
 'svnmerge init' has a new flag, --location-type, allowing the user to
 specify which kind of location to use.  After that, the format will be
 retained.

I've also included two fairly independent patches for bugs I ran across
in this process:

get_latest_rev_of_source.patch
 analyze_source_revs() gets the latest revision of the *branch*
 repository, then proceeds to use that value against the *source*
 repository; it should get the latest revision of the *source*.

detect_bad_url.patch
 Detect and error out on invalid URLs.

Dustin

-- 
        Dustin J. Mitchell
        Storage Software Engineer, Zmanda, Inc.
        http://www.zmanda.com/
-------------- next part --------------
fix comments, variable names for urls, directories, and
"repostitory-relative paths" to be more explicit and call the
repo-relative paths "location identifiers"

Index: svnmerge.py
===================================================================
--- svnmerge.py	(revision 24557)
+++ svnmerge.py	(working copy)
@@ -294,8 +294,8 @@
         revision_re = re.compile(r"^r(\d+)")
 
         # Look for changes which contain merge tracking information
-        repos_path = target_to_repos_relative_path(url)
-        srcdir_change_re = re.compile(r"\s*M\s+%s\s+$" % re.escape(repos_path))
+        repos_locid = target_to_locid(url)
+        srcdir_change_re = re.compile(r"\s*M\s+%s\s+$" % re.escape(repos_locid))
 
         # Setup the log options (--quiet, so we don't show log messages)
         log_opts = '--quiet -r%s:%s "%s"' % (begin, end, url)
@@ -574,25 +574,33 @@
         revs.update(rs._revs)
         return RevisionSet(revs)
 
-def merge_props_to_revision_set(merge_props, path):
+# Identifiers for branches:
+# A branch is identified in three ways within this source:
+# - as a working copy (variable name usually includes 'dir')
+# - as a fully qualified URL
+# - as a location identifier (an opaque string indicating a particular path
+#   in a particular repository; variable name includes 'locid')
+# A "target" is generally user-specified, and may be a working copy or
+# a URL.
+
+def merge_props_to_revision_set(merge_props, locid):
     """A converter which returns a RevisionSet instance containing the
     revisions from PATH as known to BRANCH_PROPS.  BRANCH_PROPS is a
-    dictionary of path -> revision set branch integration information
+    dictionary of locid -> revision set branch integration information
     (as returned by get_merge_props())."""
-    if not merge_props.has_key(path):
-        error('no integration info available for repository path "%s"' % path)
-    return RevisionSet(merge_props[path])
+    if not merge_props.has_key(locid):
+        error('no integration info available for location "%s"' % locid)
+    return RevisionSet(merge_props[locid])
 
 def dict_from_revlist_prop(propvalue):
     """Given a property value as a string containing per-source revision
-    lists, return a dictionary whose key is a relative path to a source
-    (in the repository), and whose value is the revisions for that
-    source."""
+    lists, return a dictionary whose key is a source location identifier
+    and whose value is the revisions for that source."""
     prop = {}
 
     # Multiple sources are separated by any whitespace.
     for L in propvalue.split():
-        # We use rsplit to play safe and allow colons in paths.
+        # We use rsplit to play safe and allow colons in locids.
         source, revs = rsplit(L.strip(), ":", 1)
         prop[source] = revs
     return prop
@@ -600,9 +608,8 @@
 def get_revlist_prop(url_or_dir, propname, rev=None):
     """Given a repository URL or working copy path and a property
     name, extract the values of the property which store per-source
-    revision lists and return a dictionary whose key is a relative
-    path to a source (in the repository), and whose value is the
-    revisions for that source."""
+    revision lists and return a dictionary whose key is a source location
+    identifier, and whose value is the revisions for that source."""
 
     # Note that propget does not return an error if the property does
     # not exist, it simply does not output anything. So we do not need
@@ -622,10 +629,10 @@
     """Extract the blocked revisions."""
     return get_revlist_prop(dir, opts["block-prop"])
 
-def get_blocked_revs(dir, source_path):
+def get_blocked_revs(dir, source_locid):
     p = get_block_props(dir)
-    if p.has_key(source_path):
-        return RevisionSet(p[source_path])
+    if p.has_key(source_locid):
+        return RevisionSet(p[source_locid])
     return RevisionSet("")
 
 def format_merge_props(props, sep=" "):
@@ -672,12 +679,12 @@
 def set_block_props(dir, props):
     set_props(dir, opts["block-prop"], props)
 
-def set_blocked_revs(dir, source_path, revs):
+def set_blocked_revs(dir, source_locid, revs):
     props = get_block_props(dir)
     if revs:
-        props[source_path] = str(revs)
-    elif props.has_key(source_path):
-        del props[source_path]
+        props[locid] = str(revs)
+    elif props.has_key(locid):
+        del props[locid]
     set_block_props(dir, props)
 
 def is_url(url):
@@ -690,43 +697,43 @@
            os.path.isdir(os.path.join(dir, "_svn"))
 
 _cache_svninfo = {}
-def get_svninfo(path):
-    """Extract the subversion information for a path (through 'svn info').
+def get_svninfo(target):
+    """Extract the subversion information for a target (through 'svn info').
     This function uses an internal cache to let clients query information
     many times."""
     global _cache_svninfo
-    if _cache_svninfo.has_key(path):
-        return _cache_svninfo[path]
+    if _cache_svninfo.has_key(target):
+        return _cache_svninfo[target]
     info = {}
-    for L in launchsvn('info "%s"' % path):
+    for L in launchsvn('info "%s"' % target):
         L = L.strip()
         if not L:
             continue
         key, value = L.split(": ", 1)
         info[key] = value.strip()
-    _cache_svninfo[path] = info
+    _cache_svninfo[target] = info
     return info
 
-def target_to_url(dir):
+def target_to_url(target):
     """Convert working copy path or repos URL to a repos URL."""
-    if is_wc(dir):
-        info = get_svninfo(dir)
+    if is_wc(target):
+        info = get_svninfo(target)
         return info["URL"]
-    return dir
+    return target
 
-def get_repo_root(dir):
+def get_repo_root(target):
     """Compute the root repos URL given a working-copy path, or a URL."""
     # Try using "svn info WCDIR". This works only on SVN clients >= 1.3
-    if not is_url(dir):
+    if not is_url(target):
         try:
-            info = get_svninfo(dir)
+            info = get_svninfo(target)
             return info["Repository Root"]
         except KeyError:
             pass
-        url = target_to_url(dir)
+        url = target_to_url(target)
         assert url[-1] != '/'
     else:
-        url = dir
+        url = target
 
     # Try using "svn info URL". This works only on SVN clients >= 1.2
     try:
@@ -748,21 +755,21 @@
 
     assert False, "svn repos root not found"
 
-def target_to_repos_relative_path(target):
+def target_to_locid(target):
     """Convert a target (either a working copy path or an URL) into a
-    repository-relative path."""
+    location identifier."""
     root = get_repo_root(target)
     url = target_to_url(target)
     assert root[-1] != "/"
     assert url[:len(root)] == root, "url=%r, root=%r" % (url, root)
     return url[len(root):]
 
-def get_copyfrom(dir):
+def get_copyfrom(target):
     """Get copyfrom info for a given target (it represents the directory from
     where it was branched). NOTE: repos root has no copyfrom info. In this case
     None is returned."""
-    repos_path = target_to_repos_relative_path(dir)
-    out = launchsvn('log -v --xml --stop-on-copy "%s"' % dir,
+    repos_path = target_to_locid(target)
+    out = launchsvn('log -v --xml --stop-on-copy "%s"' % target,
                     split_lines=False)
     out = out.replace("\n", " ")
     try:
@@ -838,20 +845,20 @@
     messages.append('')
     return longest_sep.join(messages)
 
-def get_default_source(branch_dir, branch_props):
-    """Return the default source for branch_dir (given its branch_props).
+def get_default_source(branch_target, branch_props):
+    """Return the default source for branch_target (given its branch_props).
     Error out if there is ambiguity."""
     if not branch_props:
         error("no integration info available")
 
     props = branch_props.copy()
-    directory = target_to_repos_relative_path(branch_dir)
+    locid = target_to_locid(branch_target)
 
     # To make bidirectional merges easier, find the target's
     # repository local path so it can be removed from the list of
     # possible integration sources.
-    if props.has_key(directory):
-        del props[directory]
+    if props.has_key(locid):
+        del props[locid]
 
     if len(props) > 1:
         err_msg = "multiple sources found. "
@@ -863,9 +870,9 @@
 
     return props.keys()[0]
 
-def check_old_prop_version(branch_dir, props):
-    """Check if props (of branch_dir) are svnmerge properties in old format,
-    and emit an error if so."""
+def check_old_prop_version(branch_target, branch_props):
+    """Check if branch_props (of branch_target) are svnmerge properties in 
+    old format, and emit an error if so."""
 
     # Previous svnmerge versions allowed trailing /'s in the repository
     # local path.  Newer versions of svnmerge will trim trailing /'s
@@ -874,7 +881,7 @@
     # the user to change them now.
     fixed = {}
     changed = False
-    for source, revs in props.items():
+    for source, revs in branch_props.items():
         src = rstrip(source, "/")
         fixed[src] = revs
         if src != source:
@@ -884,13 +891,13 @@
         err_msg = "old property values detected; an upgrade is required.\n\n"
         err_msg += "Please execute and commit these changes to upgrade:\n\n"
         err_msg += 'svn propset "%s" "%s" "%s"' % \
-                   (opts["prop"], format_merge_props(fixed), branch_dir)
+                   (opts["prop"], format_merge_props(fixed), branch_target)
         error(err_msg)
 
-def analyze_revs(target_dir, url, begin=1, end=None,
+def analyze_revs(target_locid, url, begin=1, end=None,
                  find_reflected=False):
     """For the source of the merges in the source URL being merged into
-    target_dir, analyze the revisions in the interval begin-end (which
+    target_locid, analyze the revisions in the interval begin-end (which
     defaults to 1-HEAD), to find out which revisions are changes in
     the url, which are changes elsewhere (so-called 'phantom'
     revisions), and optionally which are reflected changes (to avoid
@@ -923,7 +930,7 @@
     phantom_revs = RevisionSet("%s-%s" % (begin, end)) - revs
 
     if find_reflected:
-        reflected_revs = logs[url].merge_metadata().changed_revs(target_dir)
+        reflected_revs = logs[url].merge_metadata().changed_revs(target_locid)
     else:
         reflected_revs = []
 
@@ -931,11 +938,11 @@
 
     return revs, phantom_revs, reflected_revs
 
-def analyze_source_revs(branch_dir, source_url, **kwargs):
+def analyze_source_revs(branch_target, source_url, **kwargs):
     """For the given branch and source, extract the real and phantom
     source revisions."""
-    branch_url = target_to_url(branch_dir)
-    target_dir = target_to_repos_relative_path(branch_dir)
+    branch_url = target_to_url(branch_target)
+    branch_locid = target_to_locid(branch_target)
 
     # Extract the latest repository revision from the URL of the branch
     # directory (which is already cached at this point).
@@ -957,7 +964,7 @@
         if end_rev > revs[-1]:
             end_rev = revs[-1]
 
-    return analyze_revs(target_dir, source_url, base, end_rev, **kwargs)
+    return analyze_revs(branch_locid, source_url, base, end_rev, **kwargs)
 
 def minimal_merge_intervals(revs, phantom_revs):
     """Produce the smallest number of intervals suitable for merging. revs
@@ -1023,12 +1030,13 @@
     # the version data obtained from it.
     if not opts["revision"]:
         cf_source, cf_rev = get_copyfrom(opts["source-url"])
-        branch_path = target_to_repos_relative_path(branch_dir)
+        branch_locid = target_to_locid(branch_dir)
 
-        # If the branch_path is the source path of "source",
+        # If the branch_locid is the source path of "source",
         # then "source" was branched from the current working tree
         # and we can use the revisions determined by get_copyfrom
-        if branch_path == cf_source:
+        # (XXX assumes locid is a repository-relative-path)
+        if branch_locid == cf_source:
             report('the source "%s" is a branch of "%s"' %
                    (opts["source-url"], branch_dir))
             opts["revision"] = "1-" + cf_rev
@@ -1041,12 +1049,12 @@
            (branch_dir, revs, opts["source-url"]))
 
     revs = str(revs)
-    # If the source-path already has an entry in the svnmerge-integrated
+    # If the source-locid already has an entry in the svnmerge-integrated
     # property, simply error out.
-    if not opts["force"] and branch_props.has_key(opts["source-path"]):
-        error('%s has already been initialized at %s\n'
-              'Use --force to re-initialize' % (opts["source-path"], branch_dir))
-    branch_props[opts["source-path"]] = revs
+    if not opts["force"] and branch_props.has_key(opts["source-locid"]):
+        error('Location %s has already been initialized at %s\n'
+              'Use --force to re-initialize' % (opts["source-locid"], branch_dir))
+    branch_props[opts["source-locid"]] = revs
 
     # Set property
     set_merge_props(branch_dir, branch_props)
@@ -1069,7 +1077,7 @@
     if reflected_revs:
         report('skipping reflected revisions: %s' % reflected_revs)
 
-    blocked_revs = get_blocked_revs(branch_dir, opts["source-path"])
+    blocked_revs = get_blocked_revs(branch_dir, opts["source-locid"])
     avail_revs = source_revs - opts["merged-revs"] - blocked_revs - reflected_revs
 
     # Compose the set of revisions to show
@@ -1097,7 +1105,7 @@
     # Extract the integration info for the branch_dir
     branch_props = get_merge_props(branch_dir)
     check_old_prop_version(branch_dir, branch_props)
-    revs = merge_props_to_revision_set(branch_props, opts["source-path"])
+    revs = merge_props_to_revision_set(branch_props, opts["source-locid"])
 
     # Lookup the oldest revision on the branch path.
     oldest_src_rev = get_created_rev(opts["source-url"])
@@ -1129,7 +1137,7 @@
     else:
         revs = source_revs
 
-    blocked_revs = get_blocked_revs(branch_dir, opts["source-path"])
+    blocked_revs = get_blocked_revs(branch_dir, opts["source-locid"])
     merged_revs = opts["merged-revs"]
 
     # Show what we're doing
@@ -1200,7 +1208,7 @@
 
     # Update the set of merged revisions.
     merged_revs = merged_revs | revs | reflected_revs | phantom_revs
-    branch_props[opts["source-path"]] = str(merged_revs)
+    branch_props[opts["source-locid"]] = str(merged_revs)
     set_merge_props(branch_dir, branch_props)
 
 def action_block(branch_dir, branch_props):
@@ -1220,9 +1228,9 @@
         error('no available revisions to block')
 
     # Change blocked information
-    blocked_revs = get_blocked_revs(branch_dir, opts["source-path"])
+    blocked_revs = get_blocked_revs(branch_dir, opts["source-locid"])
     blocked_revs = blocked_revs | revs_to_block
-    set_blocked_revs(branch_dir, opts["source-path"], blocked_revs)
+    set_blocked_revs(branch_dir, opts["source-locid"], blocked_revs)
 
     # Write out commit message if desired
     if opts["commit-file"]:
@@ -1241,7 +1249,7 @@
     # Check branch directory is ready for being modified
     check_dir_clean(branch_dir)
 
-    blocked_revs = get_blocked_revs(branch_dir, opts["source-path"])
+    blocked_revs = get_blocked_revs(branch_dir, opts["source-locid"])
     revs_to_unblock = blocked_revs
 
     # Limit to revisions specified by -r (if any)
@@ -1253,7 +1261,7 @@
 
     # Change blocked information
     blocked_revs = blocked_revs - revs_to_unblock
-    set_blocked_revs(branch_dir, opts["source-path"], blocked_revs)
+    set_blocked_revs(branch_dir, opts["source-locid"], blocked_revs)
 
     # Write out commit message if desired
     if opts["commit-file"]:
@@ -1279,9 +1287,9 @@
     # Extract the integration info for the branch_dir
     branch_props = get_merge_props(branch_dir)
     check_old_prop_version(branch_dir, branch_props)
-    # Get the list of all revisions already merged into this source-path.
+    # Get the list of all revisions already merged into this source-locid.
     merged_revs = merge_props_to_revision_set(branch_props,
-                                              opts["source-path"])
+                                              opts["source-locid"])
 
     # At which revision was the src created?
     oldest_src_rev = get_created_rev(opts["source-url"])
@@ -1299,7 +1307,7 @@
     # merge source, error out.
     if revs & src_pre_exist_range:
         err_str  = "Specified revision range falls out of the rollback range.\n"
-        err_str += "%s was created at r%d" % (opts["source-path"],
+        err_str += "%s was created at r%d" % (opts["source-locid"],
                                               oldest_src_rev)
         error(err_str)
 
@@ -1342,7 +1350,7 @@
 
     # Update the set of merged revisions.
     merged_revs = merged_revs - revs 
-    branch_props[opts["source-path"]] = str(merged_revs)
+    branch_props[opts["source-locid"]] = str(merged_revs)
     set_merge_props(branch_dir, branch_props)
 
 def action_uninit(branch_dir, branch_props):
@@ -1350,19 +1358,19 @@
     # Check branch directory is ready for being modified
     check_dir_clean(branch_dir)
 
-    # If the source-path does not have an entry in the svnmerge-integrated
+    # If the source-locid does not have an entry in the svnmerge-integrated
     # property, simply error out.
-    if not branch_props.has_key(opts["source-path"]):
-        error('"%s" does not contain merge tracking information for "%s"' \
-                % (opts["source-path"], branch_dir))
+    if not branch_props.has_key(opts["source-locid"]):
+        error('Location "%s" does not contain merge tracking information for "%s"' \
+                % (opts["source-locid"], branch_dir))
 
-    del branch_props[opts["source-path"]]
+    del branch_props[opts["source-locid"]]
 
     # Set merge property with the selected source deleted
     set_merge_props(branch_dir, branch_props)
 
     # Set blocked revisions for the selected source to None
-    set_blocked_revs(branch_dir, opts["source-path"], None)
+    set_blocked_revs(branch_dir, opts["source-locid"], None)
 
     # Write out commit message if desired
     if opts["commit-file"]:
@@ -1903,14 +1911,15 @@
             if not cf_source:
                 error('no copyfrom info available. '
                       'Explicit source argument (-S/--source) required.')
-            opts["source-path"] = cf_source
+            opts["source-locid"] = cf_source
             if not opts["revision"]:
                 opts["revision"] = "1-" + cf_rev
         else:
-            opts["source-path"] = get_default_source(branch_dir, branch_props)
+            opts["source-locid"] = get_default_source(branch_dir, branch_props)
 
-        assert opts["source-path"][0] == '/'
-        opts["source-url"] = get_repo_root(branch_dir) + opts["source-path"]
+        # (XXX assumes locid is a repository-relative-path)
+        assert opts["source-locid"][0] == '/'
+        opts["source-url"] = get_repo_root(branch_dir) + opts["source-locid"]
     else:
         # The source was given as a command line argument and is stored in
         # SOURCE.  Ensure that the specified source does not end in a /,
@@ -1919,25 +1928,25 @@
         # trailing /'s.
         source = rstrip(source, "/")
         if not is_wc(source) and not is_url(source):
-            # Check if it is a substring of a repo-relative URL recorded
+            # Check if it is a substring of a locid recorded
             # within the branch properties.
             found = []
-            for repos_path in branch_props.keys():
-                if repos_path.find(source) > 0:
-                    found.append(repos_path)
+            for locid in branch_props.keys():
+                if locid.find(source) > 0:
+                    found.append(locid)
             if len(found) == 1:
+                # (XXX assumes locid is a repository-relative-path)
                 source = get_repo_root(branch_dir) + found[0]
             else:
-                error('"%s" is neither a valid URL (or an unambiguous '
-                      'substring), nor a working directory' % source)
+                error('"%s" is neither a valid URL, nor an unambiguous '
+                      'substring of a location, nor a working directory' % source)
 
-        source_path = target_to_repos_relative_path(source)
+        source_locid = target_to_locid(source)
         if str(cmd) == "init" and \
-               source_path == target_to_repos_relative_path("."):
-            error("cannot init integration source '%s'\nIt must "
-                  "differ from the repository-relative path of the current "
-                  "directory." % source_path)
-        opts["source-path"] = source_path
+               source_locid == target_to_locid("."):
+            error("cannot init integration source location '%s'\nIts location identifer must "
+                  "differ from the location identifier of the current directory." % source_locid)
+        opts["source-locid"] = source_locid
         opts["source-url"] = target_to_url(source)
 
     # Sanity check source_url
@@ -1951,7 +1960,7 @@
     # Get previously merged revisions (except when command is init)
     if str(cmd) != "init":
         opts["merged-revs"] = merge_props_to_revision_set(branch_props,
-                                                          opts["source-path"])
+                                                          opts["source-locid"])
 
     # Perform the action
     cmd(branch_dir, branch_props)
-------------- next part --------------
clean up the locid abstraction, removing a few assumptions about the
form of location identifiers

depends on
  call-it-locid.patch

Index: svnmerge.py
===================================================================
--- svnmerge.py.orig	2007-04-12 16:34:49.335878750 -0500
+++ svnmerge.py	2007-04-12 16:58:56.290307750 -0500
@@ -295,7 +295,7 @@
 
         # Look for changes which contain merge tracking information
         repos_locid = target_to_locid(url)
-        srcdir_change_re = re.compile(r"\s*M\s+%s\s+$" % re.escape(repos_locid))
+        srcdir_change_re = re.compile(r"\s*M\s+%s\s+$" % re.escape(locid_path(repos_locid)))
 
         # Setup the log options (--quiet, so we don't show log messages)
         log_opts = '--quiet -r%s:%s "%s"' % (begin, end, url)
@@ -696,6 +696,9 @@
     return os.path.isdir(os.path.join(dir, ".svn")) or \
            os.path.isdir(os.path.join(dir, "_svn"))
 
+def is_locid(locid):
+    return locid and locid[0] == '/'
+
 _cache_svninfo = {}
 def get_svninfo(target):
     """Extract the subversion information for a target (through 'svn info').
@@ -764,11 +767,34 @@
     assert url[:len(root)] == root, "url=%r, root=%r" % (url, root)
     return url[len(root):]
 
+def locid_to_url(locid, *targets):
+    """Convert a locid into a URL.  If this is not possible, error out.  Extra
+    arguments are any targets the caller knows about, which may be repositories
+    containing the locid."""
+    if not targets:
+        error("Cannot determine URL for location '%s'; Explicit source "
+            + "argument (-S/--source) required.")
+
+    # append locid (a path within the repository) to the repostitory root of
+    # the first target found
+    return get_repo_root(targets[0]) + locid
+
+def equivalent_locids(locid1, locid2, *targets):
+    """Check the equivalency of two locid's.  Extra arguments are any targets
+    the caller knows about, which will be used to qualify any ambiguity in the
+    locids"""
+    # for repo-relative paths, mere equivalence suffices
+    if locid1 == locid2: return True
+
+def locid_path(locid):
+    """Get the repository-relative path from a location identifier."""
+    return locid
+
 def get_copyfrom(target):
-    """Get copyfrom info for a given target (it represents the directory from
-    where it was branched). NOTE: repos root has no copyfrom info. In this case
-    None is returned."""
-    repos_path = target_to_locid(target)
+    """Get copyfrom info for a given target (it represents the
+    repository-relative path from where it was branched). NOTE:
+    repos root has no copyfrom info. In this case None is returned."""
+    repos_path = locid_path(target_to_locid(target))
     out = launchsvn('log -v --xml --stop-on-copy "%s"' % target,
                     split_lines=False)
     out = out.replace("\n", " ")
@@ -1031,12 +1057,17 @@
     if not opts["revision"]:
         cf_source, cf_rev = get_copyfrom(opts["source-url"])
         branch_locid = target_to_locid(branch_dir)
+        if cf_source:
+          cf_url = get_repo_root(opts["source-url"]) + cf_source
+          cf_locid = target_to_locid(cf_url)
+          report("'%s' was branched from location '%s'" %
+                 (opts["source-url"], cf_locid))
+        else:
+          cf_locid = None
 
-        # If the branch_locid is the source path of "source",
-        # then "source" was branched from the current working tree
-        # and we can use the revisions determined by get_copyfrom
-        # (XXX assumes locid is a repository-relative-path)
-        if branch_locid == cf_source:
+        # If the source-url was coped from branch_locid
+        # then we can use the revisions determined by get_copyfrom
+        if equivalent_locids(branch_locid, cf_locid, branch_dir):
             report('the source "%s" is a branch of "%s"' %
                    (opts["source-url"], branch_dir))
             opts["revision"] = "1-" + cf_rev
@@ -1911,15 +1942,17 @@
             if not cf_source:
                 error('no copyfrom info available. '
                       'Explicit source argument (-S/--source) required.')
-            opts["source-locid"] = cf_source
+            opts["source-url"] = get_repo_root(opts["source-url"]) + cf_source
+            opts["source-locid"] = target_to_locid(opts["source-url"])
+
             if not opts["revision"]:
                 opts["revision"] = "1-" + cf_rev
         else:
             opts["source-locid"] = get_default_source(branch_dir, branch_props)
+            opts["source-url"] = locid_to_url(opts["source-locid"], branch_dir)
 
-        # (XXX assumes locid is a repository-relative-path)
-        assert opts["source-locid"][0] == '/'
-        opts["source-url"] = get_repo_root(branch_dir) + opts["source-locid"]
+        assert is_locid(opts["source-locid"])
+        assert is_url(opts["source-url"])
     else:
         # The source was given as a command line argument and is stored in
         # SOURCE.  Ensure that the specified source does not end in a /,
@@ -1935,15 +1968,16 @@
                 if locid.find(source) > 0:
                     found.append(locid)
             if len(found) == 1:
-                # (XXX assumes locid is a repository-relative-path)
-                source = get_repo_root(branch_dir) + found[0]
+                source_locid = found[0]
+                source = locid_to_url(source_locid, branch_dir)
             else:
                 error('"%s" is neither a valid URL, nor an unambiguous '
                       'substring of a location, nor a working directory' % source)
+        else:
+            source_locid = target_to_locid(source)
 
-        source_locid = target_to_locid(source)
         if str(cmd) == "init" and \
-               source_locid == target_to_locid("."):
+               equivalent_locids(source_locid, target_to_locid("."), "."):
             error("cannot init integration source location '%s'\nIts location identifer must "
                   "differ from the location identifier of the current directory." % source_locid)
         opts["source-locid"] = source_locid
-------------- next part --------------
Add support for three types of identifiers for locations in the subversion
properties:
 - path (the existing repo-relative path)
 - uuid (uuid://XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX/repo/relative/path
 - url
'svnmerge init' has a new flag, --location-type, allowing the user to specify
which kind of location to use.  After that, the format will be retained.

depends on
  call-it-locid.patch
  cleanup-abstraction.patch

Index: svnmerge.py
===================================================================
--- svnmerge.py	2007-04-12 18:22:51.416983500 -0500
+++ svnmerge.py	2007-04-12 18:40:02.709435250 -0500
@@ -574,12 +574,59 @@
         revs.update(rs._revs)
         return RevisionSet(revs)
 
+class LocationIdentifier:
+    """Abstraction for a location identifier, so that we can start talking
+    about it before we know the form that it takes in the properties (its
+    external_form).  Objects are referenced in the global variable 'locobjs',
+    keyed by all known forms."""
+    def __init__(self, repo_relative_path, uuid=None, url=None, external_form=None):
+        self.repo_relative_path = repo_relative_path
+        self.uuid = uuid
+        self.url = url
+        self.external_form = external_form
+
+    def __str__(self):
+        """Return a printable string representation"""
+	if self.external_form:
+	    return self.external_form
+        if self.uuid:
+            return self.format('uuid')
+        if self.url:
+            return self.format('url')
+	return self.format('path')
+
+    def format(self, fmt):
+        if fmt == 'path':
+            return self.repo_relative_path
+        elif fmt == 'uuid':
+            return "uuid://%s%s" % (self.uuid, self.repo_relative_path)
+        elif fmt == 'url':
+            return self.url
+        else:
+            error("Unkonwn location type '%s'" % fmt)
+
+    def match_substring(self, str):
+        """Test whether str is a substring of any representation of this
+        LocationIdentifier."""
+        if self.repo_relative_path.find(str) >= 0:
+            return True
+
+        if self.uuid:
+            if ("uuid://%s%s" % (self.uuid, self.repo_relative_path)).find(str) >= 0:
+                return True
+
+        if self.url:
+            if (self.url + self.repo_relative_path).find(str) >= 0:
+                return True
+
+        return False
+
 # Identifiers for branches:
 # A branch is identified in three ways within this source:
 # - as a working copy (variable name usually includes 'dir')
 # - as a fully qualified URL
-# - as a location identifier (an opaque string indicating a particular path
-#   in a particular repository; variable name includes 'locid')
+# - as a location identifier (a LocationIdentifier indicating a particular
+#   path in a particular repository; variable name includes 'locid')
 # A "target" is generally user-specified, and may be a working copy or
 # a URL.
 
@@ -601,8 +648,35 @@
     # Multiple sources are separated by any whitespace.
     for L in propvalue.split():
         # We use rsplit to play safe and allow colons in locids.
-        source, revs = rsplit(L.strip(), ":", 1)
-        prop[source] = revs
+        locid_str, revs = rsplit(L.strip(), ":", 1)
+
+        # convert locid_str to a LocationIdentifier
+        if not locobjs.has_key(locid_str):
+            if is_url(locid_str):
+                # we can determine every form; locid_hint knows how to do that
+                locid_hint(locid_str)
+            elif locid_str[:7] == 'uuid://':
+                mo = re.match('uuid://([^/]*)(.*)', locid_str)
+                if not mo:
+                    error("Invalid location identifier '%s'" % locid_str)
+                uuid, repo_relative_path = mo.groups()
+                locid = LocationIdentifier(repo_relative_path, uuid=uuid)
+                # we can cache this by uuid:// locid and by repo-relative path
+                locobjs[locid_str] = locobjs[repo_relative_path] = locid
+            elif locid_str and locid_str[0] == '/':
+                # strip any trailing slashes
+                locid_str = locid_str.rstrip('/')
+                locid = LocationIdentifier(repo_relative_path=locid_str)
+                # we can only cache this by repo-relative path
+                locobjs[locid_str] = locid
+            else:
+                error("Invalid location identifier '%s'" % locid_str)
+        locid = locobjs[locid_str]
+
+        # cache the "external" form we saw
+        locid.external_form = locid_str
+
+        prop[locid] = revs
     return prop
 
 def get_revlist_prop(url_or_dir, propname, rev=None):
@@ -643,7 +717,7 @@
     props.sort()
     L = []
     for h, r in props:
-        L.append(h + ":" + r)
+        L.append("%s:%s" % (h, r))
     return sep.join(L)
 
 def _run_propset(dir, prop, value):
@@ -689,7 +763,7 @@
 
 def is_url(url):
     """Check if url is a valid url."""
-    return re.search(r"^[a-zA-Z][-+\.\w]*://", url) is not None
+    return re.search(r"^[a-zA-Z][-+\.\w]*://", url) is not None and url[:4] != 'uuid'
 
 def is_wc(dir):
     """Check if a directory is a working copy."""
@@ -761,37 +835,63 @@
 
     assert False, "svn repos root not found"
 
-def target_to_locid(target):
-    """Convert a target (either a working copy path or an URL) into a
-    location identifier."""
-    root = get_repo_root(target)
+# a global cache of LocationIdentifier instances, keyed by all locids by
+# which they might be known.  This dictionary is primed by locid_hint(),
+# and further adjusted as queries against it are performed.
+locobjs = {}
+
+def locid_hint(target):
+    """Cache some information about target, as it may be referenced by
+    repo-relative path in subversion properties; the cache can help to
+    expand such a relative path to a full location identifier."""
+    if locobjs.has_key(target): return
+    if not is_url(target) and not is_wc(target): return
+
     url = target_to_url(target)
+
+    root = get_repo_root(url)
     assert root[-1] != "/"
     assert url[:len(root)] == root, "url=%r, root=%r" % (url, root)
-    return url[len(root):]
+    repo_relative_path = url[len(root):]
+
+    uuid = get_svninfo(target)['Repository UUID']
+    uuid_locid = 'uuid://%s%s' % (uuid, repo_relative_path)
 
-def locid_to_url(locid, *targets):
-    """Convert a locid into a URL.  If this is not possible, error out.  Extra
-    arguments are any targets the caller knows about, which may be repositories
-    containing the locid."""
-    if not targets:
-        error("Cannot determine URL for location '%s'; Explicit source "
-            + "argument (-S/--source) required.")
-
-    # append locid (a path within the repository) to the repostitory root of
-    # the first target found
-    return get_repo_root(targets[0]) + locid
-
-def equivalent_locids(locid1, locid2, *targets):
-    """Check the equivalency of two locid's.  Extra arguments are any targets
-    the caller knows about, which will be used to qualify any ambiguity in the
-    locids"""
-    # for repo-relative paths, mere equivalence suffices
-    if locid1 == locid2: return True
+    locobj = locobjs.get(url) or \
+             locobjs.get(uuid_locid) or \
+             locobjs.get(repo_relative_path)
+    if not locobj:
+        locobj = LocationIdentifier(repo_relative_path, uuid=uuid, url=url)
+
+    locobjs[target] = locobj
+    locobjs[url] = locobj
+    locobjs[uuid_locid] = locobj
+    if not locobjs.has_key(repo_relative_path):
+        locobjs[repo_relative_path] = locobj
+
+def target_to_locid(target):
+    """Convert a target (either a working copy path or an URL) into a
+    location identifier."""
+    # prime the cache first if we don't know about this target yet
+    if not locobjs.has_key(target):
+        locid_hint(target)
+    return locobjs[target]
+
+def locid_to_url(locid):
+    """Convert a locid into a URL.  If this is not possible, error out."""
+    if locid.url:
+        return locid.url
+    else:
+        error("Cannot determine URL for '%s'; " % locid +
+              "Explicit source argument (-S/--source) required.\n")
+
+def equivalent_locids(locid1, locid2):
+    """Check the equivalency of two locid's."""
+    return locid1 is locid2
 
 def locid_path(locid):
     """Get the repository-relative path from a location identifier."""
-    return locid
+    return locid.repo_relative_path
 
 def get_copyfrom(target):
     """Get copyfrom info for a given target (it represents the
@@ -894,35 +994,11 @@
         err_msg += "Explicit source argument (-S/--source) required.\n"
         err_msg += "The merge sources available are:"
         for prop in props:
-          err_msg += "\n  " + prop
+          err_msg += "\n  " + str(prop)
         error(err_msg)
 
     return props.keys()[0]
 
-def check_old_prop_version(branch_target, branch_props):
-    """Check if branch_props (of branch_target) are svnmerge properties in 
-    old format, and emit an error if so."""
-
-    # Previous svnmerge versions allowed trailing /'s in the repository
-    # local path.  Newer versions of svnmerge will trim trailing /'s
-    # appearing in the command line, so if there are any properties with
-    # trailing /'s, they will not be properly matched later on, so require
-    # the user to change them now.
-    fixed = {}
-    changed = False
-    for source, revs in branch_props.items():
-        src = rstrip(source, "/")
-        fixed[src] = revs
-        if src != source:
-            changed = True
-
-    if changed:
-        err_msg = "old property values detected; an upgrade is required.\n\n"
-        err_msg += "Please execute and commit these changes to upgrade:\n\n"
-        err_msg += 'svn propset "%s" "%s" "%s"' % \
-                   (opts["prop"], format_merge_props(fixed), branch_target)
-        error(err_msg)
-
 def analyze_revs(target_locid, url, begin=1, end=None,
                  find_reflected=False):
     """For the source of the merges in the source URL being merged into
@@ -1070,7 +1146,7 @@
 
         # If the source-url was coped from branch_locid
         # then we can use the revisions determined by get_copyfrom
-        if equivalent_locids(branch_locid, cf_locid, branch_dir):
+        if equivalent_locids(branch_locid, cf_locid):
             report('the source "%s" is a branch of "%s"' %
                    (opts["source-url"], branch_dir))
             opts["revision"] = "1-" + cf_rev
@@ -1082,13 +1158,18 @@
     report('marking "%s" as already containing revisions "%s" of "%s"' %
            (branch_dir, revs, opts["source-url"]))
 
-    revs = str(revs)
     # If the source-locid already has an entry in the svnmerge-integrated
     # property, simply error out.
-    if not opts["force"] and branch_props.has_key(opts["source-locid"]):
+    source_locid = opts['source-locid']
+    if not opts["force"] and branch_props.has_key(source_locid):
         error('Location %s has already been initialized at %s\n'
-              'Use --force to re-initialize' % (opts["source-locid"], branch_dir))
-    branch_props[opts["source-locid"]] = revs
+              'Use --force to re-initialize' % (source_locid, branch_dir))
+
+    # set the locid's external_form based on the user's options
+    source_locid.external_form = source_locid.format(opts['location-type'])
+
+    revs = str(revs)
+    branch_props[source_locid] = revs
 
     # Set property
     set_merge_props(branch_dir, branch_props)
@@ -1138,7 +1219,6 @@
     creation revision."""
     # Extract the integration info for the branch_dir
     branch_props = get_merge_props(branch_dir)
-    check_old_prop_version(branch_dir, branch_props)
     revs = merge_props_to_revision_set(branch_props, opts["source-locid"])
 
     # Lookup the oldest revision on the branch path.
@@ -1320,7 +1400,6 @@
 
     # Extract the integration info for the branch_dir
     branch_props = get_merge_props(branch_dir)
-    check_old_prop_version(branch_dir, branch_props)
     # Get the list of all revisions already merged into this source-locid.
     merged_revs = merge_props_to_revision_set(branch_props,
                                               opts["source-locid"])
@@ -1742,9 +1821,9 @@
     OptionArg("-S", "--source", "--head",
               default=None,
               help="specify a merge source for this branch.  It can be either "
-                   "a path, a full URL, or an unambiguous substring of one "
-                   "of the paths for which merge tracking was already "
-                   "initialized.  Needed only to disambiguate in case of "
+                   "a working directory path, a full URL, or an unambiguous "
+		   "substring of one of the locations for which merge tracking was "
+		   "already initialized.  Needed only to disambiguate in case of "
                    "multiple merge sources"),
 ]
 
@@ -1764,6 +1843,12 @@
     the branch point (unless you teach it with --revision).""" % NAME,
     [
         "-f", "-r", # import common opts
+        OptionArg("-L", "--location-type",
+               dest="location-type",
+               default="path",
+               help="Use this type of location identifier in the new " +
+                    "Subversion properties; 'uuid', 'url', or 'path' " +
+                    "(default)"),
     ]),
 
     "avail": (action_avail,
@@ -1933,9 +2018,12 @@
     if not is_wc(branch_dir):
         error('"%s" is not a subversion working directory' % branch_dir)
 
+    # give out some hints as to potential locids
+    locid_hint(branch_dir)
+    if source: locid_hint(source)
+
     # Extract the integration info for the branch_dir
     branch_props = get_merge_props(branch_dir)
-    check_old_prop_version(branch_dir, branch_props)
 
     # Calculate source_url and source_path
     report("calculate source path for the branch")
@@ -1952,7 +2040,7 @@
                 opts["revision"] = "1-" + cf_rev
         else:
             opts["source-locid"] = get_default_source(branch_dir, branch_props)
-            opts["source-url"] = locid_to_url(opts["source-locid"], branch_dir)
+            opts["source-url"] = locid_to_url(opts["source-locid"])
 
         assert is_locid(opts["source-locid"])
         assert is_url(opts["source-url"])
@@ -1968,11 +2056,11 @@
             # within the branch properties.
             found = []
             for locid in branch_props.keys():
-                if locid.find(source) > 0:
+                if locid.match_substring(source) > 0:
                     found.append(locid)
             if len(found) == 1:
                 source_locid = found[0]
-                source = locid_to_url(source_locid, branch_dir)
+                source = locid_to_url(source_locid)
             else:
                 error('"%s" is neither a valid URL, nor an unambiguous '
                       'substring of a location, nor a working directory' % source)
@@ -1980,8 +2068,8 @@
             source_locid = target_to_locid(source)
 
         if str(cmd) == "init" and \
-               equivalent_locids(source_locid, target_to_locid("."), "."):
-            error("cannot init integration source location '%s'\nIts location identifer must "
+               equivalent_locids(source_locid, target_to_locid(branch_dir)):
+            error("cannot init integration source location '%s'\nIts location identifer does not "
                   "differ from the location identifier of the current directory." % source_locid)
         opts["source-locid"] = source_locid
         opts["source-url"] = target_to_url(source)
-------------- next part --------------
Detect and error out on invalid URLs

(does not depend on other patches in this collection)

Index: svnmerge.py
===================================================================
--- svnmerge.py	2007-04-12 18:19:05.026835000 -0500
+++ svnmerge.py	2007-04-12 18:22:51.416983500 -0500
@@ -713,7 +713,10 @@
         if not L:
             continue
         key, value = L.split(": ", 1)
-        info[key] = value.strip()
+	value = value.strip()
+	if value == '(Not a valid URL)':
+	    error("Not a valid URL: %s" % target)
+        info[key] = value
     _cache_svninfo[target] = info
     return info
 
-------------- next part --------------
analyze_source_revs() gets the latest revision of the *branch*
repository, then proceeds to use that value against the *source*
repository; it should get the latest revision of the *source*

(does not depend on other patches in this collection)

Index: svnmerge.py
===================================================================
--- svnmerge.py.orig	2007-04-12 16:34:56.064299250 -0500
+++ svnmerge.py	2007-04-12 16:35:12.965355500 -0500
@@ -972,7 +972,7 @@
 
     # Extract the latest repository revision from the URL of the branch
     # directory (which is already cached at this point).
-    end_rev = get_latest_rev(branch_url)
+    end_rev = get_latest_rev(source_url)
 
     # Calculate the base of analysis. If there is a "1-XX" interval in the
     # merged_revs, we do not need to check those.


More information about the Svnmerge mailing list