[Svnmerge] [PATCH] performance fix for some platforms

lsuvkne at onemodel.org lsuvkne at onemodel.org
Sat Jun 30 14:10:22 PDT 2007


(The attachment in my last, 1-line message was included by mistake and
can be disregarded.)

Dustin,
I forgot to thank you earlier for the thoughtful review and comments of
the long patch that I sent in of a week or more ago.  I've used almost
all your suggestions, and will try to send it in, in chunks now. Here's
the one for the performance fix.  

You probably know some of this already but I'll repeat it for
completeness.  The problem this patch addresses is that some OSes (some
*nix flavors, but not others, and not windows) have larger numbers of
file descriptors, and the older popen() code loops through trying to
close all of them when you execute a subprocess. This makes the launch()
command very slow.  The newer subprocess code fixes the issue, so this
patch uses that for versions of python that support it.  This makes a
huge difference on my box:  1-4 minutes vs. hours to run the tests; the
svnmerge.py code itself naturally runs faster as well.

The change to svnmerge_test.py is because launch() no longer accepts
empty commands.  Maybe it would be better to throw an exception in
launch if there are some, but for now I at least quit passing the empty
commands from multilaunch() that come because of the line break at the
beginning & ending of those parameter strings.

> Similarly, your conversion to the subprocess module probably deserves
> its own patch, although that could face an uphill battle ‑‑   lots of
> folks still use Python 2.3, since lots of RedHat systems are still out
> there with 2.3.4.  Presenting it as a separate pach might allow us to
> develop a wrapper that can fall back to popen..

OK.  I made it conditional on version of python, so both old and new
logic are there and the same versions of python supported.
 
I would like to have actually run it mysself with different versions of
python but every old version I download for windows or linux seems to
have some kind of file corruption and won't extract or install. I made
sure that the "if" condition at least worked as far as the 2.4 goes, and
the old code the same so with a good inspection I think it will be fine.

What do you think?

Thanks,

Luke 
-------------- next part --------------
--- contrib/client-side/svnmerge/svnmerge.py-be4PerfFix	2007-06-29 15:20:30.000000000 -0600
+++ contrib/client-side/svnmerge/svnmerge.py	2007-06-30 14:44:28.000000000 -0600
@@ -57,7 +57,7 @@
 # TODO:
 #  - Add "svnmerge avail -R": show logs in reverse order
 
-import sys, os, getopt, re, types, popen2, tempfile
+import sys, os, getopt, re, types, subprocess, tempfile, time, popen2
 from bisect import bisect
 
 NAME = "svnmerge"
@@ -205,30 +205,95 @@
     """Launch a sub-process. Return its output (both stdout and stderr),
     optionally split by lines (if split_lines is True). Raise a LaunchError
     exception if the exit code of the process is non-zero (failure)."""
-    if os.name not in ['nt', 'os2']:
-        p = popen2.Popen4(cmd)
-        p.tochild.close()
-        if split_lines:
-            out = p.fromchild.readlines()
-        else:
-            out = p.fromchild.read()
-        ret = p.wait()
-        if ret == 0:
-            ret = None
+    
+    if not (hasattr(sys, "version_info") and sys.version_info >= (2, 4)):
+        # support versions of python before 2.4 (slower on some systems)
+        if os.name not in ['nt', 'os2']:
+            p = popen2.Popen4(cmd)
+            p.tochild.close()
+            if split_lines:
+                out = p.fromchild.readlines()
+            else:
+                out = p.fromchild.read()
+            ret = p.wait()
+            if ret == 0:
+                ret = None
+            else:
+                ret >>= 8
         else:
-            ret >>= 8
+            i,k = os.popen4(cmd)
+            i.close()
+            if split_lines:
+                out = k.readlines()
+            else:
+                out = k.read()
+            ret = k.close()
+    
+        if ret is None:
+            return out
+        raise LaunchError(ret, cmd, out)
     else:
-        i,k = os.popen4(cmd)
-        i.close()
-        if split_lines:
-            out = k.readlines()
+        # Requiring python 2.4 or higher, on some platforms we get 
+        # much faster performance from the subprocess module (python
+        # doesn't try to close an exhorbitant number of file descriptors)
+        
+        # todo: The next two little sections, handling argument quoting, could
+        # probably go away if we change all callers of this method to pass in
+        # the parameter "cmd" as a Python list of arguments instead of as a
+        # string.
+        ''' Need to keep quoted arguments together as single parameters, so can't 
+        use plain cmd.split(). Is there a better way than this?'''
+        args = re.split("(\"[^\"]*\")|(\'[^\']*\')|\s",cmd)
+        while None in args:
+            args.remove(None)
+        while '' in args:
+            args.remove('')
+        
+        ''' (We need to remove double quoting from args that are quoted twice--it 
+        didn't matter when we used the older subprocess handler (popen2) but now 
+        parameters that were previously quoted (probably because they might 
+        contain spaces or odd chars), get re-quoted when we convert args to 
+        strings via the above call to "re.split()", thus we have double quotes 
+        that have to be removed or it breaks.)
+        '''
+        counter = 0
+        for arg in args:
+            if (arg[0]=='"' or arg[0]=="'") and (arg[-1]=='"' or arg[-1]=="'"):
+                args[counter] = arg[1:-1]
+            counter = counter + 1
+        
+        
+        stdout = ""
+        stderr = ""
+        try:
+            if os.name == 'nt':
+                p = subprocess.Popen(args, executable=os.environ['COMSPEC'], \
+                                     stdout=subprocess.PIPE, \
+                                     stderr=subprocess.PIPE, shell=True)
+            else:
+                p = subprocess.Popen(args, executable=args[0], \
+                                     stdout=subprocess.PIPE, \
+                                     stderr=subprocess.PIPE)
+            stdoutAndErr = p.communicate()
+            stdout = stdoutAndErr[0]
+            stderr = stdoutAndErr[1]
+        except OSError, inst:
+            # Using 1 as failure code; should get actual number somehow? For 
+            # examples see svnmerge_test.py's TestCase_launch.test_failure and 
+            # TestCase_launch.test_failurecode.
+            raise LaunchError(1, cmd, stdout + " " + stderr + ": " + str(inst)) 
+        
+        if p.returncode == 0 and len(stderr) == 0:
+            if split_lines:
+                # Setting keepends=True for compatibility with previous logic 
+                # (where file.readlines() preserves newlines)
+                # and the test TestCase_launch.test_basic, whose purpose in 
+                # checking for newlines I don't yet know.
+                return stdout.splitlines(True)
+            else:
+                return stdout
         else:
-            out = k.read()
-        ret = k.close()
-
-    if ret is None:
-        return out
-    raise LaunchError(ret, cmd, out)
+            raise LaunchError(p.returncode, cmd, stdout + stderr)
 
 def launchsvn(s, show=False, pretend=False, **kwargs):
     """Launch SVN and grab its output."""
-------------- next part --------------
--- contrib/client-side/svnmerge/svnmerge_test.py-be4PerfFix	2007-06-30 14:47:10.000000000 -0600
+++ contrib/client-side/svnmerge/svnmerge_test.py	2007-06-30 14:48:15.000000000 -0600
@@ -456,7 +456,8 @@
     def multilaunch(self, cmds):
         for cmd in cmds.split("\n"):
             cmd = cmd.strip()
-            svnmerge.launch(cmd % self.command_dict())
+            if len(cmd) > 0:
+                svnmerge.launch(cmd % self.command_dict())
 
     def revert(self):
         self.multilaunch("svn revert -R .")


More information about the Svnmerge mailing list