[Svnmerge] [RFC] Pipes as iterators / multiple pipes

Wed May 17 12:12:08 PDT 2006

Madan U Sreenivasan wrote:

>> This can have a couple of benefits:
>>
>> - It provides a way to parallelize operations and thus gaining speed
>> benefits
>> when the SVN connection time is the bottleneck. For instance, the
>> attached
>> patch constructs the commit log message by running numerous parallel
>> "svn log"
>> processes to get the error message, and the gain in speed is
>> enormous.
>
> I didn't understand this. Could you explain a little bit in detail pl.
>
> Especially the part about running numerous parallel 'svn log'
> commands. My immd worry is about the load it will add on the server.
> Some servers might misinterpret this as a DOS attack too.

Give a look at the new construct_merged_log_message. It uses a FIFO queue to
allocate pipes to external "svn log" process, one for each revision log.
When the FIFO is full (default: 8 elements), it extracts the oldest process
and block reading from its pipe. So, by the time it gets to the second
process, it's almost always already completed, and the pipe is already full
of data to read. The basic idea is easy to check even with a shell and a
simple script: with a remote SVN server, launching 8 parallel "svn log"
commands take almost exactly 8 times more than 8 serialized "svn log"
command. Each "svn log" process is not limited by the bandwidth used to
transfer the log contents (of course!), it's limited by long handshaking
times, requiring several round-trips.

The size of the FIFO is currently configured at 8 elements, which means that
you can run 8 svn log commands at a time. I am open to concerns about
flooding/DOS, but I guess that 8 SVN (or SSH) connections from the same IP
are pretty normal (think 8 programmers accessing the same SVN server from
behind a NAT). I was planning to make this configurable, of course, in case
there were specific limits. Given that almost half of the time of my average
merge was wasted constructing the log message (while now it takes 8 times
less), it is a good improvement for me.

> variable names : i,j,L can be replaced with more meaningful names

I don't generically like long names in local variables. I believe they don't
buy much, as functions should already be small enough and not use many
variables, and a too lang name hits my eyes too much, it's longer to read
and makes the code "more verbose", as in it hides what it's being done too
much. Of course, it's just a very personal taste.

Anyway, I might have used some random letter to hold very local variables
here and there in this patch. Of course I plan to rename them to the
approriate letter later :)

Thanks!
-- 
Giovanni Bajo