[Svnmerge] request for review: #2863 - analyze_source_revs fix

Sun Aug 5 17:15:48 PDT 2007

On dom, 2007-08-05 at 18:58 -0500, Dustin J. Mitchell wrote:

> > It is perfectly correct if you run merges within a single repository,
> > whcih is the only "documented" way to do merges. I appreciate your
> > efforts to fully support merges across multiple repositories, but I am
> > a little concerned if that slows down operations within a single
> > repository (which I am sure it is by far the most common type of merge
> > whcih svnmerge.py is used for).
> > 
> > >  Is there another way to do this?  I think using get_svninfo() on
> > >  them to compare UUIDs or repos-roots would cause an extra fetch
> > >  anyway.
> > 
> > Not necessarily: svn info of a local path does not fetch anything.
> > Pardon me if I haven't looked at the code for a long time, but *on
> > paper* I can't see why a remote fetch should be necessary here. 
> 
> I think that, before we can address this specific case, we need to
> address the following question: does svnmerge support inter-repository
> merges?  If no, then I should withdraw this patch series.  If yes, then
> we have some bugs to fix.  I don't think we can have any ambiguity here.

As far as I remember, when I started svnmerge.py (and before, when
Archie started svnemrge.sh), svn itself could *not* do any
inter-repository merges. So inter-repository merge was never been part
of the original design of svnmerge.py.

I think that at some point "svn merge" started to support this. I don't
even know when it happened -- I must have missed the news anyway, since
the first mention of this in my memory is related to your work.

But anyway, I don't think you should withdraw your patches. On the
contrary: I see your patches as a very nice improvement as they add a
new feature.

> Inter-repository merges work, modulo this bug and the inability to merge
> between projects with equal repos-relative paths (which this patch
> series intends to fix).
> 

And that is great! And it is exactly why I think that you shouldn't
withdraw your patches: with only little adjustments, we can get an
important new feature.

> > > analyze_revs has some unused logic to handle the case of an unknown
> > > end_rev.  Could we make that an option for speed-conscious users?
> > 
> > I don't understand what you are suggesting here... can you elaborate
> > please?
> 
> The code in question is to determine end_rev for supply to analyze_revs.
> It's the only code to call analyze_revs, and always supplies end_rev.
> But analyze_rev contains conditional code to deal with the case where
> end_rev is unknown.  My point was that end_rev is not strictly required,
> so one possibility is to drop it entirely.  Other possibilities:
>  - use the target HEAD on single-repository merges and the source head
>    on multi-repository merges, either with an explicit test in
>    analyze_source_revs or by improving the caching in get_latest_rev()

This is what I was suggesting, in fact.

>  - offer a --fast argument that disables remote accesses that are not
>    strictly necessary (such as this one)

I don't like implementation details leaking in the user interface.
Sometimes they are necessary: --bidi is (or at least used to be) one of
those, and I would love if it was eventually removed. 

> > > One I've been thinking about is caching immutable information such
> > > as RevisionLog information -- I find that the biggest time sink for
> > > my merges (which are over svn+ssh://, ugh) comes from downloading
> > > the same logs and revision diffs on every merge.
> > 
> > Yes, but I wouldn't put that within svnmerge.py. Caching part of the
> > remote repository into some local space is probably a project on its
> > own; I had played with such a toy before. I had design a tool called
> > "lsvn" which would basically have the same UI of "svn" (forwarding
> > every command), but cache many things locally (not file
> > contents/diffs, but logs and revprops). After that, you can simply
> > tell svnmerge.py to run "lsvn" instead of "svn" and be done with it.
> > In fact, I guess many users of "svn" would be happy of "lsvn"
> > independently of svnmerge.py.
> 
> I'm not entirely convinced: svnmerge is already caching information
> *within* a run of svnmerge.

... but only because the underlying SCM is slow. If SVN was a
distributed SCM with fast responses to all commands, svnmerge.py could
drop all of its internal caches. 

I understand that your suggestion of pickling svnmerge caches looks like
normal evolution to the current design. In fact, it is. I believe even I
myself suggested this some years ago on this list. I'm not totally
opposed to this design. I only believe that moving all the caching to a
separate layer like an external lsvn program -- which is, by itself, an
useful tool irrespective of svnmerge -- is a better design because of
more modularity.
-- 
Giovanni Bajo