[Svnmerge] [PATCH] Eliminate spurious svnmerge-integrated property conflicts

Mon Dec 11 11:19:35 PST 2006

On Tue Dec 5 07:48:26 PST 2006, Raman Gupta wrote:

> If keeping transitive merge data is important to people, perhaps what
> is needed is a discussion of the transitive use cases people wish to
> support, and then a specific implementation to achieve the
> requirements, perhaps enabled by use of a "--transitive" flag.

This whole "transitive merge info" business is kind of a pet project  
of mine, that I've been thinking about and working on for well over  
ten years, so if I get too wordy, feel free to slap me down ;-)  Let  
me start (and maybe even stop, at least as far as this message is  
concerned) with some concrete examples, real-world situations I've  
actually been in.

Limited-distribution product: In this situation, we had an internal  
product with a small number of customers--half a dozen.  The product  
was of substantial size: 100k lines of C, a client/server  
configuration with two or three distinct clients (I will use "client"  
to mean "process that depends on the server," while I will use  
"customer" to mean "team within the company that uses the product.")   
Each customer had their own server, and the customers were not  
software people.  The product was successful (i.e., people used it  
every day), and it evolved quite rapidly (new releases to some  
customer or other every week or so), and each customer had unique  
requirements.  We ended up boxed into maintaining parallel versions  
of what was only roughly the same product, sharing as much code as we  
could, generalizing the idiosyncratic requirements whenever possible,  
parameterizing and modularizing and configuring and otherwise  
limiting the variations as much as we could, but still we ended up  
with half a dozen parallel versions of something that was maybe 80%  
common and 20% custom.  The need for transitive merge tracking arises  
like this: when a problem is discovered in the common code, the  
particular customer who first encounters it is pretty much random.   
At any moment in time, it may or may not be convenient for other  
customers to receive the fix.  Worse than that: some customers would  
actually reject some fixes, because they violate one of their  
idiosyncratic requirements, or at any rate need to be moderated to  
fit their idiosyncrasies.  One concrete example of how messy that  
could get: we had a "fairness doctrine" in the server, that ensured  
that multiple concurrent clients didn't interfere in ways that  
starved one or another unfairly.  But one customer felt "fair" meant  
"equal time for all," while another wanted "in case of a tie, one  
designated client always wins."  A bug fix that improved the "one  
always wins" behavior might make the latter happy, while explicitly  
violating the former.  Key properties of this situation that forced  
us into wanting "transitive merge info":
  - several branches of development
  - proceeding in parallel without any expectation of eventual  
convergence
  - changes might originate in any of the versions
  - changes might propagate to the other versions in any arbitrary order
  - sometimes changes propagate identically, sometimes with tweaks,  
and sometimes we needed to record that a change was positively  
rejected and should never be applied

Database system: I once worked for a major relational database system  
vendor.  A peculiarity of the database business is that customers  
become extraordinarily wedded to the version of the DB they're  
using.  Major releases are allowed/expected to be incompatible, which  
means both that a dump/load of the data is/may be needed, and also  
that your APIs may change incompatibly.  A corollary to this is that  
some customers don't want to upgrade to a new version ... often  
customers with a lot of money to spend.  But the still want change in  
their product, not only bug fixes but even major enhancements of  
various sorts.  At the time I was there, we had four major versions  
in circulation.  The "center of gravity" (number of customers using  
it, or put another way number of dollars coming in for support  
contracts) tended to lag about 1.5 major versions behind the main  
release stream, so we had two or three versions under active, major  
enhancement and support, as well as one or two on "life support."   
Each version had some unique property to distinguish it: physical  
storage format, or highly parallel code, or object-relational  
support, or data replication, and the unique features involved a lot  
of unique code, of course, but there was also a huge majority of the  
code that was shared (and some of the support work on older versions  
consisted of back-porting stuff originally created for newer  
version).  So, once again, we had essentially eternal parallel  
development, with a rich mixture of unique stuff, shared stuff,  
weirdly semi-shared/semi-tweaked stuff, and some notable mixture of  
categorically-not-to-be-shared stuff.

Linux: Things in the Linux world are, perhaps, a bit more  
"cathedralesque" these days than they once were, but if you think  
back, say, to the days "pre-Red Hat," the model went something like  
this: while there was certainly always a very clear definition to  
"the most official Linux in the world" (answer: whatever Linux has),  
the community dynamics included a rich exchange of work at somewhat- 
less-than-Linus levels of officialdom.  There was ample room for  
hobbyists with no more interest than to build their own single site,  
and willing to take a useful feature from some other hobbyist,  
acknowledging that it might have some flaws or limitations that were  
actually keeping it from the One True Linux, "but what the heck, it  
works for me."  Maybe that's still true, maybe the commercially- 
viable, "I don't WANT to understand it, I just want someone to yell  
at if it fails" style is a pure addition.  Anyway, particularly in  
the hobbyist arena, we once again see those now-familiar indicia:  
multiple branches, no real expectation that they'll converge to one,  
on-going change freely interchanged among them, and a distinct  
requirement to accept some changes, modify others, and explicitly  
reject yet others.

So those are my cases.  As you can see, they really reduce down to  
one abstract case, which is handy because it means we only need one  
implementation to address them all!

-==-
Jack Repenning
Director, Software Product Architecture
CollabNet, Inc.
8000 Marina Boulevard, Suite 600
Brisbane, California 94005
office: +1 650.228.2562
mobile: +1 408.835.8090
raindance: 844.7461
aim: jackrepenning
skype: jrepenning