[Svnmerge] correct parsing of xml output

Giovanni Bajo rasky at develer.com
Mon Jan 23 15:35:16 PST 2006


Mattias Engdegård <mattias at virtutech.se> wrote:

> get_copyfrom() does not parse the output of xml log --xml correctly.
> Regexes are greedy, so .* leads to mismatches (and incorrect
> behaviour).
> We were just bitten by this bug.

Ah yes, I had this fixed a few days ago as well but I haven't commited the fix
for some reason. Will do in the next few days.

> --- svnmerge.py (revision 18197)
> +++ svnmerge.py (working copy)
> @@ -479,7 +479,7 @@
>      out = launchsvn('log -v --xml --stop-on-copy "%s"' % dir,
>      split_lines=False) out = out.replace("\n", " ")
>      try:
> -        m = re.search(r'(<path .*action="A".*>%s</path>)' % rlpath,
> out)
> +        m = re.search(r'<path ([^>]*action="A"[^>]*)>%s</path>'
>          % rlpath, out) head = re.search(r'copyfrom-path="([^"]*)"',
>          m.group(1)).group(1) rev =
>          re.search(r'copyfrom-rev="([^"]*)"', m.group(1)).group(1)
> return head,rev

Yeah, I had a similar fix.

> Proper XML parsing would be best, but I suppose you have some reason
> for not doing this (compatibility with old Python version perhaps).


No, basic PyXML usage would be fine since I'm not interested in anything
pre-2.0. I'm more familiar with ElementTree but that's not even standard in 2.4
so it's out of quiestion  I should code something with SAX or PullDOM (the
correct code reads everything into memory, but that's a quirk I'd like to fix,
I'd rather pull data from the stream and parse as it goes). I don't have much
spare time at the moment, but patches are welcome

Thanks!

Giovanni Bajo
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 93 bytes
Desc: not available
Url : /pipermail/svnmerge/attachments/20060124/701a9568/attachment.gif 


More information about the Svnmerge mailing list