[Ocaml-pxp-users] PXP: Saving memory

Tue Apr 18 03:17:36 PDT 2006

We have a problematic XML document (a daily report) which runs to
something like 600-700 MB in size.  It is proving hard to parse this
document with PXP, because our machine runs out of memory and starts
thrashing.  In future this document will only grow in size.

We're looking for options to reduce the amount of memory used.  One
option would be to somehow parse the document incrementally, but I
don't think this is possible with PXP.

Another option would be to use "pools".  However documentation is very
thin on how to use these.  It seems that the parsing function we are
using, parse_wfdocument_entity, doesn't allow pools to be passed, and
that's assuming we even knew how to create pools in the first place,
which isn't very obvious.

The document isn't very complicated - it's just a simple list of
<row>'s.

Can someone give me suggestions?

Rich.

PS. One thing we found when parsing this, is that #find_all_elements
isn't tail recursive, meaning that it causes a crash on even fairly
modest documents.

-- 
Richard Jones, CTO Merjis Ltd.
Merjis - web marketing and technology - http://merjis.com
Team Notepad - intranets and extranets for business - http://team-notepad.com