[Ocaml-pxp-users] Handling undeclared entities

Gerd Stolpmann gerd at gerd-stolpmann.de
Thu Jul 16 12:45:21 PDT 2009


Am Donnerstag, den 16.07.2009, 09:43 -0700 schrieb Dario Teixeira:
> Hi,
> 
> I am using PXP to parse a small HTML-like markup.  I would like to allow
> the use of common HTML entities in the source text (such as €), but I
> don't want to include a list of *all* of them in the DTD  (note that these
> are eventually checked for validity somewhere else; I just don't need this
> task to be performed also by PXP).
> 
> Now, the PXP manual mentions several times that entities are automatically
> converted into regular #PCDATA, and there doesn't seem to be a way of passing
> them unmodified to the processing code.  Therefore, if they are not declared
> in the DTD I get a parsing error.
> 
> One solution I can think of is to preprocess the source file, using regexps
> to replace entity references by a special node.  Something like this:
> "the symbol is &euro;" -> "the symbol is <entity>euro<entity>".
> 
> This solution is of course way to kludgy and error prone.  Is there a better
> alternative within PXP?

Sure. The entity declarations need only to be put into the dtd object at
the right moment. The parsing functions have a callback for exactly that
purpose, ~transform_dtd, e.g.

parse_document_entity
  ~transform_dtd:(fun dtd ->
                    let e = Pxp_Dtd.Entity.create_internal_entity 
                              ~name:"euro" ~value:"&#x20ac;" dtd in
                    dtd # add_gen_entity e false
                 )
  config src spec

Gerd

> 
> Thanks!
> Best regards,
> Dario Teixeira
> 
> 
> 
>       
> _______________________________________________
> Ocaml-pxp-users mailing list
> Ocaml-pxp-users at orcaware.com
> http://www.orcaware.com/mailman/listinfo/ocaml-pxp-users
> 
-- 
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd at gerd-stolpmann.de          http://www.gerd-stolpmann.de
Phone: +49-6151-153855                  Fax: +49-6151-997714
------------------------------------------------------------




More information about the Ocaml-pxp-users mailing list