[Ocaml-pxp-users] Help needed with PXP api

Gerd Stolpmann gerd at gerd-stolpmann.de
Tue Mar 8 02:43:37 PST 2005


Am Montag, den 07.03.2005, 21:57 +0100 schrieb Alain Frisch:
> Hello ocaml-pxp-users list,
> 
> I'm struggling with PXP API to load external entities. Basically, I have 
> a function of type (string -> string) which downloads non local URLs. 
> I'd like to configure PXP to use this function (and access the local 
> file system for URLs without an URL scheme), and figure out how to 
> combine base and relative URLs. What should I do ?  (I guess this is a 
> two-liner, but I can't figure out.)

A bit more.

The link to your function is best done with the class
Pxp_reader.resolve_to_url_obj_channel (provided you have a URL and want
that relative URLs work - without that it is a bit simpler):

class resolve_to_url_obj_channel :
  ?close:(Netchannels.in_obj_channel -> unit) ->
  url_of_id:(resolver_id -> Neturl.url) ->
  base_url_of_id:(resolver_id -> Neturl.url) ->
  channel_of_url:(resolver_id -> Neturl.url -> accepted_id) ->
  unit ->
    resolver;;

There are three argument functions:

- url_of_id: Creates the URL from the URL string. It looks at the URL
  scheme and decides whether to accept it (by just returning the 
  Neturl.url) or to reject it (by raising Not_competent).

  An example: Accept only "http" URLs (with pseudo code):
  ~url_of_id:(fun id ->
    match id.rid_system with
      Some s ->
        if "s begins with URL scheme" then (
          if "s begins with http:" then
            "Create absolute URL from s"
          else
            raise Not_competent   (* This URL is not for us *)
        ) else (
          (* ==> s is a relative URL *)
          match s.rid_system_base with
            Some base ->
               (* Check whether the base URL is http: *)
               if "base begins with http:" then
                 "Create relative URL from s (ignoring base)"
               else
                 raise Not_competent   (* Again not for us *)
          | None ->
               raise Not_competent
        )
    | None -> raise Not_competent
  )

- base_url_of_id: Returns the rid_system_base component as URL.
  Raises Not_competent or Malformed_URL if something goes wrong.
  Normally just parses the rid_system_base.

- channel_of_url: Actually opens the entity. An accepted_id is a triple

  type accepted_id =
      Netchannels.in_obj_channel * encoding option * resolver_id option

  where the first component is the object channel accessing the entity,
  the second component may be used to override the encoding (None means
  to autodetect). The third component is usually None (in principle, one
  can give the open entity a new name here).

  Example:
  ~channel_of_url:(fun id url ->
    let data = alain's_wonder_function ("get string representation of url") in
    let ch = new Netchannels.input_string data in
    (data, None, None)
  )

  Note that url is now always an absolute URL.

Now, when you have this, you can combine this custom reader with a
standard file reader:

let r =
  new combine
    [ new resolve_to_url_obj_channel ...;
      new resolve_as_file()
    ]

Finally, use

XExtID(xid, Some default_base, r)

to open the first XML file. xid should name this file. default_base is
the base URL to be assumed by default, e.g. "file:<cwd>" so URLs without
URL scheme refer to files.

As you see, the PXP machinery to open entities in a custom way is quite
complicated. The advantage of doing it this way is that it is possible
to plug various entity sources together. After the right resolver class
is defined (which is complicated), it is very easy to use it and to
combine it with other resolvers.

Gerd
-- 
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd at gerd-stolpmann.de          http://www.gerd-stolpmann.de
------------------------------------------------------------





More information about the Ocaml-pxp-users mailing list