afs_xml_split - AFS - Reference Guides

AFS Filters Description

Product
AFS
Platform
7.12
Category
Reference Guides
Language
English

Split an XML document according to an XPath

The filter is declared with the afs_xml_split type. It is in the antidot-paf package. It is a processor filter.

The XML Split filter specifications are described in the following table:

Parameter name

Mandatory

Type

Default

Description

split_XPath

Yes

string

N/A

The XPath on which documents are split

doc_urn_XPath

No

string

N/A

The XPath where to find the URI for created documents. If unset, then arbitrary URIs of pattern urn:afs:<UUID> are used. Note that the prefix "urn:afs:" is always added.

urn_namespace_identifier

No

string

N/A

Define specific URN namespace identifier by replacing 'afs' by specified value in following patterns: auto-generated URN pattern "urn:afs:<URI>", user defined URN: "urn:afs:<value from doc_urn_XPath parameter>".

file_mode

No

boolean

false

Set to true to process very large files. The following limitations apply: - XSD/DTD validation is not available. - Only "FILE" URIs (and not urn:afs:... and so on) can be processed. - Layers' content is not read, only disk content associated to URIs is.

inherit_tags

No

boolean

false

If set, the project specific tags xId, yId, zId and language from the input document will be copied to each output document

input_layer

No

layer

CONTENTS

Input Layer

keep_hierarchy

No

boolean

false

When set to true, the original nodes hierarchy is kept. Otherwise, the split_XPath parameter is considered as the document root

keep_blank_text_node

No

boolean

false

When set to true, all XML text nodes are kept as is even if they only contain blank characters.

nsmap

No

map

N/A

The namespaces' map used to interpret xpaths

max

No

integer

Infinite

The split will not produce more than max new documents for each input file. To be used for development purposes

output_layer

No

layer

CONTENTS

Output layer

root_node_name

No

string

N/A

When the document has not this tag as a root tag, it is passed to the next filter. It is useful to chain several afs_xml_split

validate_XPath

No

string

N/A

If this parameter is set, then only documents for which the evaluation of the XPath yields true are created

The XML split filter enables dividing an XML file into smaller documentary units which are stored as individual documents. This is useful if documents to process are provided in a single file. For example, a catalog of products in a store (since the documents to process are products, it is necessary to "split" the catalog in order to create one document for each product).
Note: For more information about UUID, see http://en.wikipedia.org/wiki/Universally_unique_identifier
Note: It is advised to use the file_mode parameter with large files (if process failed because file is too large).
When setting file_mode parameter to true, the filter process documents directly from the disk, then following limitations apply:
  • XSD/DTD validation is not available anymore.
  • Only "FILE" URIs (and not urn:afs:xxx and so on) can be processed.
  • Layers content is not read, only disk content associated to URI is.