afs:title and afs:contents - AFS

AFS Integration Guide

Product
AFS
AFS_Version
7.9
Category
Reference Guide

afs:title and afs:contents store the title and contents of the document respectively. The parts of the document used to build these items are specified during the PaF phase.

The contents of these elements include additional information in order to highlight the parts of the text corresponding to where the query keywords have matched. For example, when the query is foo, every occurrence of foo in the title and the contents is highlighted.

Highlighting is fully compatible with linguistic resources used while indexing the documents, for example:

  • When a thesaurus stating that foobar is an alternate version of foo, a query for foobar also highlights every occurrence of foo and conversely.
  • When a stemming dictionary stating that foos and foo have the same root form, a query for foo also highlights every occurrence of foos and conversely.

Highlighting hits in the text is made by delimiting text portions with a list of afs:text, afs:match and afs:trunc child elements:

  • Portions of text that do not contain matches are included in an afs:text element.
  • Portions of text containing a match are included in an afs:match element.
  • When the text is too large, it is truncated of its non-significant parts to allow displaying of a small, significant excerpt. An afs:trunc element is inserted where each truncation takes place.
  • When the content of the document has multiple pages, the highlighted snippets contain an attribute page within each match element. This attribute contains the page number or identifier of the page that contains the matching word or expression.

For example, the following text:

The terms foobar, foo, bar, and baz are common placeholder names (also referred to as metasyntactic variables) used in computer programming or computer-related documentation.[1] They are commonly used to name variables or functions whose purpose is unimportant and serve only to demonstrate a concept. The terms can be used to represent any part of a complicated system or idea, including the data, variables, , and commands. The words themselves have no meaning in this usage, and are merely logical representations, much like the letters x and y are used in algebra. Foobar is often used alone; foo, bar, and baz are usually used in that order, when multiple entities are needed.

Can be highlighted as follows when searching for "foo foobar":

<afs:text>The terms </afs:text>
<afs:match>foobar</afs:match>
<afs:text>, </afs:text>
<afs:match>foo</afs:match>
<afs:text>, bar, and baz are common placeholder names</afs:text>
<afs:trunc/>
<afs:match>Foobar</afs:match>
<afs:text> is often used alone; </afs:text>
<afs:match>foo</afs:match>
<afs:text>, bar, and baz</afs:text>
<afs:trunc/>

And, as a result, becomes:

The terms foobar, foo, bar, and baz are common placeholder names ... Foobar is often used alone; foo, bar, and baz ...

Indentation levels do not appear in the XML feed, and are displayed in the example to enhance readability of the XML data.