afs_lang_detect - AFS - Reference Guides

AFS Filters Description

Product
AFS
AFS_Version
7.11
Category
Reference Guides
language
English

Detect language of input documents

The filter is declared with the afs_lang_detect type. It is in the antidot-paf package. It is a processor filter.

The Language detector filter specifications are described in the following table:

Parameter name

Mandatory

Type

Default

Description

xpaths

No

list

//*

The XPaths used on XML documents to extract the content to be analyzed for language detection.

nsmap

No

map

Empty map

Namespaces used to interpret the given XPath.

jpaths

No

list

$..*

The JPaths used on JSON documents to extract the content to be analyzed for language detection.

input_layer

No

layer

CONTENTS

Input layer used to detect the language of the document.

threshold

No

float

0

Detection process associate a score to each language. The language set to the document is the language with the best score. You can avoid setting a language to a document for which the score is less than configured threshold.

The language detector filter enables language detection of various document types. This is useful if processed documents are to be dispatched by a following switch in the PaF pipeline.