afs_sitemap_crawl - AFS - Reference Guides

AFS Filters Description

Product
AFS
AFS_Version
7.12
Category
Reference Guides
language
English

This connector enables crawling a sitemap. It uses a URL and generates one document for each found URI

The filter is declared with the afs_sitemap_crawl type. It is in the antidot-paf-misc package. It is a generator filter.

The Sitemap Crawl filter specifications are described in the following table:

Parameter name

Mandatory

Type

Default

Description

sitemap_url

Yes

string

N/A

Sitemap url

output_layer

No

layer

CONTENTS

Layer filled for each output document

user_agent

No

string

Python-urllib/3.2

Controls the user-agent provided in HTTP request

Attention: This filter must be the first filter in a Pipe. This filter will never process input documents.