Crawl Perimeter - AIF

AIF Crawl

Product
AIF
Category
Technical Notes
language
English
audience
public

The perimeter consists in a subset of web to be crawled. It needs to define a dedicated settings files structure.

In conf/ directory, just create a perimeter/ directory which will contain either http or/and https directory(ies) depending on the kind of sites to crawl.

The http (or https) directory will then contain one sub-directory per site. For each site directory, a conf.xml file can be used to override general settings.

structure_crawl