It is necessary to have a reasonable data volume estimate to help size the server architecture. Architectures may well vary with the size of each indexed content item. Indexing web pages of a few kilobytes and a DMS server (Document Management System) with heavy PDFs is quite different.
For recommendation purpose, the following points can be considered:
- For 100,000 web pages, an average 10-GB back-end space is necessary to store the crawled pages and to generate the indexes. And 1 GB is necessary at the front-end level to store the generated indexes and the snippets.
- For 100,000 structured items (of eCommerce catalog entry or database item type), an average 5 GB back-end space is used for indexing, and 500 Mb for the index front-ends.
- For logs, an average 1 Gb is necessary per layer of 5-million queries.
- A two-processor server of the latest generation makes it possible to index more than 1 million daily documents (according to the type and complexity of the applied processes).
To allow users to adjust their hardware equipment to the best usage of AFS v7.9, the Antidot recommendations include:
- 2 cores minimum of a physical CPU (recommended: 4)
- 4 GB minimum of memory
- An SSD
- 100 Mbps of network bandwidth
- An ethernet card
- Data size: 10 GB max
- Topics: 25,000
- Sessions: 7,000 per day
- System: RedHat7
In order to guarantee high availability on the client side (to access web portal, search engine, content server), it is recommended to have two Front-End servers and a MongoDB cluster.
The other Antidot component (Back Office, data processing, PDF server and Scheduler) cannot be duplicated.
The following diagram shows the recommended architecture for this specific use case:
The following table lists the server recommendations induced by this use case:
The PDF Server can be installed on any server of the Fluid Topics installation.
By contrast, the following diagram shows a one-front architecture, where all database are located on the servers hosting the application: