Generate Clusters - Fluid Topics - 3.7 - Technical Notes

Develop Connectors with the Fluid Topics API

Product
Fluid Topics
FT_Version
3.7
Category
Technical Notes
language
English
audience
public

Technical documentation often contains chunks of information that authors reuse, for example, across product versions. As a result, different documents can be nearly identical. Rather than list similar documents one after the other, Fluid Topics aims to improve the search experience by providing a way to assemble them within a cluster. The cluster is then displayed as a single search result.

By default, Fluid Topics identifies similar documents by scanning the base_id of each document. Documents or topics with the same base_id are clustered together.

The following example shows how it is possible to create a cluster containing two documents about the same product. A selector in the Search page allows end-users to choose which document seems the most relevant.

from fluidtopics.connector import Topic, StructuredDocument, Metadata

version_1 = Metadata("product_version", ["1.0"])
version_2 = Metadata("product_version", ["2.0"])
base_id_intro = "product-X-intro"
base_id_product_x = "product-X"

# Book about "Product X" in version 1.0

product_X_intro_v1 = Topic.create(
topic_id="product-X-intro-version-1.0",
title="My Topic Title",
body="Introduction of Product X ... version 1.0 ...",
metadata=[version_1],
base_id=base_id_intro
)

product_X_v1 = StructuredDocument.create(
document_id="product-X-1.0",
title="Product X - version 1.0",
locale="en-US",
toc=[product_X_intro_v1],
metadata=[version_1],
base_id=base_id_product_x
)

# Book about "Product X" in version 2.0
# Almost the same as version 1.0, with some new features.
# Most content is the same: only the version number changes

product_X_intro_v2 = Topic.create(
topic_id="product-X-intro-version-2.0",
title="My Topic Title 2",
body="Introduction of Product X ... version 2.0 ...",
metadata=[version_2],
base_id=base_id_intro
)

product_X_v2 = StructuredDocument.create(
document_id="product-X-2.0",
title="Product X - version 2.0",
locale="en-US",
toc=[product_X_intro_v2],
metadata=[version_2],
base_id=base_id_product_x
)

The following screenshots illustrate how clusters are displayed in a list of search results depending on which version the user selects:

cluster_book_1.0

cluster_topic_1.0

cluster_book_2.0

cluster_topic_2.0