Split a Document into Topics - Fluid Topics - Technical Notes

Microsoft Word Connector Configuration Guide

Fluid Topics
Technical Notes
Target Audience

The Microsoft-word connector transforms the content of a Word document into HTML. The resulting HTML file is split into topics at each heading tag: h1, h2, etc.

This transformation automatically converts the official heading styles of Microsoft Word into h1, h2, etc.

For documents using the standard heading styles (i.e., 'heading 1', 'heading 2' in English; 'titre 1', 'titre 2' in French, etc.), the connector automatically splits a document into topics at each of these headings.

For documents using custom styles to create paragraphs with sections and sub-sections, the connector needs to know how to map these styles to the appropriate HTML heading tags. Otherwise, the document will not be split at all. For this reason, it is necessary to add a style map file along with the .docx files in the archive. A style map is a yaml file with HTML tags as keys and a list of Word styles as values.

For example, if the document uses the styles chapter, section and sub-section, the style map will be:

    - chapter
    - section
    - sub-section
The style map must be named stylemap.yml.