afs_filesystem_load - AFS - Reference Guides

AFS Filters Description

Product
AFS
Platform
7.12
Category
Reference Guides
Language
English

The afs_filesystem_load filter enables indexing the content of a distant filesystem using NFS or SAMBA.

The filter is declared with the afs_filesystem_load type. It is in the antidot-paf-misc package. It is a processor filter.

The Filesystem Load filter specifications are described in the following table:

Parameter name

Mandatory

Type

Default

Description

output_layer

No

layer

CONTENTS

Layer filled with each document.

protocol

Yes

string

N/A

Filesystem protocol. Valid values are:
  • nfs: Network File System
  • smb: Samba File System

host

Yes

string

N/A

Remote host fully qualified name or IP address.

user

No

string

N/A

Remote user, leave empty for $AFS7_USER.

password

No

string

N/A

Remote user password

workgroup

No

string

WORKGROUP

If applicable, remote user workgroup.

root_directory

Yes

string

N/A

Remote root directory or share name.

mount_point

No

directory

N/A

If applicable, local mount point.

mount_options

No

string

N/A

If applicable, mount options.

user_group_mapping

No

string

static

The method used to map uid/gid values to user and group names. For Samba only the default method 'static' is available. For NFSv3/v4 the method 'idmapd' can be used to resolve dynamically the uid/gid values through calls to the idmapd daemon.

user_ids_to_names

No

map

N/A

Statically map uids or sids to user names.

group_ids_to_names

No

map

N/A

Statically map gids or sids to group names.

exclude

No

list

N/A

List of patterns determining paths for files or directories to ignore. This list takes precedence over the list of included patterns. Patterns can include * for wildcard. Patterns case sensitivity depends on the type of filesystem.

include

No

list

N/A

List of patterns determining paths for files or directories to load. Leave empty to load all files and directories. Patterns can include * for wildcard. Patterns case sensitivity depends on the type of filesystem.

skip_non_readable_files

No

boolean

true

When true, the filter ignores non-readable files. If set to false, then these files are created and their status is set to KO.

The filter is configured for one type of filesystem (NFS or Samba) using the protocol parameter. It will process all document URIs starting with the corresponding prefix : nfs:// for NFS filesystem and smb:// for Samba filesystem and will simply pass URIs that do not match to the next PaF filter.

This filter -s Secured PaF Scheduler Mode. See Secured Operating Mode for more information.

Files and directory permissions are taken into account and stored in the reply database. Queries to this secured database will rely on parameters afs:group and afs:user.

Load NFS filesystem

Note that when protocol is NFS, the following parameters are mandatory: root_directory and mount_point. The include and exclude filter parameters (see details below) are case-sensitive when using NFS. All files and directories starting with a "." are automatically excluded.

Attention: When using NFS, note that the user running the PaF must have sufficient rights to run sudo mount and sudo umount commands. These rights are configurable in /etc/sudoers file. The distant filesystem (NFS server) must stay accessible during the whole PaF execution.

Load SAMBA filesystem

Note that when protocol is SAMBA, the following parameters are mandatory: root_directory (share name), user and password. The include and exclude filter parameters (see details below) are case-insensitive when using Samba, and use the "/" character as directory names separator instead of the "\" character.

Note: Only SAMBA server running on a ** Windows ** filesystem is supported.

Attention: When using SAMBA, note that the user running the PaF must have sufficient rights to access the share and navigate through the directory structure (ie read all files and their attributes). The distant filesystem (Samba server) must stay accessible during the whole PaF execution).

Users and Groups IDs Mapping

By default, PaF process uses mapping information configured in the user_ids_to_names and group_ids_to_names filter parameters to save user and group names.

If these information are not filled in, PaF process will only have UID (user id) and GID (group id) information when using NFS or SID (security id) information when using SAMBA. Then, default names will be used: uu100 for UID 100, ug101 for GID101, and raw SID values.

Note: when using Samba some well-known SID mappings are predefined, eg "S-1-5-32-544" is mapped to "ADMINISTRATORS" group and "S-1-5-18" is mapped to the "NT_SYSTEM" user (see the whole list of standard SIDs at http://support.microsoft.com/kb/243330 or increase the log level to display the list in the filter logs). These users and groups may have privileges to read many files in the filesystem. It is possible to overwrite the predefined mapping by defining a new mapping name in the filter configuration, and it is also possible to discard a SID by assigning an empty name to the SID in the filter configuration.

Mapping errors that occur are reported in PaF process logs.
Tip: Note that a mapping modification will be taken into account the next time the file is indexed.

Directories and files includes / excludes

By default, the afs_nfs_filesystem filter will explore the whole directory tree below the input document's URI. It is possible to restrict the loaded files and directories by using the include and exclude parameters of the filter. The entries in these lists support the unix shell-style wildcards, which are not the same as regular expressions. The special characters used in shell-style wildcards are:

Pattern       Meaning
*             matches everything
?             matches any single character
[seq]         matches any character in seq
[!seq]        matches any character not in seq

For a literal match, wrap the meta-characters in brackets. For example, '[?]' matches the character '?'.

Note that the pattern must match the whole path including parent directories, eg to match a file "foo.txt" in any directory the pattern should be "*/foo.txt".

All exclude patterns are evaluated first and any matching URI will be skipped. If at least one include entry is defined then only the paths matching the include pattern(s) will be loaded and other paths will be skipped.