Extraction parameters
The parameters in the <Extraction>
element
calibrate the size of the document batches that TEXTML Server indexes.
Starting at the oldest unindexed document, TEXTML Server takes a batch of
documents, parses them, and writes all the values to the different indexes in parallel
(according to the IndexationThreadCount
value).
Extraction parameters should not be changed unless IXIASOFT Customer Support expressly recommends it.
Parameter name | Default value | Description |
---|---|---|
StopUpdatePeriod | 1 | Specifies the interval, in seconds, at which the indexing and deindexing tasks will check for pending read operations and permit interruption. |
MaximumUpdateSize | 67108864 | Specifies the size (in bytes) of parsed documents that an indexing batch can contain. |
MaximumSourceSize | 10485760 |
Specifies the size (in bytes) of content that an indexing batch can contain. |
OccAllocManagerBlockSize | 4194304 | Specifies the size (in bytes) of the blocks used in order to minimize memory allocation to the OS when indexing and deindexing. |
ClusterIndexDocuments | (not specified) | Specifies the maximum number of documents that an indexing batch can contain. When not specified, maximum is 500 documents. |
ClusterDeindexDocuments | (not specified) | Specifies the maximum number of documents that a deindexing batch can contain. When not specified, maximum is 500 documents. |
IgnoreInvalidCharacters | False | |
LongXPathEvaluationTreshold | 10 | Length of time (in seconds) an XPath evaluation can take before triggering a warning. |