Content too large to process even though it's smaller than the maximum conversion size

Article: 100044849
Last Published: 2020-02-26
Ratings: 1 0
Product(s): Enterprise Vault

Problem

Before the content of an item can be indexed, the content must be converted to HTML or text.  Emails with file attachments that are smaller than the maximum conversion size may fail to convert due to the content being too large to process. 

Error Message

Log Name:      Veritas Enterprise Vault Converters
Source:        Enterprise Vault Converters
Date:          2/1/2019 3:40:58 PM
Event ID:      28993
Task Category: Storage Archive
Level:         Warning
Keywords:      Classic
User:          N/A
Computer:      EV-EXCH.EV.Local
Description:
Unable to convert item content
 
Reason: The converted content is too large to process      (0xc0041bd0)
Supplementary Info:  
 
Item: SA:-0
Subject: FlushedMail_U10.xlsx
Attachment:  
Type: xlsx
Vault ID: 1CB415D2C77840B42A10E7BB08B4EEAFF1110000ev-exch.EV.Local
SaveSet ID: 201902019053576~201901251409330000~Z~D116826D9C2D410C3A8340F1A7FFCEE1
Attachment ID: 1
 
This item will be archived without a preview being available to the web application and the content will not be indexed. It is not possible to search on the content but the item can be restored as normal

Cause

Enterprise Vault (EV) uses a converter application to convert an item to HTML or text and to extract parts of a file so that the content can be crawled for indexing.  The combined size of all files extracted and converted gives the total conversion size.  The event above is thrown when this total conversion size is larger than the EV maximum conversion size setting.

Zip Compressed File Types

Certain files, such as XLSX, are compressed files and have to be extracted in order to expand all content so that they can be converted.  To test this, an XLSX file can be saved as an XLS file which is it's uncompressed format type, and the size would be larger.  All Office documents with the additional X as part of the file extension are compressed with zip compression and therefore can be renamed with .zip extension and then decompressed with the preferred zip decompression tool.

Other File Types

Other file types such as PDF can not as easily have the file extension changed to zip and then decompressed.  For such files we need to use a tool that does the conversion process as well as extract parts of the file such as images to their original state.

Solution

Increase the maximum converted file size from the default (30MB) to a value less than the maximum limit (1024MB) in the Site Properties. Note that this value is for the size of the converted file, not the original file's size.

Console Root > Enterprise Vault > Directory > Site > Site Properties > Advanced > Content Conversion > Maximum conversion size     Minimum = 1 MB, Maximum = 1024 MB

Additional testing could be done to know how much to increase the value.  Enterprise Vault uses the Oracle Outside In Technology content converters to perform conversions and this can be downloaded to test the differences in sizes for certain file types. See the Related Articles section on a step-by-step guide on how to do that.

NOTES:

  1. Not all content is able to be converted to HTML or text due to its size, file type, number of files, etc.  It is expected to see some conversion event warnings/errors in the Enterprise Vault Converters event log. Adjust the settings as needed.
  2. Items that cannot be converted receive an index attribute named comr.  This attribute is searchable and can be used in Enterprise Vault Search or Discovery Accelerator to identify items with non-converted content.

Was this content helpful?