USER MANUALS

Useful Web Content Extraction Filter

This filter receives the same parameters as the content extraction filter :doc:`<content_extraction_filter_html_pdf_word_excel_powerpoint_xml_eml_and_text>. It uses several heuristics to automatically extract the useful content of the field value, eliminating browser menus, images, and other normal adornments in many Web documents.

This filter uses the content extraction filter internally; therefore the Content Extraction Filter needs not be included, if the Useful Content Extraction Filter is used.

Add feedback