Summary Generation Filter

This filter acts on the value of the field specified in the parameter Input field (which has to be textual) and stores the result in the field specified in the parameter Output field. It uses various heuristics to automatically generate a summary of the content of the field specified. Its behavior varies depending on the type of value of the field processed:

  • In the case of contents expressed in RSS format, the summary corresponds to the value of the field “description.

  • In the rest of the documents, the summary is generated automatically by applying various heuristics.

Although this filter is applicable to tuples returned for any type of job, it is particularly oriented to ARN jobs, hence appearing by default as Input field the field “content” with the content of the document obtained by the crawler and as Output field the field “summary”.

Add feedback