French business daily – Intelligent article auto-tagging system

Fill in the form to read the full story

We won't share your details with anyone

hidden

Your download has completed successfully.

A European business daily used an external semantic tagging system to add value to its articles while cutting authors’ routine workloads.

Background

The daily is one of Europe’s leading sources of financial and business information and debate. Since 2009 it has used a multi-channel Eidosmedia platform to produce its print edition, Web portals and e-paper and manage part of its editorial archives.

… significantly improved the quality and relevance of the metadata associated with the published articles

The challenge

For a business publication whose news content has significant reference value (even after publication), the retrievability of items through search and reference mechanisms is an important part of the service it offers its paying subscribers.

The ease and speed with which an article can be located by search mechanisms is significantly increased by tagging the item with the right metadata. This classification is normally done by the author of the article, typically by choosing terms from a list and entering names. The step is often skipped or done badly by busy journalists.

What was needed was a process that would create richer and more comprehensive semantic metadata without over-burdening the authoring staff.

The solution

The solution adopted is based on the Luxid expert system for the semantic analysis of unstructured data developed by French company Temis.

When an author releases a story for publication, the XML story file is first ‘cleaned up’ and converted into a format that Luxid’s analytical tools can read. It is then sent to the Luxid remote platform where two kinds of web-based analyses are undertaken.

The first analyses the content to generate ‘descriptors’ (people, locations, categories etc.) which will be used to enrich the metadata, including tagging with IPTC topic newscodes. The second identifies other documents within the editorial corpus, kept in an external archive, which exhibit ‘similarity’ with the analyzed text.

The results are then returned to the editorial workflow and the author is invited to review the tagging, removing the checks from boxes next to terms that do not apply. The author may also add locations to the geodata using an auto-completion mechanism fed by a link to an external database. At the end of the operation, the corrected metadata is appended to the news item and it is archived.

Outcomes

The auto-tagging solution significantly improved the quality and relevance of the metadata associated with the published articles, while cutting the time needed to complete the task.

Authors were far more willing to review and confirm metadata than to originate it.

There were immediate benefits in terms of the accessibility of the paper’s archived content and the solution provided a sound basis for the advanced search and retrieval functions planned for the future.

To find out more about this content publishing system solution, read the Full document

Explore more case studies: Financial, Media orGet in touch!