Europe Media Monitor

EVENT INFORMATION EXTRACTION RESOURCES

Rapid proliferation of textual information in digital form has brought various security-related organisations to acknowledge the benefits of automatically extracting structured information on critical events from open sources for early warning, situation monitoring and risk assessment. The JRC multilingual event-extraction system has been developed for extracting structured information on crisis, violent, man-made, natural disaster, outbreak of infectious disease and socio-political events from on-line news stories collected in near real-time through the Internet with the Europe Media Monitor. The event information extracted includes event type, location and covers also additional event-specific aspects, such as perpetrators, victims, number of injured and displaced people, targeted infrastructure and weapons used.


This is a web page dedicated to the preliminary release of a corpus of structured information on security-related events automatically extracted from online news over a period of 10 years, part of which has been manually curated.

Please refer for further details to our paper "On the Creation of Security-related Event Corpus" presented at the Events and Stories in the News workshop co-located with ACL 2017 conference.

The event data was freshly extracted from an archive of stories in English going back 10 years. The JRC's big data pilot project platform JEODPP was used, its architecture is described in this paper "Towards a JRC Earth Observation Data and Processing Platform". The link to the project website is here: https://cidportal.jrc.ec.europa.eu

Europe Media Monitor Currently the following data is available:

Release 1.0 (1 August 2017)

  1. Corpus of human moderated event templates in six languages [TSV] [JSON].
  2. Corpus of automatically extracted event templates for English (filtered - 14 event types reported in the paper)
  3. Corpus of automatically extracted event templates for English (unfiltered - all event types)
  4. dowbnload the PDF Full list of event types and attributes [PDF]


  5. dowbnload the PDF Description of how to access news clusters (stories) from which the events were extracted [PDF].