The EU-ADR corpus: Annotated drugs, diseases, targets, and their relationships

  • Erik M. van Mulligen*
  • , Annie Fourrier-Reglat
  • , David Gurwitz
  • , Mariam Molokhia
  • , Ainhoa Nieto
  • , Gianluca Trifiro
  • , Jan A. Kors
  • , Laura I. Furlong
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

118 Scopus citations

Abstract

Corpora with specific entities and relationships annotated are essential to train and evaluate text-mining systems that are developed to extract specific structured information from a large corpus. In this paper we describe an approach where a named-entity recognition system produces a first annotation and annotators revise this annotation using a web-based interface. The agreement figures achieved show that the inter-annotator agreement is much better than the agreement with the system provided annotations. The corpus has been annotated for drugs, disorders, genes and their inter-relationships. For each of the drug-disorder, drug-target, and target-disorder relations three experts have annotated a set of 100 abstracts. These annotated relationships will be used to train and evaluate text-mining software to capture these relationships in texts.

Original languageEnglish
Pages (from-to)879-884
Number of pages6
JournalJournal of Biomedical Informatics
Volume45
Issue number5
DOIs
StatePublished - Oct 2012

Funding

FundersFunder number
European Union Community
EU-ADR
European Commission
Innovative Medicines Initiative
Seventh Framework Programme115002, 215847
Instituto de Salud Carlos IIICP10/00524

    Keywords

    • Adverse drug reactions
    • Corpus development
    • Machine learning
    • Text mining

    Fingerprint

    Dive into the research topics of 'The EU-ADR corpus: Annotated drugs, diseases, targets, and their relationships'. Together they form a unique fingerprint.

    Cite this