Cascaded data mining methods for text understanding, with medical case study

Roni Romano*, Lior Rokach, Oded Maimon

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations


Substantial electronically stored textual data such as clinical narratives reports often need to be retrieved to find relevant information for clinical and research purposes. The context of negation, a negative finding, is of special importance, since many of the most frequently described findings are such. Hence, when searching free-text narratives for patients with a certain medical condition, if negation is not taken into account, many of the documents retrieved will be irrelevant. We present a new cascaded pattern learning method for automatic identification of negative context in clinical narratives re-ports. Studying the training corpuses, the classification errors and patterns selected by the classifier, we noticed that it is possible to create a more powerful ensemble structure than the structure obtained from general-purpose ensemble method (such as Adaboost). We compare the new algorithm to previous methods proposed for the same task of similar medical narratives, and show its advantages: accuracy improvement compared to other machine learning methods, and much faster than manual knowledge engineering techniques with matching accuracy.

Original languageEnglish
Title of host publicationProceedings - ICDM Workshops 2006 - 6th IEEE International Conference on Data Mining - Workshops
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages5
ISBN (Print)0769527027, 9780769527024
StatePublished - 2006

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
ISSN (Print)1550-4786


Dive into the research topics of 'Cascaded data mining methods for text understanding, with medical case study'. Together they form a unique fingerprint.

Cite this