A linguistic approach to semantic extraction from text

Abel Browarnik, Oded Maimon

Research output: Contribution to journalArticlepeer-review


Ontology learning from text is the process of distilling knowledge - both implicit and explicit. Machines acquire knowledge either through human intervention or by means of an automatic, human-less learning approach, i.e. unsupervised ontology learning, using unsupervised, automatic understanding of text. Text understanding makes resort to Machine Learning or to a Linguistics-based approach. Both approaches require that a semantic representation of the text be obtained. This paper describes the context of Ontology learning, emphasizing the extraction of semantic content. We review the possible approaches and propose a heuristics based linguistic model for the automatic extraction of semantic content. The model examines the structure of the English sentence and corpus-based facts showing that sentence length is bound. This leads to the conclusion that it is possible to use finite state automata to heuristically detect clause boundaries within sentences. We show a clause-semantics retrieval example that could not be solved using other methods currently available. The semantics of the whole sentence can be obtained by combining the semantics of each individual constituent clause, based on the sentence structure found. A further paper will present the complete automaton for clause boundary detection, together with detailed results and a comparison to other available approaches.
Original languageEnglish
Pages (from-to)9-29
Number of pages21
JournalRevista Electronica de Linguistica Aplicada
Issue number1
StatePublished - 2011


Dive into the research topics of 'A linguistic approach to semantic extraction from text'. Together they form a unique fingerprint.

Cite this