Managing the Acronym/Expansion Identi cation Process for Text-Mining Applications |
Received:October 14, 2008 Revised:December 20, 2008 Download PDF |
Mathieu Roche,Violaine Prince. Managing the Acronym/Expansion Identi cation Process for Text-Mining Applications. International Journal of Software and Informatics, 2008,2(2):163~179 |
Hits: 5160 |
Download times: 3557 |
Mathieu Roche Violaine Prince |
|
|
Abstract:This paper deals with an acronym/de nition extraction approach from textual data (corpora) and the disambiguation of these de nitions (or expansions). Both steps of our global process of acquisition and management of acronyms are precisely described. The first step consists in using markers such as brackets to identify expansion candidates. The alignment of the letters allows to select the acronym/de nition couples. The second step is to de ne the relevant expansion of an acronym in a given context. Our method is based on statistical measurements (Mutual Information, Cubic Mutual Information, Dice Measure) and the results provided by search engines. This paper presents an evaluation of the global process from real data (general and specialized domains). |
keywords:Web-mining text-mining natural language processing BioNLP named entities recognition acronym quality measures |
View Full Text View/Add Comment Download reader |