Towards Knowledge Acquisition from Semi-Structured Content |
Received:August 02, 2007 Revised:October 16, 2008 Download PDF |
Xi Bai,Jigui Sun,Haiyan Che,Lian Shi. Towards Knowledge Acquisition from Semi-Structured Content. International Journal of Software and Informatics, 2008,2(2):233~248 |
Hits: 4148 |
Download times: 3081 |
Xi Bai Jigui Sun Haiyan Che Lian Shi |
|
Fund:This work is supported by the NNSFC under Grant No.60496321 and European Commission(EASTWEB: Building an Integrated Leading Euro-Asian Higher Education and Research Community in the Field of the Semantic WEB) under Grant No.111084 |
|
Abstract:A rich family of generic Information Extraction (IE) techniques have been developed by researchers nowadays. This paper proposes WebKER, a system for automatically extracting knowledge from semi-structured content on Web pages based on wrappers and domain ontologies. Within the extracting process, wrappers are learned through su x arrays.Then domain ontologies automatically align the raw data extracted by wrappers and knowledge are generated by describing the data with Resource Description Framework (RDF)statements. After the merging process, newly generated knowledge are added to the Knowledge Base (KB) nally for users to query regardless of resources' derivation. A prototype of WebKER is implemented. This paper also gives the performance evaluation of this system
and the comparison between querying information in the KB and querying information in the traditional database, indicating the superiority of our system. In addition, the evaluation of the outstanding wrapper and the method for merging knowledge are also presented. |
keywords:information extraction knowledge base domain ontologies pattern discovery su x array knowledge merging |
View Full Text View/Add Comment Download reader |