SEMANTIC ANNOTATION TO SUPPORT AUTOMATIC TAXONOMY CLASSIFICATION
The paper presents a new taxonomy classification method that generates classification criteria from a small number of important sentences identified through semantic annotations. Rhetorical Structure Theory (RST) is used to discover the semantics. The annotations identify which parts of a text are more important for understanding its contents. The extraction of salient sentences is a major issue in text summarisation. Statistical analysis is commonly used, but for subject-matter type texts, linguistically motivated natural language processing techniques, e.g. semantic annotations, are preferred. An experiment to test the method using documents collected from industry demonstrated that classification accuracy can be improved by up to 16%.