Methods of Extracting and Representing Year References from an Encyclopedia for Chronological-table-generating Text Searching
Abstract: A method of textual information retrieval, which is called the thematic chronological-table search method, has been developed. In this method, an index is generated by extracting and collecting year references from a text collection, the index and a statement-by-statement full-text index are used for searching for year references and search words when the user inputs the words, and the result items are sorted by year and displayed. The result item contains a year reference, a sentence that contains the year, and a hyperlink to the original text. In this paper, the method of information extraction in the thematic chronological-table searching is explained. This method has been applied to a Japanese encyclopedia. An evaluation shows the precision of extraction is higher than 99% in most cases. An efficient and less error-prone data representation for year expression that may contain several units such as century, year, month, day, and so on, are also explained.
Introduction to this research theme: Axis-Specified Search (Thematic Search)