University of Tasmania
Browse
AI07-Natural_Language_Min.pdf (244.52 kB)

Effectiveness of Methods for Syntactic and Semantic Recognition of Numeral Strings: Tradeoffs Between Number of Features and Length of Word N-Grams

Download (244.52 kB)
conference contribution
posted on 2023-05-26, 09:21 authored by Minchin, K, Wilson, WH, Kang, BH
This paper describes and compares the use of methods based on Ngrams(specifically trigrams and pentagrams), together with five features, to recognise the syntactic and semantic categories of numeral strings representing money, number, date, etc., in texts. The system employs three interpretation processes: word N-grams construction with a tokeniser; rule-based processing of numeral strings; and N-gram-based classification. We extracted numeral strings from 1,111 online newspaper articles. For numeral strings interpretation, we chose 112 (10%) of 1,111 articles to provide unseen test data (1,278 numeral strings), and used the remaining 999 articles to provide 11,525 numeral strings for use in extracting N-gram-based constraints to disambiguate meanings of the numeral strings. The word trigrams method resulted in 83.8% precision, 81.2% recall ratio, and 82.5% in F-measurement ratio. The word pentagrams method resulted in 86.6% precision, 82.9% recall ratio, and 84.7% in F-measurement ratio.

History

Publication title

Effectiveness of Methods for Syntactic and Semantic Recognition of Numeral Strings: Tradeoffs Between Number of Features and Length of Word N-Grams

Volume

4830

Issue

1

Publication status

  • Published

Event title

20th Australian Joint Conference on Artificial Intelligence

Event Venue

Brisbane, Australia

Date of Event (Start Date)

2007-12-02

Date of Event (End Date)

2007-12-06

Repository Status

  • Open

Usage metrics

    University Of Tasmania

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC