  <eprint xmlns="http://eprints.org/ep2/data/2.0">
    <eprintid>178</eprintid>
    <rev_number>4</rev_number>
    <eprint_status>archive</eprint_status>
    <userid>4</userid>
    <dir>disk0/00/00/01/78</dir>
    <datestamp>2005-05-18</datestamp>
    <lastmod>2008-07-18 09:39:04</lastmod>
    <status_changed>2008-07-16 15:42:06</status_changed>
    <type>conference_item</type>
    <metadata_visibility>show</metadata_visibility>
    <creators>
      <item>
        <name>
          <family>Taib</family>
          <given>SM</given>
        </name>
        <id></id>
      </item>
      <item>
        <name>
          <family>Yeom</family>
          <given>SJ</given>
        </name>
        <id></id>
      </item>
      <item>
        <name>
          <family>Kang</family>
          <given>BH</given>
        </name>
        <id></id>
      </item>
    </creators>
    <title>Elimination of Redundant Information for Web Data Mining</title>
    <ispublished>pub</ispublished>
    <subjects>
      <item>280100</item>
    </subjects>
    <full_text_status>public</full_text_status>
    <monograph_type>NULL</monograph_type>
    <pres_type>paper</pres_type>
    <keywords>Web Monitoring, Web information management, Ripple Down Rules, RDR, MCRDR</keywords>
    <abstract>These days, billions of Web pages are created with
HTML or other markup languages. They only have a few
uniform structures and contain various authoring styles
compared to traditional text-based documents. However,
users usually focus on a particular section of the page
that presents the most relevant information to their
interest. Therefore, Web documents classification needs
to group and filter the pages based on their contents and
relevant information. Many researches on Web mining
report on mining Web structure and extracting
information from web contents. However, they have
focused on detecting tables that convey specific data, not
the tables that are used as a mechanism for structuring
the layout of Web pages. Case modeling of tables can be
constructed based on structure abstraction. Furthermore,
Ripple Down Rules (RDR) is used to implement
knowledge organization and construction, because it
supports a simple rule maintenance based on case and
local validation.</abstract>
    <date>2005</date>
    <date_type>published</date_type>
    <volume>1</volume>
    <publisher>IEEE</publisher>
    <pagerange>200-205</pagerange>
    <event_title>International Conference on Information Technology</event_title>
    <event_location>Las Vegas, USA</event_location>
    <event_dates>4-6 April 2005</event_dates>
    <event_type>conference</event_type>
    <institution>University of Tasmania</institution>
    <thesis_type>UNSPECIFIED</thesis_type>
    <refereed>TRUE</refereed>
    <editors>
      <item>
        <name>
          <family>Selvaraj</family>
          <given>Henry</given>
        </name>
        <id></id>
      </item>
      <item>
        <name>
          <family>Srimani</family>
          <given>Pradip K.</given>
        </name>
        <id></id>
      </item>
    </editors>
    <documents>
      <document xmlns="http://eprints.org/ep2/data/2.0">
        <docid>429</docid>
        <rev_number>1</rev_number>
        <eprintid>178</eprintid>
        <pos>1</pos>
        <format>application/pdf</format>
        <language>en</language>
        <security>public</security>
        <license>cc_utas</license>
        <main>PID51344.pdf</main>
        <files>
          <file>
            <filename>PID51344.pdf</filename>
            <filesize>176938</filesize>
            <url>http://eprints.utas.edu.au/178/1/PID51344.pdf</url>
          </file>
        </files>
      </document>
    </documents>
  </eprint>
