Please Note:

The Open Access Repository has moved to a new authentication system as of the 1st of November.

Account holders will now be able to login using their University of Tasmania credentials.
If you have trouble logging in please email us on so we can assist you.

Public users can still access the records in this repository as normal

Open Access Repository

Noise Elimination from the Web Documents by Using URL paths and Information Redundancy


Downloads per month over past year

Kang, BH and Kim, YS (2006) Noise Elimination from the Web Documents by Using URL paths and Information Redundancy. In: The 2006 International Conference on Information & Knowledge Engineering, 26-29 Jun, Las Vegas, US.

IKE06-Noise_Eli...pdf | Download (346kB)
Available under University of Tasmania Standard License.

| Preview


Noise data in the Web document significantly affect on the performance of the Web information management system. Many researchers have proposed document structure based noise data elimination methods. In this paper, we propose a different approach that uses a redundant information elimination approach in the Web documents from the same URL path. We propose a redundant word/phrase filtering method for single or multiple tokenizations. We conducted two experiments to examine efficiency and effectiveness of our filtering approaches. Experimental results show that our approach produces a high performance in these two criteria

Item Type: Conference or Workshop Item (Paper)
Keywords: MCRDR Filtering Web
Date Deposited: 08 Feb 2007
Last Modified: 18 Nov 2014 03:13
Item Statistics: View statistics for this item

Actions (login required)

Item Control Page Item Control Page