Global Utilities

Seminars - Abstract

Department of Computer Science & Computer Engineering

Topic:   Clustering Sentences Based on Semantic Similarity Computation
Speaker:   Khaled Abdalgader
Date:   01-06-2009
Time:   3:00 PM
Venue:   SEMS meeting room, Bundoora
Abstract:   Traditional text clustering methods treat text as a bag of words, with similarity between two texts measured on the basis of word co-occurrence between those texts. While this approach is suitable for clustering large fragments of text (e.g., documents), it performs poorly when clustering smaller text fragments such as sentences. This is because two sentences may be semantically very similar while containing no common words. This talk will describe a new algorithm for sentence clustering that is based on the notion of semantic vectors. These vectors represent sentences using semantic information derived from a lexical knowledge base constructed to model common human knowledge about words in natural language. Results of applying the algorithm to a variety of documents show that the sentence clusters found by the algorithm are more in accord with the clusters identified by humans than are those of the clustering approaches based only on word co-occurrence.
Content Approved by: Head of School
Page maintained by: Applications Programmer
Last Updated: 14 October, 2009