I am currently working on WSD systems in all-words disambiguation setting. SENSE LEARNER[1] participated in the SENSEVAL-3, 2004, English all words task, and achieved an average accuracy of 64.6%. It claims to be a minimally supervised sense tagger that attempts to disambiguate all content words in a text using the senses from WordNet. They use 2 datasets, one being the Wordnet itself and other being the SemCor corpora.
Semantic Language Model
The paper observes that the single noun/verb/adjective just before and after the ambiguous word, are most of the time sufficient to disambiguate the sense of that word. They develop a Semantic Language Model on this hypothesis. The problem with this model is that it cannot disambiguate words which it had not seen in training corpus, and hence requires another module for handling unseen words.
Noun Model : The first noun, verb, or adjective before the target noun, within a window of at most five words to the left, and its POS.
Verb Model: The first word before and the first word after the target verb, and its POS.
Adjective Model 1: One relying on the first noun after the target adjective, within a window of at most five words.
Adjective Model 2 : A second model relying on the first word before and the first word after the target adjective, and its POS.
The label of each such feature vector consists of the target word and corresponding sense, as word#sense. For learning, they use Timbl Memory based learning[2]. During sense prediction, each content word is labeled with a predicted word and its sense. If the predicted word matches with the word to be disambiguated, we assign it the predicted sense, otherwise we pass it to the next module to decipher.
Semantic Generalization using Syntactic Dependencies and a Conceptual Network
Since the module 1 doesn't work on unseen words, we need a way to generalize. For this , authors propose another memory based learning algorithm.
Training Phase
1. Find all the dependencies in the sentence using a dependency parser. Authors use Link parser. Add POS and sense information to the dependency pair if applicable.
2. For each noun and verb in the dependency pair, obtain the hypernym tree of the word. Build a vector consisting of the words themselves, their POS, their wordnet sense and a reference to all the hypernym synsets in wordnet.
3. For each dependency pair, we generate positive feature vectors for the senses that appear in the training set and negative feature vectors for the remaining possible senses.
Testing Phase
1.We will do the same as above and get a feature vector from the dependency pair. Now, we enumerate all possible feature vectors using all combinations of possible senses.
2. We now attempt to label a feature vector with a positive or negative label using Timbl.
Discussion and Conclusions
1.The memory based learning algorithm is a new idea, instead of using k-NN approaches.
2.Also, the feature set in first model is minimal, providing an insight into what are the useful features. One can probably experiment with these feature sets and obtain other results.
3. Using dependencies to understand semantic structure has always been interesting/complicated. The paper shows a simple way on how it can be done.
References:
[1] R. Mihalcea and E. Faruque. 2004. SenseLearner: Minimally supervised word sense disambiguation for all words in open text. In Proceedings of ACL/SIGLEX Senseval-3, Barcelona, Spain, July.
[2] W.Daelemans, J. Zavrel, K. van der Sloot, and A. van der Bosch. 2001. Timbl : Tilberg Memory based learner.
No comments:
Post a Comment