WordNet::Similarity::vector_pairs is a Perl module for computing semantic relatedness of word senses using second order co-occurrence vectors of glosses of the word senses.
my $wn = WordNet::QueryData->new();
my $vector_pairs = WordNet::Similarity::vector_pairs->new($wn);
my $value = $vector_pairs->getRelatedness("car#n#1", "bus#n#2");
($error, $errorString) = $vector_pairs->getError();
die "$errorStringn" if($error);
print "car (sense 1) bus (sense 2) = $valuen";
Schütze (1998) creates what he calls context vectors (second order co-occurrence vectors) of pieces of text for the purpose of Word Sense Discrimination. This idea is adopted by Patwardhan and Pedersen to represent the word senses by second-order co-occurrence vectors of their dictionary (WordNet) definitions. The relatedness of two senses is then computed as the cosine of their representative gloss vectors.
A concept is represented by its own gloss, as well as the glosses of the neighboring senses as specified in the vector-relation.dat file. Each gloss is converted into a second order vector by replacing the words in the gloss with co-occurrence vectors for those words. The overall measure of relatedness between two concepts is determined by taking the pairwise cosines between these expanded glosses. If vector-relation.dat consists of:
then three pairwise cosine measurements are made to determine the relatedness of concepts A and B. The examples found in the glosses of A and B are expanded and measured, then the glosses themselves are expanded and measured, and then the hyponyms of A and B are expanded and measured. Then, the values of these three pairwise measures are summed to create the overall relatedness score.
Overrides the initialize method in the parent class (GlossFinder.pm). This method essentially initializes the measure for use.
Parameters: $file -- configuration file.
This method is internally called to determine the extra options specified by this measure (apart from the default options specified in the WordNet::Similarity base class).
Computes the relatedness of two word senses using the Vector Algorithm.
Parameters: two word senses in "word#pos#sense" format.
Returns: Unless a problem occurs, the return value is the relatedness score, which is greater-than or equal-to 0. If an error occurs, then the error level is set to non-zero and an error string is created (see the description of getError()).