RANLP’2009 Workshop: A Knowledge-Rich Approach to Measuring the Similarity between Bulgarian and Russian Words
Today I presented a scientific publication about measuring modified orthographic similarity between Bulgarian and Russian words at the Workshop “Multilingual Resources, Technologies and Evaluation for Central and Eastern European Languages”, held in conjunction with the scientific conference RANLP’2009. The paper is titled “A Knowledge-Rich Approach to Measuring the Similarity between Bulgarian and Russian Words” and is a small part of my PhD thesis.
We propose a novel knowledge-rich approach to measuring the similarity between a pair of words. The algorithm is tailored to Bulgarian and Russian and takes into account the orthographic and the phonetic correspondences between the two Slavic languages: it combines lemmatization, hand-crafted transformation rules, and weighted Levenshtein distance. The experimental results show an 11-pt interpolated average precision of 90.58%, which represents a significant improvement over two classic rivaling approaches.
Download the article: RANLP2009-Workshop-Nakov-Paskaleva-Nakov-MMEDR-Similarity-Bulgarian-Russian-Words.pdf
Download the presentation: RANLP-2009-Workshop-Nakov-Paskaleva-Nakov-MMEDR-Similarity-Bulgarian-Russian.ppt.