Arabic Text Categorization : a Comparative Study Of Different Representation Modes
Abstract—The quantity of accessible information on Internet is phenomenal, and its categorization remains one of the most important problems. A lot of work is currently, focused on English rightly since; it s the dominant language of the web. However, a need arises for the other languages. Because the Web is each day more multilingual. The need is much more pressing for the Arabic language. Our research is on the categorization of the Arabic texts, its originality relates to the use of a conceptual representation of the text. For that we will use Arabic WordNet < Final Year Projects 2016 > as a lexical and semantic resource. To comprehend its effect, we incorporate it in a comparative study with the other usual modes of representation (bag of words and N- grams_, and we use the K-Nearest Neighbors (K-NN) learning scheme with different similarity measures.
sales on Site11,021