Indian Institute of Technology (IIT) Roorkee researchers have developed an efficient method for Sanskrit text sentiment analysis. The proposed technique has achieved 87.50 per cent accuracy for machine translation and 92.83 per cent accuracy for sentiment classification, claims the IIT.
“Sanskrit is one of the world’s most ancient languages, however, natural language processing tasks such as machine translation and sentiment analysis have not been explored for it to the full potential because of the unavailability of sufficient labeled data,” says IIT Roorkee. The research proposed a method that comprises models for machine translation, translation evaluation, and sentiment analysis.
The machine translations have been used as cross-lingual mapping of the source and the target language. The obtained English translations are sufficiently mature and natural as the original English sentences, added the IIT.
The team involved in this research are Prof Balasubramanian Raman, department of computer science and engineering and his PhD student Puneet Kumar, and MSc student Kshitij Pathania, department of mathematics. The model has been published as a research paper in a reputed peer-reviewed journal Applied Intelligence.
Elaborating on the sentiment analysis model, Prof Balasubramanian Raman, Department of Computer Science, IIT Roorkee, said, “We have trained our model to predict sentiment scores in the range of positive, neutral, or negative. And the model uses statistics, natural language processing, and machine learning to determine the sentiment with over 90 per cent accuracy.”
The dataset to perform this research was taken from the Valmiki Ramayana website developed and maintained by the IIT Kanpur researchers. The future plans of the researchers are to exploit the morphological properties of Sanskrit for better classification using only root words with their respective suffixes and prefix. It is also planned to evaluate whether the morphological richness of Sanskrit is retained while translating to English. Moreover, the researchers also plan to obtain a model that discerns the context of words in multiple languages and provides word embeddings of lesser dimensions.