The project is among the pioneers in Natural Language Processing in Roman Urdu script. Having launched a Roman Urdu Dictionary in the Android market from the software house Balytix, the product gives a through list of more than 8000 words in Roman Urdu and their translations. It handles words variations and also standardizes the lexicon. Users can also add their own words as well.
A less researched area in Urdu, natural language processing of Roman Urdu has wide applications for digital users in South Asia and potentially more languages utilizing the English alphabets. Using an improved linear combination of LCS and Soundex, we develop two new algorithms for clustering word variations within the Urdu language context. The research work at this project outperforms present techniques in the same field.
For more information, refer to Ahsan Nabi Khan