An Integrated WordNet of Seven Languages
Indradhanush is a mission mode project to be executed by a consortium of nine academic institutions:
- Goa University, Goa
- IIT Bombay, Mumbai
- Indian Statistical Institute, Kolkata
- University of Kashmir, Srinagar
- University of Hyderabad, Hyderabad
- Punjabi University, Patiala
- Thapar University, Patiala
- Dharmsinh Desai University, Nadiad
- Jawaharlal Nehru University, New Delhi
The final deliverable of the project at the end of two years will be the Integrated WordNet consisting of 20000 synsets for Bengali, Gujarati, Kashmiri, Konkani, Oriya, Punjabi and Urdu linked with Hindi and English WordNets with which the users will be able to:
- Look up their language specific words to obtain lexico-semantic relations like synonymy, hypernymy, meronymy, etc.
- Query for cross lingual lexical information (e.g.: Punjabi to Gujarati, Gujarati to Urdu, etc.)
- Design and implement complex natural language applications like machine translation and cross lingual search.
Bengali, Gujarati, Kashmiri, Konkani, Oriya, Punjabi and Urdu.
The above 7 languages will have their respective WordNets constructed like Hindi and English WordNets and amongst one another. This will lead to an extremely rich and useful lexical base which will facilitate:
- Automatic bi-lingual dictionary construction
- Machine translation between these languages and English
- Machine translation between these languages and Hindi
- Machine translation amongst these languages themselves
- Cross-lingual Information Retrieval with the query being in one of the languages and document being in English or Hindi or one of these languages
All these tasks need NLP, engineering of large system, high quality linguistics and lexicography and sophisticated user interface.
Constitution of the Consortium:
|Goa Unirersity,Taleigao Plateau, Goa||Consortium Leader||Dr. Jyoti. D. Pawar|
|IIT Bombay,Mumbai||Co-Consortium Leader||Dr. Pushpak Bhattacharya|
|Indian Statistical Institute,Kolkata||Consortium Member||Prof. Probal Dasgupta|
|Dharmsinh Desai University,Nadiad||Consortium Member||Prof. C. K. Bhensdadia|
|University of Kashmir,Srinagar||Consortium Member||Dr. Aadil Amin Kak|
|University of Hyderabad,Hyderabad||Consortium Member||Prof. Panchanan Mohanty|
|Punjabi University,Patiala||Consortium Member||Dr. Suman Preet|
|Thapar University,Patiala||Consortium Member||Dr. R. K. Sharma|
|Jawaharlal University,New Delhi||Consortium Member||Dr. Rizwanur Rahman|
The First National Level Workshop on Indradhanush, an Integrated WordNet for Bengali, Gujarati, Kashmiri, Konkani, Oriya, Punjabi and Urdu, has been organized at Dharmsinh Desai University, Nadiad during 01st-03rd October, 2010.