Indian Language Resources–Text Processing Subcommittee Report

A lot of research has been done on different NLP tasks and standards both internationally and in Indian languages (ILs). Much software has been built around these tasks and widely used in products. However, often different research and product groups have created different standards to address the problem. This often causes issues in sharing of data, information representation, etc. In this report, the authors investigate the NLP text processing tasks for which standardization is required and subsequently explore the different standards available either in ILs or internationally. They categorize the tasks primarily based on their input/output for this study. Furthermore, they also conduct few case studies based on downstream applications.

