Mult-modal Language
Citizen Service Broker for Bangalore One
Project Principal Investigator(s): G. Srinivasaraghavan
Building end-to-end task completion dialog systems is an active area of research in machine learning / natural language processing. Microsoft has recently launched a task-completion challenge (https://github.com/xiul-msr/e2e_dialog_challenge/blob/master/microsoft-dialogue-challenge-slt2018.pdf) with datasets for 3 specific tasks they have identified: Movie-Ticket booking, Restaurant reservation and taxi ordering. These systems are expected to be at the cutting edge of research involving deep reinforcement learning and natural language understanding / generation.
This project is to try pushing the boundary of this technology while making it directly relevant towards a much larger public good. The proposed aim is to build dialog systems for facilitating citizen interactions with the government and public offices for specific tasks. For instance this might result in replacing or enhancing some of the services currently provided by Bangalore One by a virtual, AI based broker. We plan to build the techniques through participation in the above challenge while simultaneously working towards extending the techniques to the . This would involve decoupling of Process Knowledge, Generic Common Knowledge, a Language Model and a dialog agent that will be able to carry out a dialog to complete tasks in conformance with a process and in a particular language where the last two are ideally plug-and-play. Current systems are far from realizing this. Our attempt will be to try closing the gap as much as possible. Apart from the obvious gains for efficient delivery of citizen services, it is conceivable that this can be extended into full-fledged voice-to-voice dialog systems for such tasks by integrating appropriate ASR/ASG technologies.
The evaluation of the dialog system will be based on metrics similar to the ones proposed for the Microsoft challenge:
1) Success Rate: the fraction of dialogues that finish successfully
2) Average Turns: the average length of the dialogue.