Prominent industry leaders and experts on Tuesday stressed on the transformative role of local Large Language Models (LLMs) in shaping the future of artificial intelligence (AI) in India.
Speaking at the Global Partnership on Artificial Intelligence (GPAI) Summit 2023 in New Delhi, Vijay Shekhar Sharma, founder and CEO at Paytm, spoke the potential of LLMs in revolutionising education, particularly in a diverse linguistic landscape like India. Sharma said, "The AI bots and especially the local language AI bots, which will be built in India, will be one of the phenomenal leapfrogs that we will have in education."
The Paytm CEO particularly addressed the education vertical in his keynote by highlighting that the there was a language barrier in India, with more than 25 primary languages and English not being the first language.
The GPAI summit brought together industry leaders, policymakers and experts to discuss the pivotal role of AI, with one of the discussions focused on the development and applications of LLMs.
Hurdles For Local LLMs
Aakrit Vaish, co-founder and CEO at Haptik, discussed the transformative impact of AI, particularly in sectors such as healthcare, education and financial services. He said that AI brings about a significant paradigm shift, even more so than the internet did, by addressing issues of accessibility, ease of use and functionality in these sectors.
One specific area of focus for Jio-backed Haptik has been the development of local language conversational AI. Vaish highlighted the challenge of building Large Language Models (LLMs) for languages with limited resources, where obtaining datasets becomes a crucial starting point. He noted that while English-based chatbots have been prevalent, cracking local languages, especially low-resource ones, has been a considerable challenge.
"Everyone knows English-based chatbots have been working and (have) existed for a while, but most of us have not been able to crack local languages, and particularly … what we call low resource languages. There is so many languages unique to our country," said Vaish.
Vaish also touched upon the source of data for training Natural Language Processing (NLP) models, mentioning that popular language models often rely on datasets from sources like Wikipedia. However, for languages less represented on platforms like Wikipedia, obtaining sufficient data becomes a hurdle.
In reflecting on the current state of AI development, Vaish expressed a concern about the gap between conversation and action. He quipped that there needs to be a faster pace of developing AI products rather than focusing solely on AI conferences and discussions. While acknowledging the importance of responsible development, including considerations of ethics and responsibility, he encouraged a more rapid deployment of AI technologies to address real-world challenges.
Meanwhile, Narayanan Vaidyanathan, Head of Policy Development at the Association of Chartered Certified Accountants (ACCA), brought a financial perspective to the discussion. Vaidyanathan said, "We very much come at this from a very simple practical point of view," highlighting the potential of LLMs and AI in areas like finance, consolidation of financial statements and anomaly detection in audit processes.
As per a report from Accenture, the integration of AI is projected to elevate India's annual growth rate by 1.3 percentage points by the year 2035. This optimistic outlook banks on a collaborative scenario where intelligent machines and humans work together to address the nation's most formidable challenges.
In practical terms, the Accenture forecast translates to an additional contribution of USD 957 billion, equivalent to 15 per cent of the present gross value added (a closely aligned metric with GDP), to India's economy in 2035 compared to a hypothetical situation where AI is not integrated.
Google's Policy Concern
David Weller, Senior Director of Emerging Tech at Google, at a GPAI panel discussion stressed on the importance of infrastructure, innovation, and data access for LLMs. Weller said, "We are partnering with IISc on a project across all 700 plus districts in India to help collect open source audio sets of different Indic languages so that we and others can train on them."
However, Weller also raised concerns about the regulatory landscape and the need for a more innovation-driven approach. "Policy's super important there," he remarked, emphasising the significance of regulatory support for data access and privacy policies to facilitate the development of LLMs.