Why in the News?
The Department for Promotion of Industry and Internal Trade’s (DPIIT) recently published working paper on AI and Copyright Issues has received attention, being considered a good start towards resolving the major conflict between AI firms and content producers regarding the use of data for model training.
Background and Driving Factors of LLM Growth
The rapid progress of Large Language Models (LLMs) in terms of quality, depth of information, and sophistication of reasoning is primarily fuelled by two major factors.
- Technological Advancements: Iterative advancements in applied machine learning techniques, which continuously improve the performance of LLMs, are key drivers.
- Data Access: Growing access to text, data, and multimedia used in training these models is the second factor.
Central Conflict
- AI Firms’ Argument: AI hyperscalers and developers have argued that information on the Internet should be freely usable for such training.
- Content Producers’ Argument: They note that reproduction and similar acts of syndication by any non-AI entity would be subject to a licence fee and consent by the content producer, indicating a need for compensation. This debate pits the interests of AI developers against content producers (news, entertainment, and book publishing).
- Current Litigation Status: Lawsuits pitting AI firms against the publishers of the content they train models on are ongoing, with no uniform judicial thinking having materialised on the subject.
DPIIT Working Paper’s Solution
The DPIIT working paper’s proposal is considered a welcome step toward a solution where content providers are remunerated, without simultaneously creating a system that could put India’s AI ecosystem at a disadvantage.
Proposed Mandatory Licensing Framework
The paper’s solution is centred on a mandatory licensing framework.
- Data Scraping Access: AI data scraping tools would be permitted to limitlessly scrape public information from the Internet.
- Non-Profit Collection Body: A non-profit copyright society-like body would be tasked with the collection of payments.
- Basis for Payment: Payments would be collected from AI developers based on the revenue accrued to them through the benefits of training their models on Indian content producers’ data.
Underlying Rationale for Mandatory Licensing
- Practical Shrewdness: The drafting committee reasoned that opting out of data scraping and the enforcement of wishes when ignored by AI developers is not practical for every content producer.
- Data Processing as a Right: The framework is also driven by a belief in data processing as a right for AI models, as they do not usually reproduce content they are trained on, but rather synthesise fresh outputs.
Issues in Implementation
Issues have been raised regarding the methodology for determining compensation within the proposed remuneration system.
- Royalty Calculation Challenge: There are issues with how the royalty amount would be decided.
- Remuneration Disparity: Small publishers who invest significant effort and investment in their work may chafe at receiving as much remuneration a piece as a big media house putting out hundreds of articles a day.
Way Forward
- Mandatory licensing mechanism should be operationalized so that data scraping by AI tools is permitted while remuneration for use of content is institutionalized through central collection body.
- Non-profit copyright society-like body should be constituted and empowered to collect payments from AI developers on basis of revenue benefits derived from use of Indian content.
- Royalty determination framework should be designed with sensitivity to differences in scale and investment across content producers so that small publishers are protected from disproportionate disadvantage.
- Provisional regulatory action should be taken without awaiting final judicial determinations so that market distortions caused by absence of licence regime are avoided.
- Government support should be extended to working paper initiative to enable collaborative policy development involving industry, publishers and civil society.
- Dispute-resolution mechanisms should be foreseen to handle disagreements over royalty allocation, scope of permissible scraping and enforcement of collection, with options for specialist tribunal or expedited judicial review.
- Monitoring and impact assessment should be institutionalized so that concentration of remuneration with large players or chilling effects on independent content creation can be detected and corrected.
- Stakeholder consultations should be continued to incorporate dissenting views of tech industry while protecting rights and incentives of content producers.
Conclusion
The DPIIT working paper provides a timely and practical approach to integrating AI innovation with the necessity of protecting copyright holders’ interests. By proposing a mandatory licensing framework and a structured compensation mechanism, a crucial step has been taken toward creating a legal and commercial certainty that is vital for both the Indian AI ecosystem and the sustainability of the content production industry.
Artificial Intelligence (AI)
Artificial Intelligence (AI) is a transformative field of computer science dedicated to developing computer systems capable of performing tasks that historically required human intelligence. It is an umbrella term encompassing various technologies and methods that empower machines to learn, reason, perceive, solve problems, and make decisions in a human-like manner. The current era of AI is characterized by explosive growth driven by data, compute power, and algorithmic breakthroughs.

What AI Truly Is
AI is the theory and development of computer systems that can simulate the complex processes of the human mind. The fundamental goal is to create systems that can achieve objectives by taking rational actions based on the information they perceive from their environment.
- Learning: The ability to acquire information and the rules for using that information, primarily through exposure to vast datasets.
- Reasoning and Problem-Solving: Utilizing logic, algorithms, and models to reach conclusions, make predictions, and find optimal solutions to complex problems.
- Perception: The capacity to interpret input from the world, such as visual information (Computer Vision) and auditory data (Speech Recognition).
- Knowledge Representation: How the information and relationships between entities are stored, indexed, and retrieved for effective use by the AI system.
Key Sub-Disciplines and Enabling Technologies
The power of modern AI is built upon several interconnected sub-fields that leverage advanced computational techniques.
| Sub-Discipline | Core Functionality and Definition | Key Techniques | Typical Applications |
| Machine Learning (ML) | The fundamental process allowing systems to automatically learn from data and improve performance over time without being explicitly programmed for every possible task. It operates by finding patterns and making predictions. | Supervised Learning, Unsupervised Learning, Reinforcement Learning (RL), Regression, Classification. | Recommendation engines, Predictive analytics, Spam filters, Fraud detection. |
| Deep Learning (DL) | A subset of ML that uses complex, multi-layered Artificial Neural Networks (ANNs). It is adept at learning hierarchical features and representations directly from raw, unstructured data. | Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Transformers, Backpropagation. | High-accuracy image recognition, Voice assistants (speech processing), Foundation Models (like GPT). |
| Natural Language Processing (NLP) | Focuses on the interaction between computers and human languages. It enables machines to understand, interpret, and generate human language (both text and speech). | Natural Language Understanding (NLU), Natural Language Generation (NLG), Tokenization, Sentiment Analysis, Text Summarization. | Chatbots and conversational AI, Machine translation (e.g., Google Translate), Automated content writing, Search engines. |
| Computer Vision (CV) | Aims to give machines the ability to “see” and interpret visual information from the environment, enabling them to analyze and understand digital images and videos. | Object Detection and Tracking, Image Segmentation, Facial Recognition, 3D Reconstruction. | Self-driving cars (identifying obstacles/signs), Quality control in manufacturing, Medical image analysis (X-rays, MRIs). |
| Generative AI (GenAI) | A cutting-edge class of models focused on creating novel, synthetic content (data, text, images, code, video) that is realistic and often indistinguishable from human-created work. | Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), Transformer Architectures (used in LLMs). | Artistic image generation (Text-to-Image), Code generation (Copilot), Personalized storytelling, Deepfake creation (a risk). |
| Robotics | The engineering branch concerned with the design, construction, operation, and use of physical agents (robots). AI provides the intelligent control systems necessary for these agents to sense, reason, and interact autonomously with the real world. | Motion Planning Algorithms, Sensor Fusion, Reinforcement Learning (for control), Simultaneous Localization and Mapping (SLAM). | Industrial automation (assembly, welding), Autonomous delivery drones, Service robots (hospital assistance), Space exploration rovers. |
Historical Evolution of Artificial Intelligence
The evolution of AI is a fascinating journey marked by conceptual breakthroughs and technological constraints, often described in terms of “AI Summers” (periods of optimism and funding) and “AI Winters” (periods of reduced funding and disillusionment).
| Era | Key Milestones and Concepts |
| 1950s | Turing Test (Alan Turing, 1950), The Dartmouth Workshop (1956) where the term “Artificial Intelligence” was coined by John McCarthy. Focus on Symbolic AI (logic and problem-solving). |
| 1960s – 1970s | Development of Expert Systems (knowledge-based programs for decision-making). The first AI Winter due to the difficulty of scaling early systems and limitations of computing power. |
| 1980s – 1990s | Revival with focus on Machine Learning techniques like backpropagation and Neural Networks. AI systems begin to be applied commercially (e.g., in logistics). |
| 2000s – Present | Massive data availability (“Big Data”), GPU computing power, and algorithmic breakthroughs lead to the current AI Summer. The rise of Deep Learning (e.g., AlexNet, AlphaGo) and Generative AI (e.g., ChatGPT, Midjourney). |
Constitutional References to AI
India’s Constitution provides indirect safeguards for AI integration, emphasizing human rights over technological overrides. Chief Justice Sharad Bobde stressed that AI in judiciary must align with constitutional rights, guided by ethics and individual oversight.
- Article 19(1)(g) and 19(6): Balance right to profession with reasonable restrictions against job displacement from AI automation.
- Article 21: Protects privacy and dignity, relevant to AI surveillance and data use.
- Supreme Court initiatives: AI committee formulates ethical guidelines for legal processes.
Significance Across Major Sectors: AI’s Transformative Impact
Artificial Intelligence (AI)’s fundamental ability to automate complex tasks, analyze massive datasets, and generate highly accurate predictions is fundamentally redefining sector productivity, efficiency, and innovation on a global scale.
- Healthcare and Life Sciences: AI is a powerful tool for accelerating drug discovery by simulating molecular interactions and predicting compound efficacy.
- It enables personalized treatment planning tailored to individual genetic and clinical data and achieves high-accuracy medical image analysis (e.g., detecting early signs of dementia from EEG signals or identifying tumors in radiology scans), dramatically improving diagnostic speed and precision.
- Finance and Banking: AI is essential for ensuring security and optimizing capital allocation.
- It drives real-time fraud detection by identifying subtle anomalies in transactional data streams, facilitates high-frequency algorithmic trading for optimal market execution, offers personalized customer service through AI-powered chatbots, and performs sophisticated risk assessment for lending and insurance underwriting.
- Manufacturing and Internet of Things (IoT): AI maximizes operational efficiency and reduces waste.
- It is critical for optimizing supply chain logistics (predicting demand and managing inventory), implementing predictive maintenance for machinery (forecasting failure before it occurs), and improving quality control through computer vision systems that detect defects instantly.
- Education and EdTech: AI is transforming learning through personalized learning paths that adapt content difficulty and pace to individual student needs.
- It automates administrative tasks like grading and feedback, and provides powerful data analytics to teachers and institutions to identify and intervene with at-risk students promptly.
- Retail and E-commerce: AI underpins the modern consumer experience. It powers hyper-personalized recommendation engines, optimizes pricing and promotions based on real-time demand elasticity, manages inventory forecasting with high accuracy, and utilizes computer vision for loss prevention in physical stores.
- Energy and Utilities: AI plays a crucial role in managing complex energy grids and promoting sustainability.
- It optimizes smart grid management by predicting energy demand and supply fluctuations, improves the efficiency of renewable resources (e.g., maximizing wind turbine placement/operation), and detects infrastructure failures through sensory data analysis.
- Transportation and Logistics: AI is foundational to the future of mobility. It is vital for autonomous driving systems (perception, decision-making, and control), optimizing freight routes to reduce delivery times and fuel costs, and enhancing traffic flow management in smart cities through real-time analysis of vehicle and pedestrian movement.
- Media and Entertainment: AI is profoundly impacting content creation and distribution.
- It is used for content recommendation (e.g., streaming services), generating synthetic media (Generative AI for images, music, and voiceovers), optimizing advertising placement, and automating the subtitling and localization of content for global audiences.
- Public Sector and Governance: AI is being leveraged to improve government efficiency and citizen services.
- Applications include sophisticated predictive policing models (focused on resource allocation), streamlining bureaucratic processes (e.g., processing permits and claims), and improving the efficiency of public service delivery through data-driven policy analysis.
Latest Developments: The Generative AI Leap
The most notable recent development is the rise of Generative AI and Foundation Models, which have completely changed the landscape of AI capabilities.
- Foundation Models (e.g., LLMs): These are extremely large Deep Learning models (like GPT-5, Gemini 2.5, Claude 3) trained on massive, broad, and unlabeled datasets. They serve as a foundation that can be easily adapted (tuned) for a wide array of downstream tasks, demonstrating emergent reasoning and problem-solving abilities.
- Multimodal AI Systems: Modern AI models are increasingly multimodal, meaning they can seamlessly process and generate content across different data types—for example, accepting a text prompt and generating a photorealistic video, or interpreting both an image and an accompanying text description to answer a complex question.
- Agentic AI: A significant trend where AI models are evolving into autonomous agents. These agents can:
- Plan and Reason: Break down a complex, high-level goal into a series of smaller, executable steps.
- Execute Actions: Utilize external tools, surf the web, or interact with other software systems (APIs) to achieve the goal without continuous human intervention.
- Self-Correct: Evaluate the results of their actions and automatically adjust the remaining plan.
Challenges, Constraints, and Ethical Frameworks
The rapid advancement of AI presents critical challenges that must be addressed to ensure responsible and equitable deployment.
Technical and Operational Constraints
- Computational and Data Demands: Training cutting-edge models requires massive, expensive computational resources (GPUs/TPUs) and immense volumes of high-quality, clean, and representative data, creating a barrier for smaller players.
- Model Explainability (Black Box Problem): Many Deep Learning models lack transparency in their decision-making process, making it difficult to understand or debug why a specific decision was reached. This is a severe constraint in high-stakes fields like judicial or medical applications.
- Robustness and Generalization: Current AI models can be vulnerable to minor, adversarial changes in input data and often struggle to generalize their learning effectively to entirely new, unseen situations (lacking common sense reasoning).
Societal and Ethical Challenges
- Bias and Discrimination: AI systems inevitably inherit and can amplify biases present in their training data (e.g., historical gender or racial disparities), leading to discriminatory outcomes in hiring, lending, or criminal justice.
- Accountability and Misuse: The difficulty in assigning responsibility when an AI system causes harm (e.g., in autonomous vehicles or medical diagnosis) complicates legal accountability. The potential for malicious use, such as generating highly realistic deepfakes for misinformation campaigns, is a growing global security concern.
- Job Displacement and Inequality: Automation driven by AI is expected to displace routine jobs, necessitating urgent, large-scale re-skilling efforts to prevent the widening of economic inequality. (~47% of jobs automatable – Oxford study)
Intellectual Property (IP) and Copyright Issues
The rapid rise of Generative AI has precipitated a global crisis in Intellectual Property law, primarily centered on two issues: the input (training data) and the output (AI-generated content).
Key Developments:
- Almost every major AI model (GPT, Claude, Gemini, Llama, Grok, Midjourney, etc.) was trained on massive web-scraped datasets containing books, news articles, images, music, and code protected by copyright.
- No explicit permission was obtained from rights holders in most cases.
- Landmark ongoing lawsuits (2023–2025):
- The New York Times vs. OpenAI & Microsoft (alleging verbatim reproduction)
- Authors Guild vs. OpenAI and Anthropic
- Getty Images vs. Stability AI
- Universal Music, Sony vs. Suno & Udio (AI music generators)
- Programmers vs. GitHub Copilot (code laundering)
Current Legal Positions:
- USA: Fair use defense under scrutiny; cases pending in federal courts
- EU: AI Act requires transparency on training data; prohibits commercial use of scraped personal data
- Japan: Explicitly allows use of copyrighted data for AI training (2024 law)
- India: No clear law; falls under Copyright Act 1957 Section 52 (fair dealing) — ambiguous
Global and Indian Best Practices and Government Policies
Governments are shifting from a purely promotional stance to one that emphasizes both innovation and regulation to manage the technology’s risks effectively.
Global Best Practices and Regulatory Approaches
- European Union (EU) – The AI Act: The EU has pioneered a risk-based regulatory framework, categorizing AI systems into four tiers (Unacceptable Risk, High Risk, Limited Risk, Minimal Risk) and imposing strict transparency and safety obligations, especially for High-Risk AI systems (e.g., in employment, critical infrastructure).
- United States (US): The approach is generally less prescriptive, focusing on Executive Orders to mandate safety and security standards for powerful frontier models and leveraging existing laws (e.g., civil rights, consumer protection) to address AI harms.
- OECD AI Principles: A non-binding global standard promoting AI that is inclusive, transparent, accountable, and safe, guiding the development of national strategies.
India’s Policy and Governance Framework
- IndiaAI Mission: The government has pledged significant investment to boost core AI research and development, following the philosophy of “AI for All”—focusing on leveraging AI for social impact and inclusive growth across sectors like healthcare, education, and agriculture.
- Regulatory Stance: India’s approach is currently moving towards a sector-specific, “light-touch” regulation, favoring innovation over rigid, overarching laws, though this is evolving.
- The government utilizes existing laws like the Digital Personal Data Protection Act (DPDP Act), 2023, and the Information Technology Act to address AI-related risks.
- India’s IP/Copyright Policy (DPIIT Proposal): A government-constituted panel has recently proposed a hybrid model for the use of copyrighted works in AI training.
- This model suggests a mandatory licensing regime where rights holders would receive statutory royalty payments through a centralized, government-designated non-profit entity, but they would not have the option to withhold their works from being used for AI training.
- This aims to ensure compensation while simplifying access to data for AI developers.
Notable Indian Achievements
- Bhashini – Real-time translation across 22 Indian languages
- Kisan e-Mitra – AI chatbot for farmers
- Sarvam AI, Krutrim (India’s first AI unicorn), CoRover, Gnani.ai
- Development of OpenHathi (first Hindi-English LLM)
Way Forward
The upcoming years represent the decisive window for shaping the future of Artificial Intelligence. The following recommendations are framed as clear, actionable, and prioritized strategies at global and India-specific levels.
A. Global-Level Strategic Recommendations
- Establish an International AI Governance Framework
- Create a United Nations-affiliated International AI Agency (similar to IAEA for nuclear) with binding authority on frontier model safety and dual-use risks.
- Mandate third-party red-teaming and public safety reports for all models exceeding 10²⁶ FLOPs before deployment.
- Enforce Mandatory Transparency on Training Data and Copyright
- Require every foundation model developer to publish a Model Card + Dataset Card disclosing sources and licensing status of training data.
- Develop a Global AI Copyright Registry where creators can opt-in or opt-out their works from AI training (with enforceable penalties for non-compliance).
- Introduce an internationally accepted Text and Data Mining (TDM) Exception with a statutory licensing and remuneration mechanism for creators.
- Ban Lethal Autonomous Weapon Systems (LAWS)
- Fast-track a Global Treaty on Meaningful Human Control over weapon systems (building on the 2023 REAIM Summit commitments).
- Create an AI Safety Verification Standard
- Standardize dangerous capability evaluations (biological, cyber, self-proliferation risks) led by institutions like the UK AI Safety Institute and US NIST AI.
- Launch a Global AI Talent & Compute Equity Fund
- Pool public and private resources to provide developing nations access to sovereign compute and open-weight frontier models on non-commercial terms.
B. India-Specific Strategic Recommendations
- Fully Operationalize the IndiaAI Mission by 2027
- Achieve 1 lakh GPUs of sovereign public compute capacity through phased public-private partnerships.
- Launch the India Datasets Platform with high-quality, consented, and annotated datasets in all 22 scheduled languages by 2026.
- Develop Sovereign Indian Foundation Models
- Release fully open-source multilingual LLMs (at least 70B–405B parameters) trained exclusively on licensed or public-domain Indian data by 2028.
- Prioritize BhashaGPT series covering Hindi, Tamil, Telugu, Bengali, and other regional languages with cultural grounding.
- Enact a Balanced AI Copyright Law by 2026
- Introduce a progressive Text and Data Mining exception under the Copyright Act that: → Permits non-commercial research use freely → Mandates collective licensing and revenue sharing with creators for commercial training → Establishes an Indian AI Content Registry for opt-out and remuneration
- Build a World-Class AI Talent Pipeline
- Scale FutureSkills Prime and YuvaAI programs to train one million certified AI professionals by 2030.
- Establish 10 National AI Research Universities in collaboration with IITs, IISc, and international partners.
- Position India as the Global Leader in Ethical & Inclusive AI
- Host the Global South AI Summit annually starting 2026.
- Lead the creation of an Inclusive AI Charter focusing on multilingualism, affordability, and protection of marginalized communities.
- Implement Risk-Based AI Regulation by 2027
- Adopt a light-touch, innovation-friendly version of the EU AI Act tailored for India.
- Create an independent AI Safety Board of India under MeitY with statutory powers for high-risk system oversight.
- Leverage AI for National Development Goals
- Deploy AI-powered solutions at scale in healthcare (Ayushman Bharat), education (DIKSHA 2.0), agriculture (Kisan-eMitra), and justice delivery (SUPACE evolution) by 2030.
Conclusion
- Artificial Intelligence is not merely an incremental technology; it is a fundamental shift in human capability.
- While the current status is characterized by breakthroughs in Generative AI and Agentic systems and significant market growth, sustained progress depends entirely on navigating the profound challenges of algorithmic bias, transparency, and the revolutionary overhaul of Intellectual Property law.
- By establishing proactive, balanced governance—as exemplified by India’s focused policy and the EU’s risk-based framework—the global community can successfully ensure that the power of AI is leveraged responsibly for the benefit of all humanity.