India's Copyright Law and AI Training Conflict
Claims that training artificial intelligence models on copyrighted material constitutes 'fair dealing' are increasingly contested. This argument, often imported from U.S. legal discourse, faces significant hurdles under India's Copyright Act, 1957. The law mandates a strict purpose-specific inquiry, requiring any use to first align with explicitly enumerated exceptions before fairness is even considered.
The Purpose Threshold
Unlike the United States' broad 'fair use' doctrine, India's 'fair dealing' is a narrow defence limited to specific purposes: criticism or review, private or personal use including research, and reporting of current affairs. Indian courts have consistently held these categories to be exhaustive. A use must satisfy this initial purpose test before any subsequent examination of fairness, such as the extent of copying or commercial intent, can occur. AI training’s reliance on extensive, systematic copying of protected works often fails this primary threshold.
AI Training as Reproduction
Generative AI development necessitates converting vast amounts of copyrighted content—text, images, music—into numerical data. This process is deliberate and technically essential for building sophisticated algorithms. The works are utilized not for their expressive content but as statistical inputs for pattern recognition. Under Section 14(a)(i) of the Copyright Act, the exclusive right to reproduce a work in any material form rests with the copyright owner. Without a clear statutory exception, this unauthorised copying is prima facie infringing.
Strained Interpretation of 'Research'
Arguments favouring AI training often point to the 'research' limb of Section 52(1)(a). However, Indian jurisprudence traditionally defines research as human-centred study of a work's content or ideas. AI training reverses this, using works as mere inputs to optimize prediction engines. Broadly interpreting 'research' to cover industrial-scale, automated data ingestion for commercial products would effectively create a text and data mining (TDM) exception, a step Parliament has not legislated. Jurisdictions like the EU and Japan have enacted specific TDM provisions after extensive deliberation, highlighting India's legislative gap.
Lack of Expressive Purpose
The other 'fair dealing' purposes—criticism, review, and reporting—are inherently expressive, involving commentary for public consumption and aligning with free speech values. AI training's copying is non-expressive; it does not engage with the work's meaning for commentary but rather treats it as raw material for a functional tool. The concept of 'transformative use,' sometimes mentioned in legal discussions, primarily relates to copyright subsistence (originality) in India, not as an independent determinant of fairness for exceptions like fair dealing.
Legal Uncertainty and Future Steps
The ambiguity surrounding AI training and copyright is escalating. Asian News International initiated proceedings against OpenAI in November 2024 before the Delhi High Court, challenging the use of copyrighted content for AI training and contesting the availability of fair dealing. Policy discussions are underway regarding potential legislative action, such as licensing mechanisms or a tailored TDM exception. These efforts signal an acknowledgement that existing provisions may not adequately address industrial-scale machine learning, reinforcing that fair dealing cannot be presumed to cover AI training by default.
Stretching current legal provisions beyond their textual limits risks undermining legal certainty and established legislative processes. Promoting AI innovation while protecting creators' rights will likely require targeted legislative reform rather than judicial reinterpretation.