HCBA Lawyer Magazine No. 34, Issue 5 | Page 52

openAi ’ SfAiruSedefenSeforuSeofcopyrightedMAteriAL
Solo & Small Firm Section Chairs : ­David­Carter­ – Carter­Injury­Law , ­PA­ & ­Dawn­Myers­ – Myers­Law , ­P . A .
openAi ’ sarguments andlegalsupportfor whyitschatgpt ’ s useofcopyrighted materialisfairuse .

OpenAI is pushing the boundaries of machine learning and natural language processing in the rapidly evolving landscape of artificial intelligence ( AI ). Central to its innovation is the use of extensive datasets of copyrighted material . The question of whether this practice constitutes copyright infringement , or can be defended as fair use , is a contentious legal debate .

According to OpenAI , its ChatGPT large language model ( LLM ) does not store copies of the information from which its models learn . Instead , the process involves training these models on large datasets sourced from the internet and other publicly available texts . Next , the model analyzes the collected data to learn patterns , structures , and nuances of language . Once trained , the model can generate text based on the learned patterns . The output is not a reproduction of specific texts from the training data but rather is generated anew each time based on the input prompt and the model ’ s training .
OpenAI argues its use of copyrighted content is fair use under United States copyright law , 17 U . S . C . § 107 . First , OpenAI argues that its models transform the copyrighted material in a significant way . The training process involves
learning patterns from large datasets of copyrighted material , but the output ( like the text generated by ChatGPT ) is not a direct copy of any specific source . Instead , it ’ s a new , unique piece of content that reflects a synthesis of information and patterns learned from numerous sources . Second , OpenAI argues that while the training process involves large datasets , no single work is predominantly or exclusively used . The model does not rely on substantial portions of individual copyrighted works but rather on broad patterns learned from massive , diverse corpora . Third , OpenAI argues that its models do not replace or diminish the market for the original works . The outputs from models like ChatGPT are not substitutes for reading the original texts . Instead , they are often used for purposes like education or generating new content , which can be different from the purposes of the original works .
Established precedents support OpenAI ’ s fair use defense . Courts have found that the use of copyrighted materials by technology innovators in transformative ways can be fair use . In fact , courts have found fair use where defendants copied copyrighted information to reverse-engineer software to learn functional requirements for compatibility purposes or to create a new product . See Sega Enterprises Ltd . v . Accolade , Inc ., 977 F . 2d 1510 ( 9th Cir . 1992 ) ( video game development ); Sony Computer Ent ., Inc . v . Connectix Corp ., 203 F . 3d 596 ( 9th Cir . 2000 ) ( video game emulators ); Google LLC v . Oracle Am ., Inc ., 141 S . Ct . 1183 ( 2021 ) ( interfaces for the Android operating system ).
LLMs and other AI technologies are powerful and useful technologies . At the same time , copyright stakeholders are concerned that these LLMs are using their copyrighted material without permission and thus not being properly compensated . Several cases are currently pending that are likely to address OpenAI ’ s fair use defense . It will be interesting to see how this will play out . n
Author : Derek Fahey - The Plus IP Firm
Join the Solo & Small Firm Section today in your Member Portal at hillsbar . com .
5 0 M A y - J u N 2 0 2 4 | H C B A L A W y E R