Bringing Creativity, Agility, and Efficiency with Generative AI in Industries 24th Edition | Page 39

Responsible Generative AI
gains . Unmanaged proliferation of AI applications may be adversely impacted by properly trained AI experts . There is a risk of improper understanding and use of the technology if hiring and employment take short-cuts .
As many commercial organizations , governmental agencies , and military are jumping on the AI bandwagon , there is a clear shortage of AI Talent . The demand for AI experts cuts across several industries , not just tech companies . Many companies have yet to adopt AI because there ' s a shortage of experts with the required skills in the field . Training the labor force on AI has become a focus for many consulting companies . Announcements like “ Ernst & Young plans on training its 400,000 employees how to use AI in the workplace 32 ” and “ Indian IT Giant TCS is spending $ 1 Billion to train entire staff in AI ” are ample evidence of the need for additional trained human power to address the surge in demand in adopting GenAI .
It is obvious that resource requirements to create reasonably large GenAI systems are very high . This means the technology could potentially be controlled by a few powerful organizations with sufficient funds and human resources , and access to the right enabling technology . The syndrome of “ Power breeds power ” could lead to a few technology giants monopolizing this field and probably lead to unfair influence and hazards of not having enough competition . Google ’ s dominance in the “ search ” domain and Amazon ’ s clout in the eCommerce industry are modern day examples . Monopoly , as in any field , will pose at least three risks – killing innovation , price manipulation , and unfair socio-economic influence if owned / managed by rogue citizens .
7.4 DATA REQUIREMENTS
Large Language Models are hungry not only for power but also for large and diverse amounts of data . Very often , the LLMs are trained on publicly available internet data which includes digitized books , news articles , blogs , social media posts , YouTube videos , and so on . The performance of these LLMs is dependent on how large and diverse the dataset is and how it can use this dataset for training .
However , human-created data has its limitations such as cost , accessibility , privacy and bias . OpenAI and many other leading companies have started making formal contracts with enterprises in news , finance , medical , transportation , and supply chain industry sectors to get access to the enterprise data to augment the public data . Also , they have turned into " synthetic " data 33 - data that is generated by algorithms / models - to train the LLMs .
In summary , the ever-growing hunger for data poses another concern because of the resources needed to collect , store , and process massive amounts of data .
32 https :// www . businessinsider . com / ey-ernst-young-consulting-invests-ai-strategy-training-model-tools- 2023-9
33
https :// www . reddit . com / r / ChatGPT / comments / 153wfxk / microsoft _ and _ openai _ test _ synthetic _ data _ to _ train /
34 March 2024