Intelligent CIO APAC Issue 18 - Page 77

t cht lk inevitably means there is variance in the accuracy and quality of data .

t cht lk inevitably means there is variance in the accuracy and quality of data .

Which data set should the AI / ML project use ? That ’ s the wrong question to ask . The right question to ask is ‘ How do we ensure there is just one set of this data , shared across all lines of business ?’
Ask the question , find the answer , then implement it and repeat for all silos . This will help the current AI / ML project immensely , should create cost-efficiencies , and should support future AI / ML and other projects going forwards . It will also be beneficial for other data management processes such as backup , restore and archive .
Derek Cowan , Director of System Engineering APAC at Cohesity
Even in the era of Digital Transformation , Fortune 500 companies take weeks or , most often , months to deliver clean data to their teams , often mandating a carefully co-ordinated effort across multiple teams . Further , this has necessitated the use of ingenious , albeit often insufficient , methods such as the use of synthetic data sets or subsets of data .
be useful . This requires several decisions to be made , including which data sets are most important , how far back to go , whether to work on a subset for proof of concept and bring in more data sets later , and whether some poorly managed data can be cleaned enough for use – or not .
Making the right decisions will be vitally important , and this is another area where that trusted third party view of the AI / ML specialist will be extremely valuable .
Dismantle data silos
It is possible an entirely new data policy will be needed going forward to keep the AI / ML system fed with the right quality data . Not only might new data need to be gathered , but new working practices might also be needed . This means there could be significant implications across the whole organization . For example , it might be important to do a ‘ once-and-forall ’ purge of data silos that can often hold data-related projects back .
Recent research has found that many IT teams are spending 40 % of their time managing and maintaining data infrastructure , and only 32 % of data available to enterprises is put to work , while the remaining 68 % goes unleveraged .
In too many organizations there are still different lines of business capturing the same data for their own use . This is cost-inefficient , causes data silos which lead to data fragmentation and governance issues , and
With Cohesity , there is an answer . Users can instantly provision clones of backup data , files , objects , or entire views and present those clones to support a variety of use cases . Cohesity ’ s zero-cost clones are extremely efficient and can be instantly created without having to move data .
No AI or ML project has a chance of being successful if there is not an accompanying data strategy .
This is in stark contrast to the inefficiency of the traditional DevTest paradigm , in which full copies of data are created between infrastructure silos . This is a dramatic shift to modernization .
By decoupling data from the underlying infrastructure in this way , we enable organizations to automate data delivery , and provide data mobility . Zero-cost clones can be spun up in minutes rather than weeks . As a result , customers have reduced their service level agreement ( SLAs ) for data delivery , accelerated application delivery and migration , and greatly simplified their data preparation .
With the right data flowing in , AI and ML projects can provide dashboards of insights that can be used by the organization in the transformational ways it envisions . Focusing on the data from the start of an AI or ML project can help an organization land on the right side of Gartner ’ s 50 %, however , this focus must occur from the outset . p
www . intelligentcio . com INTELLIGENTCIO APAC 77