Intelligent CIO Africa Issue 77 | Page 65

IT PEOPLE NOT IMMERSED IN AI OR DATA SCIENCE SHOULD NOT BE MAKING DECISIONS ON WHAT STORAGE AND DATA SYSTEMS THE DATA SCIENTISTS NEED .
INDUSTRY WATCH

IT PEOPLE NOT IMMERSED IN AI OR DATA SCIENCE SHOULD NOT BE MAKING DECISIONS ON WHAT STORAGE AND DATA SYSTEMS THE DATA SCIENTISTS NEED .

Enterprises are fast embracing more datacentric business models , and as a result , so is the need for big data and analytics workloads that use artificial intelligence , AI , machine learning , ML and deep learning , DL .

We know that good data equates to better business insights . Still , according to a recent whitepaper by IDC , Storage Infrastructure Considerations for Artificial Intelligence Deep Learning Training Workloads in Enterprises , outdated storage architectures can ultimately pose challenges in efficiently scaling large AI-driven workloads .
DL is more complex and uses multi-layered neural networks that can act on their learnings and make decisions without a human .
Both subsets of AI , ML and DL have different use cases . ML will alert you to problems , whereas DL is an actual learning system used in applications such as natural language processing , virtual assistants , and autonomous vehicle systems . The one thing they all share is their hunger for data . As IDC accurately determines , AI workloads generally perform better when they leverage larger data sets for training purposes .
IDC states , over 88 % of enterprises purchase newer , more modernised storage infrastructure designs for those types of applications .
Data pipelines are most effectively managed when the ingest , transformation , training , inferencing , production , and archiving stages are done in a consolidated storage processing framework in a single storage system . But the pressures of latency , high data concurrency , and multitenant management can put this system at risk . Why ? Because these systems need to support cloud-native capabilities and integration and petabytes of data .
Because of this , organisations are taking the option of software-defined , scale-out storage infrastructures that support extensive hybrid cloud integration a lot more seriously .
But to understand the strains traditional storage is under , we need to know how AI plays a role in digital transformation and storage decisions . What we know about ML is that it needs to review data inputs , identify patterns and similarities in data , and then take what it has learnt and propose a decision . On the other hand ,
Accordingly , IDC states that most enterprises are experiencing data growth rates from 30 % to 40 % per year and will soon be managing multi-petabyte storage environments if they are not already . This led them to their subsequent finding that roughly 70 % of organisations pursuing digital transformation will modernise their storage in the next two years to support performance , availability , scalability , and security .
What we are also learning is that AI applications are not static . The data models created by data scientists are constantly evolving ; therefore , the training workflow applied to the data being used must continuously be refined and optimised , or the models will fail . This means that sometimes older data might not be archived if it does not support the model ’ s evolution .
Within this , the AI data pipeline needs different capabilities from the IT infrastructure in which it operates to process data-intensive workloads effectively . IDC showcases in its AI DL Training Infrastructure Survey that 62 % of enterprises running AI workloads were running them on high-density clusters . The survey defined these as scale-out IT infrastructure leveraging some form of accelerated compute .
www . intelligentcio . com INTELLIGENTCIO AFRICA 65