BUSINESS + TECH
THE ABILITY TO CO-OPT RATHER THAN COERCE , INFLUENCING THROUGH CULTURE AND INNOVATION RATHER THAN FORCE HAS GRADUALLY BECOME A CORNERSTONE IN ALL AREAS OF GLOBAL COMPETITION AND ARTIFICIAL INTELLIGENCE ( AI ) TECH INNOVATIONS ARE DEFINING THIS FUTURE SHIFT . BURSTING ONTO THE TECH SCENE AND SHOCKING FINANCIAL MARKETS IN ITS WAKE , DEEPSEEK , A CHINESE CUTTING-EDGE LANGUAGE MODEL , HAS RAPIDLY EMERGED AS A LEADER IN THE WORLDWIDE RACE FOR TECHNOLOGICAL DOMINANCE , DIRECTLY CHALLENGING OTHER KEY PLAYERS LIKE US-BASED OPENAI MODEL CHATGPT AND PROMPTING A WAKE-UP CALL ACROSS THE ENTIRE INDUSTRY . POLLY HUMPHRIS DELVES INTO THE DIFFERENCES BETWEEN THE TWO , EXPLORING HOW THEY COMPARE AND WHAT THIS MEANS .
DeepSeek and ChatGPT are advanced AI language models – or ‘ chatbots ’ - that process and generate human-like text . While they share many similarities , they differ in development , architecture , training data , cost-efficiency , performance and innovations . Crucially , DeepSeek became the most downloaded free app in the US just a week after it was launched in January this year , so how does it compare to its much more established but much more expensive US rival , ChatGPT ?
DeepSeek is an advanced opensource AI training language model that aims to process vast amounts of data and generate accurate , high-quality language outputs within specific domains such as education , coding and research . It uses Natural Language Processing ( NLP ) - a machine learning technology that gives computers the ability to interpret , manipulate and understand human language - to understand and generate human-like text effectively .
DEEPSEEK ’ S KEY FEATURES :
Architecture : DeepSeek uses a design called Mixture of Experts ( MoE ), which means the model has different ‘ experts ’ ( smaller sections within the larger system ) that work together to process information efficiently . It has 671 billion total parameters - the building blocks of AI , which help it understand and generate language - with 37 billion active at any time to handle specific tasks .
Training Data : DeepSeek was trained on 14.8 trillion pieces of information called tokens . Tokens are parts of text , like words or fragments of words , that the model processes to understand and generate language . This large dataset helps it deliver accurate results .
Cost-Efficiency : One of DeepSeek ’ s main claims is that it ’ s extremely resource- and costefficient . It completed its training with just 2.78 million hours of computing time , costing £ 4.5 million to train - over 89 times cheaper than OpenAI ’ s rumoured £ 402.8 million budget for the original model .
Performance : DeepSeek produces results similar to some of the best AI models , such as GPT-4 , and excels at understanding context , reasoning through information and generating detailed , high-quality text .
Innovations : DeepSeek includes unique features like a loadbalancing method that keeps its performance smooth without needing extra adjustments . It also uses a multi-token prediction approach , which allows it to predict several pieces of information at once , making its responses faster and more accurate .
ChatGPT is an AI language model created by research organisation , OpenAI , to generate human-like text and understand context . As with DeepSeek , it also uses NLP to respond accurately and help with various professional tasks and personal use cases . Built on the Generative Pre-trained Transformer ( GPT ) framework , it processes large amounts of data to answer questions , providing detailed responses and effectively supporting both professional and personal projects .
CHATGPT ’ S KEY FEATURES :
Architecture : ChatGPT ’ s first incarnation , GPT-3 , contained around 175 billion parameters . It ’ s upgrade , GPT-4 , which uses more advanced architecture , is estimated to contain around 1 trillion parameters , allowing the model to learn more complex patterns and nuances , enhancing its language understanding and text generation capabilities .
Training Data : ChatGPT was trained on a broad dataset including text from the internet , books and Wikipedia , which enables it to tackle complex queries and provide detailed responses on various topics . GPT- 4 ’ s dataset is significantly larger than GPT-3 ’ s , allowing the model to understand language and context more effectively .
Cost-Efficiency : ChatGPT ’ s training and deployment requires significant computational resources - OpenAI trained the model using a supercomputing infrastructure provided by Microsoft Azure , and while OpenAI has not disclosed exact training costs , it ’ s estimated that training GPT-4 involved millions of GPU hours ( units of measurement used to quantify the computational resources consumed during deep learning tasks ) resulting in substantial additional operational expenses .
Performance : ChatGPT generates responses that are both coherent and context-relevant , making it usable for tasks like
www . insidekent . co . uk • 183