
What is DeepSeek, and how is it different from other AIs?
DeepSeek is a Chinese Artificial Intelligence (AI) company founded by entrepreneur and businessman Liang Wenfeng in July 2023. The company specialises in developing large language models (LLMs), yet they are most known for their eponymous chatbot. Released on January 10th for IOS and Android phones alongside the DeepSeek-R1 model that their chatbot was based on, its sudden success has since shaken up the industry to such an extent that it won’t ever be the same again.
How is this virtual agent different from others?
DeepSeek’s chatbot is on par with its contemporaries and their abilities based on previously used benchmark tests. However, they differ in their drastically different developmental processes, with the Chinese company using lower-tier technology compared to the status quo.
Take the difference between which graphics processing units (GPUs) were used and how they were done, a crucial component in any AI, as it allows for complex calculations that can train the models more efficiently. Almost all of the top AI companies in the U.S. used top-tier Nvidia GPUs to create their chatbots, namely either the A100 or H100. However, Chinese companies were barred from using these chips by the U.S. government, which imposed export restrictions on the country, citing national security concerns.
This restriction led to the weaker yet more widely available and affordable H800 being created, a China-export version of the H100 with reduced capabilities, which was used during DeepSeek’s development of its R1 model. Yet, the company managed to use it to their advantage, extracting more performance out of the less powerful hardware with customised instructions.
How much more efficient is the Chinese chatbot compared to others?
DeepSeek’s decision to work with what they were given and innovate on them, along with others that resulted in their AI model learning more efficiently and reducing its computing, optimised the company’s operations to incredible ends. According to the company’s research paper, it eventually allowed for their chatbot to be trained to the level of leading industry leaders like Meta’s Llama 3. Directly comparing the two, we see that Deepseek used 11 times less computing power to train its LLM model (2.8 million GPU hours to 30.8 million), which would later be the basis for their R1 model, on an eighth of the GPUs (16,384 H100s to 2,048 H800s).
It is this difference in approach to AI development from other companies that allowed for DeepSeek’s success. The result of which is a unique end product, not because it offers its users anything revolutionary, but because it does so for developers. The Chinese company, through a cost-effective, less resource-intensive and innovative approach, created a chatbot comparable to its competition, successfully doing so without having the capital that most do.
What are the future implications of DeepSeek’s success on the AI Industry?
Within 2 weeks of its release, DeepSeek saw its eponymous chatbot climb up the charts, surpassing ChatGPT as the most downloaded freeware app on the Apple iOS App Store. This surge to the top coincided with Nvidia’s stock being shedded by nearly 600 billion or 17 percent, the largest one-day market loss in history. These two events demonstrate that both AI users and companies have noticed the effect and impact of DeepSeek in such a short amount of time, with Market analyst Ivan Feinseth describing the Chinese company’s creation as “the first shot at what is emerging as a global AI space race.”
However, for all of DeepSeek’s innovation on the developmental end, it isn’t ideal for users looking to seek out China-censored information and instructions for self-harming and dangerous activities. Additionally, concerns regarding the chatbot sending data back to ByteDance (TikTok’s parent company) in China have also been raised when South Korea’s data protection regulator confirmed communication between both parties and subsequently banned new downloads of the app. Other countries and several U.S. government branches have since followed suit, showcasing both a desire for these institutions to have their security and the challenges Chinese chatbots have in gaining global use.
Despite legitimate concerns and criticism, through the creation of their chatbot, DeepSeek showed that it is possible to develop an AI model without the amount of capital or resources required in the past. This watershed moment has opened the floodgates, effectively democratising AI development for all who are willing to invest in it. Additionally, by allowing their code to be open-source, DeepSeek’s influence can be amplified by others who seek to improve on their work. With its innovative approach to development, the Chinese company has been the catalyst for a change in the industry, where a new wave of AI models made by small companies, startups, and individual developers will appear on the market sooner rather than later.