In the ever-changing global market of artificial intelligence (AI), a largely lesser-known Chinese startup, DeepSeek, has recently made a strong threat to traditional tech players by offering AI solutions that are at the cutting edge of the market in both technological design and affordability. Find out how a company set up by AI expert Liang Wenfeng in 2023 and its phenomenal climb are changing the global AI scene.
Origins and Foundation
DeepSeek originated as a research initiative within High Flyer, a Chinese quantitative hedge fund established by Liang Wenfeng in 2015.
Using AI to make informed trading decisions, High Flyer ventured into AI research, which eventually led to the creation of DeepSeek as a dedicated AI lab firm in 2023. That lab eventually went its separate way to become an independent company, since then known as DeepSeek.
Here are some main facts about the origins and background of this company listed below:
- DeepSeek is a Chinese AI startup backed by High-Flyer Capital Management, a quantitative hedge fund that integrates AI into trading strategies.
- Liang Wenfeng, an AI enthusiast, co-founded High-Flyer in 2015 and launched it as a hedge fund in 2019.
- DeepSeek was initially an AI research lab within High-Flyer before becoming an independent company in 2023.
- The company has built its own data center clusters for model training.
Technological Innovations
DeepSeek’s technological advancements have been nothing short of groundbreaking. The timeline of the company’s developments consists of these primary facts:
- The company launched its first AI models in November 2023, including DeepSeek Coder, DeepSeek LLM, and DeepSeek Chat.
- DeepSeek-V2, released in early 2024, gained attention for its efficiency and affordability, outperforming competitors and forcing major Chinese AI firms like ByteDance and Alibaba to lower prices.
- DeepSeek-V3, introduced in December 2024, is a 671-billion-parameter Mixture-of-Experts (MoE) model, trained in two months for $5.58 million—significantly less than the budgets of Meta and OpenAI.
- Internal testing suggests that DeepSeek-V3 outperforms both open-source models (like Meta’s Llama) and API-restricted models (like OpenAI’s GPT-4o).
- The company also released DeepSeek R1, a “reasoning” AI model in January 2025, which competes with OpenAI’s o1 reasoning model and is designed to self-check for errors, improving accuracy in fields like math, physics, and science.
In December 2024, the company demonstrated DeepSeek-V3, a Mixture-of-Experts (MoE) language model with 671 billion parameters and 37 billion parameters per token activated. This model was trained on 14.8 trillion tokens using novel computing architectures, such Multi-head Latent Attention (MLA) and DeepSeekMoE. Interestingly, the training was completed within 2 months at a cost of 5.58 Million dollars, which is a small portion of the expense of industry giants like OpenAI and Meta Plattfoons which they use to train their own machine learning models.
The efficiency of DeepSeek’s models is particularly remarkable given the constraints posed by U.S. export bans on advanced hardware. To train models, DeepSeek used Nvidia H800 chips, a less powerful equivalent of the H100 chips U.S. companies have access to. This engineering and scientific resourcefulness highlights DeepSeek’s resilience to external pressures in its commitment to innovation.
It is known that the company actively recruits AI PhDs from top Chinese universities and also hires individuals without computer science backgrounds to enhance the diversity of its AI training data. The company also operates with extreme cost-efficiency, continuously aiming to reduce compute requirements for training models.
Market Disruption and Global Impact in January 2025
The release of DeepSeek’s advanced models has sent particularly strong shockwaves through the global tech industry. Within weeks, the success of DeepSeek’s AI models led to significant market capitalization losses for major tech companies, including Nvidia, Tesla, Google, Amazon, and Microsoft. For example, The success of DeepSeek’s AI models contributed to a significant stock price drop (18%) for Nvidia alone.
This new development has also put under question the traditional idea that it is only big tech companies and companies with access to huge financial resources can dominate the AI field.
DeepSeek’s success has also raised the issue of the power distribution between countries in AI development. The capacity of the firm to create powerful models at a reduced cost leads to the possibility that small startups can challenge competitors that are deeply rooted in the AI market.
Although DeepSeek is technologically advanced, it has also been criticized for the behaviors of its AI models in handling controversial issues. For example, the company’s chatbot has controversial outputs related to human rights and Taiwan and illustrates how Chinese government views can be reflected in the chatbot’s response. These considerations have prompted debate around the ethical considerations of AI and the need for objective reporting of information.
In addition, both U.S. and European regulators are already rushing to investigate DeepSeek on various national security risks, data privacy issues and potential infringements on intellectual property. This legal examination draws attention to the intricate balance between technological development and legislative regimes in the world of AI.
DeepSeek’s business model remains unclear, as it prices services well below market rates and offers many products for free. While not fully open-source, DeepSeek’s models are available under permissive licenses, allowing developers to use them commercially.
Conclusion
The fast rise of DeepSeek in the field of AI shows how new ideas and effective use of resources can challenge the status quo. While the company has already developed and is honing its models to achieve their top performance, the company aims to continue to lead the debate on the future of AI, technological ethics, and international competition.
If you are interested in this topic, we suggest you check our articles:
- Generative AI: Deep Dive on What It Is and How it Can Be Used
- OpenAI Strives for the Most Human-Like Voice
- Avoiding LLM’s “hallucinations” could now be possible
Sources: ArXiv.org, TheVerge, Technology.org, Intelligencer, The Scottish Sun, Computerworld, SCMP, NY Times