OpenAI’s Upcoming Open-Source AI Model: The First ‘Truly-Open’ Language Model?

2025-04-28

OpenAI plans to release its first truly open language model and one of the most important releases since GPT-2.

Sam Altman, CEO at OpenAI. Image credit: Steve Jurvetson via Flickr, CC BY 2.0 license

Breaking New Ground in Open-Source AI

This may be a worthy king among all existing large language models. Departing from its closed-source strategy, OpenAI is preparing to release a new open-source AI reasoning model in early summer. It is about to become the company’s first truly open language model and one of the most significant developments since GPT-2.

Led by Aidan Clark, OpenAI’s VP of research, the development is still in early stages, but the company has ambitious goals. Sources familiar with the project reveal that OpenAI aims to create a best-in-class model that will outperform other open reasoning models on key benchmarks.

“[I personally think we need to] figure out a different open source strategy,” Sam Altman, OpenAI’s CEO, stated during a Reddit Q&A in January. “Not everyone at OpenAI shares this view, and it’s also not our current highest priority […] We will produce better models [going forward], but we will maintain less of a lead than we did in previous years.”

Technical Specifications and User Accessibility

The forthcoming model is being designed as a “text in, text out” system with reasoning capabilities similar to OpenAI’s o-series models. What makes this particularly noteworthy is that it’s being engineered to run on high-end consumer hardware, potentially democratizing access to sophisticated AI capabilities.

A distinctive feature may allow developers to toggle the model’s “reasoning” functionality on or off, similar to capabilities recently introduced by Anthropic and other AI labs. This flexibility could make the model more versatile for various applications and computational environments.

If successful, OpenAI might expand its open-source offerings to include smaller, more specialized models in the future.

Permissive Licensing: Learning from Competitors’ Missteps

In contrast to some competitors, OpenAI is exploring a highly permissive license with minimal usage or commercial restrictions. This approach appears directly informed by criticisms leveled at Meta’s Llama and Google’s Gemma, which some developers have criticized for imposing burdensome requirements on users.

This permissive strategy could potentially accelerate adoption and innovation around OpenAI’s model, especially among independent developers and smaller organizations that lack resources to navigate complex licensing requirements.

Safety Remains a Priority

Despite the shift toward openness, OpenAI emphasizes that safety testing will remain rigorous. According to Altman, the model will undergo evaluation using the company’s preparedness framework, with additional precautions taken specifically because the model is expected to be modified after release.

“[B]efore release, we will evaluate this model according [to] our preparedness framework, like we would for any other model,” Altman said in a post on X. “[A]nd we will do extra work given that we know this model will be modified post-release.”

The company plans to release a comprehensive model card—a technical report detailing the results of internal and external benchmarking and safety testing. This transparency is particularly noteworthy given previous criticism from AI ethicists who have accused OpenAI of rushing safety testing and failing to release model cards for other models.

The Competitive Strides in Open-Source AI

OpenAI’s move comes amid increasing pressure from competitors embracing open approaches to AI development. Meta’s Llama family has garnered over one billion downloads since its release, while Chinese AI lab DeepSeek has quickly built a substantial global user base and attracted significant investment.

The success of these open models demonstrates the powerful network effects and community engagement that open-source strategies can generate, potentially explaining OpenAI’s strategic recalibration.

Regional Innovation: Neurotechnology’s Lithuanian LLM

The open-source AI landscape continues to diversify regionally as well. In August 2024, Neurotechnology, a provider of deep learning-based solutions and biometric technologies, released the first open-source large language model customized for the Lithuanian language.

Built upon the transformer-based LlamaV2 7 and 13 billion parameter architectures, this model was pretrained on more than 14 billion Lithuanian language tokens. The company utilized NVIDIA H100 graphics processing units to accelerate the training process.

“We are proud to contribute our LLM to the open-source community,” said Artūras Nakvosas, Technical Lead of the Natural Language Processing department at Neurotechnology. “By making it publicly available, we aim to encourage others to use it and expand the development of AI applications in Lithuanian.”

Internal benchmarking indicates that Neurotechnology’s model outperforms the default Llama 2 in multiple areas, making it a solid foundation for Lithuanian language AI applications. The model and accompanying datasets are available on the Hugging Face platform, with research papers accessible via the arXiv archive.

This initiative represents an important step toward advancing natural language processing technologies in the Baltic region, with Neurotechnology planning to expand its research across Baltic, Scandinavian, and Eastern European languages.

The Shifting Paradigm in AI Development

OpenAI’s forthcoming open-source model potentially signals a broader shift in the AI industry. As Altman himself acknowledged, the company has been “on the wrong side of history” regarding open-sourcing its technologies.

This recognition comes as more organizations discover that open models can accelerate innovation through community contributions while still maintaining competitive advantages through implementation expertise, specialized datasets, and complementary closed-source offerings.

For developers and organizations relying on AI technologies, this evolution toward more open models suggests a future where access to powerful foundation models becomes more democratic, while competition increasingly centers on fine-tuning, application development, and specialized domain expertise.

What’s In The Future?

As OpenAI prepares its open-source offering, the AI community awaits details about the model’s specific capabilities, hardware requirements, and licensing terms. The early summer release timeline suggests we won’t have to wait long.

If OpenAI succeeds in creating a truly best-in-class open reasoning model with minimal usage restrictions, it could significantly influence the trajectory of AI development and democratize access to sophisticated AI capabilities. This would mark a substantial shift for a company that has primarily pursued a closed-source, API-driven business model until now.

The outcome of this initiative may ultimately determine whether OpenAI can maintain its influential position in the AI market as the industry continues its rapid evolution toward more open, accessible, and distributed innovation.

If you are interested in this topic, we suggest you check our articles:

Sources: TechCrunch, Neurotechnology

Written by Alius Noreika

OpenAI’s Upcoming Open-Source AI Model: The First ‘Truly-Open’ Language Model?

Breaking New Ground in Open-Source AI

Technical Specifications and User Accessibility

Permissive Licensing: Learning from Competitors’ Missteps

Safety Remains a Priority

The Competitive Strides in Open-Source AI

Regional Innovation: Neurotechnology’s Lithuanian LLM

The Shifting Paradigm in AI Development

What’s In The Future?

News

Machine Learning Platform

Legal Information

Contact