8925FF685C6AF1930381BFB791F10391 The researchers created an open competitor for the Openai's O1 thinking for less than $ 50 - usa365.news | usa365.news The researchers created an open competitor for the Openai's O1 thinking for less than $ 50 - usa365.news | usa365.news
Dark Mode Light Mode
Dark Mode Light Mode

The researchers created an open competitor for the Openai’s O1 thinking for less than $ 50 – usa365.news

Artificial Intelligence Researchers in Stanford and Washington University managed to train the “Thinking” model from artificial intelligence with less Search paper It was released last Friday.

The model known as S1 is similar to advanced thinking models, such as Openai’s O1 and Deepseek’s R1, leads to tests that measure mathematics and coding capabilities. S1 model is Available on JabbapAlong with the data and the symbol used to train it.

The team behind the S1 said that they started with a base model outside the cliff, and then installed it through distillation, a process to extract the capabilities of “thinking” from another AI model through training in his answers.

The researchers said that the S1 is dripped from one of the embroidery of thinking in Google, Gemini 2.0 Flash Thinking. The distillation is the same approach that researchers used in Berkeley to create a thinking model of artificial intelligence for about $ 450 last month.

For some, the idea that a few researchers without millions of dollars behind them can still innovate the area of ​​artificial intelligence is exciting. But the S1 raises real questions about the commodity of artificial intelligence models.

Where is the trench if someone can repeat the millions of dollars closely with the relative pocket change?

It is not surprising, that the large Amnesty International laboratories are not happy. Openai Deepseek accused the data incorrectly from its application programming interface Distillation.

The researchers behind the S1 were looking to find the simplest approach to achieving strong thinking and “limiting the test time”, or allowing the artificial intelligence model to think more before answering a question. These were some breakthroughs in Openai’s O1, which tried Deepseek and other artificial intelligence laboratories to repeat it through various technologies.

The S1 paper indicates that thinking models can be distilled with a relatively small data collection using a process called SFT, where AI model is explicitly directed to imitate some behaviors in a data set.

SFT tends to be cheaper than the widespread reinforcement method that used Deepseek to train its competitor on Openai’s O1 Model, R1.

Google provides free access to Gemini 2.0 Flash Thinking, albeit with daily price limits, via the Google Ai Studio platform.

However, Google Conditions prohibit engineering unlike their models to develop services that compete with the company’s AI’s offers. We have contacted Google to comment.

S1 depends on a small model of artificial intelligence from AI LAB QWEN, which is available for free download. For S1 training, researchers created a data set that includes 1000 carefully coordinated questions, associated with answers to these questions as well as the “thinking” process behind each answer from Google Gemini 2.0 Flash Thinking.

After the S1 training, which took less than 30 minutes using 16 NVIDIA H100 GPU, the S1 has achieved strong performance on certain artificial intelligence standards, according to researchers. Niklas Mullingov, a Stanford researcher who worked on the project, told TECHCRUNCH to rent the necessary account today for about 20 dollars.

The researchers used an elegant trick to get the S1 to check its work and expand the time of “thinking”: they told her to wait. Adding the word “Wait” while thinking about the S1 the model helped to reach a little more accurate answers, for each paper.

In 2025, Meta, Google and Microsoft It plans to invest hundreds of billions of dollars in Amnesty International InfrastructureWhich will partially go towards training artificial intelligence models from the next generation.

This level of investment may still be necessary to push the envelope of creating artificial intelligence. The distillation has shown that it is a good way to re -establish the capabilities of the artificial intelligence model at a cheap price, but it does not create new models of artificial intelligence better than available today.

Add a comment Add a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Previous Post

Scout Motors has filed a lawsuit to sell EVS directly to consumers - usa365.news

Next Post

CEO of Columbia SPORTSWEAR on definitions: "We need some guarantee" - usa365.news