AI Lab Deepseek released an open version of Deepseek-R1, the so-called thinking model, which is called as well as Openai’s O1 on certain criteria of artificial intelligence.
R1 is available from the AI Dev platform that embraces the Massachusetts Institute’s license, which means that it can be used commercially without restrictions. According to Deepseek, R1 surpasses O1 on AIME, Math-500 and Swe-Benced standards. AIME employs other models to evaluate the performance of the model, while the Math-500 is a set of word problems. Meanwhile, SWE-Bench focuses on programming tasks.
Since it is a model of thinking, the R1 is effectively dividing the facts, which helps it avoid some of the pitfalls that are usually on the models. Thinking models take a little longer – a second longer to longer minutes – to reach compared to an unequisite model. The upward trend is that they tend to be more reliable in fields such as physics, science and mathematics.
R1 contains 671 billion teachers, Dibsic It was revealed in a Technical report. Parameters are almost compatible with problem solving skills in the model, and models with parameters generally perform better than those that contain less parameters.
In fact, 671 billion huge teachers, but Dibsic also released “drops” versions of R1 ranging from 1.5 billion teachers to 70 billion teachers. Smaller can run on a laptop. For the full R1, it requires more devices He is Available by API from Deepseek at 90 % -95 % cheaper than Openai’s O1.
Clem Dylangi, CEO of Huging Face, said in A. After x On Monday, developers on the platform created more than 500 “derivative” models of R1, which achieved 2.5 million downloads combined – five times the number of downloads obtained by the official R1.
There is a negative side to R1. Being a Chinese model, subject to Measurement By the Internet organizer in China to ensure that its responses “embody basic socialist values”. R1 will not answer questions about Tiananmen Square, for example, or the independence of Taiwan.
a lot Chinese artificial intelligence systems, including other thinking models, refuse to respond to topics that may cause the country’s anger in the country, such as speculation about Xi Jinping system.
R1 arrives days after the outgoing Biden administration proposal Harsh Export rules and restrictions on artificial intelligence techniques for Chinese projects. China companies have already been prevented from buying advanced artificial intelligence chips, but if the new rules enter into force as they are written, companies will face more stringent hats on both semiconductor technology and models needed to pave the advanced artificial intelligence systems.
At the Politics Document last week, Openai urged the United States government to support the development of American artificial intelligence, fearing that Chinese models will match or overcome the ability. in interview Through information, the Vice President of Openai Chris Lehane High Flyer Capital Management, DEPSEK’s father, as a special concern organization.
Until now, at least three Chinese laboratories – Debsik, Ali Baba, and LoveOwned by Unicorn Moonshot AI – produced models they claim to compete with O1. (It is worth noting, Deepseek was the first – I announced a preview of R1 in late November.) mail In X, Dean Paul, an artificial intelligence researcher at George Masson University, said that the trend indicates that the Chinese artificial intelligence laboratories will remain “quick followers.”
“The impressive performance of the distilled Deepseek models (…) means that the very able causes will continue to spread widely and operate it on local devices,” Paul wrote, “Far from the eyes of any monitoring system from top to bottom.”
This story was originally published on January 20 and was updated on January 27 with more information.
Techcrunch has a news message focused on artificial intelligence! Subscribe here To get it in your inbox every Wednesday.