select search filters
briefings
roundups & rapid reactions
Fiona fox's blog

expert reaction to new AI Chatbot DeepSeek

Scientists comment on DeepSeek, a new AI Chatbot. 

 

Dr Luo Mai, Lecturer in Data Centric Systems at University of Edinburgh, said;

“The DeepSeek team has achieved a significant milestone by releasing an open-source, highly capable model. At a high level, this model leverages the sparse mixture-of-experts (MoE) architecture, which activates fewer neurons — the key component of an AI model — to process inputs compared to fully activated counterparts, making it more efficient. Additionally, it incorporates test-time compute, similar to OpenAI-o1-like reasoning, enabling it to tackle challenging reasoning tasks. DeepSeek is the first to fully open-source them and offers them at significantly lower prices compared to closed-source models. This effort provides researchers and students worldwide with opportunities to explore new avenues of AI research, particularly in environments with less compute resources.” 

“Looking ahead, I expect further developments in the AI system stack to enhance the design of sparse MoE architectures. Combining sparsity with test-time compute techniques could amplify their individual benefits, influencing the direction of AI software and hardware design for years to come, while also encouraging greater diversity in the market and reducing the impact on the environment. These innovations have the potential to significantly enhance AI efficiency in the years to come.”

 

Prof Anthony G Cohn, FREng, FLSW, Professor of Automated Reasoning, University of Leeds, and Foundation Models Lead at the Alan Turing Institute, said:

“That another Large Language Model (LLM) has been released is not particularly newsworthy – that has been happening very frequently ever since ChatGPT’s release in November 2022.  What has generated interest is that this appears to be the most competitive model from outside the USA, and that it has apparently been trained much more cheaply, though the true costs have not been independently confirmed. The present cost of using it is also very cheap, though that is scheduled to increase by nearly four times on Feb 8th, and experiments still need to be conducted to see if the cost of inference is cheaper than competitors – this is at least partially determined by the number of tokens generated during its “chain-of-thought” computations, and this may dramatically affect the actual and relative cost of different models.

“Additional excitement has been generated by the fact that it is released as an “open-weight” model – i.e. the model can be downloaded and run on one’s own (sufficiently powerful) hardware, rather than having to run on servers from the LLM’s creators, as is the case with, for example, GPT and OpenAI. Moreover, this potentially makes the internal computations of the LLM more open to introspection, potentially helping with explainability, a very desirable property of an AI system.  It should be noted however that the benchmark results reported by DeepSeek are on an internal model that is different to the one released publicly on the HuggingFace platform.

“Further reason for the excitement is that has been done in China, which has been denied access to the latest NVIDIA hardware, which has been presumed to be essential to achieve state-of-the-art performance.  However, necessity is said to be the mother of invention, and this lack of the latest hardware seems to have driven creativeness to exploit previous generation hardware more efficiently – which will no doubt in turn drive western LLM builders to look for similar improvements in their own computations rather than mainly relying on yet more compute power and yet more data.

“It is important to note that there is no evidence that DeepSeek’s performance on less than state-of-the-art hardware is actually getting us any closer to the holy grail of Artificial General Intelligence (AGI); LLMs are still, by their very nature, subject to the problems of hallucination, unreliability, and lack of meta-cognition – i.e. not knowing what they do and don’t know.

“Moreover, the challenge of enabling commonsense reasoning in LLMs is still an unsolved problem, for example reasoning about space, time, and theory of mind, although LLMs do appear to have improved their performance in this regard over time. Initial preliminary experiments I have conducted suggest that DeepSeek is still not as good as GPT-o1 for some kinds of spatial reasoning.

“All commercial fielded LLMs have some kind of “guard rails” to stop the generation of illegal or potentially harmful material;  DeepSeek seems no different and in particular it is, not surprisingly, unable to generate responses which violate Chinese government policies and restrictions.  Moreover, the DeepSeek model has been trained from scratch on data which has not been released – it is thus unknown what hidden biases may be latent in the model (as is also the case in almost every other model).

“Finally, I note that the DeepSeek models are still language only, rather than multi-modal – they cannot take speech, image or video inputs, or generate them.  No doubt the future will see such a release, though the computational demands of handling multi-modal data are much greater than when just handling language.”

 

Professor Harin Sellahewa, Professor of Computing, and Dean of Faculty of Computing, Law and Psychology, University of Buckingham, said:

“DeepSeek is to AI what ARM is to computer processor.

“ARM processors were originally designed in the late 70s by Professor Steve Furber and Sophie Wilson to power Acorn’s computers as part of the BBC Micro project. Today, nearly 99% of smartphones use ARM processors due their efficiency, reduced heat generation and lower costs compared to rival processors. Some of the reasons for its success? Small budget, small team and brilliant minds.

“DeepSeek is to AI what ARM is to computer processor. DeepSeek founder Liang Wenfung did not have several hundred million pounds to invest in developing the DeepSeek LLM, the AI brain of DeepSeek, at least not that we know of. Restrictions on sale of powerful computing chips to China meant the DeepSeek team had to find clever and innovative ways to train AI models using limited computational resources. DeepSeek also had the advantage of learning from its predecessors such as ChatGPT, which dates to 2018 when GPT-1 was introduced.

“Just like the design of ARM processors, restrictions on access to powerful computing chips and a limited budget necessitated innovation in AI that resulted in DeepSeek. The instant popularity of DeepSeek is due to several factors. It costs a fraction of what it costs to use the more established Generative AI tools such as OpenAI’s ChatGPT, Google’s Gemini or Anthropic’s Claude. DeepSeek is Open Source which means third-party developers have flexibility to use it built other applications. What’s more, DeepSeek’s performance in terms of accuracy and computational efficiency is on par with – sometimes better than – its competitors.

“Whilst high performance and low-cost is a huge advantage over the likes of ChatGPT, there are questions on DeepSeek’s data collection and privacy policy.

“DeepSeek’s Privacy Policy states they collect user-provided information such as date of birth (where applicable), username, email address and/or telephone number, and password. Moreover, automatically collected data includes keystroke patterns or rhythms, which can be used as a biometric to identify individuals.

“The concern is not necessarily the collection of user-provided or the automatically collected data per say, because other Generative AI applications collect similar data. It’s DeepSeek’s legal and obligations and rights, which includes the requirement to “comply with applicable law, legal process or government requests, as consistent with internationally recognised standards”, that concerns the most. Given that information collected by DeepSeek is stored in servers located in the People’s Republic of China, personal data of UK users might not be protected by UK General Data Protection Regulation (GDPR).”

 

Dr Lukasz Piwek, Senior Lecturer in Data Science, and member of the University’s Institute for Digital Security and Behaviour, University of Bath, said:

“China’s release of the DeepSeek AI model marks a major disruption in the global AI race. It proves that cutting-edge AI capabilities can be achieved at a fraction of the typical budget —around $6 million compared to the tens of billions spent by U.S. firms. DeepSeek’s R1 model, which is also open-source, was trained with approximately 2,000 specialized Nvidia chips over 55 days, despite strict embargoes on China’s access to advanced AI hardware from the U.S. Reports suggest the development relied on a mix of stockpiled advanced chips paired with more cost-effective, less sophisticated hardware to reduce costs significantly. By contrast, similar U.S. AI models often utilize upwards of 16,000 chips. 

“This achievement also highlights the differences in regulatory environments. China’s relatively flexible regulatory approach to advanced technology enables rapid innovation but raises concerns about data privacy, potential misuse, and ethical implications, particularly for an open-source model like DeepSeek. In contrast, U.S. firms operate within stricter frameworks that emphasize oversight and governance, which may limit speed but provide additional safeguards.

“Coming on the heels of the recently announced $500 billion U.S. “Stargate Project” – a collaboration between OpenAI, SoftBank, and Oracle to invest in AI infrastructure over the next four years – DeepSeek underscores a stark contrast in strategies. It challenges assumptions about the massive spending required for AI innovation and serves as a wake-up call for the industry, signalling a broader shift in global competition and collaboration in AI development.”

 

Prof Neil Lawrence, DeepMind Professor of Machine Learning at Department of Computer Science and Technology, University of Cambridge, said:

“I think the progress is unsurprising, and I think it’s just the tip of the iceberg in terms of the type of innovation we can expect in these models. History shows that big firms struggle to innovate as they scale, and what we’ve seen from many of these big firms is a substitution of compute investment for the intellectual hard work. I’ve been suggesting that this has made the conditions ideal for a “Dreadnaught moment” where current technology is rapidly rendered redundant by new thinking. I don’t think DeepSeek is it, because the innovations deployed are relatively incremental, but it shows that we’re still in the age of the Newcomen engine, there’s plenty of space for budding James Watts to emerge, and that they are less likely to come from established players.”

 

Comment provided by the SMC pilot for Ireland:

Dr Deepak Padmanabhan, Senior Lecturer, School of Electronics, Electrical Engineering and Computer Science, Queen’s University Belfast, said:

“DeepSeek is causing massive disruption in financial markets. Mainstream narratives contrast the technology with ChatGPT and illustrate the differences in technological aspects. The far more long-reaching effect it would have would not be technological, it would be political, for it could disrupt the paradigms entrenched in the tech industry in substantive ways. There could be several aspects:

“Open-Source Software: DeepSeek’s code to train AI models is open source. This means that anybody can download the code and use it to develop their own AI. This is a significant step towards democratisation of AI. The open-source availability of code for an AI that competes well with contemporary commercial models is a significant change. Yet, if one is to download and run the code to develop their own AI, they would still need to have access to large datasets and tremendous computational power – but this is nevertheless a massive step forward.

“Computational Power: AI has been noted to pose massive computational requirements over the past decade leading to corporate dominance in AI research [ https://www.science.org/doi/10.1126/science.ade2420 ]. With massive compute requirements yielding well to monopolisation of the space, big tech, and the government funding landscape (that are in turn influenced by big tech) have shown limited interests in prioritising AI research towards reducing computational requirements. DeepSeek’s models have been noted to require far lesser computational requirements than today’s commercial models. This could potentially ignite new interest in reducing computational requirements for future AI, with positive effects towards environment.

“No plans for Commercialisation: It has been highlighted that DeepSeek has no plans for commercialisation [ https://www.chinatalk.media/p/deepseek-ceo-interview-with-chinas ]. This makes it a very interesting development in that this marks a moment when a player with qualitatively different ideas enters a commercially-dominated space. This is a change against the prevailing trends – OpenAI was noted as moving to a full commercial model (from a partly non-profit model) in recent times. It may be interesting how commercial players respond to this challenge.

“In other words, the entry of DeepSeek could potentially hasten a paradigm shift in AI and pose a real challenge to commercial dominance in the sector. It may be a little too far to see this as a pathway towards taking AI into public hands, but that’s the direction of travel that DeepSeek brings to the table.

“Cheaper AI, Pervasive AI: One of the potential first effects would be cheaper consumer AI, and a fall in the profit margins within the tech sector. But it could also accelerate disruption by making AI pervasive, bringing more sectors and more jobs under threat.

“Cautious Optimism: It may be tempting to hope that open-source AI would lead to effects similar to what was seen in the 1990s when the dominance of Microsoft’s windows was challenged very well by open-source Linux. Yet, AI is not just software and computational resources – there is data too. So, there are further hurdles to overcome. We could view this development with optimism, but we must be cautious. For example, the ethos of the open-source movement was diluted with corporate players substantively entering the system leading to what has been called a ‘Corporate dominance in Open Source Ecosystems’ [ https://dl.acm.org/doi/10.1145/3540250.3549117 ]. To develop, sustain and strengthen open-source ethos within AI would require many more developments in the same direction as DeepSeek.”

 

 

 

Declared interests

Prof Neil Lawrence: No conflicts.

Dr Padmanabhan: None

Dr Lukasz Piwek: No conflicts of interest. 

Professor Harin Sellahewa: No conflicts.

Prof Anthony G Cohn: No conflicts.

Dr Luo Mai: receives funding from ESPRC and recently ARIA to look at the scalability of AI: https://www.aria.org.uk/opportunity-spaces/nature-computes-better/scaling-compute/

 

in this section

filter RoundUps by year

search by tag