Hugging Face平台惊现100多个恶意AI/ML模型

admin 2024年3月6日07:52:29评论12 views字数 5202阅读17分20秒阅读模式
Hugging Face平台惊现100多个恶意AI/ML模型

As many as 100 malicious artificial intelligence (AI)/machine learning (ML) models have been discovered in the Hugging Face platform.

在Hugging Face平台上发现了多达100个恶意人工智能(AI)/机器学习(ML)模型。

These include instances where loading a pickle file leads to code execution, software supply chain security firm JFrog said.


"The model's payload grants the attacker a shell on the compromised machine, enabling them to gain full control over victims' machines through what is commonly referred to as a 'backdoor,'" senior security researcher David Cohen said.

高级安全研究员David Cohen表示:“该模型的有效载荷使攻击者在受损计算机上获得shell,从而使他们能够通过常被称为‘后门’的方式完全控制受害者的计算机。”

"This silent infiltration could potentially grant access to critical internal systems and pave the way for large-scale data breaches or even corporate espionage, impacting not just individual users but potentially entire organizations across the globe, all while leaving victims utterly unaware of their compromised state."


Specifically, the rogue model initiates a reverse shell connection to 210.117.212[.]93, an IP address that belongs to the Korea Research Environment Open Network (KREONET). Other repositories bearing the same payload have been observed connecting to other IP addresses.


In one case, the authors of the model urged users not to download it, raising the possibility that the publication may be the work of researchers or AI practitioners.


"However, a fundamental principle in security research is refraining from publishing real working exploits or malicious code," JFrog said. "This principle was breached when the malicious code attempted to connect back to a genuine IP address."


Hugging Face平台惊现100多个恶意AI/ML模型

The findings once again underscore the threat lurking within open-source repositories, which could be poisoned for nefarious activities.


From Supply Chain Risks to Zero-click Worms


They also come as researchers have devised efficient ways to generate prompts that can be used to elicit harmful responses from large-language models (LLMs) using a technique called beam search-based adversarial attack (BEAST).


In a related development, security researchers have developed what's known as a generative AI worm called Morris II that's capable of stealing data and spreading malware through multiple systems.

在相关发展中,安全研究人员开发了一种名为Morris II的生成式AI蠕虫,能够通过多个系统窃取数据和传播恶意软件。

Morris II, a twist on one of the oldest computer worms, leverages adversarial self-replicating prompts encoded into inputs such as images and text that, when processed by GenAI models, can trigger them to "replicate the input as output (replication) and engage in malicious activities (payload)," security researchers Stav Cohen, Ron Bitton, and Ben Nassi said.

Morris II,对最古老的计算机蠕虫之一的改编,利用对抗性自我复制提示编码到输入中,比如图像和文本,当由GenAI模型处理时,可以触发它们“将输入复制为输出(复制)并参与恶意活动(有效载荷)”,安全研究人员Stav Cohen,Ron Bitton和Ben Nassi说。

Even more troublingly, the models can be weaponized to deliver malicious inputs to new applications by exploiting the connectivity within the generative AI ecosystem.


Hugging Face平台惊现100多个恶意AI/ML模型

The attack technique, dubbed ComPromptMized, shares similarities with traditional approaches like buffer overflows and SQL injections owing to the fact that it embeds the code inside a query and data into regions known to hold executable code.


ComPromptMized impacts applications whose execution flow is reliant on the output of a generative AI service as well as those that use retrieval augmented generation (RAG), which combines text generation models with an information retrieval component to enrich query responses.


The study is not the first, nor will it be the last, to explore the idea of prompt injection as a way to attack LLMs and trick them into performing unintended actions.


Previously, academics have demonstrated attacks that use images and audio recordings to inject invisible "adversarial perturbations" into multi-modal LLMs that cause the model to output attacker-chosen text or instructions.


"The attacker may lure the victim to a webpage with an interesting image or send an email with an audio clip," Nassi, along with Eugene Bagdasaryan, Tsung-Yin Hsieh, and Vitaly Shmatikov, said in a paper published late last year.

“攻击者可以通过一个有趣的图像引诱受害者访问网页或发送一个带有音频剪辑的电子邮件。”Nassi和Eugene Bagdasaryan、Tsung-Yin Hsieh和Vitaly Shmatikov在去年晚些时候发表的一篇论文中表示。

"When the victim directly inputs the image or the clip into an isolated LLM and asks questions about it, the model will be steered by attacker-injected prompts."


Early last year, a group of researchers at Germany's CISPA Helmholtz Center for Information Security at Saarland University and Sequire Technology also uncovered how an attacker could exploit LLM models by strategically injecting hidden prompts into data (i.e., indirect prompt injection) that the model would likely retrieve when responding to user input.

德国CISPA Helmholtz信息安全研究中心和Saarland大学以及Sequire Technology的一组研究人员还发现了一种攻击者可以利用隐藏提示策略将隐藏提示注入到数据中(即间接提示注入),模型在回应用户输入时可能检索到这些提示。



原文始发于微信公众号(知机安全):Hugging Face平台惊现100多个恶意AI/ML模型

  • 左青龙
  • 微信扫一扫
  • weinxin
  • 右白虎
  • 微信扫一扫
  • weinxin
  • 本文由 发表于 2024年3月6日07:52:29
  • 转载请保留本文链接(CN-SEC中文网:感谢原作者辛苦付出):
                   Hugging Face平台惊现100多个恶意AI/ML模型


匿名网友 填写信息