Gemini could be used to hack itself (because why not)

Gemini could be used to hack itself (because why not)

AI technology is a helpful tool, but it could also be a powerful weapon. Ever since we discovered the abilities of generative AI, hackers have been using it for their own devious deeds. According to a new report, it looks like Gemini could be used to hack itself through a process called “Fun-tuning.”

One clever way that hackers trick LLMs is through a method called prompt injection. Basically, hackers can hide text inside a prompt that tricks the LLM into doing things it’s not supposed to. Some models can’t distinguish between user-created prompts and developer-created prompts. So, it’s easy to surreptitiously hide text within a prompt to fool a model.

Gemini could be used to hack itself

Don’t worry, this isn’t a story about people using this technique to cause widespread chaos. Rather, a team of researchers at UC San Diego and the University of Wisconsin discovered this. The team tested out a certain method of indirect prompt injection on several Gemini models with varying results. However, it then employed a method called “Fun-Tuning.”

It’s a play on the word fine-tuning, and it’s pretty effective at making a prompt more likely to fool a model. It involves encasing a prompt in text like “wandel ! ! ! !” or “formatted ! ASAP !” Just adding this text to the prompt actually increased the likelihood of it working by a large amount.

Using Gemini 1.5, Fun-Tuning caused a malicious prompt to be 65% likely to succeed. What’s scarier is that using this method with Gemini 1.0 Pro gave it an 80% success rate.

There’s a tool that Gemini uses to define how close a model’s response is to the intended result. It’s presented as a score, and people can use this score to help them fine-tune their prompts. In turn, one of Gemini’s own tools could be used to hack itself.

At this point, we don’t know if Google will address this issue, but it’s worth it for the company to do something. We don’t know how effective this method could be for Gemini 2.0 or Gemini 2.5 Pro, but it’s worth looking into.

📰 Crime Today News is proudly sponsored by DRYFRUIT.CO – A Brand by eFabby Global LLC

Design & Developed by Yes Mom Hosting

Crime Today News

Crime Today News is Hyderabad’s most trusted source for crime reports, political updates, and investigative journalism. We provide accurate, unbiased, and real-time news to keep you informed.

Related Posts