AI Tools Lose Safety Awareness Over Time

A recent report reveals that AI systems gradually forget their safety protocols during long interactions, increasing the risk of harmful or inappropriate responses. Researchers found that a few simple prompts can break through most artificial intelligence guardrails.

Cisco Tests Chatbots Across Multiple Companies

Cisco analyzed large language models from OpenAI, Mistral, Meta, Google, Alibaba, Deepseek, and Microsoft. The team conducted 499 conversations using “multi-turn attacks,” in which users repeatedly questioned AI chatbots to bypass safety filters. Each dialogue included five to ten exchanges.

The researchers tracked how many prompts caused chatbots to reveal unsafe or illegal details, including private corporate data or misinformation. On average, chatbots gave malicious information in 64 percent of multi-question conversations but only 13 percent of single-question ones. Mistral’s Large Instruct model reached a 93 percent success rate, while Google’s Gemma stayed near 26 percent.

Open Models Shift Safety Responsibility

Cisco warned that multi-turn attacks could spread harmful content or let hackers steal confidential information. The study observed that AI systems often fail to apply safety guidelines consistently in longer chats, allowing attackers to refine their requests and bypass controls.

Mistral, along with Meta, Google, OpenAI, and Microsoft, uses open-weight models that reveal safety parameters to the public. Cisco reported that these open systems typically include fewer built-in safety features, leaving users responsible for maintaining protection when customizing models.

Cisco added that Google, Meta, Microsoft, and OpenAI claim to strengthen defenses against malicious fine-tuning. Despite these assurances, AI firms still face criticism for weak safety systems that enable criminal misuse. In one case, Anthropic confirmed that criminals exploited its Claude model to conduct large-scale data theft and extortion, demanding ransoms exceeding $500,000 (€433,000).

Latest Post

Pension income across Europe: Which countries offer the highest pensions?

Partial Epstein File Release Sparks Controversy Over Transparency

Joshua ends Paul experiment with sixth-round stoppage in Miami

AI Tools Lose Safety Awareness Over Time

Holiday Travel Faces Widespread Strikes Across Europe

New antibiotics hailed as turning point in fight against drug-resistant gonorrhoea

Australia and Europe Tighten Rules on Children’s Social Media Use

Europe Shelves Bold Plan to Fund Ukraine

TikTok Owner Reaches Deal to Avert United States Ban

Study Finds 10% of UK Over-70s Have Alzheimer’s-Like Brain Changes

Holiday Travel Faces Widespread Strikes Across Europe

Europe Shelves Bold Plan to Fund Ukraine

TikTok Owner Reaches Deal to Avert United States Ban

Study Finds 10% of UK Over-70s Have Alzheimer’s-Like Brain Changes

Holiday Travel Faces Widespread Strikes Across Europe

Categories

Important Links

Latest News

Pension income across Europe: Which countries offer the highest pensions?

Partial Epstein File Release Sparks Controversy Over Transparency

Joshua ends Paul experiment with sixth-round stoppage in Miami

They survived wildfires. But something else is killing Greece’s iconic fir forests

Latest Post

AI Tools Lose Safety Awareness Over Time

Cisco Tests Chatbots Across Multiple Companies

Open Models Shift Safety Responsibility

Related Posts

Categories

Important Links

Latest News