Mindgard researchers uncovered critical vulnerabilities in Microsoft’s Azure AI Content Safety service, allowing attackers to bypass its safeguards and unleash harmful AI-generated content.
A UK-based cybersecurity-for-AI startup, Mindgard, discovered two critical security vulnerabilities in Microsoft’s Azure AI Content Safety Service in February 2024. These flaws, as per their research shared with Hackread.com, could allow attackers to bypass the service’s safety guardrails.
The vulnerabilities were responsibly disclosed to Microsoft in March 2024, and by October 2024, the company deployed “stronger mitigations” to reduce their impact. However, the details of it have only been shared by Mindgard now.
Understanding the Vulnerabilities
Azure AI Content Safety is a Microsoft Azure cloud-based service that helps developers create safety and security guardrails for AI applications by detecting and managing inappropriate content. It uses advanced techniques to filter harmful content, including hate speech and explicit/objectionable material. Azure OpenAI uses a Large Language Model (LLM) with Prompt Shield and AI Text Moderation guardrails to validate inputs and AI-generated content.
However, two security vulnerabilities were discovered within these guardrails, which protect AI models against jailbreaks and prompt injection. As per the research, attackers could circumvent both the AI Text Moderation and Prompt Shield guardrails and inject harmful content into the system, manipulate the model’s responses, or even compromise sensitive information.
Attack Techniques
According to Mindgard’s report, its researchers employed two primary attack techniques to bypass the guardrails including Character injection and Adversarial Machine Learning (AML).
Character injection:
It is a technique where text is manipulated by injecting or replacing characters with specific symbols or sequences. This can be done through diacritics, homoglyphs, numerical replacement, space injection, and zero-width characters. These subtle changes can deceive the model into misclassifying the content, allowing attackers to manipulate the model’s interpretation and disrupt the analysis. The goal is to deceive the guardrail into misclassifying the content.
Adversarial Machine Learning (AML):
AML involves manipulating input data through certain techniques to mislead the model’s predictions. These techniques include perturbation techniques, word substitution, misspelling, and other manipulations. By carefully selecting and perturbing words, attackers can cause the model to misinterpret the input’s intent.
Possible Consequences
The two techniques effectively bypassed AI text moderation safeguards, reducing detection accuracy by up to 100% and 58.49%, respectively. The exploitation of these vulnerabilities could lead to societal harm as it “can result in harmful or inappropriate input reaching the LLM, causing the model to generate responses that violate its ethical, safety, and security guidelines,” researchers wrote in their blog post shared exclusively with Hackread.com.
Moreover, it allows malicious actors to inject harmful content into AI-generated outputs, manipulate model behaviour, expose sensitive data, and exploit vulnerabilities to gain unauthorized access to sensitive information or systems.
“By exploiting the vulnerability to launch broader attacks, this could compromise the integrity and reputation of LLM-based systems and the applications that rely on them for data processing and decision-making,” researchers noted.
It is crucial for organizations to stay updated with the latest security patches and to implement additional security measures to protect their AI applications from such attacks.
- How to build a website with WordPress and what are the best plugins to use Building a website with WordPress is an excellent choice due to its versatility, ease of use, and a vast array of plugins that enhance functionality. Here’s a comprehensive guide to building a WordPress website, along with recommendations for the best plugins
- Top WordPress Plugins for Managing Ads and Monetizing Your Website Effectively: Why is Ads Management Important for Website Monetization? Strategic ad placement throughout the website enables publishers to maximize ad revenue while ensuring a positive user experience. The positioning of ads is critical in capturing users’ attention without being intrusive or disruptive. By understanding user behavior and preferences, publishers can make informed decisions regarding ad placement to ensure that the ads are relevant and engaging.
- Top Directory Plugins for WordPress to Create Professional Listings and Directories: If you are interested in establishing professional listings and directories on your WordPress website, the following information will be of value to you. This article will present the top directory plugins available for WordPress, which include GeoDirectory, Business Directory Plugin, Sabai Directory, Connections Business Directory, and Advanced Classifieds & Directory Pro.
- The Most Important Stages and Plugins for WordPress Website Development: Developing a WordPress website requires careful planning, execution, and optimisation to ensure it is functional, user-friendly, and effective. The process can be broken into key stages, and each stage benefits from specific plugins to enhance functionality and performance. Here’s a detailed guide to the most important stages of WordPress website development and the essential plugins for each stage.
- .org vs .com: A Top Guide to the Differences in Domain Extension
When you set up a website for a business or a non-profit organisation, you might think the most important part of the address is the actual name. But the domain extension (the bit that comes after the dot) is just as important for telling people what your site is all about. - The Best WordPress Plugins for Image Optimization to Improve Load Times and SEO. The pivotal element lies in image optimization. This discourse delves into the significance of image optimization for websites and its impact on load times. Furthermore, we will delve into the advantages of leveraging WordPress plugins for image optimization, such as streamlined optimization processes, enhanced SEO, expedited load times, and an enriched user experience.
- What is a data center or Internet data center? The term “data center” has become very common due to the role it plays in many of our daily activities. Most of the data we receive and send through our mobile phones, tablets and computers ends up stored in these data centers — which many people refer to as “the Cloud”, in a more generic way.
RELATED TOPICS
- Mirai botnet exploiting Azure OMIGOD vulnerabilities
- Microsoft AI Researchers Expose 38TB of Top Sensitive Data
- Phishing Attacks Bypass Microsoft 365 Email Safety Warnings
- Researchers access primary keys of Azure’s Cosmos DB Users
- Data Security: Congress Bans Staff Use of Microsoft’s AI Copilot
- New LLMjacking Attack Lets Hackers Hijack AI Models for Profit