
MIT team developed a text-based system that tricks Google’s AI
Artificial Intelligence of Google to detect images mistaking a turtle for a gun to Jigsaw’s AI to score toxic comments tricked to think a sentence is positive by including words like love.
A new system was developed by MIT researchers called TextFooler that can trick AI models that use natural language processing (NLP) — like the ones used by Siri and Alexa.
TextFoolers AI model will attack an NLP model to check how it handles the altered input text classification and entailment looks for important words that carry heavy ranking weightage for a particular NLP model.
They successfully fooled three existing models including the popular open-sourced language model called BERT, which is developed by folks at Google.
Text-based models are used in the areas of email spam filtering, hate speech flagging, or “sensitive” political speech text detection.
Hence they conclude saying that it’ll take a lot more training to perfect language-based AI models before they can tackle complex tasks like moderating online forums.