Prompt is self-contradictory, can a large model discover it? Shanghai Jiao Tong University's latest research reveals the answer

Prompt is self-contradictory, can a large model discover it? Shanghai Jiao Tong University's latest research reveals

2024-08-16

Contributed by Wang Dequan's research group at Shanghai Jiao Tong University
Quantum Bit | Public Account QbitAI

Professor Wang Dequan's research team at Shanghai Jiao Tong University raised this question in their latest research.

Imagine this scenario: a kindergarten kid holds a picture of a tiger and asks you: "This kitten is very cute, is it a female cat?" How would you answer?

You may not answer "yes" or "no" directly, but first point out the "contradiction" in this question -This picture is of a tiger, not a cat。

But there has been little systematic research on how large models would respond.

It is important to know that an AI model that cannot detect "instruction conflicts" will generate results for "questions that should not have answers", and no matter which side of the conflict the generated results lean towards, it will cause potential disasters and affect AI security and Superalignment.

In this latest study, the team proposedMultimodal Benchmarking——Paradoxical instruction set, and designed an innovativeAutomatic dataset creation framework, namedAutoCreate。

The team found that the multimodal large model was very inadequate in detecting contradictory user instructions, so they proposedCognitive awakening prompt method(CAP), injecting cognitive abilities from the external world and thus improving the ability to detect contradictions.

The paper will be published at the 18th European Conference on Computer Vision (ECCV) in October this year.

Can the large model detect conflicting instructions?

At present, multimodal large models have made great progress in scientific research and application. They can process a variety of data types including text and images, showing abilities similar to human cognition.

The team believes that the success of these models is due to a lot of research and development work, which enables them to follow human instructions closely, even somewhat "obediently".

In addition, these models are particularly good at long contexts. Multimodal large models such as Claude 3 and Gemini 1.5 Pro have demonstrated strong capabilities. The Claude 3 series models provide a context window of 200K tokens, and the standard context window size of Gemini 1.5 Pro is 128K, and can even reach 1M tokens during the private preview phase.

These advances have enabled large multimodal models to excel in handling complex tasks and meet the needs of humans for prolonged interactions.

However, as multimodal interactions deepen and context length increases, the problem of contradictory user instructions becomes more and more prominent.

As shown in the figure below, when users (such as children or language beginners) use these models, they are often unaware of the potential multimodal conflicts.

At the same time, as the number of conversation turns increases and the context window expands, it becomes difficult for users to remember all the details, leading to inconsistencies between instructions.

In addition, as the number of modalities increases, conflicts between modalities may also occur. Once these models lack self-awareness and the ability to identify contradictions, their performance will be affected.

To address these challenges, the research team proposed a multimodal benchmark:Paradoxical instruction set” (Self-Contradictory Instructions, SCI), which is used to evaluate the ability of multimodal large models to detect conflicting instructions.

SCI includes20,000 conflicting instructionsand8 missions, evenly distributed inLanguage-LanguageandVisual-LanguageIn both paradigms.

In the upper part of the figure, the language-language paradigm involves conflicts between contexts and directives, such as conflicting rules of design, conflicting object properties, exclusive directives, and forbidden words.

In the lower part of the figure: The visual-language paradigm covers multimodal conflicts, such as OCR text recognition conflicts, graphic conflicts, geometric conflicts, and semantic conflicts. Among the eight tasks, only semantic conflicts involve other datasets (ImageNet).

To give a specific example, when constructing semantic conflicts, researchers will first generate corresponding text based on the image, and then replace the key semantic information in the text with similar but different new semantics.

In the picture below, the picture contains an ostrich. The author adds the question "Does the picture depict the ostrich's size?" based on the picture semantics "ostrich".

Then, the key semantics of the question text, "ostrich", is replaced with "Kiwi". In this way, a pair of self-contradictory multimodal instructions is constructed.

Throughout the SCI construction process, the author designed an innovative automatic dataset creation framework——AutoCreate。

It builds a multimodal loop through programs and large language models. The framework uses programs and large language models to automate dataset creation.

AutoCreate starts with some task-related seed data and maintains a seed pool. In each cycle, AutoCreate includes two branches:Language branches(left) andVisual branch(Right). Each branch consists of a generator and a decorator.

Finally, the cleaner will exclude data that does not meet the standards. After passing the quality check of human experts, these data will be fed back to the seed pool for the next round.

AutoCreate greatly improves the construction speed and content breadth of SCI datasets.

How to improve the ability to detect contradictions?

Using the SCI dataset, the researchers comprehensively evaluated the performance of large models in handling contradictory instructions.

Experimental results show that current large models often show certain shortcomings when faced with contradictory instructions.

They can process information and knowledge, butLack of ability to evaluate the rationality of instructions, which the research team calls “cognitive” abilities.

This deficiency stems from a lack of self-awareness to recognize inconsistencies in instructions.

Therefore, the researchers proposed a simple insertion prompt method called "Cognitive Awakening Tips”（Cognitive Awakening Prompting, CAP）。

CAP is in inputAdd a simple reminder, it is possible to inject cognitive capabilities from the outside world, thereby improving the contradiction detection capabilities of large models with little negative impact.

This finding suggests that current multimodal large models require more self-awareness and cognitive abilities to better handle complex instruction conflicts.

For more details, interested friends can check out the original paper.

About the Author

The first author of the paper is a doctoral student at Shanghai Jiao Tong UniversityGao Jin。

His research interests include computer vision, multimodal large models, and AI-enabled life sciences.

The corresponding author of the paper is a tenured assistant professor and doctoral supervisor at Shanghai Jiao Tong University.Wang DequanHe received his undergraduate degree from Fudan University and his Ph.D. from the University of California, Berkeley, where he studied under Professor Trevor Darrell.

His research work has been published in top international conferences such as CVPR, ICCV, ECCV, ICLR, ICML, ICRA, and IROS. In the past five years, his papers have been cited more than 10,000 times on Google Scholar, with an H-index of 20.

Paper link: https://arxiv.org/abs/2408.01091
Project link: https://selfcontradiction.github.io/

news