Awesome-Agent-Hallucinations

Welcome to the Awesome-Agent-Hallucinations repository! This repository is a library of works related to Large Language Models (LLMs) based Agent Hallucinations. For a detailed introduction, please refer our survey paper:

✨LLM-based Agents Suffer from Hallucinations: A Survey of Taxonomy, Methods, and Directions✨

🤩An overview of agent goal completion.

Within the loop, the LLM-based agent carries out external behaviors including reasoning, execution, perception, and memorization, guided by its internal belief state.
Throughout this process, the environment dynamically evolves in response to the agent’s decisions, while task allocation within the LLM-based multi-agent system further enhances the fulfillment of user requirements.

🌟1 Reasoning Hallucinations

LLM+Map: Bimanual Robot Task Planning Using Large Language Models And Planning Domain Definition Language
Large language models for planning: A comprehensive and systematic survey
Plan-then-execute: An empirical study of user trust and team performance when using llm agents as a daily assistant
Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents
Divide and conquer in multi-agent planning
Hm-rag: Hierarchical multi-agent multimodal retrieval augmented generation
AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenge
REMAC: Self-Reflective and Self-Evolving Multi-Agent Collaboration for Long-Horizon Robot Manipulation
MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems
Beyond Bare Queries: Open-Vocabulary Object Grounding with 3D Scene Graph
Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions
Why Does ChatGPT Fall Short in Providing Truthful Answers?
Do LLMs Understand Ambiguity in Text? A Case Study in Open-world Question Answering
We‘re Afraid Language Models Aren‘t Modeling Ambiguity
Learning to Clarify: Multi-turn Conversations with Action-Based Contrastive Self-Training
Goal Awareness for Conversational AI: Proactivity, Non-collaborativity, and Beyond
Active Task Disambiguation with LLMs
Conversational health agents: A personalized llm-powered agent framework
Opera: Alleviating hallucination in multi-modal large language models via over-trust penalty and retrospection-allocation
Navigating the overkill in large language models
Mitigating Hallucinations in Multimodal Spatial Relations through Constraint-Aware Prompting
AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents
Agents of Change: Self-Evolving LLM Agents for Strategic Planning
Know Where You’re Uncertain When Planning with Multimodal Foundation Models: A Formal Framework
Large language models for planning: A comprehensive and systematic survey
BeamAggR: Beam Aggregation Reasoning over Multi-source Knowledge for Multi-hop Question Answering
Rag-kg-il: A multi-agent hybrid framework for reducing hallucinations and enhancing llm reasoning through rag and incremental knowledge graph learning integration
Large language models as commonsense knowledge for large-scale task planning
Knowagent: Knowledge-augmented planning for llm-based agents
Text-image alignment for diffusion-based perception
Vision-language model-based physical reasoning for robot liquid perception
VECSR: Virtually Embodied Common Sense Reasoning System
Web agents with world models: Learning and leveraging environment dynamics in web navigation
RbFT: Robust Fine-tuning for Retrieval-Augmented Generation against Retrieval Defects
Large language models as commonsense knowledge for large-scale task planning
Focus on your question! interpreting and mitigating toxic cot problems in commonsense reasoning
Knowledge Verification to Nip Hallucination in the Bud
Neural-symbolic methods for knowledge graph reasoning: A survey
Contrastive chain-of-thought prompting
Can llm be a good path planner based on prompt engineering? mitigating the hallucination for path planning
Insight-v: Exploring long-chain visual reasoning with multimodal large language models
Do i know this entity? knowledge awareness and hallucinations in language models
KnowMap: Efficient Knowledge-Driven Task Adaptation for LLMs
Robotouille: An Asynchronous Planning Benchmark for LLM Agents
Review of Case-Based Reasoning for LLM Agents: Theoretical Foundations, Architectural Components, and Cognitive Integration
Uncertainty of thoughts: Uncertainty-aware planning enhances information seeking in large language models
Language models, agent models, and world models: The law for machine reasoning and planning
Towards reasoning in large language models: A survey
Agent Capability Negotiation and Binding Protocol (ACNBP)
Enhance Mobile Agents Thinking Process Via Iterative Preference Learning
TARGA: Targeted Synthetic Data Generation for Practical Reasoning over Structured Data
Attentive Reasoning Queries: A Systematic Method for Optimizing Instruction-Following in Large Language Models
Lisa: Reasoning segmentation via large language model
Focus on your question! interpreting and mitigating toxic cot problems in commonsense reasoning
WorldEval: World Model as Real-World Robot Policies Evaluator
Introspective Planning: Aligning Robots’ Uncertainty with Inherent Task Ambiguity
We‘re Afraid Language Models Aren‘t Modeling Ambiguity
A Survey: Learning Embodied Intelligence from Physical Simulators and World Models
The flan collection: Designing data and methods for effective instruction tuning
Mars-PO: Multi-Agent Reasoning System Preference Optimization
Reasoning on graphs: Faithful and interpretable large language model reasoning

🌟2 Execution Hallucinations

Granite-Function Calling Model: Introducing Function Calling Abilities via Multi-task Learning of Granular Tasks
APILOT: Navigating Large Language Models to Generate Secure Code by Sidestepping Outdated API Pitfalls
Toolace: Winning the points of llm function calling
ALGO: Synthesizing Algorithmic Programs with Generated Oracle Verifiers
ACEBench: Who Wins the Match Point in Tool Learning?
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs
Learning Evolving Tools for Large Language Models
Tool Unlearning for Tool-Augmented LLMs
Enhancing Tool Learning in Large Language Models with Hierarchical Error Checklists
ToolRegistry: A Protocol-Agnostic Tool Management Library for Function-Calling LLMs
Retool: Reinforcement learning for strategic tool use in llms
Geckopt: Llm system efficiency via intent-based tool selection
Rag-mcp: Mitigating prompt bloat in llm tool selection via retrieval-augmented generation
Confucius: Iterative Tool Learning from Introspection Feedback by Easy-to-Difficult Curriculum
CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing
RepoAudit: An Autonomous LLM-Agent for Repository-Level Code Auditing
NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models
GenTool: Enhancing Tool Generalization in Language Models through Zero-to-One and Weak-to-Strong Simulation
Tool Documentation Enables Zero-Shot Tool-Usage with Large Language Models
An llm compiler for parallel function calling
ControlLLM: Augment Language Models with Tools by Searching on Graphs
OctoTools: An Agentic Framework with Extensible Tools for Complex Reasoning
Graph RAG-Tool Fusion
ScaleMCP: Dynamic and Auto-Synchronizing Model Context Protocol Tools for LLM Agents

🌟3 Perception Hallucinations

Comparative Analysis of AI Agent Architectures for Entity Relationship Classification
Ferret: Refer and ground anything anywhere at any granularity
Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion
Re-Aligning Language to Visual Objects with an Agentic Workflow
Mitigating hallucination in visual language models with visual supervision
Multi-news+: Cost-efficient dataset cleansing via llm-based data annotation
Editing Factual Knowledge in Language Models
Seeing is believing: Mitigating hallucination in large vision-language models via clip-guided decoding
EMMA: Efficient Visual Alignment in Multi-Modal LLMs
Asynchronous LLM Function Calling
HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models
Detecting and preventing hallucinations in large vision language models
ContextualLVLM-Agent: A Holistic Framework for Multi-Turn Visually-Grounded Dialogue and Complex Instruction Following
G-retriever: Retrieval-augmented generation for textual graph understanding and question answering
Causal-LLaVA: Causal Disentanglement for Mitigating Hallucination in Multimodal Large Language Models
Vcoder: Versatile vision encoders for multimodal large language models
Hydra: An Agentic Reasoning Approach for Enhancing Adversarial Robustness and Mitigating Hallucinations in Vision-Language Models
Text-image alignment for diffusion-based perception
Vision-language model-based physical reasoning for robot liquid perception
Large language model agent for fake news detection
Beyond Bare Queries: Open-Vocabulary Object Grounding with 3D Scene Graph

🌟4 Memorization Hallucinations

Knowledge unlearning for llms: Tasks, methods, and challenges
MIRIX: Multi-Agent Memory System for LLM-Based Agents
What You Want to Forget: Efficient Unlearning for LLMs
Improving factuality with explicit working memory
Compress to Impress: Unleashing the Potential of Compressive Memory in Real-World Long-Term Conversations
Flashattention: Fast and memory-efficient exact attention with io-awareness
Who’s harry potter? approximate unlearning for LLMs
Alphaedit: Null-space constrained model editing for language models
Does fine-tuning LLMs on new knowledge encourage hallucinations?
Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue
Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions
Memory sandbox: Transparent and interactive memory management for conversational agents
Editing models with task arithmetic
AIstorian lets AI be a historian: A KG-powered multi-agent system for accurate biography generation
PMET: Precise Model Editing in a Transformer
Refine Knowledge of Large Language Models via Adaptive Contrastive Learning
Memos:A memory os for ai system
Unveiling the pitfalls of knowledge editing for large language models

🌟5 Communication Hallucinations

Autonomous Agents for Collaborative Task under Information Asymmetry
LLMs Working in Harmony: A Survey on the Technological Aspects of Building Effective LLM-Based Multi Agent Systems
Cautiously-optimistic knowledge sharing for cooperative multi-agent reinforcement learning
Challenges in Human-Agent Communication
Preventing Rogue Agents Improves Multi-Agent Collaboration
Why Do Multi-Agent LLM Systems Fail?
Learning to Clarify: Multi-turn Conversations with Action-Based Contrastive Self-Training
MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents
AgentDropout: Dynamic Agent Elimination for Token-Efficient and High-Performance LLM-Based Multi-Agent Collaboration
Chateval: Towards better llm-based evaluators through multi-agent debate
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors
Internet of agents: Weaving a web of heterogeneous agents for collaborative intelligence
Optima: Optimizing effectiveness and efficiency for llm-based multi-agent system
Goal Awareness for Conversational AI: Proactivity, Non-collaborativity, and Beyond
Enhancing Multi-Agent Consensus through Third-Party LLM Integration: Analyzing Uncertainty and Mitigating Hallucinations in Large Language Models
A survey of agent interoperability protocols: Model Context Protocol(MCP), Agent Communication Protocol (ACP), Agent-to-Agent Protocol (A2A), and Agent Network Protocol (ANP)
Heterogeneous Swarms: Jointly Optimizing Model Roles and Weights for Multi-LLM Systems
Embodied llm agents learn to cooperate in organized teams
MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework
CP-AgentNet: Autonomous and Explainable Communication Protocol Design Using Generative Agents
Camel: Communicative agents for" mind" exploration of large language model society
Theory of mind for multi-agent collaboration via large language models
Encouraging divergentthinking in large language models through multi-agent debate
A Dynamic LLM-Powered Agent Network for Task-Oriented Agent Collaboration
Merge, ensemble, and cooperate! a survey on collaborative strategies in the era of large language models

🌟Mitigating Hallucination

Calibrating Verbal Uncertainty as a Linear Feature to Reduce Hallucinations
Towards Mitigating LLM Hallucination via Self Reflection
Hallucination augmented contrastive learning for multimodal large language model
Tug-of-war between knowledge: Exploring and resolving knowledge conflicts in retrieval-augmented language models
Cutting off the head ends the conflict: A mechanism for interpreting and mitigating knowledge conflicts in language models
Simulation Agent: A Framework for Integrating Simulation and Large Language Models for Enhanced Decision-Making
Contrastive representation learning: A framework and review
Volcano: Mitigating Multimodal Hallucination through Self-Feedback Guided Revision
Mitigating object hallucinations in large vision-language models through visual contrastive decoding
Banishing LLM hallucinations requires rethinking generalization
Treble counterfactual vlms: A causal approach to hallucination
When hindsight is not 20/20: Testing limits on reflective thinking in large language models
Contrastive modality-disentangled learning for multimodal recommendation
Generative Causality-driven Network for Graph Multi-task Learning
Advances and challenges in foundation agents: From brain-inspired intelligence to evolutionary, collaborative, and safe systems
Mitigating hallucination in large multi-modal models via robust instruction tuning
Reducing hallucinations in large vision-language models via latent space steering
Visual-rft: Visual reinforcement fine-tuning
Agentauditor: Human-level safety and security evaluation for llm agents

🌟Detecting Hallucination

Lettucedetect: A hallucination detection framework for rag applications
Enhancing Mathematical Reasoning in Large Language Models with Self-Consistency-Based Hallucination Detection
SEVADE: Self-Evolving Multi-Agent Analysis with Decoupled Evaluation for Hallucination-Resistant Irony Detection

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
imgs		imgs
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome-Agent-Hallucinations

🌟1 Reasoning Hallucinations

🌟2 Execution Hallucinations

🌟3 Perception Hallucinations

🌟4 Memorization Hallucinations

🌟5 Communication Hallucinations

🌟Mitigating Hallucination

🌟Detecting Hallucination

About

Uh oh!

Releases

Packages

ASCII-LAB/Awesome-Agent-Hallucinations

Folders and files

Latest commit

History

Repository files navigation

Awesome-Agent-Hallucinations

🌟1 Reasoning Hallucinations

🌟2 Execution Hallucinations

🌟3 Perception Hallucinations

🌟4 Memorization Hallucinations

🌟5 Communication Hallucinations

🌟Mitigating Hallucination

🌟Detecting Hallucination

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages