Welcome to the Awesome-Agent-Hallucinations repository! This repository is a library of works related to Large Language Models (LLMs) based Agent Hallucinations. For a detailed introduction, please refer our survey paper:
✨LLM-based Agents Suffer from Hallucinations: A Survey of Taxonomy, Methods, and Directions✨
🤩An overview of agent goal completion.- Within the loop, the LLM-based agent carries out external behaviors including reasoning, execution, perception, and memorization, guided by its internal belief state.
- Throughout this process, the environment dynamically evolves in response to the agent’s decisions, while task allocation within the LLM-based multi-agent system further enhances the fulfillment of user requirements.
-
LLM+Map: Bimanual Robot Task Planning Using Large Language Models And Planning Domain Definition Language
-
Large language models for planning: A comprehensive and systematic survey
-
Plan-then-execute: An empirical study of user trust and team performance when using llm agents as a daily assistant
-
Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents
-
Hm-rag: Hierarchical multi-agent multimodal retrieval augmented generation
-
AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenge
-
REMAC: Self-Reflective and Self-Evolving Multi-Agent Collaboration for Long-Horizon Robot Manipulation
-
MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems
-
Beyond Bare Queries: Open-Vocabulary Object Grounding with 3D Scene Graph
-
Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions
-
Do LLMs Understand Ambiguity in Text? A Case Study in Open-world Question Answering
-
Learning to Clarify: Multi-turn Conversations with Action-Based Contrastive Self-Training
-
Goal Awareness for Conversational AI: Proactivity, Non-collaborativity, and Beyond
-
Conversational health agents: A personalized llm-powered agent framework
-
Opera: Alleviating hallucination in multi-modal large language models via over-trust penalty and retrospection-allocation
-
Mitigating Hallucinations in Multimodal Spatial Relations through Constraint-Aware Prompting
-
AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents
-
Agents of Change: Self-Evolving LLM Agents for Strategic Planning
-
Know Where You’re Uncertain When Planning with Multimodal Foundation Models: A Formal Framework
-
Large language models for planning: A comprehensive and systematic survey
-
BeamAggR: Beam Aggregation Reasoning over Multi-source Knowledge for Multi-hop Question Answering
-
Rag-kg-il: A multi-agent hybrid framework for reducing hallucinations and enhancing llm reasoning through rag and incremental knowledge graph learning integration
-
Large language models as commonsense knowledge for large-scale task planning
-
Knowagent: Knowledge-augmented planning for llm-based agents
-
Vision-language model-based physical reasoning for robot liquid perception
-
VECSR: Virtually Embodied Common Sense Reasoning System
-
Web agents with world models: Learning and leveraging environment dynamics in web navigation
-
RbFT: Robust Fine-tuning for Retrieval-Augmented Generation against Retrieval Defects
-
Large language models as commonsense knowledge for large-scale task planning
-
Focus on your question! interpreting and mitigating toxic cot problems in commonsense reasoning
-
Neural-symbolic methods for knowledge graph reasoning: A survey
-
Contrastive chain-of-thought prompting
-
Can llm be a good path planner based on prompt engineering? mitigating the hallucination for path planning
-
Insight-v: Exploring long-chain visual reasoning with multimodal large language models
-
Do i know this entity? knowledge awareness and hallucinations in language models
-
KnowMap: Efficient Knowledge-Driven Task Adaptation for LLMs
-
Robotouille: An Asynchronous Planning Benchmark for LLM Agents
-
Review of Case-Based Reasoning for LLM Agents: Theoretical Foundations, Architectural Components, and Cognitive Integration
-
Uncertainty of thoughts: Uncertainty-aware planning enhances information seeking in large language models
-
Language models, agent models, and world models: The law for machine reasoning and planning
-
Towards reasoning in large language models: A survey
-
Agent Capability Negotiation and Binding Protocol (ACNBP)
-
Enhance Mobile Agents Thinking Process Via Iterative Preference Learning
-
TARGA: Targeted Synthetic Data Generation for Practical Reasoning over Structured Data
-
Attentive Reasoning Queries: A Systematic Method for Optimizing Instruction-Following in Large Language Models
-
Lisa: Reasoning segmentation via large language model
-
Focus on your question! interpreting and mitigating toxic cot problems in commonsense reasoning
-
WorldEval: World Model as Real-World Robot Policies Evaluator
-
Introspective Planning: Aligning Robots’ Uncertainty with Inherent Task Ambiguity
-
We‘re Afraid Language Models Aren‘t Modeling Ambiguity
-
A Survey: Learning Embodied Intelligence from Physical Simulators and World Models
-
The flan collection: Designing data and methods for effective instruction tuning
-
Mars-PO: Multi-Agent Reasoning System Preference Optimization
-
Reasoning on graphs: Faithful and interpretable large language model reasoning
- Granite-Function Calling Model: Introducing Function Calling Abilities via Multi-task Learning of Granular Tasks
- APILOT: Navigating Large Language Models to Generate Secure Code by Sidestepping Outdated API Pitfalls
- Toolace: Winning the points of llm function calling
- ALGO: Synthesizing Algorithmic Programs with Generated Oracle Verifiers
- ACEBench: Who Wins the Match Point in Tool Learning?
- ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs
- Learning Evolving Tools for Large Language Models
- Tool Unlearning for Tool-Augmented LLMs
- Enhancing Tool Learning in Large Language Models with Hierarchical Error Checklists
- ToolRegistry: A Protocol-Agnostic Tool Management Library for Function-Calling LLMs
- Retool: Reinforcement learning for strategic tool use in llms
- Geckopt: Llm system efficiency via intent-based tool selection
- Rag-mcp: Mitigating prompt bloat in llm tool selection via retrieval-augmented generation
- Confucius: Iterative Tool Learning from Introspection Feedback by Easy-to-Difficult Curriculum
- CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing
- RepoAudit: An Autonomous LLM-Agent for Repository-Level Code Auditing
- NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models
- GenTool: Enhancing Tool Generalization in Language Models through Zero-to-One and Weak-to-Strong Simulation
- Tool Documentation Enables Zero-Shot Tool-Usage with Large Language Models
- An llm compiler for parallel function calling
- ControlLLM: Augment Language Models with Tools by Searching on Graphs
- OctoTools: An Agentic Framework with Extensible Tools for Complex Reasoning
- Graph RAG-Tool Fusion
- ScaleMCP: Dynamic and Auto-Synchronizing Model Context Protocol Tools for LLM Agents
- Comparative Analysis of AI Agent Architectures for Entity Relationship Classification
- Ferret: Refer and ground anything anywhere at any granularity
- Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion
- Re-Aligning Language to Visual Objects with an Agentic Workflow
- Mitigating hallucination in visual language models with visual supervision
- Multi-news+: Cost-efficient dataset cleansing via llm-based data annotation
- Editing Factual Knowledge in Language Models
- Seeing is believing: Mitigating hallucination in large vision-language models via clip-guided decoding
- EMMA: Efficient Visual Alignment in Multi-Modal LLMs
- Asynchronous LLM Function Calling
- HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models
- Detecting and preventing hallucinations in large vision language models
- ContextualLVLM-Agent: A Holistic Framework for Multi-Turn Visually-Grounded Dialogue and Complex Instruction Following
- G-retriever: Retrieval-augmented generation for textual graph understanding and question answering
- Causal-LLaVA: Causal Disentanglement for Mitigating Hallucination in Multimodal Large Language Models
- Vcoder: Versatile vision encoders for multimodal large language models
- Hydra: An Agentic Reasoning Approach for Enhancing Adversarial Robustness and Mitigating Hallucinations in Vision-Language Models
- Text-image alignment for diffusion-based perception
- Vision-language model-based physical reasoning for robot liquid perception
- Large language model agent for fake news detection
- Beyond Bare Queries: Open-Vocabulary Object Grounding with 3D Scene Graph
- Knowledge unlearning for llms: Tasks, methods, and challenges
- MIRIX: Multi-Agent Memory System for LLM-Based Agents
- What You Want to Forget: Efficient Unlearning for LLMs
- Improving factuality with explicit working memory
- Compress to Impress: Unleashing the Potential of Compressive Memory in Real-World Long-Term Conversations
- Flashattention: Fast and memory-efficient exact attention with io-awareness
- Who’s harry potter? approximate unlearning for LLMs
- Alphaedit: Null-space constrained model editing for language models
- Does fine-tuning LLMs on new knowledge encourage hallucinations?
- Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue
- Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions
- Memory sandbox: Transparent and interactive memory management for conversational agents
- Editing models with task arithmetic
- AIstorian lets AI be a historian: A KG-powered multi-agent system for accurate biography generation
- PMET: Precise Model Editing in a Transformer
- Refine Knowledge of Large Language Models via Adaptive Contrastive Learning
- Memos:A memory os for ai system
- Unveiling the pitfalls of knowledge editing for large language models
- Autonomous Agents for Collaborative Task under Information Asymmetry
- LLMs Working in Harmony: A Survey on the Technological Aspects of Building Effective LLM-Based Multi Agent Systems
- Cautiously-optimistic knowledge sharing for cooperative multi-agent reinforcement learning
- Challenges in Human-Agent Communication
- Preventing Rogue Agents Improves Multi-Agent Collaboration
- Why Do Multi-Agent LLM Systems Fail?
- Learning to Clarify: Multi-turn Conversations with Action-Based Contrastive Self-Training
- MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents
- AgentDropout: Dynamic Agent Elimination for Token-Efficient and High-Performance LLM-Based Multi-Agent Collaboration
- Chateval: Towards better llm-based evaluators through multi-agent debate
- AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors
- Internet of agents: Weaving a web of heterogeneous agents for collaborative intelligence
- Optima: Optimizing effectiveness and efficiency for llm-based multi-agent system
- Goal Awareness for Conversational AI: Proactivity, Non-collaborativity, and Beyond
- Enhancing Multi-Agent Consensus through Third-Party LLM Integration: Analyzing Uncertainty and Mitigating Hallucinations in Large Language Models
- A survey of agent interoperability protocols: Model Context Protocol(MCP), Agent Communication Protocol (ACP), Agent-to-Agent Protocol (A2A), and Agent Network Protocol (ANP)
- Heterogeneous Swarms: Jointly Optimizing Model Roles and Weights for Multi-LLM Systems
- Embodied llm agents learn to cooperate in organized teams
- MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework
- CP-AgentNet: Autonomous and Explainable Communication Protocol Design Using Generative Agents
- Camel: Communicative agents for" mind" exploration of large language model society
- Theory of mind for multi-agent collaboration via large language models
- Encouraging divergentthinking in large language models through multi-agent debate
- A Dynamic LLM-Powered Agent Network for Task-Oriented Agent Collaboration
- Merge, ensemble, and cooperate! a survey on collaborative strategies in the era of large language models
- Calibrating Verbal Uncertainty as a Linear Feature to Reduce Hallucinations
- Towards Mitigating LLM Hallucination via Self Reflection
- Hallucination augmented contrastive learning for multimodal large language model
- Tug-of-war between knowledge: Exploring and resolving knowledge conflicts in retrieval-augmented language models
- Cutting off the head ends the conflict: A mechanism for interpreting and mitigating knowledge conflicts in language models
- Simulation Agent: A Framework for Integrating Simulation and Large Language Models for Enhanced Decision-Making
- Contrastive representation learning: A framework and review
- Volcano: Mitigating Multimodal Hallucination through Self-Feedback Guided Revision
- Mitigating object hallucinations in large vision-language models through visual contrastive decoding
- Banishing LLM hallucinations requires rethinking generalization
- Treble counterfactual vlms: A causal approach to hallucination
- When hindsight is not 20/20: Testing limits on reflective thinking in large language models
- Contrastive modality-disentangled learning for multimodal recommendation
- Generative Causality-driven Network for Graph Multi-task Learning
- Advances and challenges in foundation agents: From brain-inspired intelligence to evolutionary, collaborative, and safe systems
- Mitigating hallucination in large multi-modal models via robust instruction tuning
- Reducing hallucinations in large vision-language models via latent space steering
- Visual-rft: Visual reinforcement fine-tuning
- Agentauditor: Human-level safety and security evaluation for llm agents
- Lettucedetect: A hallucination detection framework for rag applications
- Enhancing Mathematical Reasoning in Large Language Models with Self-Consistency-Based Hallucination Detection
- SEVADE: Self-Evolving Multi-Agent Analysis with Decoupled Evaluation for Hallucination-Resistant Irony Detection
