Researchers from Beihang University, Meituan Inc., Beijing, China have developed a novel framework, the Knowledge-Driven Chain-of-Thought (KD-CoT), designed to improve the reasoning performance of Large Language Models (LLMs) in knowledge-intensive Knowledge Base Question Answering (KBQA) tasks. Despite the advancements in LLMs equipped with Chain-of-Thought (CoT), these models often encounter hallucinations and unfaithful intermediate reasoning steps, particularly in knowledge-intensive tasks like KBQA. The KD-CoT framework addresses these challenges by verifying and modifying reasoning traces in CoT through interaction with external knowledge, thereby overcoming hallucinations and error propagation.
The KD-CoT framework encourages LLMs to generate verbal reasoning traces and facilitates dynamic reasoning by verifying and adjusting intermediate reasoning steps using external knowledge. Additionally, the researchers created a KBQA CoT collection that can be applied for in-context learning (ICL) and to train a robust retriever using the constructed collection. Experimental results on WebQSP and ComplexWebQuestion datasets demonstrated the effectiveness of the KD-CoT framework, achieving significant improvements in Hit@1 scores compared to the vanilla ICL method.
The researchers also highlighted the current limitations of LLMs, which include hallucinations or lack of knowledge while solving knowledge-intensive tasks, leading to erroneous subsequent reasoning steps and incorrect final answers. They noted that while previous works have enhanced the retriever’s capability to retrieve external knowledge or improved the reader’s ability to extract answers from the retrieved knowledge, few have leveraged the understanding and reasoning capabilities of LLMs to address complex multi-hop KBQA tasks or investigated the problem of model hallucination for encyclopedic knowledge.
The KD-CoT framework addresses these gaps by utilizing a QA system to access external knowledge and provide high-quality answers to LLMs for solving knowledge-intensive KBQA tasks. The main contributions of the research include the creation of a KBQA CoT collection by prompting LLMs, the proposal of a retriever-reader-verifier QA system to access external knowledge and interact with LLMs, and the introduction of the KD-CoT framework to improve the reasoning performance of large language models.The researchers concluded by emphasizing that the KD-CoT framework leads to superior performance with interpretable inference steps on knowledge-intensive KBQA tasks. They also noted the potential utility of the CoT collection for CoT fine-tuning and few-shots in-context learning, as well as the substantial improvement in Hit scores for retrieved knowledge achieved by the new training approach for developing a robust retriever.
To know more, check the paper