THINK: Tackling API Hallucinations in LLMs via Injecting Knowledge
Large language models (LLMs) have made significant strides in code generation but often struggle with API hallucination issues, especially for the third-party library. Existing approaches attempt to enhance LLMs by incorporating documentation. However, they face three main challenges: the introduction of irrelevant information that distracts the model; reliance solely on documentation that results in discrepancies between API descriptions and practical usage; and the absence of comprehensive error post-processing mechanisms. To address these challenges, we propose THINK, a knowledge injection method that leverages a custom API knowledge database with two phases: pre-execution retrieval enhancement and post-execution optimization. The former reduces irrelevant information and integrates multiple knowledge sources, while the latter identifies seven API error types and suggests three heuristic correction strategies. We manually construct a benchmark by collecting and filtering complex API-related tasks from GitHub to evaluate the effectiveness of our method. The experimental results demonstrate that our method can significantly improve the correctness of API usage in the context of LLMs. We reduce the error rate of programs from 61.18% to 16.64% for GPT-3.5 and from 41.49% to 5.58% for GPT-4o across tasks involving different libraries.