Review of Reference Generation Methods in Large Language Models

Priyaranjan Pattnayak, Amit Agarwal, Bhargava Kumar, Yeshil Bangera, Srikant Panda, Tejaswini Kumar, Hitesh Laxmichand Patel. IJAIML (2024).

Publisher

Abstract. Large Language Models (LLMs) are now central to a wide range of applications, from academic writing and legal analysis to scientific research. Yet, one area that has consistently challenged their broader adoption is the problem of accurate and verifiable citation generation. Hallucinated or inaccurate citations erode trust, so it is essential to create reliable methods of citation generation. This survey covers notable approaches used to improve citation generation in LLMs, including Retrieval-Augmented Generation (RAG), prompt engineering, instruction tuning, and incorporating external knowledge. We also cover emerging approaches such as multimodal citation generation using structured data and visual information for improved accuracy. A survey of evaluation metrics, benchmark datasets, and ethical concerns—such as biases, risks of misinformation, and transparency—identifies current limitations and possible areas of improvement.