NeurIPS “Hallucinated Citations” Are a Wake-Up Call — What the Official LLM Policy Actually Requires (and a 5-Minute Compliance Routine)
Executive Summary
2.What NeurIPS allows: NeurIPS explicitly permits AI/LLM use in paper preparation and does not require disclosure when LLMs are used only for grammar, spelling, or general language editing.
3.When disclosure is required: If an LLM is an important, original, or non-standard part of the research methodology, its use must be described in the paper.
4.Where the line is drawn: NeurIPS states authors are fully responsible for all content, including references, and lists “using references generated by an LLM without due diligence” as an example of a policy violation.
5.Practical takeaway: Compliance depends less on whether AI was used and more on whether authors can demonstrate reasonable verification, especially for citations.
1. What happened (facts vs. claims)

1.1 GPTZero’s reported findings
In January 2026, GPTZero reported that it scanned 4,841 NeurIPS 2025 accepted papers and identified 100 confirmed hallucinated citations, spanning 51 or more papers (the post uses multiple count phrasings). The investigation includes a publicly shared table and states that flagged entries were verified by a human expert.
1.2 Public context from NeurIPS
NeurIPS program chairs reported 21,575 submissions to the 2025 main track and 5,290 accepted papers (~24.52% acceptance), illustrating the scale at which peer review operates.
Separately, NeurIPS maintains an official Policy on the Use of Large Language Models, which governs how AI tools may be used and what responsibilities remain with authors.
1.3 Important caveats
Independent coverage has emphasized that 100 citations represent a very small fraction of the total citations across thousands of papers, and that citation errors—while serious—do not automatically invalidate an entire study. The policy relevance lies in preventability, not scale.
2. What is a “hallucinated citation”?
Terminology is different so we need to be clear.
● Hallucinated citation: A citation that cannot be verified as an existing paper, or for which the bibliographic information (authors, title, venue, year) does not match a real, traceable paper.
● Inaccurate or mismatched citation: There is a paper with that citation, but important metadata is wrong (e.g. wrong year, missing authors, wrong venue, broken DOI/URL).
● “Vibe citing” (informal): A citation that seems possibly valid, but does not clearly support the claim it is attached to, or springs phonetic elements from multiple papers.
The NeurIPS policy isn’t about labels, it’s about author accountability and verification.
3. NeurIPS’ official policy on AI-assisted writing (what is actually allowed)
3.1 Is AI use allowed?
Yes. NeurIPS states that authors are welcome to use any tools, including LLMs, to prepare high-quality papers.
3.2 When is disclosure not required?
NeurIPS is explicit (LLM Policy):
If you use tools (including LLMs) for editing purposes (e.g., checking grammar), you do not need to declare it in your manuscript.
This includes:
● Grammar and spelling checks
● Clarity and language polishing
● Formatting or minor writing assistance
3.3 When must AI use be disclosed?
Disclosure is required only when LLMs are an important, original, or non-standard component of the research methodology. In such cases, the use should be described in the experimental setup or an equivalent section.
Examples include:
● Using an LLM as part of the algorithm or inference pipeline
● Generating training data with an LLM
● Using an LLM to produce experimental outputs that affect conclusions
3.4 Who is responsible for errors?
NeurIPS is unambiguous (LLM Policy):
Authors are responsible for the entire content of the paper, including all text, figures, and references.
The policy explicitly lists the following as a violation example (LLM Policy):
Using references generated by an LLM without conducting due diligence to verify correctness, existence, and appropriateness in context.
4. A practical 5-minute citation check routine
This routine is designed to remove avoidable risk under time pressure.
Step 1: Triage
Highlight references that:
● Lack a DOI or stable URL
● Lead to dead or unrelated links
● Have unusually generic titles/authors
● Are weakly connected to the claim they support
Step 2: Existence check
For each highlighted item:
1. Search the exact title
2. Locate a stable publisher, proceedings, or archive page
3. If you cannot reliably find it, treat it as high risk
Step 3: Metadata alignment
Confirm that:
● Title, authors, year, and venue match the located record
● DOI or URL resolves to the intended work
Step 4: Context fit
This paper is cited here because it demonstrates X under Y conditions, which directly supports the claim that write one sentence explaining why this reference supports this claim.
If you cannot do this confidently, revise or replace the citation.
5. Tool-assisted triage (optional)
Citation Checking tools can help flag dead links or suspicious metadata as a first pass, but they do not replace author verification. Used correctly, they simply help prioritize where human review is most needed, consistent with NeurIPS’ emphasis on author responsibility.
6. Compliance in one paragraph (for authors)
NeurIPS welcomes AI-assisted writing and does not require disclosure when LLMs are used only for language editing. Disclosure is required only if AI is an integral part of the research methodology. Authors remain fully accountable for making sure that every reference is correct, exists, and is contextually relevant, regardless of any tool that was used. The lack of due diligence , especially where the citation is AI-generated - would be a policy violation.
FAQ — NeurIPS “Hallucinated Citations” & AI Writing Policy
1. What does “hallucinated citation” actually mean in practice?
A hallucinated citation is a reference that cannot be verified as a real, traceable work, or whose bibliographic details (authors, title, venue, year, DOI) do not match any existing publication. This is distinct from minor formatting errors. NeurIPS policy focuses on whether authors verified the existence and correctness of cited works, not on the label used to describe the error.
2. Are hallucinated citations the same as simple citation mistakes?
No. A simple citation mistake usually involves a typo, missing field, or broken link to a real paper. A hallucinated citation involves fabricated or substantially incorrect bibliographic information that prevents the source from being reliably located. Both are problematic, but hallucinated citations raise stronger concerns about due diligence.
3. Does NeurIPS allow authors to use AI tools like ChatGPT when writing papers?
Yes. NeurIPS explicitly allows authors to use AI and large language models when preparing papers. The policy does not ban AI-assisted writing. Instead, it defines when disclosure is required and emphasizes that authors remain responsible for all content.
4. Do I need to disclose AI use if I only used it for grammar or language editing?
No. According to the NeurIPS 2025 LLM Policy, if AI tools are used only for editing purposes—such as grammar checks, spelling correction, or general language polishing—disclosure in the manuscript is not required.
5. When is AI usage required to be disclosed under NeurIPS policy?
Disclosure is required only when an LLM is an important, original, or non-standard component of the research methodology. In such cases, the use of the LLM must be described in the experimental setup or an equivalent section of the paper.
6. If AI generated a citation, is that automatically a policy violation?
No. The policy does not prohibit AI-generated citations. However, NeurIPS explicitly lists “using references generated by an LLM without conducting due diligence to verify correctness, existence, and appropriateness in context” as an example of a violation. The issue is not generation, but failure to verify.
7. Who is responsible if a paper contains a hallucinated citation?
The authors are fully responsible. NeurIPS states that authors are responsible for the entire content of the paper, including text, figures, and references, regardless of what tools were used in preparation. AI tools cannot be listed as authors and do not carry responsibility.
8. Does the presence of a hallucinated citation invalidate an entire paper?
Not necessarily. Independent commentary has emphasized that citation errors represent a small fraction of total citations and do not automatically invalidate research results. However, such errors are preventable and may trigger closer scrutiny or policy action if they indicate a lack of due diligence.
9. How does NeurIPS typically expect authors to demonstrate “due diligence”?
NeurIPS does not mandate a specific workflow. In practice, due diligence means that authors can reasonably demonstrate that cited works exist, that their metadata is accurate, and that each reference supports the claim it is attached to. A short, systematic verification process is generally sufficient.
10. Is checking every citation manually required?
There is no explicit requirement to manually check every citation, but authors are expected to take reasonable steps to verify their references. Under time pressure, prioritizing high-risk citations (recent papers, AI-generated references, missing DOIs) is considered a practical and defensible approach.
11. Can citation-checking tools replace manual verification?
No. Tools can assist with triage—such as flagging dead links or inconsistent metadata—but they do not replace author responsibility. NeurIPS policy places accountability on authors, not on tools.
12. Is NeurIPS’ approach stricter than other major conferences?
NeurIPS’ policy is consistent with a broader trend across major computer science conferences: AI-assisted writing is generally allowed, disclosure is required only when AI affects methodology, and authors retain full responsibility for accuracy and integrity. What differs across venues is wording, not the core principle of accountability.
13. What is the safest compliance mindset for authors using AI?
The safest approach is to treat AI as an assistant, not an authority. Use AI freely for drafting and language polishing if permitted, but personally verify all factual claims and references. Under NeurIPS policy, compliance is demonstrated through verification, not avoidance of tools.
14. What is the minimum I should do before submitting to stay compliant?
At minimum, authors should verify that each cited work exists, that its bibliographic information is accurate, and that it genuinely supports the statement it accompanies. A short, documented citation check before submission removes one of the most common and avoidable compliance risks.

