Citation Presence¶
Heuristic Knowledge Multi-Turn
At a Glance¶
Score Range
0.0 or 1.0Binary pass/fail
Default Threshold
0.5Pass/fail cutoff
Required Inputs
actual_outputOptional: conversation
What It Measures
Citation Presence evaluates whether AI responses include properly formatted citationsโURLs, DOIs, or academic references. It supports both single-turn responses and multi-turn conversations.
| Score | Interpretation |
|---|---|
| 1.0 | Citations present (in at least one message) |
| 0.0 | No citations found |
- Requiring sourced responses
- Building research assistants
- Enforcing citation policies
- Validating knowledge retrieval
- Citations aren't required
- Checking citation accuracy (use Faithfulness)
- Creative/generative tasks
- Simple Q&A without sources
Citation Presence vs Faithfulness
Citation Presence checks: "Are citations included?" Faithfulness checks: "Is the content accurate to the source?"
Use Citation Presence for format compliance; use Faithfulness for content verification.
How It Works
The metric extracts citations using regex patterns and evaluates based on the configured mode.
Step-by-Step Process¶
flowchart TD
subgraph INPUT["๐ฅ Input"]
A[Response Text]
B[Mode Setting]
end
subgraph EXTRACT["๐ Step 1: Extract Citations"]
C[Run citation patterns]
D1["HTTP/HTTPS URLs"]
D2["DOI references"]
D3["Academic citations"]
end
subgraph EVALUATE["โ๏ธ Step 2: Mode-Based Evaluation"]
E{Mode?}
F["any_citation: Any URL/DOI found?"]
G["resource_section: Section with citations?"]
end
subgraph OUTPUT["๐ Result"]
H["1.0 = Pass"]
I["0.0 = Fail"]
end
A & B --> C
C --> D1 & D2 & D3
D1 & D2 & D3 --> E
E -->|any_citation| F
E -->|resource_section| G
F & G -->|Yes| H
F & G -->|No| I
style INPUT stroke:#f59e0b,stroke-width:2px
style EXTRACT stroke:#3b82f6,stroke-width:2px
style EVALUATE stroke:#8b5cf6,stroke-width:2px
style OUTPUT stroke:#10b981,stroke-width:2px
| Format | Pattern | Example |
|---|---|---|
| HTTP/HTTPS URLs | https?://... |
https://docs.python.org/3/ |
| WWW URLs | www.domain.com |
www.wikipedia.org |
| DOI References | doi:10.xxxx/... |
doi:10.1000/xyz123 |
| Academic | (Author, Year) |
(Smith et al., 2023) |
Pass if any citation appears anywhere in the response.
Pass only if citations appear in a dedicated Resources/References section.
Configuration¶
| Parameter | Type | Default | Description |
|---|---|---|---|
mode |
str |
any_citation |
Evaluation mode: any_citation or resource_section |
strict |
bool |
False |
If True, validates URLs are live |
use_semantic_search |
bool |
False |
Use embeddings for fallback detection |
embed_model |
EmbeddingRunnable |
None |
Embedding model (required if semantic search enabled) |
resource_similarity_threshold |
float |
0.8 |
Threshold for semantic matching |
custom_resource_phrases |
List[str] |
None |
Custom phrases to identify resource sections |
Strict Mode
When strict=True, the metric validates that URLs are live by making HEAD requests. This ensures citations point to actual resources but adds latency.
Code Examples¶
from axion.metrics import CitationPresence
metric = CitationPresence()
item = DatasetItem(
actual_output="Python is a great programming language for beginners.",
)
result = await metric.execute(item)
print(result.score) # 0.0 - no citations
print(result.explanation)
# "Mode: any_citation. FAILURE: No assistant message satisfied the citation requirement."
from axion.metrics import CitationPresence
# Require citations in a dedicated section
metric = CitationPresence(mode='resource_section')
item = DatasetItem(
actual_output="""
Python is versatile and beginner-friendly.
For More Information:
- https://docs.python.org/3/
- https://realpython.com/
""",
)
result = await metric.execute(item)
print(result.score) # 1.0 - resource section with citations
from axion.metrics import CitationPresence
from axion._core.schema import Conversation, HumanMessage, AIMessage
metric = CitationPresence()
item = DatasetItem(
actual_output="", # Will check conversation instead
conversation=Conversation(messages=[
HumanMessage(content="What is Python?"),
AIMessage(content="Python is a programming language."),
HumanMessage(content="Where can I learn more?"),
AIMessage(content="Check out https://python.org and https://realpython.com"),
]),
)
result = await metric.execute(item)
print(result.score) # 1.0 - citation in second AI message
print(result.signals.messages_with_citations) # [3] (index of 2nd AI message)
Metric Diagnostics¶
Every evaluation is fully interpretable. Access detailed diagnostic results via result.signals.
result = await metric.execute(item)
print(result.pretty()) # Human-readable summary
result.signals # Full diagnostic breakdown
๐ CitationPresenceResult Structure
CitationPresenceResult(
{
"passes_presence_check": True,
"total_assistant_messages": 2,
"messages_with_citations": [3] # 0-indexed message positions
}
)
Signal Fields¶
| Field | Type | Description |
|---|---|---|
passes_presence_check |
bool |
Whether citation requirement was met |
total_assistant_messages |
int |
Number of AI messages evaluated |
messages_with_citations |
List[int] |
Indices of messages with valid citations |
Example Scenarios¶
โ Scenario 1: URL Citation (Score: 1.0)
HTTP URL Found
Output:
"Machine learning is a subset of AI. See https://scikit-learn.org for tutorials."
Citations Detected: https://scikit-learn.org
Final Score: 1.0
โ Scenario 2: Academic Citation (Score: 1.0)
Author-Year Format
Output:
"Attention mechanisms transformed NLP (Vaswani et al., 2017)."
Citations Detected: (Vaswani et al., 2017)
Final Score: 1.0
โ Scenario 3: No Citations (Score: 0.0)
Missing Citations
Output:
"Deep learning uses neural networks with multiple layers to process data."
Citations Detected: None
Final Score: 0.0
โ ๏ธ Scenario 4: Resource Section Required
Wrong Mode
Mode: resource_section
Output:
"Python documentation is at https://python.org which explains everything."
Analysis: URL exists but not in a resource section.
Final Score: 0.0
Switch to any_citation mode or add a Resources section.
Why It Matters¶
Ensure AI outputs provide proper attribution to sources.
Enforce citation standards for academic or research applications.
Verify responses meet organizational citation requirements.
Quick Reference¶
TL;DR
Citation Presence = Does the response include citations?
- Use it when: Requiring sourced responses or research assistants
- Score interpretation: 1.0 = citations found, 0.0 = none
- Key config:
modedetermines where citations must appear
-
API Reference
-
Related Metrics
Faithfulness ยท Contextual Relevancy ยท Answer Relevancy