Search API Reference¶

Search and retrieval integrations for augmenting evaluation datasets.

from axion.search import GoogleRetriever, TavilyRetriever, YouRetriever

G

GoogleRetriever

Google Custom Search API integration for web search and retrieval.

T

TavilyRetriever

Tavily AI search API for research-optimized web retrieval with AI summaries.

Y

YouRetriever

You.com search API integration for AI-focused web search results.

GoogleRetriever¶

axion.search.GoogleRetriever ¶

GoogleRetriever(api_key: Optional[str] = None, num_web_results: int = 5, crawl_pages: bool = False, max_crawl_tokens: Optional[int] = 10000, **kwargs)

Bases: BaseRetriever

Retriever that uses SerpAPI to perform Google searches and format the results into nodes.

Initialize the GoogleRetriever.

Parameters:

api_key (Optional[str], default: None ) –

SerpAPI key used for authenticating requests. Defaults to the value of the 'SERPAPI_KEY' environment variable if not provided.
num_web_results (int, default: 5 ) –

Number of top search results to return (maximum 20).
crawl_pages (bool, default: False ) –

Whether to fetch and clean full page content from URLs in the search results.
max_crawl_tokens (Optional[int], default: 10000 ) –

Maximum number of tokens to crawl per page (if crawling is enabled).

retrieve `async` ¶

retrieve(query: str) -> SearchResults

Perform a search query and return results.

Parameters:

query (str) –

Query to search.

Returns:

SearchResults –

A list of nodes with associated scores.

TavilyRetriever¶

axion.search.TavilyRetriever ¶

TavilyRetriever(api_key: Optional[str] = None, endpoint: Literal['search', 'extract', 'crawl'] = 'search', search_depth: Literal['basic', 'advanced'] = 'basic', topic: Optional[str] = 'general', max_results: Optional[int] = 5, crawl_pages: bool = False, max_crawl_tokens: Optional[int] = 10000, days: Optional[int] = None, include_answer: bool = False, include_raw_content: bool = False, include_images: bool = False, include_image_descriptions: bool = False, include_domains: Optional[List[str]] = None, exclude_domains: Optional[List[str]] = None, extract_depth: Literal['basic', 'advanced'] = 'basic', max_depth: Optional[int] = 1, max_breadth: Optional[int] = 20, limit: Optional[int] = 50, instructions: Optional[str] = None, select_paths: Optional[List[str]] = None, select_domains: Optional[List[str]] = None, exclude_paths: Optional[List[str]] = None, allow_external: bool = False, categories: Optional[List[str]] = None, **kwargs)

Bases: BaseRetriever

Retriever for Tavily's Search, Extract, and Crawl APIs.

Initialize the TavilyRetriever with specified parameters.

retrieve `async` ¶

retrieve(query: str) -> SearchResults

Retrieve results using the Tavily Search API. Only handles 'search' endpoint. For 'extract' and 'crawl', call their respective methods directly.

extract_url_text ¶

extract_url_text(url: str) -> str

Extract content from a single URL using the Extract API.

extract `async` ¶

extract(url: str) -> SearchResults

Extract content from a URL using the Extract API.

crawl `async` ¶

crawl(url: str) -> SearchResults

Crawl content from a URL using the Crawl API.

YouRetriever¶

axion.search.YouRetriever ¶

YouRetriever(api_key: Optional[str] = None, endpoint: Literal['search', 'news'] = 'search', num_web_results: Optional[int] = 5, crawl_pages: bool = False, max_crawl_tokens: Optional[int] = 10000, safesearch: Optional[Literal['off', 'moderate', 'strict']] = None, country: Optional[str] = None, search_lang: Optional[str] = None, ui_lang: Optional[str] = None, spellcheck: Optional[bool] = None, **kwargs)

Bases: BaseRetriever

Retriever for You.com's Search and News API.

Initialize the YouRetriever.

Parameters:

api_key (Optional[str], default: None ) –

You.com API key. If not provided, it will attempt to use the YDC_API_KEY environment variable.
callback_manager (Optional[CallbackManager]) –

Optional manager for handling callback events during retrieval.
endpoint (Literal['search', 'news'], default: 'search' ) –

The You.com API endpoint to query — either "search" for web results or "news" for news-specific content. Defaults to "search".
num_web_results (Optional[int], default: 5 ) –

Maximum number of search results to return. Must not exceed 20.
crawl_pages (bool, default: False ) –

Whether to crawl and extract the content of the linked pages from the search results.
max_crawl_tokens (Optional[int], default: 10000 ) –

Maximum number of tokens to retrieve per page when crawling. If None, a default internal value is used.
safesearch (Optional[Literal['off', 'moderate', 'strict']], default: None ) –

Safe search filtering level. Defaults to "moderate" if not specified.
country (Optional[str], default: None ) –

Country code for geo-specific search behavior (e.g., "US" for United States).
search_lang (Optional[str], default: None ) –

Language code to use for the search query (e.g., "en" for English).
ui_lang (Optional[str], default: None ) –

Language code for the UI/localized response (e.g., "en").
spellcheck (Optional[bool], default: None ) –

Whether to enable spell check for the query. Defaults to True if unspecified.

retrieve `async` ¶

retrieve(query: str) -> SearchResults

Perform a search query and return results using You.com API.

Parameters:

query (str) –

Query to search.

Returns:

SearchResults –

A list of nodes with associated scores.

Search Integrations Guide Google Search Deep Dive

Search API Reference¶

GoogleRetriever¶

axion.search.GoogleRetriever ¶

retrieve async ¶

TavilyRetriever¶

axion.search.TavilyRetriever ¶

retrieve async ¶

extract_url_text ¶

extract async ¶

crawl async ¶

YouRetriever¶

axion.search.YouRetriever ¶

retrieve async ¶

retrieve `async` ¶

retrieve `async` ¶

extract `async` ¶

crawl `async` ¶

retrieve `async` ¶