toolregistry.hub.websearch.websearch_searxng module¶
- class toolregistry.hub.websearch.websearch_searxng.WebSearchSearxng(searxng_base_url: str, proxy: str | None = None)[source]¶
Bases:
WebSearchGeneral
WebSearchSearxng provides a unified interface for performing web searches and processing results through a SearxNG instance. It handles search queries, result filtering, and content extraction.
Features: - Performs web searches using SearxNG instance - Filters results by relevance score threshold - Extracts and cleans webpage content using multiple methods (BeautifulSoup/Jina Reader) - Parallel processing of result fetching - Automatic emoji removal and text normalization
Examples
>>> from toolregistry.hub.websearch_searxng import WebSearchSearxng >>> searcher = WebSearchSearxng("http://localhost:8080") >>> results = searcher.search("python web scraping", number_results=3) >>> for result in results: ... print(result["title"])
- __init__(searxng_base_url: str, proxy: str | None = None)[source]¶
Initialize WebSearchSearxng with configuration parameters. :param searxng_base_url: Base URL for the SearxNG instance (e.g. “http://localhost:8080”). :type searxng_base_url: str :param proxy: Proxy URL for HTTP requests. :type proxy: Optional[str]
- search(query: str, number_results: int = 5, threshold: float = 0.2, timeout: float | None = None) List[Dict[str, str]] [source]¶
Perform search and return results.
- Parameters:
query (str) – The search query. Boolean operators like AND, OR, NOT can be used if needed.
number_results (int, optional) – The maximum number of results to return. Defaults to 5.
threshold (float, optional) – Minimum score threshold for results [0-1.0]. Defaults to 0.2.
timeout (float, optional) – Request timeout in seconds. Defaults to TIMEOUT_DEFAULT (10). Usually not needed.
- Returns:
A list of enriched search results. Each dictionary contains: - ‘title’: The title of the search result. - ‘url’: The URL of the search result. - ‘content’: The content of the search result. - ‘excerpt’: The excerpt of the search result.
- Return type:
List[Dict[str, str]]