toolregistry.hub.websearch.websearch_searxng module

class toolregistry.hub.websearch.websearch_searxng.WebSearchSearxng(searxng_base_url: str, proxy: str | None = None)[source]

Bases: WebSearchGeneral

WebSearchSearxng provides a unified interface for performing web searches and processing results through a SearxNG instance. It handles search queries, result filtering, and content extraction.

Features: - Performs web searches using SearxNG instance - Filters results by relevance score threshold - Extracts and cleans webpage content using multiple methods (BeautifulSoup/Jina Reader) - Parallel processing of result fetching - Automatic emoji removal and text normalization

Examples

>>> from toolregistry.hub.websearch_searxng import WebSearchSearxng
>>> searcher = WebSearchSearxng("http://localhost:8080")
>>> results = searcher.search("python web scraping", number_results=3)
>>> for result in results:
...     print(result["title"])
__init__(searxng_base_url: str, proxy: str | None = None)[source]

Initialize WebSearchSearxng with configuration parameters. :param searxng_base_url: Base URL for the SearxNG instance (e.g. “http://localhost:8080”). :type searxng_base_url: str :param proxy: Proxy URL for HTTP requests. :type proxy: Optional[str]

search(query: str, number_results: int = 5, threshold: float = 0.2, timeout: float | None = None) List[Dict[str, str]][source]

Perform search and return results.

Parameters:
  • query (str) – The search query. Boolean operators like AND, OR, NOT can be used if needed.

  • number_results (int, optional) – The maximum number of results to return. Defaults to 5.

  • threshold (float, optional) – Minimum score threshold for results [0-1.0]. Defaults to 0.2.

  • timeout (float, optional) – Request timeout in seconds. Defaults to TIMEOUT_DEFAULT (10). Usually not needed.

Returns:

A list of enriched search results. Each dictionary contains: - ‘title’: The title of the search result. - ‘url’: The URL of the search result. - ‘content’: The content of the search result. - ‘excerpt’: The excerpt of the search result.

Return type:

List[Dict[str, str]]