Knowledge Query Tools¶

This section documents tools for information retrieval and knowledge querying in DeepCritical.

Overview¶

Knowledge Query tools provide capabilities for retrieving information from various knowledge sources, including web search, databases, and structured knowledge bases.

Available Tools¶

Web Search Tools¶

WebSearchTool¶

Performs web searches and retrieves relevant information.

Location: DeepResearch.src.tools.websearch_tools.WebSearchTool

Capabilities: - Multi-engine search (Google, DuckDuckGo, Bing) - Content extraction and summarization - Relevance filtering - Result ranking and deduplication

Usage:

from DeepResearch.src.tools.websearch_tools import WebSearchTool

tool = WebSearchTool()
result = await tool.run({
    "query": "machine learning applications",
    "num_results": 10,
    "engines": ["google", "duckduckgo"]
})

Parameters: - query: Search query string - num_results: Number of results to return (default: 10) - engines: List of search engines to use - max_age_days: Maximum age of results in days - language: Language for search results

ChunkedSearchTool¶

Performs chunked searches for large query sets.

Location: DeepResearch.src.tools.websearch_tools.ChunkedSearchTool

Capabilities: - Large-scale search operations - Query chunking and parallel processing - Result aggregation and deduplication - Memory-efficient processing

Usage:

from DeepResearch.src.tools.websearch_tools import ChunkedSearchTool

tool = ChunkedSearchTool()
result = await tool.run({
    "queries": ["query1", "query2", "query3"],
    "chunk_size": 5,
    "max_concurrent": 3
})

Database Query Tools¶

DatabaseQueryTool¶

Executes queries against structured databases.

Location: DeepResearch.src.tools.database_tools.DatabaseQueryTool

Capabilities: - SQL query execution - Result formatting and validation - Connection management - Query optimization

Supported Databases: - PostgreSQL - MySQL - SQLite - Neo4j (graph database)

Usage:

from DeepResearch.src.tools.database_tools import DatabaseQueryTool

tool = DatabaseQueryTool()
result = await tool.run({
    "connection_string": "postgresql://user:pass@localhost/db",
    "query": "SELECT * FROM research_data WHERE topic = %s",
    "parameters": ["machine_learning"],
    "max_rows": 1000
})

Knowledge Base Tools¶

KnowledgeBaseQueryTool¶

Queries structured knowledge bases and ontologies.

Location: DeepResearch.src.tools.knowledge_base_tools.KnowledgeBaseQueryTool

Capabilities: - Ontology querying (GO, MeSH, etc.) - Semantic search - Relationship traversal - Knowledge graph navigation

Usage:

from DeepResearch.src.tools.knowledge_base_tools import KnowledgeBaseQueryTool

tool = KnowledgeBaseQueryTool()
result = await tool.run({
    "ontology": "GO",
    "query_type": "term_search",
    "search_term": "protein kinase activity",
    "max_results": 50
})

Document Search Tools¶

DocumentSearchTool¶

Searches through document collections and corpora.

Location: DeepResearch.src.tools.document_tools.DocumentSearchTool

Capabilities: - Full-text search across documents - Metadata filtering - Relevance ranking - Multi-format support (PDF, DOC, TXT)

Usage:

from DeepResearch.src.tools.document_tools import DocumentSearchTool

tool = DocumentSearchTool()
result = await tool.run({
    "collection": "research_papers",
    "query": "deep learning protein structure",
    "filters": {
        "year": {"gte": 2020},
        "journal": "Nature"
    },
    "max_results": 20
})

Tool Integration¶

Agent Integration¶

Knowledge Query tools integrate seamlessly with DeepCritical agents:

from DeepResearch.agents import SearchAgent

agent = SearchAgent()
result = await agent.execute(
    "Find recent papers on CRISPR gene editing",
    dependencies=AgentDependencies()
)

Workflow Integration¶

Tools can be used in research workflows:

from DeepResearch.app import main

result = await main(
    question="What are the latest developments in quantum computing?",
    flows={"deepsearch": {"enabled": True}},
    tool_config={
        "web_search": {
            "engines": ["google", "arxiv"],
            "max_results": 50
        }
    }
)

Configuration¶

Tool Configuration¶

Configure Knowledge Query tools in configs/tools/knowledge_query.yaml:

knowledge_query:
  web_search:
    default_engines: ["google", "duckduckgo"]
    max_results: 20
    cache_results: true
    cache_ttl_hours: 24

  database:
    connection_pool_size: 10
    query_timeout_seconds: 30
    enable_query_logging: true

  knowledge_base:
    supported_ontologies: ["GO", "MeSH", "ChEBI"]
    default_endpoint: "https://api.geneontology.org"
    cache_enabled: true

Performance Tuning¶

performance:
  search:
    max_concurrent_requests: 5
    request_timeout_seconds: 10
    retry_attempts: 3

  database:
    connection_pool_size: 20
    statement_cache_size: 100
    query_optimization: true

  caching:
    enabled: true
    ttl_seconds: 3600
    max_cache_size_mb: 512

Best Practices¶

Search Optimization¶

Query Formulation: Use specific, well-formed queries
Result Filtering: Apply relevance filters to reduce noise
Source Diversity: Use multiple search engines/sources
Caching: Enable caching for frequently accessed data

Database Queries¶

Parameterized Queries: Always use parameterized queries
Index Usage: Ensure proper database indexing
Connection Pooling: Use connection pooling for efficiency
Query Limits: Set reasonable result limits

Knowledge Base Queries¶

Ontology Awareness: Understand ontology structure and relationships
Semantic Matching: Use semantic search capabilities
Result Validation: Validate ontology term mappings
Version Handling: Handle ontology version changes

Error Handling¶

Common Errors¶

Search Failures:

try:
    result = await web_search_tool.run({"query": "complex query"})
except SearchTimeoutError:
    # Handle timeout
    result = await web_search_tool.run({
        "query": "complex query",
        "timeout": 60
    })

Database Connection Issues:

try:
    result = await db_tool.run({"query": "SELECT * FROM data"})
except ConnectionError:
    # Retry with different connection
    result = await db_tool.run({
        "query": "SELECT * FROM data",
        "connection_string": backup_connection
    })

Knowledge Base Unavailability:

try:
    result = await kb_tool.run({"ontology": "GO", "term": "kinase"})
except OntologyUnavailableError:
    # Fallback to alternative source
    result = await kb_tool.run({
        "ontology": "GO",
        "term": "kinase",
        "fallback_source": "local_cache"
    })

Monitoring and Metrics¶

Tool Metrics¶

Knowledge Query tools provide comprehensive metrics:

# Get tool metrics
metrics = tool.get_metrics()

print(f"Total queries: {metrics['total_queries']}")
print(f"Success rate: {metrics['success_rate']:.2%}")
print(f"Average response time: {metrics['avg_response_time']:.2f}s")
print(f"Cache hit rate: {metrics['cache_hit_rate']:.2%}")

Performance Monitoring¶

# Enable performance monitoring
tool.enable_monitoring()

# Get performance report
report = tool.get_performance_report()
for query_type, stats in report.items():
    print(f"{query_type}: {stats['count']} queries, "
          f"{stats['avg_time']:.2f}s avg time")

Security Considerations¶

Input Validation¶

All Knowledge Query tools validate inputs:

# Automatic input validation
result = await tool.run({
    "query": user_input,  # Automatically validated
    "max_results": 100    # Range checked
})

Output Sanitization¶

Results are sanitized to prevent injection:

# Safe result handling
if result.success:
    safe_data = result.get_sanitized_data()
    # Use safe_data for further processing

Access Control¶

Configure access controls for sensitive data sources:

access_control:
  database:
    allowed_queries: ["SELECT", "SHOW"]
    blocked_tables: ["sensitive_data"]
  knowledge_base:
    allowed_ontologies: ["GO", "MeSH"]
    require_authentication: true

Tool Registry - Tool registration and management
Web Search Integration - Web search capabilities
RAG Tools - Retrieval-augmented generation
Bioinformatics Tools - Domain-specific tools

Knowledge Query Tools¶

Overview¶

Available Tools¶

Web Search Tools¶

WebSearchTool¶

ChunkedSearchTool¶

Database Query Tools¶

DatabaseQueryTool¶

Knowledge Base Tools¶

KnowledgeBaseQueryTool¶

Document Search Tools¶

DocumentSearchTool¶

Tool Integration¶

Agent Integration¶

Workflow Integration¶

Configuration¶

Tool Configuration¶

Performance Tuning¶

Best Practices¶

Search Optimization¶

Database Queries¶

Knowledge Base Queries¶

Error Handling¶

Common Errors¶

Monitoring and Metrics¶

Tool Metrics¶

Performance Monitoring¶

Security Considerations¶

Input Validation¶

Output Sanitization¶

Access Control¶

Related Documentation¶