Skip to content

Data Types API

This page provides comprehensive documentation for DeepCritical's data type system, including Pydantic models, type definitions, and data validation schemas.

Core Data Types

Agent Framework Types

AgentRunResponse

Response structure from agent execution.

@dataclass
class AgentRunResponse:
    """Response from agent execution."""

    messages: List[ChatMessage]
    """List of messages in the conversation."""

    data: Optional[Dict[str, Any]] = None
    """Optional structured data from agent execution."""

    metadata: Optional[Dict[str, Any]] = None
    """Optional metadata about the execution."""

    success: bool = True
    """Whether the agent execution was successful."""

    error: Optional[str] = None
    """Error message if execution failed."""

    execution_time: float = 0.0
    """Time taken for execution in seconds."""

ChatMessage

Message format for agent communication.

@dataclass
class ChatMessage:
    """A message in an agent conversation."""

    role: Role
    """The role of the message sender."""

    contents: List[Content]
    """The content of the message."""

    metadata: Optional[Dict[str, Any]] = None
    """Optional metadata about the message."""

Role

Enumeration of message roles.

class Role(Enum):
    """Message role enumeration."""

    SYSTEM = "system"
    USER = "user"
    ASSISTANT = "assistant"
    TOOL = "tool"

Content Types

Base classes for message content.

@dataclass
class Content:
    """Base class for message content."""
    pass

@dataclass
class TextContent(Content):
    """Text content for messages."""

    text: str
    """The text content."""

@dataclass
class ImageContent(Content):
    """Image content for messages."""

    url: str
    """URL of the image."""

    alt_text: Optional[str] = None
    """Alternative text for the image."""

Research Types

ResearchState

Main state object for research workflows.

@dataclass
class ResearchState:
    """Main state for research workflow execution."""

    question: str
    """The research question being addressed."""

    plan: List[str] = field(default_factory=list)
    """List of planned research steps."""

    agent_results: Dict[str, Any] = field(default_factory=dict)
    """Results from agent executions."""

    tool_outputs: Dict[str, Any] = field(default_factory=dict)
    """Outputs from tool executions."""

    execution_history: ExecutionHistory = field(default_factory=lambda: ExecutionHistory())
    """History of workflow execution."""

    config: DictConfig = None
    """Hydra configuration object."""

    metadata: Dict[str, Any] = field(default_factory=dict)
    """Additional metadata."""

    status: ExecutionStatus = ExecutionStatus.PENDING
    """Current execution status."""

ResearchOutcome

Result structure for research execution.

@dataclass
class ResearchOutcome:
    """Outcome of research execution."""

    success: bool
    """Whether the research was successful."""

    data: Optional[Dict[str, Any]] = None
    """Main research data and results."""

    metadata: Optional[Dict[str, Any]] = None
    """Metadata about the research execution."""

    error: Optional[str] = None
    """Error message if research failed."""

    execution_time: float = 0.0
    """Total execution time in seconds."""

    agent_results: Dict[str, AgentResult] = field(default_factory=dict)
    """Results from individual agents."""

    tool_outputs: Dict[str, Any] = field(default_factory=dict)
    """Outputs from tools used."""

ExecutionHistory

Tracking of workflow execution steps.

@dataclass
class ExecutionHistory:
    """History of workflow execution steps."""

    entries: List[ExecutionHistoryEntry] = field(default_factory=list)
    """List of execution history entries."""

    total_time: float = 0.0
    """Total execution time."""

    start_time: Optional[datetime] = None
    """When execution started."""

    end_time: Optional[datetime] = None
    """When execution ended."""

    def add_entry(self, entry: ExecutionHistoryEntry) -> None:
        """Add an entry to the history."""
        self.entries.append(entry)
        if entry.execution_time:
            self.total_time += entry.execution_time

    def get_entries_by_type(self, entry_type: str) -> List[ExecutionHistoryEntry]:
        """Get entries filtered by type."""
        return [e for e in self.entries if e.entry_type == entry_type]

    def get_successful_entries(self) -> List[ExecutionHistoryEntry]:
        """Get entries that were successful."""
        return [e for e in self.entries if e.success]

Agent Types

AgentResult

Result structure from agent execution.

@dataclass
class AgentResult:
    """Result from agent execution."""

    success: bool
    """Whether the agent execution was successful."""

    data: Optional[Any] = None
    """Main result data."""

    metadata: Optional[Dict[str, Any]] = None
    """Metadata about the execution."""

    error: Optional[str] = None
    """Error message if execution failed."""

    execution_time: float = 0.0
    """Time taken for execution."""

    agent_type: AgentType = AgentType.UNKNOWN
    """Type of agent that produced this result."""

AgentDependencies

Configuration and dependencies for agent execution.

@dataclass
class AgentDependencies:
    """Dependencies and configuration for agent execution."""

    model_name: str = "anthropic:claude-sonnet-4-0"
    """Name of the LLM model to use."""

    api_keys: Dict[str, str] = field(default_factory=dict)
    """API keys for external services."""

    config: Dict[str, Any] = field(default_factory=dict)
    """Additional configuration parameters."""

    tools: List[str] = field(default_factory=list)
    """List of tool names to make available."""

    context: Optional[Dict[str, Any]] = None
    """Additional context for agent execution."""

    timeout: float = 60.0
    """Timeout for agent execution in seconds."""

Tool Types

ToolSpec

Specification for tool metadata and interface.

@dataclass
class ToolSpec:
    """Specification for a tool's interface and metadata."""

    name: str
    """Unique name of the tool."""

    description: str
    """Human-readable description of the tool."""

    category: str = "general"
    """Category this tool belongs to."""

    inputs: Dict[str, str] = field(default_factory=dict)
    """Input parameter specifications."""

    outputs: Dict[str, str] = field(default_factory=dict)
    """Output specifications."""

    metadata: Dict[str, Any] = field(default_factory=dict)
    """Additional metadata."""

    version: str = "1.0.0"
    """Version of the tool specification."""

    author: Optional[str] = None
    """Author of the tool."""

    license: Optional[str] = None
    """License for the tool."""

ExecutionResult

Result structure from tool execution.

@dataclass
class ExecutionResult:
    """Result from tool execution."""

    success: bool
    """Whether the tool execution was successful."""

    data: Optional[Any] = None
    """Main result data."""

    metadata: Optional[Dict[str, Any]] = None
    """Metadata about the execution."""

    execution_time: float = 0.0
    """Time taken for execution."""

    error: Optional[str] = None
    """Error message if execution failed."""

    error_type: Optional[str] = None
    """Type of error that occurred."""

    citations: List[Dict[str, Any]] = field(default_factory=list)
    """Source citations for the result."""

ToolRequest

Request structure for tool execution.

@dataclass
class ToolRequest:
    """Request to execute a tool."""

    tool_name: str
    """Name of the tool to execute."""

    parameters: Dict[str, Any] = field(default_factory=dict)
    """Parameters to pass to the tool."""

    metadata: Dict[str, Any] = field(default_factory=dict)
    """Additional metadata for the request."""

    timeout: Optional[float] = None
    """Timeout for tool execution."""

    priority: int = 0
    """Priority of the request (higher numbers = higher priority)."""

ToolResponse

Response structure from tool execution.

@dataclass
class ToolResponse:
    """Response from tool execution."""

    success: bool
    """Whether the tool execution was successful."""

    data: Optional[Any] = None
    """Result data from the tool."""

    metadata: Dict[str, Any] = field(default_factory=dict)
    """Metadata about the execution."""

    citations: List[Dict[str, Any]] = field(default_factory=list)
    """Source citations."""

    execution_time: float = 0.0
    """Time taken for execution."""

    error: Optional[str] = None
    """Error message if execution failed."""

Bioinformatics Types

GOAnnotation

Gene Ontology annotation data structure.

@dataclass
class GOAnnotation:
    """Gene Ontology annotation."""

    gene_id: str
    """Gene identifier."""

    go_id: str
    """GO term identifier."""

    go_term: str
    """GO term description."""

    evidence_code: str
    """Evidence code for the annotation."""

    aspect: str
    """GO aspect (P, F, or C)."""

    source: str = "GO"
    """Source of the annotation."""

    confidence_score: Optional[float] = None
    """Confidence score for the annotation."""

PubMedPaper

PubMed paper data structure.

@dataclass
class PubMedPaper:
    """PubMed paper information."""

    pmid: str
    """PubMed ID."""

    title: str
    """Paper title."""

    abstract: Optional[str] = None
    """Paper abstract."""

    authors: List[str] = field(default_factory=list)
    """List of authors."""

    journal: Optional[str] = None
    """Journal name."""

    publication_date: Optional[str] = None
    """Publication date."""

    doi: Optional[str] = None
    """Digital Object Identifier."""

    keywords: List[str] = field(default_factory=list)
    """Paper keywords."""

    relevance_score: Optional[float] = None
    """Relevance score for the query."""

FusedDataset

Fused dataset from multiple bioinformatics sources.

@dataclass
class FusedDataset:
    """Fused dataset from multiple bioinformatics sources."""

    gene_id: str
    """Primary gene identifier."""

    annotations: List[GOAnnotation] = field(default_factory=list)
    """GO annotations."""

    publications: List[PubMedPaper] = field(default_factory=list)
    """Related publications."""

    expression_data: Dict[str, Any] = field(default_factory=dict)
    """Expression data from various sources."""

    quality_score: float = 0.0
    """Overall quality score for the fused data."""

    sources_used: List[str] = field(default_factory=list)
    """List of data sources used."""

    fusion_metadata: Dict[str, Any] = field(default_factory=dict)
    """Metadata about the fusion process."""

Code Execution Types

CodeExecutionWorkflowState

Bases: BaseModel

State for the code execution workflow.

Attributes:

Name Type Description
code_block CodeBlock | None
code_type str | None
detected_code_type str | None
enable_improvement bool
error_analysis dict[str, Any] | None
errors list[str]
execution_error str | None
execution_executor str | None
execution_exit_code int
execution_output str | None
execution_success bool
execution_time float
final_response AgentRunResponse | None
force_code_type bool
generated_code str | None
generation_time float
improved_code str | None
improvement_attempts int
improvement_history list[dict[str, Any]]
improvement_time float
jupyter_config dict[str, Any]
max_improvement_attempts int
max_retries int
model_config
status ExecutionStatus
timeout float
total_time float
use_docker bool
use_jupyter bool
user_query str

Attributes

code_block class-attribute instance-attribute

code_block: CodeBlock | None = Field(
    None, description="Generated code block"
)

code_type class-attribute instance-attribute

code_type: str | None = Field(
    None,
    description="Type of code to generate (bash/python/auto)",
)

detected_code_type class-attribute instance-attribute

detected_code_type: str | None = Field(
    None, description="Auto-detected code type"
)

enable_improvement class-attribute instance-attribute

enable_improvement: bool = Field(
    True,
    description="Enable automatic code improvement on errors",
)

error_analysis class-attribute instance-attribute

error_analysis: dict[str, Any] | None = Field(
    None, description="Error analysis results"
)

errors class-attribute instance-attribute

errors: list[str] = Field(
    default_factory=list,
    description="Any errors encountered",
)

execution_error class-attribute instance-attribute

execution_error: str | None = Field(
    None, description="Execution error message"
)

execution_executor class-attribute instance-attribute

execution_executor: str | None = Field(
    None, description="Executor used"
)

execution_exit_code class-attribute instance-attribute

execution_exit_code: int = Field(
    0, description="Execution exit code"
)

execution_output class-attribute instance-attribute

execution_output: str | None = Field(
    None, description="Execution output"
)

execution_success class-attribute instance-attribute

execution_success: bool = Field(
    False, description="Whether execution succeeded"
)

execution_time class-attribute instance-attribute

execution_time: float = Field(
    0.0, description="Code execution time"
)

final_response class-attribute instance-attribute

final_response: AgentRunResponse | None = Field(
    None, description="Final response to user"
)

force_code_type class-attribute instance-attribute

force_code_type: bool = Field(
    False,
    description="Whether to force the specified code type",
)

generated_code class-attribute instance-attribute

generated_code: str | None = Field(
    None, description="Generated code content"
)

generation_time class-attribute instance-attribute

generation_time: float = Field(
    0.0, description="Code generation time"
)

improved_code class-attribute instance-attribute

improved_code: str | None = Field(
    None, description="Improved code after error analysis"
)

improvement_attempts class-attribute instance-attribute

improvement_attempts: int = Field(
    0, description="Number of improvement attempts made"
)

improvement_history class-attribute instance-attribute

improvement_history: list[dict[str, Any]] = Field(
    default_factory=list,
    description="History of improvements",
)

improvement_time class-attribute instance-attribute

improvement_time: float = Field(
    0.0, description="Code improvement time"
)

jupyter_config class-attribute instance-attribute

jupyter_config: dict[str, Any] = Field(
    default_factory=dict,
    description="Jupyter configuration",
)

max_improvement_attempts class-attribute instance-attribute

max_improvement_attempts: int = Field(
    3, description="Maximum improvement attempts allowed"
)

max_retries class-attribute instance-attribute

max_retries: int = Field(
    3, description="Maximum execution retries"
)

model_config class-attribute instance-attribute

model_config = ConfigDict(json_schema_extra={})

status class-attribute instance-attribute

status: ExecutionStatus = Field(
    PENDING, description="Workflow status"
)

timeout class-attribute instance-attribute

timeout: float = Field(
    60.0, description="Execution timeout"
)

total_time class-attribute instance-attribute

total_time: float = Field(
    0.0, description="Total processing time"
)

use_docker class-attribute instance-attribute

use_docker: bool = Field(
    True, description="Use Docker for execution"
)

use_jupyter class-attribute instance-attribute

use_jupyter: bool = Field(
    False, description="Use Jupyter for execution"
)

user_query class-attribute instance-attribute

user_query: str = Field(
    ...,
    description="Natural language description of desired operation",
)

CodeBlock

Bases: BaseModel

A class that represents a code block for execution.

Attributes:

Name Type Description
code str
language str

Attributes

code class-attribute instance-attribute

code: str = Field(description='The code to execute.')

language class-attribute instance-attribute

language: str = Field(
    description="The language of the code."
)

CodeResult

Bases: BaseModel

A class that represents the result of a code execution.

Attributes:

Name Type Description
exit_code int
output str

Attributes

exit_code class-attribute instance-attribute

exit_code: int = Field(
    description="The exit code of the code execution."
)

output class-attribute instance-attribute

output: str = Field(
    description="The output of the code execution."
)

CodeExecutionConfig

CodeExecutor

Bases: Protocol

A code executor class that executes code blocks and returns the result.

Methods:

Name Description
execute_code_blocks

Execute code blocks and return the result.

restart

Restart the code executor.

Attributes:

Name Type Description
code_extractor CodeExtractor

The code extractor used by this code executor.

Attributes

code_extractor property

code_extractor: CodeExtractor

The code extractor used by this code executor.

Functions

execute_code_blocks

execute_code_blocks(
    code_blocks: list[CodeBlock],
) -> CodeResult

Execute code blocks and return the result.

This method should be implemented by the code executor.

Parameters:

Name Type Description Default
code_blocks List[CodeBlock]

The code blocks to execute.

required

Returns:

Name Type Description
CodeResult CodeResult

The result of the code execution.

Source code in DeepResearch/src/datatypes/coding_base.py
def execute_code_blocks(self, code_blocks: list[CodeBlock]) -> CodeResult:
    """Execute code blocks and return the result.

    This method should be implemented by the code executor.

    Args:
        code_blocks (List[CodeBlock]): The code blocks to execute.

    Returns:
        CodeResult: The result of the code execution.
    """
    ...  # pragma: no cover

restart

restart() -> None

Restart the code executor.

This method should be implemented by the code executor.

This method is called when the agent is reset.

Source code in DeepResearch/src/datatypes/coding_base.py
def restart(self) -> None:
    """Restart the code executor.

    This method should be implemented by the code executor.

    This method is called when the agent is reset.
    """
    ...  # pragma: no cover

CodeExtractor

Bases: Protocol

A code extractor class that extracts code blocks from a message.

Methods:

Name Description
extract_code_blocks

Extract code blocks from a message.

Functions

extract_code_blocks

extract_code_blocks(
    message: str
    | list[
        UserMessageTextContentPart
        | UserMessageImageContentPart
    ]
    | None,
) -> list[CodeBlock]

Extract code blocks from a message.

Parameters:

Name Type Description Default
message str

The message to extract code blocks from.

required

Returns:

Type Description
list[CodeBlock]

List[CodeBlock]: The extracted code blocks.

Source code in DeepResearch/src/datatypes/coding_base.py
def extract_code_blocks(
    self,
    message: str
    | list[UserMessageTextContentPart | UserMessageImageContentPart]
    | None,
) -> list[CodeBlock]:
    """Extract code blocks from a message.

    Args:
        message (str): The message to extract code blocks from.

    Returns:
        List[CodeBlock]: The extracted code blocks.
    """
    ...  # pragma: no cover

Validation and Error Types

ValidationResult

Result from data validation.

@dataclass
class ValidationResult:
    """Result from data validation."""

    valid: bool
    """Whether the data is valid."""

    errors: List[str] = field(default_factory=list)
    """List of validation errors."""

    warnings: List[str] = field(default_factory=list)
    """List of validation warnings."""

    metadata: Dict[str, Any] = field(default_factory=dict)
    """Additional validation metadata."""

ErrorInfo

Structured error information.

@dataclass
class ErrorInfo:
    """Structured error information."""

    error_type: str
    """Type of error."""

    message: str
    """Error message."""

    details: Optional[Dict[str, Any]] = None
    """Additional error details."""

    stack_trace: Optional[str] = None
    """Stack trace if available."""

    timestamp: datetime = field(default_factory=datetime.now)
    """When the error occurred."""

    context: Optional[Dict[str, Any]] = None
    """Context information about the error."""

Type Validation

Pydantic Models

All data types use Pydantic for validation:

from pydantic import BaseModel, Field, validator

class ValidatedResearchState(BaseModel):
    """Validated research state using Pydantic."""

    question: str = Field(..., min_length=1, max_length=1000)
    plan: List[str] = Field(default_factory=list)
    status: ExecutionStatus = ExecutionStatus.PENDING

    @validator('question')
    def validate_question(cls, v):
        if not v.strip():
            raise ValueError('Question cannot be empty')
        return v.strip()

Type Guards

Type guards for runtime type checking:

from typing import TypeGuard

def is_agent_result(obj: Any) -> TypeGuard[AgentResult]:
    """Type guard for AgentResult."""
    return (
        isinstance(obj, dict) and
        'success' in obj and
        isinstance(obj['success'], bool)
    )

def is_tool_response(obj: Any) -> TypeGuard[ToolResponse]:
    """Type guard for ToolResponse."""
    return (
        isinstance(obj, dict) and
        'success' in obj and
        isinstance(obj['success'], bool) and
        'data' in obj
    )

Serialization

JSON Serialization

All data types support JSON serialization:

import json
from deepresearch.datatypes import AgentResult

# Create and serialize
result = AgentResult(
    success=True,
    data={"answer": "42"},
    execution_time=1.5
)

# Serialize to JSON
json_str = result.json()
print(json_str)

# Deserialize from JSON
result_dict = json.loads(json_str)
restored_result = AgentResult(**result_dict)

YAML Serialization

Support for YAML serialization:

import yaml
from deepresearch.datatypes import ResearchState

# Serialize to YAML
state = ResearchState(question="Test question")
yaml_str = yaml.dump(state.dict())

# Deserialize from YAML
state_dict = yaml.safe_load(yaml_str)
restored_state = ResearchState(**state_dict)

Data Validation

Schema Validation

from deepresearch.datatypes.validation import DataValidator

validator = DataValidator()

# Validate agent result
result = AgentResult(success=True, data="test")
validation = validator.validate(result, AgentResult)

if validation.valid:
    print("Data is valid")
else:
    for error in validation.errors:
        print(f"Validation error: {error}")

Cross-Field Validation

from pydantic import root_validator

class ValidatedToolSpec(ToolSpec):
    """Tool specification with cross-field validation."""

    @root_validator
    def validate_inputs_outputs(cls, values):
        inputs = values.get('inputs', {})
        outputs = values.get('outputs', {})

        if not inputs and not outputs:
            raise ValueError("Tool must have either inputs or outputs")

        return values

Best Practices

  1. Use Type Hints: Always use proper type hints for better IDE support and validation
  2. Validate Input: Validate all input data using Pydantic models
  3. Handle Errors: Use structured error types for better error handling
  4. Document Types: Provide comprehensive docstrings for all data types
  5. Test Serialization: Ensure all types can be properly serialized/deserialized
  6. Version Compatibility: Consider backward compatibility when changing data types