Skip to content

Latest commit

 

History

History
763 lines (568 loc) · 15.5 KB

File metadata and controls

763 lines (568 loc) · 15.5 KB

MCP Servers Guide

Complete guide to integrating Model Context Protocol (MCP) servers with Perpendicularity.


📋 Table of Contents


🎯 Overview

Model Context Protocol (MCP) is a standard for connecting AI applications to external tools and data sources. Perpendicularity uses MCP to access domain-specific capabilities:

  • GenomicOps-MCP - Genomic analysis tools (UCSC, liftOver, etc.)
  • TxGemma-MCP - Therapeutics evaluation tools (drug data, toxicity, etc.)
  • Custom servers - Your own domain-specific tools

Benefits:

  • Standardized interface - Works with any MCP-compliant server
  • Automatic discovery - Tools are discovered dynamically
  • Type-safe execution - Schema validation for tool calls
  • Multiple servers - Connect to many servers simultaneously

📡 MCP Protocol

Protocol Basics

MCP defines:

  • Tool schema - JSON schema for tool inputs/outputs
  • Tool invocation - Standard request/response format
  • Transport layer - HTTP, SSE, or stdio

Communication Flow:

1. Connection
   Perpendicularity → MCP Server: Connect
   
2. Tool Discovery
   Perpendicularity → MCP Server: list_tools()
   MCP Server → Perpendicularity: [tool1, tool2, ...]
   
3. Tool Execution
   Perpendicularity → MCP Server: call_tool(name, arguments)
   MCP Server → Perpendicularity: {result}

Tool Schema Example

{
  "name": "evaluate_drug_toxicity",
  "description": "Evaluate toxicity profile of a drug compound",
  "inputSchema": {
    "type": "object",
    "properties": {
      "smiles": {
        "type": "string",
        "description": "SMILES notation of the compound"
      },
      "model_type": {
        "type": "string",
        "enum": ["acute", "chronic", "organ"],
        "description": "Type of toxicity model to use"
      }
    },
    "required": ["smiles"]
  }
}

LLM sees this and can call:

evaluate_drug_toxicity({"smiles": "CC(=O)OC1=CC=CC=C1C(=O)O", "model_type": "acute"})

🧬 GenomicOps-MCP

GenomicOps-MCP provides genomic analysis tools via UCSC Genome Browser APIs.

Features

  • Species/Assembly Listing - List available genomes
  • Coordinate Conversion - liftOver between assemblies
  • Feature Retrieval - Get genes/features in regions
  • Track Access - Query UCSC genome browser tracks

Configuration in Perpendicularity

# config/agent_config.yaml

mcp_servers:
  genomic_ops:
    url: "http://localhost:8000/mcp"
    transport: "streamable-http"
    timeout: 120  # Genomic operations can be slow

Example available Tools

list_species

Description: List all available species from UCSC.

No parameters required.

Example usage:

Agent: "What species are available for genomic analysis?"
Tool call: list_species()
Result: ["Human (Homo sapiens)", "Mouse (Mus musculus)", ...]

list_assemblies

Description: Get assemblies for a given species.

Parameters:

  • species_name (string) - Species name (exact or fuzzy match)

Example usage:

Agent: "What human genome assemblies are available?"
Tool call: list_assemblies({"species_name": "human"})
Result: ["hg38", "hg19", "hg18", ...]

get_overlapping_features

Description: Get genomic features overlapping a region.

Parameters:

  • assembly (string) - Genome assembly (e.g., "hg38")
  • region (string) - Genomic region (e.g., "chr1:1000-2000")
  • track (string, optional) - UCSC track name (default: "knownGene")

Example usage:

Agent: "What genes are in chr15:61857240-61862199 in hg38?"
Tool call: get_overlapping_features({
  "assembly": "hg38",
  "region": "chr15:61857240-61862199",
  "track": "knownGene"
})
Result: [
  {
    "name": "GENE1",
    "chrom": "chr15",
    "start": 61857240,
    "end": 61862199,
    ...
  }
]

Example Workflow

User: "For mouse genomic locus mm39 chr15:61857240-61862199, find the genes and convert to human coordinates"

Agent reasoning:

Step 1: Get genes in mouse region
  Tool: get_overlapping_features({
    "assembly": "mm39",
    "region": "chr15:61857240-61862199"
  })
  Result: [Zcchc11, ...]

Step 2: Convert coordinates to human
  Tool: lift_over_coordinates({
    "from_asm": "mm39",
    "to_asm": "hg38",
    "region": "chr15:61857240-61862199"
  })
  Result: chr7:123456-128456

Step 3: Get genes in human region
  Tool: get_overlapping_features({
    "assembly": "hg38",
    "region": "chr7:123456-128456"
  })
  Result: [ZCCHC11, ...]

💊 TxGemma-MCP

TxGemma-MCP provides therapeutics evaluation tools using fine-tuned Gemma models.

Features

  • Drug Information - Retrieve drug properties
  • Toxicity Prediction - Evaluate compound safety
  • Molecular Analysis - SMILES-based compound analysis
  • Literature Search - Find relevant publications

Configuration in Perpendicularity

# config/agent_config.yaml

mcp_servers:
  txgemma:
    url: "http://localhost:8001/mcp"
    transport: "streamable-http"
    timeout: 180  # AI inference can be slow

Example available Tools

evaluate_drug_toxicity

Description: Evaluate toxicity profile of a compound.

Parameters:

  • smiles (string) - SMILES notation of compound
  • model_type (string, optional) - Toxicity model type

Example usage:

Agent: "Evaluate toxicity of aspirin"
Tool call: evaluate_drug_toxicity({
  "smiles": "CC(=O)OC1=CC=CC=C1C(=O)O"
})
Result: {
  "hepatotoxicity": 0.45,
  "nephrotoxicity": 0.12,
  "cardiotoxicity": 0.08,
  "overall_safety": "moderate"
}

Example Workflow

User: "Evaluate the safety of aspirin and compare to ibuprofen"

Agent reasoning:

Step 1: Get aspirin SMILES and evaluate
  Tool: evaluate_drug_toxicity({
    "smiles": "CC(=O)OC1=CC=CC=C1C(=O)O"
  })
  Result: {hepatotoxicity: 0.45, ...}

Step 2: Get ibuprofen SMILES and evaluate
  Tool: evaluate_drug_toxicity({
    "smiles": "CC(C)CC1=CC=C(C=C1)C(C)C(=O)O"
  })
  Result: {hepatotoxicity: 0.52, ...}

Step 3: Search literature for comparative studies
  Tool: search_drug_literature({
    "query": "aspirin ibuprofen safety comparison"
  })
  Result: [relevant papers...]

Step 4: Synthesize results
  "Based on toxicity models and literature, aspirin shows
   slightly lower hepatotoxicity (0.45 vs 0.52) but both
   are generally safe at therapeutic doses..."

⚙️ Configuration

Basic Configuration

# config/agent_config.yaml

mcp_servers:
  server_name:
    url: "http://server:port/mcp"
    transport: "streamable-http"
    timeout: 120

Advanced Configuration

mcp_servers:
  genomic_ops:
    url: "http://localhost:8000/mcp"
    transport: "streamable-http"
    timeout: 120
    headers:
      Authorization: "Bearer ${GENOMIC_OPS_TOKEN}"
      X-API-Version: "v1"
  
  txgemma:
    url: "https://secure-server.com/mcp"
    transport: "streamable-http"
    timeout: 180
    headers:
      Authorization: "Bearer ${TXGEMMA_TOKEN}"
    
  local_tool:
    command: ["python", "-m", "my_mcp_server"]
    transport: "stdio"
    env:
      DATABASE_URL: "postgresql://..."
      API_KEY: "${MY_API_KEY}"

Environment Variables

# Set tokens/keys
export GENOMIC_OPS_TOKEN="token123"
export TXGEMMA_TOKEN="token456"

# Reference in config with ${VAR_NAME}

Multiple Servers

Connect to multiple MCP servers simultaneously:

mcp_servers:
  genomic_ops:
    url: "http://server1:8000/mcp"
  
  txgemma:
    url: "http://server2:8001/mcp"
  
  custom_tools:
    url: "http://server3:8002/mcp"

All tools from all servers are available to the agent!


🔨 Custom MCP Servers

Creating a Custom Server

1. Install MCP SDK:

pip install mcp

2. Implement Server:

# my_mcp_server.py

from mcp.server import Server
from mcp.types import Tool, TextContent

app = Server("my-custom-server")

@app.list_tools()
async def list_tools() -> list[Tool]:
    """Return available tools."""
    return [
        Tool(
            name="my_custom_tool",
            description="Does something useful",
            inputSchema={
                "type": "object",
                "properties": {
                    "param1": {
                        "type": "string",
                        "description": "First parameter"
                    },
                    "param2": {
                        "type": "integer",
                        "description": "Second parameter"
                    }
                },
                "required": ["param1"]
            }
        )
    ]

@app.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
    """Execute a tool."""
    if name == "my_custom_tool":
        param1 = arguments["param1"]
        param2 = arguments.get("param2", 0)
        
        # Your custom logic here
        result = f"Processed {param1} with {param2}"
        
        return [TextContent(type="text", text=result)]
    
    raise ValueError(f"Unknown tool: {name}")

# Run server
if __name__ == "__main__":
    import uvicorn
    import mcp.server.fastapi
    
    # Create FastAPI app from MCP server
    fastapi_app = mcp.server.fastapi.create_fastapi_app(app)
    
    # Run
    uvicorn.run(fastapi_app, host="0.0.0.0", port=8003)

3. Configure in Perpendicularity:

# config/agent_config.yaml

mcp_servers:
  my_server:
    url: "http://localhost:8003/mcp"
    transport: "streamable-http"

4. Test:

# Start your server
python my_mcp_server.py

# Use in Perpendicularity
perpendicularity ask "Use my_custom_tool with param1='test'"

Tool Design Best Practices

1. Clear Descriptions:

Tool(
    name="calculate_similarity",
    description="Calculate molecular similarity between two compounds using Tanimoto coefficient",  # Specific!
    ...
)

2. Structured Input Schemas:

inputSchema={
    "type": "object",
    "properties": {
        "smiles1": {
            "type": "string",
            "description": "SMILES notation of first compound"
        },
        "smiles2": {
            "type": "string",
            "description": "SMILES notation of second compound"
        },
        "method": {
            "type": "string",
            "enum": ["tanimoto", "dice", "cosine"],
            "description": "Similarity metric to use"
        }
    },
    "required": ["smiles1", "smiles2"]
}

3. Informative Errors:

if not is_valid_smiles(smiles):
    raise ValueError(
        f"Invalid SMILES: {smiles}. "
        f"SMILES must be a valid molecular structure notation."
    )

4. Structured Outputs:

# Return structured data
result = {
    "similarity": 0.85,
    "method": "tanimoto",
    "compounds": {
        "compound1": {"name": "...", "smiles": "..."},
        "compound2": {"name": "...", "smiles": "..."}
    }
}
return [TextContent(type="text", text=json.dumps(result, indent=2))]

🔄 Transport Types

streamable-http (Recommended)

Modern bidirectional HTTP streaming.

Pros:

  • ✅ Bidirectional communication
  • ✅ Works through firewalls
  • ✅ Standard HTTP/HTTPS
  • ✅ Easy to deploy

Configuration:

server:
  url: "http://server:8000/mcp"
  transport: "streamable-http"

SSE (Server-Sent Events)

Unidirectional streaming from server to client.

Pros:

  • ✅ Simple protocol
  • ✅ Browser-compatible
  • ✅ Works through proxies

Cons:

  • ❌ Unidirectional only
  • ❌ Less efficient

Configuration:

server:
  url: "http://server:8000/sse"
  transport: "sse"

stdio (Standard I/O)

Local processes communicating via stdin/stdout.

Pros:

  • ✅ No network needed
  • ✅ Very fast
  • ✅ Secure (local only)

Cons:

  • ❌ Local only
  • ❌ Process management needed

Configuration:

server:
  command: ["python", "-m", "my_server"]
  transport: "stdio"
  env:
    VAR: "value"

🛠️ Tool Management

Tool Discovery

Automatic on connection:

# When agent connects
async def connect(self):
    self.tool_manager = MCPToolManager(mcp_config)
    await self.tool_manager.connect()
    
    # Tools are now available
    print(f"Discovered {len(self.tool_manager.tools)} tools")
    for tool in self.tool_manager.tools:
        print(f"  - {tool.name}: {tool.description}")

Tool Execution

Agent calls tools automatically:

# Agent sees tool in context
# LLM generates: evaluate_drug_toxicity({"smiles": "CC(=O)O..."})

# Tool manager executes
result = await tool_manager.execute_tool(
    "evaluate_drug_toxicity",
    {"smiles": "CC(=O)O..."}
)

# Result returned to agent

Tool Filtering (Future)

# Filter tools by pattern
mcp_servers:
  genomic_ops:
    url: "..."
    enabled_tools:
      - "list_*"        # Only list_ tools
      - "get_*"         # Only get_ tools
    
  txgemma:
    url: "..."
    disabled_tools:
      - "admin_*"       # Exclude admin tools

🐛 Troubleshooting

Connection Failed

Problem: Failed to connect to MCP server

Solutions:

# 1. Check server is running
curl http://server:8000/mcp

# 2. Check URL in config
# Verify http:// vs https://
# Verify port number
# Verify /mcp endpoint

# 3. Check firewall
ping server
telnet server 8000

# 4. Check server logs
# Server should show connection attempt

# 5. Test with debug logging
perpendicularity ask "test" --debug | grep -i mcp

No Tools Discovered

Problem: Tool list is empty after connection

Solutions:

# 1. Verify server implements list_tools()
curl http://server:8000/mcp -X POST \
  -H "Content-Type: application/json" \
  -d '{"method":"list_tools","params":{}}'

# Should return tool list

# 2. Check server logs for errors

# 3. Verify transport type matches
# streamable-http is most common

# 4. Test server separately
# Ensure server works before connecting

Tool Execution Timeout

Problem: Tool execution timeout after 180 seconds

Solutions:

# Increase timeout
mcp_servers:
  slow_server:
    url: "..."
    timeout: 300  # 5 minutes

Authentication Errors

Problem: 401 Unauthorized or 403 Forbidden

Solutions:

# Add authentication headers
mcp_servers:
  secure_server:
    url: "https://server/mcp"
    headers:
      Authorization: "Bearer ${API_TOKEN}"

# Set environment variable
export API_TOKEN="your-token"

📚 MCP Resources

Official Documentation

Example Servers

Related Documentation


Connect your domain expertise to Perpendicularity with MCP! 🔌

For questions, see Troubleshooting or open an issue.