II. Implementation

Best Practices

Overview

Follow these guidelines to build robust, performant applications with the Gracenote Video MCP Server.

Best Practices

Working with Images

Image URIs are constructed to include your subdomain as follows:

https://acme-media.tmsimg.com/assets/o8_l_h15_ac.png

Do not expose image URLs to end users. You should cache in your own content delivery network. Reasons for this are:

URLs may expire
Reduces external dependencies
Improves performance
Required for production use

Implementation:


Code
 
import requests
from pathlib import Path

def cache_image(image_url, image_uri):
    # Use URI as filename for consistency
    filename = Path(image_uri).name
    local_path = f"./image_cache/{filename}"
    
    if not Path(local_path).exists():
        response = requests.get(image_url)
        Path(local_path).write_bytes(response.content)
    
    return local_path

Production: Store in CDN or object storage (S3, GCS, Azure Blob).

User Experience

Handle Multiple Candidates

Entity Resolution returns up to 5 results ranked by confidence.

Implementation Options:

Option 1: Auto-select highest confidence


Code
 
if candidates:
    best_match = candidates[0]  # Highest confidence
    return best_match

Option 2: Present options to user


Code
 
if len(candidates) > 1:
    # Show disambiguation UI
    return present_options(candidates)
else:
    return candidates[0]

Option 3: Use confidence threshold


Code
 
if candidates[0].confidence > 0.9:
    return candidates[0]  # High confidence, auto-select
else:
    return present_options(candidates)  # Low confidence, ask user

Optimize Query Patterns

Encourage users to provide specific details for better results.

Implementation:


Code
 
SYSTEM_PROMPT = """
When users ask vague questions, prompt them for more details:
- "Which year was that movie released?"
- "Do you remember any actors in it?"
- "Was it a movie or TV series?"
"""

UI Hints:

Placeholder text: "e.g., The Matrix (1999) with Keanu Reeves"
Suggested filters: Year, genre, cast
Auto-complete with popular titles

Provide Helpful Error Messages

Set user expectations when content isn't found.

Implementation:


Code
 
if not results:
    return {
        "message": "We couldn't find that title. Try:",
        "suggestions": [
            "Check the spelling",
            "Include the release year",
            "Try an alternate title",
            "This content may not be available in your region"
        ]
    }

Production Requirements

Environment Variables

Always use environment variables for sensitive data:

Code
 
# LLM API Keys (at least one required)
ANTHROPIC_API_KEY=sk-ant-...
# OPENAI_API_KEY=sk-...
# GEMINI_API_KEY=your-key-here

# MCP Server Configuration
MCP_SERVER_HOST=your-server.gracenote.com
MCP_SERVER_PORT=443

# Cognito Authentication
COGNITO_USER_POOL_REGION=us-west-2
COGNITO_CLIENT_ID=your-client-id
COGNITO_CLIENT_SECRET=your-client-secret
COGNITO_USERNAME=your-username
COGNITO_PASSWORD=your-password

Connection Pooling

Reuse MCP client across requests:


Code
 
import asyncio

class MCPConnectionPool:
    def __init__(self):
        self.client = None
        self.lock = asyncio.Lock()
    
    async def get_client(self):
        async with self.lock:
            if self.client is None:
                self.client = GracenoteClient()
                await self.client.connect()
            return self.client

# Usage
pool = MCPConnectionPool()
client = await pool.get_client()

Caching Strategy

Cache frequently requested data:


Code
 
from functools import lru_cache
import time

# Simple in-memory cache
@lru_cache(maxsize=1000)
def get_cached_result(tmsid):
    return fetch_from_mcp(tmsid)

# Time-based cache
cache = {}
CACHE_TTL = 3600  # 1 hour

def get_with_ttl_cache(key):
    if key in cache:
        data, timestamp = cache[key]
        if time.time() - timestamp < CACHE_TTL:
            return data
    
    result = fetch_from_mcp(key)
    cache[key] = (result, time.time())
    return result

Implement Token Refresh

Tokens expire after 1 hour. The GracenoteClient from mcp_utils.py handles this automatically — it checks token age before each call and reconnects if the token is within 5 minutes of expiry. No additional code is needed.

For custom implementations without GracenoteClient, the pattern is:


Code
 
import time

class TokenManager:
    def __init__(self):
        self.token_acquired_at = None
        self.token_expires_in = 3600  # 1 hour
    
    def should_refresh(self):
        if not self.token_acquired_at:
            return True
        elapsed = time.time() - self.token_acquired_at
        return elapsed > (self.token_expires_in - 300)  # 5 min buffer

Use Secrets Manager

Never hardcode credentials.

Why: Security best practice.

Options:

AWS Secrets Manager
HashiCorp Vault
Azure Key Vault
Google Secret Manager
Environment variables (minimum)

Example with AWS Secrets Manager:


Code
 
import boto3
import json

def get_credentials():
    client = boto3.client('secretsmanager')
    response = client.get_secret_value(SecretId='mcp-credentials')
    return json.loads(response['SecretString'])

creds = get_credentials()
client = GracenoteClient(
    username=creds['username'],
    password=creds['password'],
    # ...
)

Implement Health Checks

Monitor MCP Server connectivity.

Implementation:


Code
 
async def health_check():
    try:
        await client.list_tools()
        return {"status": "healthy", "timestamp": time.time()}
    except Exception as e:
        return {"status": "unhealthy", "error": str(e)}

import asyncio

async def monitor():
    while True:
        health = await health_check()
        log_health_status(health)
        await asyncio.sleep(60)  # Check every minute

Monitor Usage

Track API usage via Customer Dashboard.

What to Monitor:

Query volume and patterns
Error rates
Response times
Tool usage distribution
Cache hit rates

Implementation:


Code
 
import logging
import time

logger = logging.getLogger(__name__)

async def call_tool_with_logging(tool_name, args):
    start = time.time()
    try:
        result = await client.call_tool(tool_name, args)
        duration = time.time() - start
        logger.info(f"Tool: {tool_name}, Duration: {duration}s, Success: True")
        return result
    except Exception as e:
        duration = time.time() - start
        logger.error(f"Tool: {tool_name}, Duration: {duration}s, Error: {e}")
        raise

Error Handling

Common Error Scenarios

Input Validation Errors:

Missing required parameters
Invalid parameter formats
Exceeded input limits (e.g., >3 cast members)

Response: Human-readable error message with guidance on proper usage

No Results Found:

Query matches no content
Content not available in specified country/language

Response: Empty results (not an error)

Server Errors:

Internal processing failures
Service unavailability

Response: Error status with message

Implement Retry Logic

Handle transient failures gracefully.

Implementation:


Code
 
from tenacity import retry, stop_after_attempt, wait_exponential

@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=2, max=10)
)
async def call_tool_with_retry(tool_name, args):
    return await client.call_tool(tool_name, args)

Handle Missing Optional Fields

Not all titles have complete metadata. Data coverage varies by title and region.

Implementation:


Code
 
cast = metadata.get('cast', [])
if cast:
    response += f"Starring {', '.join(cast)}"

genres = metadata.get('genres', [])
if genres:
    response += f"Genres: {', '.join(genres)}"

Validate Before Production

Test with edge cases and error scenarios.

Test Cases:

Ambiguous titles (multiple matches)
Obscure content (no matches)
Invalid TMSIDs
Missing metadata fields
Token expiration

LLM Self-Correction

The MCP Server returns descriptive error messages that enable your LLM to:

Understand what went wrong
Adjust parameters automatically
Retry with corrected input

Example: If your LLM sends 5 cast members, the server responds with "Maximum 3 cast members allowed" - your LLM can automatically retry with only 3.

Converting MCP Tools to LiteLLM Format

LiteLLM uses the OpenAI tool format. The GracenoteClient from mcp_utils.py provides a convenience method:


Code
 
# Recommended: use the built-in helper
tools = await client.tools_for_litellm()

If you need to convert manually:


Code
 
mcp_tools = await mcp_client.list_tools()

tools = [{
    "type": "function",
    "function": {
        "name": tool.name,
        "description": tool.description,
        "parameters": tool.inputSchema
    }
} for tool in mcp_tools.tools]

This format works with all LiteLLM-supported models (Claude, GPT, Gemini, etc.).

Last modified on March 31, 2026

System Prompt Catalog Harmonization

II. Implementation

Best Practices

Overview

Follow these guidelines to build robust, performant applications with the Gracenote Video MCP Server.

Best Practices

Working with Images

Image URIs are constructed to include your subdomain as follows:

https://acme-media.tmsimg.com/assets/o8_l_h15_ac.png

Do not expose image URLs to end users. You should cache in your own content delivery network. Reasons for this are:

URLs may expire
Reduces external dependencies
Improves performance
Required for production use

Implementation:


Code
 
import requests
from pathlib import Path

def cache_image(image_url, image_uri):
    # Use URI as filename for consistency
    filename = Path(image_uri).name
    local_path = f"./image_cache/{filename}"
    
    if not Path(local_path).exists():
        response = requests.get(image_url)
        Path(local_path).write_bytes(response.content)
    
    return local_path

Production: Store in CDN or object storage (S3, GCS, Azure Blob).

User Experience

Handle Multiple Candidates

Entity Resolution returns up to 5 results ranked by confidence.

Implementation Options:

Option 1: Auto-select highest confidence


Code
 
if candidates:
    best_match = candidates[0]  # Highest confidence
    return best_match

Option 2: Present options to user


Code
 
if len(candidates) > 1:
    # Show disambiguation UI
    return present_options(candidates)
else:
    return candidates[0]

Option 3: Use confidence threshold


Code
 
if candidates[0].confidence > 0.9:
    return candidates[0]  # High confidence, auto-select
else:
    return present_options(candidates)  # Low confidence, ask user

Optimize Query Patterns

Encourage users to provide specific details for better results.

Implementation:


Code
 
SYSTEM_PROMPT = """
When users ask vague questions, prompt them for more details:
- "Which year was that movie released?"
- "Do you remember any actors in it?"
- "Was it a movie or TV series?"
"""

UI Hints:

Placeholder text: "e.g., The Matrix (1999) with Keanu Reeves"
Suggested filters: Year, genre, cast
Auto-complete with popular titles

Provide Helpful Error Messages

Set user expectations when content isn't found.

Implementation:


Code
 
if not results:
    return {
        "message": "We couldn't find that title. Try:",
        "suggestions": [
            "Check the spelling",
            "Include the release year",
            "Try an alternate title",
            "This content may not be available in your region"
        ]
    }

Production Requirements

Environment Variables

Always use environment variables for sensitive data:

Code
 
# LLM API Keys (at least one required)
ANTHROPIC_API_KEY=sk-ant-...
# OPENAI_API_KEY=sk-...
# GEMINI_API_KEY=your-key-here

# MCP Server Configuration
MCP_SERVER_HOST=your-server.gracenote.com
MCP_SERVER_PORT=443

# Cognito Authentication
COGNITO_USER_POOL_REGION=us-west-2
COGNITO_CLIENT_ID=your-client-id
COGNITO_CLIENT_SECRET=your-client-secret
COGNITO_USERNAME=your-username
COGNITO_PASSWORD=your-password

Connection Pooling

Reuse MCP client across requests:


Code
 
import asyncio

class MCPConnectionPool:
    def __init__(self):
        self.client = None
        self.lock = asyncio.Lock()
    
    async def get_client(self):
        async with self.lock:
            if self.client is None:
                self.client = GracenoteClient()
                await self.client.connect()
            return self.client

# Usage
pool = MCPConnectionPool()
client = await pool.get_client()

Caching Strategy

Cache frequently requested data:


Code
 
from functools import lru_cache
import time

# Simple in-memory cache
@lru_cache(maxsize=1000)
def get_cached_result(tmsid):
    return fetch_from_mcp(tmsid)

# Time-based cache
cache = {}
CACHE_TTL = 3600  # 1 hour

def get_with_ttl_cache(key):
    if key in cache:
        data, timestamp = cache[key]
        if time.time() - timestamp < CACHE_TTL:
            return data
    
    result = fetch_from_mcp(key)
    cache[key] = (result, time.time())
    return result

Implement Token Refresh

For custom implementations without GracenoteClient, the pattern is:


Code
 
import time

class TokenManager:
    def __init__(self):
        self.token_acquired_at = None
        self.token_expires_in = 3600  # 1 hour
    
    def should_refresh(self):
        if not self.token_acquired_at:
            return True
        elapsed = time.time() - self.token_acquired_at
        return elapsed > (self.token_expires_in - 300)  # 5 min buffer

Use Secrets Manager

Never hardcode credentials.

Why: Security best practice.

Options:

AWS Secrets Manager
HashiCorp Vault
Azure Key Vault
Google Secret Manager
Environment variables (minimum)

Example with AWS Secrets Manager:


Code
 
import boto3
import json

def get_credentials():
    client = boto3.client('secretsmanager')
    response = client.get_secret_value(SecretId='mcp-credentials')
    return json.loads(response['SecretString'])

creds = get_credentials()
client = GracenoteClient(
    username=creds['username'],
    password=creds['password'],
    # ...
)

Implement Health Checks

Monitor MCP Server connectivity.

Implementation:


Code
 
async def health_check():
    try:
        await client.list_tools()
        return {"status": "healthy", "timestamp": time.time()}
    except Exception as e:
        return {"status": "unhealthy", "error": str(e)}

import asyncio

async def monitor():
    while True:
        health = await health_check()
        log_health_status(health)
        await asyncio.sleep(60)  # Check every minute

Monitor Usage

Track API usage via Customer Dashboard.

What to Monitor:

Query volume and patterns
Error rates
Response times
Tool usage distribution
Cache hit rates

Implementation:


Code
 
import logging
import time

logger = logging.getLogger(__name__)

async def call_tool_with_logging(tool_name, args):
    start = time.time()
    try:
        result = await client.call_tool(tool_name, args)
        duration = time.time() - start
        logger.info(f"Tool: {tool_name}, Duration: {duration}s, Success: True")
        return result
    except Exception as e:
        duration = time.time() - start
        logger.error(f"Tool: {tool_name}, Duration: {duration}s, Error: {e}")
        raise

Error Handling

Common Error Scenarios

Input Validation Errors:

Missing required parameters
Invalid parameter formats
Exceeded input limits (e.g., >3 cast members)

Response: Human-readable error message with guidance on proper usage

No Results Found:

Query matches no content
Content not available in specified country/language

Response: Empty results (not an error)

Server Errors:

Internal processing failures
Service unavailability

Response: Error status with message

Implement Retry Logic

Handle transient failures gracefully.

Implementation:


Code
 
from tenacity import retry, stop_after_attempt, wait_exponential

@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=2, max=10)
)
async def call_tool_with_retry(tool_name, args):
    return await client.call_tool(tool_name, args)

Handle Missing Optional Fields

Not all titles have complete metadata. Data coverage varies by title and region.

Implementation:


Code
 
cast = metadata.get('cast', [])
if cast:
    response += f"Starring {', '.join(cast)}"

genres = metadata.get('genres', [])
if genres:
    response += f"Genres: {', '.join(genres)}"

Validate Before Production

Test with edge cases and error scenarios.

Test Cases:

Ambiguous titles (multiple matches)
Obscure content (no matches)
Invalid TMSIDs
Missing metadata fields
Token expiration

LLM Self-Correction

The MCP Server returns descriptive error messages that enable your LLM to:

Understand what went wrong
Adjust parameters automatically
Retry with corrected input

Example: If your LLM sends 5 cast members, the server responds with "Maximum 3 cast members allowed" - your LLM can automatically retry with only 3.

Converting MCP Tools to LiteLLM Format

LiteLLM uses the OpenAI tool format. The GracenoteClient from mcp_utils.py provides a convenience method:


Code
 
# Recommended: use the built-in helper
tools = await client.tools_for_litellm()

If you need to convert manually:


Code
 
mcp_tools = await mcp_client.list_tools()

tools = [{
    "type": "function",
    "function": {
        "name": tool.name,
        "description": tool.description,
        "parameters": tool.inputSchema
    }
} for tool in mcp_tools.tools]

This format works with all LiteLLM-supported models (Claude, GPT, Gemini, etc.).

Last modified on March 31, 2026

System Prompt Catalog Harmonization