Do not expose image URLs to end users. You should cache in your own content delivery network. Reasons for this are:
URLs may expire
Reduces external dependencies
Improves performance
Required for production use
Implementation:
Code
import requestsfrom pathlib import Pathdef cache_image(image_url, image_uri): # Use URI as filename for consistency filename = Path(image_uri).name local_path = f"./image_cache/{filename}" if not Path(local_path).exists(): response = requests.get(image_url) Path(local_path).write_bytes(response.content) return local_path
Production: Store in CDN or object storage (S3, GCS, Azure Blob).
User Experience
Handle Multiple Candidates
Entity Resolution returns up to 5 results ranked by confidence.
Implementation Options:
Option 1: Auto-select highest confidence
Code
if candidates: best_match = candidates[0] # Highest confidence return best_match
Option 2: Present options to user
Code
if len(candidates) > 1: # Show disambiguation UI return present_options(candidates)else: return candidates[0]
Option 3: Use confidence threshold
Code
if candidates[0].confidence > 0.9: return candidates[0] # High confidence, auto-selectelse: return present_options(candidates) # Low confidence, ask user
Optimize Query Patterns
Encourage users to provide specific details for better results.
Implementation:
Code
SYSTEM_PROMPT = """When users ask vague questions, prompt them for more details:- "Which year was that movie released?"- "Do you remember any actors in it?"- "Was it a movie or TV series?""""
UI Hints:
Placeholder text: "e.g., The Matrix (1999) with Keanu Reeves"
Suggested filters: Year, genre, cast
Auto-complete with popular titles
Provide Helpful Error Messages
Set user expectations when content isn't found.
Implementation:
Code
if not results: return { "message": "We couldn't find that title. Try:", "suggestions": [ "Check the spelling", "Include the release year", "Try an alternate title", "This content may not be available in your region" ] }
Production Requirements
Environment Variables
Always use environment variables for sensitive data:
Code
# LLM API Keys (at least one required)ANTHROPIC_API_KEY=sk-ant-...# OPENAI_API_KEY=sk-...# GEMINI_API_KEY=your-key-here# MCP Server ConfigurationMCP_SERVER_HOST=your-server.gracenote.comMCP_SERVER_PORT=443# Cognito AuthenticationCOGNITO_USER_POOL_REGION=us-west-2COGNITO_CLIENT_ID=your-client-idCOGNITO_CLIENT_SECRET=your-client-secretCOGNITO_USERNAME=your-usernameCOGNITO_PASSWORD=your-password
from functools import lru_cacheimport time# Simple in-memory cache@lru_cache(maxsize=1000)def get_cached_result(tmsid): return fetch_from_mcp(tmsid)# Time-based cachecache = {}CACHE_TTL = 3600 # 1 hourdef get_with_ttl_cache(key): if key in cache: data, timestamp = cache[key] if time.time() - timestamp < CACHE_TTL: return data result = fetch_from_mcp(key) cache[key] = (result, time.time()) return result
Implement Token Refresh
Tokens expire after 1 hour. The GracenoteClient from mcp_utils.py handles this automatically — it checks token age before each call and reconnects if the token is within 5 minutes of expiry. No additional code is needed.
For custom implementations without GracenoteClient, the pattern is: