LLM Integration
How to rendered prompts with LLM Clients
Using Rendered Prompts with LLM Providers
This guide covers all the ways you can use PromptEngine's rendered prompts with various LLM provider gems and custom clients in your Rails application.
Basic Usage
First, render a prompt with your variables:
rendered = PromptEngine.render("customer-support",
{ customer_name: "Alice", issue: "Password reset" }
)
The rendered
object is a PromptEngine::RenderedPrompt
instance that provides multiple ways to integrate with LLM providers.
Accessing Rendered Prompt Data
The rendered prompt provides accessors for all the data you might need:
# Core content
rendered.content # => "Help Alice with Password reset"
rendered.system_message # => "You are a helpful support agent"
# Model configuration
rendered.model # => "gpt-4"
rendered.temperature # => 0.7
rendered.max_tokens # => 1000
# Metadata
rendered.status # => "active"
rendered.version # => 3
# Parameters used
rendered.parameters # => {"customer_name" => "Alice", "issue" => "Password reset"}
rendered.parameter(:customer_name) # => "Alice"
# Options passed during rendering
rendered.options # => {model: "gpt-4", temperature: 0.7}
Integration Methods
Automatic Integration with execute_with
The simplest way to execute a rendered prompt is using the execute_with
method, which automatically detects the client type and formats the request appropriately:
# OpenAI
client = OpenAI::Client.new(access_token: ENV["OPENAI_API_KEY"])
response = rendered.execute_with(client)
# Anthropic
client = Anthropic::Client.new(access_token: ENV["ANTHROPIC_API_KEY"])
response = rendered.execute_with(client)
# RubyLLM
client = RubyLLM::ChatModels::OpenAI.new(api_key: ENV["OPENAI_API_KEY"])
response = rendered.execute_with(client)
You can also pass additional options:
response = rendered.execute_with(client,
stream: true,
tools: [...],
response_format: { type: "json_object" }
)
OpenAI Gem Integration
For direct OpenAI gem usage, use the to_openai_params
method:
require "openai"
client = OpenAI::Client.new(access_token: ENV["OPENAI_API_KEY"])
# Get formatted parameters
params = rendered.to_openai_params
# Make the API call
response = client.chat(parameters: params)
# Access the response
puts response.dig("choices", 0, "message", "content")
Adding additional OpenAI-specific options:
# Add tools, functions, or other OpenAI-specific parameters
params = rendered.to_openai_params(
tools: [
{
type: "function",
function: {
name: "get_weather",
description: "Get the current weather",
parameters: {
type: "object",
properties: {
location: { type: "string" }
}
}
}
}
],
response_format: { type: "json_object" }
)
response = client.chat(parameters: params)
RubyLLM Integration
RubyLLM provides a unified interface for multiple LLM providers:
require "ruby_llm"
# Configure RubyLLM
RubyLLM.configure do |config|
config.openai_api_key = ENV["OPENAI_API_KEY"]
config.anthropic_api_key = ENV["ANTHROPIC_API_KEY"]
end
# Create a chat model
chat = RubyLLM.chat(provider: "openai", model: rendered.model)
# Get formatted parameters
params = rendered.to_ruby_llm_params
# Execute the prompt
response = chat.messages(params[:messages])
.with_temperature(params[:temperature])
.with_max_tokens(params[:max_tokens])
.ask
# Or more concisely
response = chat.ask(params[:messages][-1][:content],
instructions: params[:messages][0][:content],
temperature: params[:temperature],
max_tokens: params[:max_tokens]
)
Using RubyLLM with different providers:
# Anthropic via RubyLLM
anthropic_chat = RubyLLM.chat(provider: "anthropic", model: "claude-3-opus")
response = anthropic_chat.messages(rendered.messages)
.with_temperature(rendered.temperature)
.ask
# OpenAI via RubyLLM
openai_chat = RubyLLM.chat(provider: "openai", model: rendered.model)
response = openai_chat.messages(rendered.messages)
.with_temperature(rendered.temperature)
.ask
Anthropic Integration
For direct Anthropic usage:
require "anthropic"
client = Anthropic::Client.new(access_token: ENV["ANTHROPIC_API_KEY"])
# Get formatted parameters
params = rendered.to_ruby_llm_params
# Anthropic expects a specific format
response = client.messages(
model: params[:model] || "claude-3-opus",
messages: params[:messages],
max_tokens: params[:max_tokens] || 1000,
temperature: params[:temperature]
)
puts response.dig("content", 0, "text")
Manual Integration Patterns
Using the Messages Array
The messages
method returns a properly formatted array for chat models:
messages = rendered.messages
# => [
# { role: "system", content: "You are a helpful support agent" },
# { role: "user", content: "Help Alice with Password reset" }
# ]
# Use with any client that expects messages format
response = custom_client.complete(
messages: messages,
model: rendered.model,
temperature: rendered.temperature
)
Using Hash Representation
The to_h
method provides complete access to all data:
data = rendered.to_h
# => {
# content: "Help Alice with Password reset",
# system_message: "You are a helpful support agent",
# model: "gpt-4",
# temperature: 0.7,
# max_tokens: 1000,
# messages: [...],
# options: {...},
# status: "active",
# version: 3,
# parameters: {"customer_name" => "Alice", "issue" => "Password reset"}
# }
# Use with custom API clients
response = CustomLLMClient.generate(
prompt: data[:content],
system: data[:system_message],
model: data[:model],
config: {
temperature: data[:temperature],
max_tokens: data[:max_tokens]
}
)
Custom Client Integration
For clients with unique interfaces, access individual components:
# For a client that separates system and user prompts
response = ProprietaryLLM.complete(
system_prompt: rendered.system_message,
user_prompt: rendered.content,
model_name: rendered.model,
creativity: rendered.temperature,
token_limit: rendered.max_tokens
)
# For a client that uses a single prompt
combined_prompt = if rendered.system_message.present?
"#{rendered.system_message}\n\n#{rendered.content}"
else
rendered.content
end
response = SimpleLLM.generate(
text: combined_prompt,
settings: {
model: rendered.model,
temp: rendered.temperature
}
)
Advanced Usage
Overriding Options at Execution Time
You can override prompt settings when executing:
# Render with default settings
rendered = PromptEngine.render("creative-writer", { topic: "Space exploration" })
# Override temperature for more creativity
params = rendered.to_openai_params(temperature: 0.9)
creative_response = client.chat(parameters: params)
# Override for more deterministic output
params = rendered.to_openai_params(temperature: 0.1)
factual_response = client.chat(parameters: params)
Working with Different Model Types
# Chat models (most common)
chat_response = rendered.execute_with(chat_client)
# Completion models (older style)
# Extract just the content for completion APIs
completion_response = completion_client.complete(
prompt: rendered.content,
max_tokens: rendered.max_tokens,
temperature: rendered.temperature
)
# Embedding models
embedding_client = OpenAI::Client.new(access_token: ENV["OPENAI_API_KEY"])
embedding_response = embedding_client.embeddings(
parameters: {
model: "text-embedding-ada-002",
input: rendered.content
}
)
Handling Responses
Different providers return responses in different formats:
# OpenAI response handling
openai_response = rendered.execute_with(openai_client)
content = openai_response.dig("choices", 0, "message", "content")
usage = openai_response["usage"]
# Anthropic response handling
anthropic_response = rendered.execute_with(anthropic_client)
content = anthropic_response.dig("content", 0, "text")
usage = anthropic_response["usage"]
# RubyLLM provides a unified response interface
rubyllm_response = rendered.execute_with(rubyllm_client)
content = rubyllm_response.content
tokens = rubyllm_response.input_tokens + rubyllm_response.output_tokens
Best Practices
-
Use execute_with for simplicity: Unless you need fine-grained control,
execute_with
handles most use cases automatically. -
Cache rendered prompts: If using the same prompt multiple times, render once and reuse:
@support_prompt ||= PromptEngine.render("support-base", base_variables) response = @support_prompt.execute_with(client, additional_vars)
-
Handle errors gracefully:
begin response = rendered.execute_with(client) rescue => e Rails.logger.error "LLM Error: #{e.message}" # Fallback behavior end
-
Use appropriate models: Override the model when needed:
# Use a faster model for simple tasks params = rendered.to_openai_params(model: "gpt-3.5-turbo") # Use a more capable model for complex tasks params = rendered.to_openai_params(model: "gpt-4-turbo")
-
Monitor token usage: Track usage for cost management:
response = rendered.execute_with(client) Rails.logger.info "Tokens used: #{response['usage']['total_tokens']}"
Complete Example
Here's a full example using a rendered prompt in a Rails service:
class CustomerSupportService
def initialize
@client = OpenAI::Client.new(access_token: Rails.application.credentials.openai_api_key)
end
def handle_ticket(customer_name:, issue:, context: nil)
# Render the prompt with variables
rendered = PromptEngine.render("customer-support",
{
customer_name: customer_name,
issue: issue,
context: context || "No additional context"
},
options: {
temperature: 0.3, # Lower temperature for consistent support responses
model: "gpt-4" # Use GPT-4 for better understanding
}
)
# Log what we're sending
Rails.logger.info "Support prompt for #{customer_name}: #{rendered.content.truncate(100)}"
# Execute with additional options
response = rendered.execute_with(@client,
presence_penalty: 0.1,
frequency_penalty: 0.1
)
# Extract and return the response
ai_response = response.dig("choices", 0, "message", "content")
# Store the interaction
SupportInteraction.create!(
customer_name: customer_name,
issue: issue,
prompt_version: rendered.version,
ai_response: ai_response,
tokens_used: response.dig("usage", "total_tokens")
)
ai_response
rescue => e
Rails.logger.error "Support AI Error: #{e.message}"
"I apologize, but I'm having trouble processing your request. Please contact human support."
end
end
# Usage
service = CustomerSupportService.new
response = service.handle_ticket(
customer_name: "Alice Johnson",
issue: "Cannot reset password",
context: "Customer has tried resetting 3 times today"
)
This documentation covers all the current integration capabilities of PromptEngine's RenderedPrompt class, providing developers with multiple options for integrating with their preferred LLM providers.