Track LLM Costs
This guide walks you through setting up cost tracking for Large Language Model (LLM) API calls in your Python application. By the end you will see per-request token counts and costs appear in Beakpoint Insights automatically.
Prerequisites
Before you begin, ensure you have:
- A Beakpoint Insights account
- A Beakpoint Insights API key
- A Python application that calls the OpenAI or Anthropic API
Install Dependencies
Install the OpenTelemetry SDK, the Beakpoint exporter, and the GenAI instrumentation package for your provider.
pip install opentelemetry-sdk opentelemetry-exporter-otlp-proto-http opentelemetry-instrumentation-openai-v2
Configure the Exporter
Set your Beakpoint API key and OTLP endpoint as environment variables before running your application:
export OTEL_EXPORTER_OTLP_ENDPOINT="https://ingest.beakpointinsights.com"
export OTEL_EXPORTER_OTLP_HEADERS="x-api-key=YOUR_API_KEY"
export OTEL_SERVICE_NAME="my-llm-service"
Replace YOUR_API_KEY with your Beakpoint Insights API key.
Instrument Your Application
Add the following setup code once at application startup, before you create any API clients:
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.instrumentation.openai_v2 import OpenAIInstrumentor
from openai import OpenAI
# 1. Configure the tracer provider
provider = TracerProvider()
exporter = OTLPSpanExporter() # reads OTEL_EXPORTER_OTLP_* env vars
provider.add_span_processor(BatchSpanProcessor(exporter))
trace.set_tracer_provider(provider)
# 2. Instrument the OpenAI client
OpenAIInstrumentor().instrument()
# 3. Use the client as normal — all calls are now traced
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4.1-mini",
messages=[{"role": "user", "content": "Summarise the key points of the Beakpoint docs."}],
)
print(response.choices[0].message.content)
The instrumentation automatically attaches the following attributes to each span:
gen_ai.system— identifies the provider (openai)gen_ai.request.model— the model you requestedgen_ai.usage.input_tokens/gen_ai.usage.output_tokens— token counts used for cost calculationgen_ai.usage.input_tokens.cached— cached input tokens (when prompt caching is active)
Verify Traces in Beakpoint
- Run your instrumented application and make at least one LLM API call.
- Log in to Beakpoint Insights.
- Navigate to Traces and search for your service name (the value you set in
OTEL_SERVICE_NAME). - Open a trace and confirm you can see a span with
gen_ai.system = openaiand non-zero token counts. - Navigate to Costs to see the calculated spend broken down by model and request.
If no traces appear within a minute of running your application, check that your OTEL_EXPORTER_OTLP_ENDPOINT and OTEL_EXPORTER_OTLP_HEADERS environment variables are set correctly and that outbound HTTPS traffic to the Beakpoint ingest endpoint is not blocked by a firewall.
Next Steps
- OpenAI cost tags reference — full list of supported attributes and models
- Anthropic Claude cost tags reference — Claude-specific attributes including prompt cache tokens
- Configure the exporter — advanced exporter configuration options