pnpm add evalite@beta • View beta docs → CI/CD
Evalite integrates seamlessly into CI/CD pipelines, allowing you to validate LLM-powered features as part of your automated testing workflow.
Static UI Export
Export eval results as a static HTML bundle for viewing in CI artifacts without running a live server.
Basic Usage
evalite exportExports latest full run to ./evalite-export directory.
Options
Custom output directory:
evalite export --output=./my-exportExport specific run:
evalite export --run-id=123Custom base path for non-root hosting:
evalite export --basePath=/evals-123Use when hosting at subpaths (e.g., S3/CloudFront with path-based URLs). The base path must start with /.
Export Structure
Generated bundle contains:
index.html- Standalone UI (works without server)data/*.json- Pre-computed API responsesfiles/*- Images, audio, etc. from eval resultsassets/*- UI JavaScript/CSS
Viewing Exports
Local preview:
npx serve -s ./evalite-exportStatic hosting: Upload to artifact.ci, S3, GitHub Pages, etc.
CI Integration Example
GitHub Actions workflow exporting UI to artifacts:
name: Run Evals
on: [push, pull_request]
jobs: evals: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3
- uses: actions/setup-node@v3 with: node-version: "22"
- run: npm install
- name: Run evaluations env: OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }} run: npx evalite --threshold=70
- name: Export UI run: npx evalite export --output=./ui-export
- name: Upload static UI uses: actions/upload-artifact@v3 with: name: evalite-ui path: ui-exportView results by downloading artifact and running npx serve -s ./ui-export.
Path-Based Deployment Example
Deploy to S3/CloudFront with unique paths per run:
- name: Export UI with base path run: | RUN_PATH="/evals-${{ github.run_id }}" npx evalite export --basePath="$RUN_PATH" --output=./ui-export
- name: Upload to S3 run: | aws s3 sync ./ui-export s3://my-bucket/evals-${{ github.run_id }}/ echo "View at: https://my-domain.com/evals-${{ github.run_id }}"To test locally with base path:
# Export with base pathevalite export --basePath=/evals-123
# Create matching directory structuremkdir -p /tmp/test/evals-123cp -r evalite-export/* /tmp/test/evals-123/
# Serve and visit http://localhost:3000/evals-123npx serve /tmp/testRunning on CI
Run Evalite in run-once mode (default):
evaliteExecutes all evals and exits.
Score Thresholds
Fail CI builds if scores fall below threshold:
evalite --threshold=70Exits with code 1 if average score < 70.
JSON Export
For programmatic analysis, export raw JSON:
evalite --outputPath=./results.jsonExport Format
Typed hierarchical structure:
import type { Evalite } from "evalite";
type Output = Evalite.Exported.Output;Contains:
run: Metadata (id, runType, createdAt)evals: Array of evaluations with:- Basic info (name, filepath, duration, status, averageScore)
results: Individual test results with:- Test data (input, output, expected)
scores: Scorer resultstraces: LLM call traces
Use Cases
- Analytics: Import into dashboards for performance tracking
- Archiving: Store historical results for comparison
- Custom tooling: Build scripts around eval data