CLI
Watch Mode
You can run Evalite in watch mode by running evalite watch
:
evalite watch
This will watch for changes to your .eval.ts
files and re-run the evals when they change.
[!IMPORTANT]
I strongly recommend implementing a caching layer in your LLM calls when using watch mode. This will keep your evals running fast and avoid burning through your API credits.
Hiding the Table Output
When debugging with console.log
, the detailed table output can make it harder to see your logs. You can hide it with --hideTable
:
evalite watch --hideTable
This keeps the score summary but removes the detailed results table from the CLI output.
Serve Mode
You can run evals once and serve the UI without re-running on file changes:
evalite serve
This runs your evals once and keeps the UI server running at http://localhost:3006
. Unlike watch mode, tests won’t re-run when files change.
Since evals can take a while to run, this can be a useful alternative to watch mode.
To re-run evals after making changes, restart evalite serve
.
Running Specific Files
You can run specific files by passing them as arguments:
evalite my-eval.eval.ts
This also works for watch
and serve
modes:
evalite watch my-eval.eval.tsevalite serve my-eval.eval.ts
Threshold
You can tell Evalite that your evals must pass a specific score by passing --threshold
:
evalite --threshold=50 # Score must be greater than or equal to 50
evalite watch --threshold=70 # Also works in watch mode
This is useful for running on CI. If the score threshold is not met, it will fail the process.
Export Command
Export eval results as a static HTML bundle:
evalite export
This exports the latest run to ./evalite-export
by default.
See the CI/CD guide for full documentation on exporting and viewing static UI bundles.