The beta version of Evalite v1 is now available! Install with pnpm add evalite@beta • View beta docs →

Running Programmatically

You can run Evalite programmatically using the Node API. This is useful when you want to integrate Evalite into your own scripts, CI/CD pipelines, or custom tooling.

Basic Usage

Import the runEvalite function from evalite/runner:

import { runEvalite } from "evalite/runner";

await runEvalite({
  mode: "run-once-and-exit",
});

That’s it! The path and cwd parameters are optional and default to running all evals in the current directory.

Run Modes

Run Once and Exit

This mode runs all evals once and exits. It’s ideal for CI/CD pipelines:

await runEvalite({
  mode: "run-once-and-exit",
});

Watch Mode

This mode watches for file changes and re-runs evals automatically. It also starts the Evalite UI server:

await runEvalite({
  mode: "watch-for-file-changes",
});

[!IMPORTANT]

I strongly recommend implementing a caching layer in your LLM calls when using watch mode. This will keep your evals running fast and avoid burning through your API credits.

Options

`path`

Optional path filter to run specific eval files. Defaults to undefined (runs all evals):

await runEvalite({
  path: "my-eval.eval.ts",
  mode: "run-once-and-exit",
});

`cwd`

The working directory to run evals from. Defaults to process.cwd():

await runEvalite({
  cwd: "/path/to/my/project",
  mode: "run-once-and-exit",
});

`scoreThreshold`

Set a minimum score threshold (0-100). If the average score falls below this threshold, the process will exit with a non-zero exit code:

await runEvalite({
  mode: "run-once-and-exit",
  scoreThreshold: 80, // Fail if score is below 80
});

This is particularly useful for CI/CD pipelines where you want to fail the build if evals don’t meet a quality threshold.

`outputPath`

Export the results to a JSON file after the run completes:

await runEvalite({
  mode: "run-once-and-exit",
  outputPath: "./results.json",
});

The exported JSON file contains the complete run data including all evals, results, scores, and traces.

Complete Example

Here’s a complete example that combines multiple options:

import { runEvalite } from "evalite/runner";

async function runEvals() {
  try {
    await runEvalite({
      mode: "run-once-and-exit",
      scoreThreshold: 75, // Fail if average score < 75
      outputPath: "./evalite-results.json", // Export results
    });
    console.log("All evals passed!");
  } catch (error) {
    console.error("Evals failed:", error);
    process.exit(1);
  }
}

runEvals();