Integration Tests
This document provides information about the integration testing framework used in this project.
Overview
Section titled “Overview”The integration tests are designed to validate the end-to-end functionality of the Gemini CLI. They execute the built binary in a controlled environment and verify that it behaves as expected when interacting with the file system.
These tests are located in the integration-tests directory and are run using a
custom test runner.
Building the tests
Section titled “Building the tests”Prior to running any integration tests, you need to create a release bundle that you want to actually test:
npm run bundleYou must re-run this command after making any changes to the CLI source code, but not after making changes to tests.
Running the tests
Section titled “Running the tests”The integration tests are not run as part of the default npm run test command.
They must be run explicitly using the npm run test:integration:all script.
The integration tests can also be run using the following shortcut:
npm run test:e2eRunning a specific set of tests
Section titled “Running a specific set of tests”To run a subset of test files, you can use
npm run <integration test command> <file_name1> .... where <integration
test command> is either test:e2e or test:integration* and <file_name>
is any of the .test.js files in the integration-tests/ directory. For
example, the following command runs list_directory.test.js and
write_file.test.js:
npm run test:e2e list_directory write_fileRunning a single test by name
Section titled “Running a single test by name”To run a single test by its name, use the --test-name-pattern flag:
npm run test:e2e -- --test-name-pattern "reads a file"Regenerating model responses
Section titled “Regenerating model responses”Some integration tests use faked out model responses, which may need to be regenerated from time to time as the implementations change.
To regenerate these golden files, set the REGENERATE_MODEL_GOLDENS environment variable to “true” when running the tests, for example:
WARNING: If running locally you should review these updated responses for any information about yourself or your system that gemini may have included in these responses.
REGENERATE_MODEL_GOLDENS="true" npm run test:e2eWARNING: Make sure you run await rig.cleanup() at the end of your test, else the golden files will not be updated.
Deflaking a test
Section titled “Deflaking a test”Before adding a new integration test, you should test it at least 5 times with the deflake script or workflow to make sure that it is not flaky.
Deflake script
Section titled “Deflake script”npm run deflake -- --runs=5 --command="npm run test:e2e -- -- --test-name-pattern '<your-new-test-name>'"Deflake Workflow
Section titled “Deflake Workflow”gh workflow run deflake.yml --ref <your-branch> -f test_name_pattern="<your-test-name-pattern>"Running all tests
Section titled “Running all tests”To run the entire suite of integration tests, use the following command:
npm run test:integration:allSandbox matrix
Section titled “Sandbox matrix”The all command will run tests for no sandboxing, docker and podman.
Each individual type can be run using the following commands:
npm run test:integration:sandbox:nonenpm run test:integration:sandbox:dockernpm run test:integration:sandbox:podmanDiagnostics
Section titled “Diagnostics”The integration test runner provides several options for diagnostics to help track down test failures.
Keeping test output
Section titled “Keeping test output”You can preserve the temporary files created during a test run for inspection. This is useful for debugging issues with file system operations.
To keep the test output set the KEEP_OUTPUT environment variable to true.
KEEP_OUTPUT=true npm run test:integration:sandbox:noneWhen output is kept, the test runner will print the path to the unique directory for the test run.
Verbose output
Section titled “Verbose output”For more detailed debugging, set the VERBOSE environment variable to true.
VERBOSE=true npm run test:integration:sandbox:noneWhen using VERBOSE=true and KEEP_OUTPUT=true in the same command, the output
is streamed to the console and also saved to a log file within the test’s
temporary directory.
The verbose output is formatted to clearly identify the source of the logs:
--- TEST: <log dir>:<test-name> ---... output from the gemini command ...--- END TEST: <log dir>:<test-name> ---Linting and formatting
Section titled “Linting and formatting”To ensure code quality and consistency, the integration test files are linted as part of the main build process. You can also manually run the linter and auto-fixer.
Running the linter
Section titled “Running the linter”To check for linting errors, run the following command:
npm run lintYou can include the :fix flag in the command to automatically fix any fixable
linting errors:
npm run lint:fixDirectory structure
Section titled “Directory structure”The integration tests create a unique directory for each test run inside the
.integration-tests directory. Within this directory, a subdirectory is created
for each test file, and within that, a subdirectory is created for each
individual test case.
This structure makes it easy to locate the artifacts for a specific test run, file, or case.
.integration-tests/└── <run-id>/ └── <test-file-name>.test.js/ └── <test-case-name>/ ├── output.log └── ...other test artifacts...Continuous integration
Section titled “Continuous integration”To ensure the integration tests are always run, a GitHub Actions workflow is
defined in .github/workflows/e2e.yml. This workflow automatically runs the
integrations tests for pull requests against the main branch, or when a pull
request is added to a merge queue.
The workflow runs the tests in different sandboxing environments to ensure Gemini CLI is tested across each:
sandbox:none: Runs the tests without any sandboxing.sandbox:docker: Runs the tests in a Docker container.sandbox:podman: Runs the tests in a Podman container.