Configuration - OpenGround

OpenGround uses a JSON configuration file to manage settings for database paths, embedding models, query parameters, and more.

Configuration File Location

OpenGround stores configuration at:

Linux/macOS
Windows

~/.config/openground/config.json

Or if XDG_CONFIG_HOME is set:

$XDG_CONFIG_HOME/openground/config.json

%LOCALAPPDATA%\openground\config.json

Typically:

C:\Users\YourName\AppData\Local\openground\config.json

The config file is automatically created with default values when you first run any openground command.

View Configuration

Display your current configuration:

openground config show

View Defaults Only

See hardcoded default values (ignoring your custom settings):

openground config show --defaults

Get Config File Path

Print the path to the config file:

openground config path

Default Configuration

OpenGround uses these defaults:

{
  "db_path": "~/.local/share/openground/lancedb",
  "table_name": "documents",
  "raw_data_dir": "~/.local/share/openground/raw_data",
  "extraction": {
    "concurrency_limit": 50
  },
  "embeddings": {
    "batch_size": 32,
    "chunk_size": 800,
    "chunk_overlap": 200,
    "embedding_model": "BAAI/bge-small-en-v1.5",
    "embedding_dimensions": 384,
    "embedding_backend": "fastembed"
  },
  "query": {
    "top_k": 5
  },
  "sources": {
    "auto_add_local": true
  }
}

Set Configuration Values

Modify configuration settings using the config set command:

openground config set <key> <value>

Use dot notation for nested keys:

openground config set embeddings.chunk_size 1000

Configuration Examples

# Change database path
openground config set db_path "/data/openground/db"

# Change chunk size
openground config set embeddings.chunk_size 1200

# Change chunk overlap
openground config set embeddings.chunk_overlap 300

# Change default top_k for queries
openground config set query.top_k 10

# Change embedding backend
openground config set embeddings.embedding_backend "sentence-transformers"

# Change embedding model
openground config set embeddings.embedding_model "sentence-transformers/all-MiniLM-L6-v2"

# Disable auto-add to sources.json
openground config set sources.auto_add_local false

# Change concurrency limit
openground config set extraction.concurrency_limit 100

Values are automatically parsed as JSON. For booleans and numbers, just type the value. For strings with spaces, use quotes.

Get Configuration Values

Retrieve a specific configuration value:

openground config get <key>

Get Examples

# Get chunk size
openground config get embeddings.chunk_size
# Output: 800

# Get embedding model
openground config get embeddings.embedding_model
# Output: BAAI/bge-small-en-v1.5

# Get database path
openground config get db_path
# Output: ~/.local/share/openground/lancedb

# Get top_k
openground config get query.top_k
# Output: 5

Reset Configuration

Delete your config file and restore defaults:

openground config reset

With confirmation skip:

openground config reset --yes

This permanently deletes your custom configuration. You cannot undo this action.

Configuration Settings Reference

Database Settings

db_path

string

default:"~/.local/share/openground/lancedb"

Path to LanceDB database directory. Stores embeddings and vector indexes.

table_name

string

default:"documents"

Name of the LanceDB table for storing document chunks.

raw_data_dir

string

default:"~/.local/share/openground/raw_data"

Directory for storing extracted JSON files before embedding.

Extraction Settings

extraction.concurrency_limit

integer

default:"50"

Maximum number of concurrent HTTP requests during sitemap extraction. Higher values extract faster but use more resources.

Embedding Settings

embeddings.batch_size

integer

default:"32"

Number of text chunks to embed in each batch. Larger batches are faster but use more memory.

embeddings.chunk_size

integer

default:"800"

Maximum number of characters per chunk when splitting documents. Affects embedding granularity.

embeddings.chunk_overlap

integer

default:"200"

Number of overlapping characters between consecutive chunks. Helps maintain context across chunk boundaries.

embeddings.embedding_model

string

default:"BAAI/bge-small-en-v1.5"

Name of the embedding model from Hugging Face. Must match the embedding backend.

embeddings.embedding_dimensions

integer

default:"384"

Dimensionality of embedding vectors. Must match the model’s output dimensions.

embeddings.embedding_backend

string

default:"fastembed"

Backend for generating embeddings. Options: "fastembed" or "sentence-transformers"

Query Settings

query.top_k

integer

default:"5"

Default number of search results to return. Can be overridden with --top-k flag.

Source Settings

sources.auto_add_local

boolean

default:"true"

Automatically save library sources to ~/.openground/sources.json when adding libraries with --source flag.

sources.file_path

string

Custom path to sources.json file. If not set, uses standard search order (project → user → package).

Embedding Backend Options

OpenGround supports two embedding backends:

FastEmbed (Default)

openground config set embeddings.embedding_backend "fastembed"

Advantages:

✅ Faster inference
✅ Lower memory usage
✅ Optimized for CPU
✅ No dependency on PyTorch

Models: Uses ONNX-optimized models

Sentence Transformers

openground config set embeddings.embedding_backend "sentence-transformers"

Advantages:

✅ More model options
✅ Better GPU support
✅ Active development
✅ Widely used

Models: Uses standard Hugging Face models

Changing the embedding backend requires re-embedding all libraries. Delete existing libraries and re-add them after changing backends.

How do I change embedding models?

Set the new model:

openground config set embeddings.embedding_model "sentence-transformers/all-MiniLM-L6-v2"
openground config set embeddings.embedding_dimensions 384

Delete existing embeddings:

openground nuke embeddings --yes

Re-embed your libraries:

openground add mylib

Note: Make sure embedding_dimensions matches your model’s output size.

Chunking Configuration

Chunking parameters affect how documents are split before embedding:

# Larger chunks (more context, fewer chunks)
openground config set embeddings.chunk_size 1200
openground config set embeddings.chunk_overlap 300

# Smaller chunks (more granular, more chunks)
openground config set embeddings.chunk_size 600
openground config set embeddings.chunk_overlap 150

How do chunk_size and chunk_overlap affect search quality?

Larger chunks (800-1200 chars):

Better context preservation
Fewer total chunks (faster indexing)
May include irrelevant content
Good for conceptual queries

Smaller chunks (400-600 chars):

More precise results
Better for specific queries
More chunks (slower indexing)
May lose broader context

Overlap (150-300 chars):

Prevents information loss at boundaries
Higher overlap = more context retention
Higher overlap = more chunks

Recommendation: Start with defaults (800/200) and adjust based on your documentation structure.

Source File Configuration

OpenGround searches for library source configurations in this order:

Custom path (if provided via --sources-file or sources.file_path)
Project-local: ./.openground/sources.json
User home: ~/.openground/sources.json
Package-level: openground/extract/sources.json

Disable Auto-Add to Sources

By default, when you add a library with --source, it’s saved to ~/.openground/sources.json:

# Disable automatic saving
openground config set sources.auto_add_local false

Custom Sources File

# Use a custom sources.json location
openground config set sources.file_path "/path/to/my/sources.json"

# Or pass directly to add command
openground add mylib --sources-file "/path/to/sources.json"

Environment Variables

OpenGround respects XDG Base Directory specification:

# Override config directory
export XDG_CONFIG_HOME="/custom/config"
# Config will be at: /custom/config/openground/config.json

# Override data directory
export XDG_DATA_HOME="/custom/data"
# Data will be at: /custom/data/openground/

Configuration in Python

Access configuration from Python code:

from openground.config import (
    get_effective_config,
    get_default_config,
    load_config,
    save_config,
    get_config_path,
    clear_config_cache
)

# Get merged config (user + defaults)
config = get_effective_config()
print(config["embeddings"]["chunk_size"])  # 800

# Get only defaults
defaults = get_default_config()
print(defaults["query"]["top_k"])  # 5

# Load user config only (no defaults)
user_config = load_config()
print(user_config)  # Only user overrides

# Modify and save config
user_config["query"]["top_k"] = 10
save_config(user_config)

# Clear cache after modifications
clear_config_cache()

# Get config file path
path = get_config_path()
print(path)  # /home/user/.config/openground/config.json

Advanced Configuration Examples

High-Performance Setup

# Optimize for speed
openground config set extraction.concurrency_limit 100
openground config set embeddings.batch_size 64
openground config set embeddings.embedding_backend "fastembed"

High-Quality Setup

# Optimize for quality
openground config set embeddings.chunk_size 1200
openground config set embeddings.chunk_overlap 300
openground config set query.top_k 10

Custom Storage Setup

# Use custom paths
openground config set db_path "/mnt/ssd/openground/db"
openground config set raw_data_dir "/mnt/storage/openground/raw"

Large-Scale Setup

# For indexing many libraries
openground config set extraction.concurrency_limit 200
openground config set embeddings.batch_size 128

High concurrency and batch sizes require more memory and network bandwidth. Monitor resource usage when increasing these values.

Configuration Best Practices

Start with Defaults

Use default settings initially and only customize when needed:

openground config show --defaults

Test Before Committing

Test configuration changes on a small library before re-embedding large libraries:

openground config set embeddings.chunk_size 1000
openground add testlib --source https://example.com/sitemap.xml
openground query "test query" --library testlib

Document Custom Settings

Keep a record of why you changed settings:

# Save current config
openground config show > my-config-backup.json

Version Control

For team projects, commit .openground/sources.json to version control:

git add .openground/sources.json
git commit -m "Add OpenGround library sources"

Should I commit config.json to version control?

No - config.json is user-specific and contains local paths. It should not be committed.Yes - .openground/sources.json is project-specific and can be shared with your team.

# .gitignore
.openground/config.json   # Do not commit
!.openground/sources.json # Do commit

What happens if I change chunk_size after embedding?

Existing embeddings use the old chunk_size. They won’t automatically update. To apply new chunking:

Change the setting:

openground config set embeddings.chunk_size 1000

Delete and re-add libraries:

openground remove mylib --version latest --yes
openground add mylib

Or use nuke for all libraries:

openground nuke embeddings --yes
# Then re-add all libraries

Can I use different configs for different projects?

Yes! Use project-local sources:

# In project A
mkdir -p .openground
echo '{ "mylib": { "type": "sitemap", "sitemap_url": "..." } }' > .openground/sources.json

# In project B
mkdir -p .openground
echo '{ "otherlib": { "type": "git_repo", "repo_url": "..." } }' > .openground/sources.json

Each project can have different source configurations.

Troubleshooting

Config changes don't take effect

Clear the config cache:

from openground.config import clear_config_cache
clear_config_cache()

Or restart your Python process/terminal.

Invalid JSON error when loading config

Your config file has syntax errors. View it:

cat $(openground config path)

Fix JSON syntax or reset:

openground config reset --yes

Permission denied writing config

Check directory permissions:

ls -ld $(dirname $(openground config path))

Fix permissions:

chmod 755 $(dirname $(openground config path))

Get Started

Core Concepts

Adding Documentation

AI Agent Integration

Usage

Advanced

​Configuration File Location

​View Configuration

​View Defaults Only

​Get Config File Path

​Default Configuration

​Set Configuration Values

​Configuration Examples

​Get Configuration Values

​Get Examples

​Reset Configuration

​Configuration Settings Reference

​Database Settings

​Extraction Settings

​Embedding Settings

​Query Settings

​Source Settings

​Embedding Backend Options

​FastEmbed (Default)

​Sentence Transformers

​Chunking Configuration

​Source File Configuration

​Disable Auto-Add to Sources

​Custom Sources File

​Environment Variables

​Configuration in Python

​Advanced Configuration Examples

​High-Performance Setup

​High-Quality Setup

​Custom Storage Setup

​Large-Scale Setup

​Configuration Best Practices

​Troubleshooting

​Next Steps

Querying

Adding Libraries

Configuration File Location

View Configuration

View Defaults Only

Get Config File Path

Default Configuration

Set Configuration Values

Configuration Examples

Get Configuration Values

Get Examples

Reset Configuration

Configuration Settings Reference

Database Settings

Extraction Settings

Embedding Settings

Query Settings

Source Settings

Embedding Backend Options

FastEmbed (Default)

Sentence Transformers

Chunking Configuration

Source File Configuration

Disable Auto-Add to Sources

Custom Sources File

Environment Variables

Configuration in Python

Advanced Configuration Examples

High-Performance Setup

High-Quality Setup

Custom Storage Setup

Large-Scale Setup

Configuration Best Practices

Troubleshooting

Next Steps