Skip to main content
OpenGround uses a JSON configuration file to manage settings for database paths, embedding models, query parameters, and more.

Configuration File Location

OpenGround stores configuration at:
~/.config/openground/config.json
Or if XDG_CONFIG_HOME is set:
$XDG_CONFIG_HOME/openground/config.json
The config file is automatically created with default values when you first run any openground command.

View Configuration

Display your current configuration:
openground config show

View Defaults Only

See hardcoded default values (ignoring your custom settings):
openground config show --defaults

Get Config File Path

Print the path to the config file:
openground config path

Default Configuration

OpenGround uses these defaults:
{
  "db_path": "~/.local/share/openground/lancedb",
  "table_name": "documents",
  "raw_data_dir": "~/.local/share/openground/raw_data",
  "extraction": {
    "concurrency_limit": 50
  },
  "embeddings": {
    "batch_size": 32,
    "chunk_size": 800,
    "chunk_overlap": 200,
    "embedding_model": "BAAI/bge-small-en-v1.5",
    "embedding_dimensions": 384,
    "embedding_backend": "fastembed"
  },
  "query": {
    "top_k": 5
  },
  "sources": {
    "auto_add_local": true
  }
}

Set Configuration Values

Modify configuration settings using the config set command:
openground config set <key> <value>
Use dot notation for nested keys:
openground config set embeddings.chunk_size 1000

Configuration Examples

# Change database path
openground config set db_path "/data/openground/db"

# Change chunk size
openground config set embeddings.chunk_size 1200

# Change chunk overlap
openground config set embeddings.chunk_overlap 300

# Change default top_k for queries
openground config set query.top_k 10

# Change embedding backend
openground config set embeddings.embedding_backend "sentence-transformers"

# Change embedding model
openground config set embeddings.embedding_model "sentence-transformers/all-MiniLM-L6-v2"

# Disable auto-add to sources.json
openground config set sources.auto_add_local false

# Change concurrency limit
openground config set extraction.concurrency_limit 100
Values are automatically parsed as JSON. For booleans and numbers, just type the value. For strings with spaces, use quotes.

Get Configuration Values

Retrieve a specific configuration value:
openground config get <key>

Get Examples

# Get chunk size
openground config get embeddings.chunk_size
# Output: 800

# Get embedding model
openground config get embeddings.embedding_model
# Output: BAAI/bge-small-en-v1.5

# Get database path
openground config get db_path
# Output: ~/.local/share/openground/lancedb

# Get top_k
openground config get query.top_k
# Output: 5

Reset Configuration

Delete your config file and restore defaults:
openground config reset
With confirmation skip:
openground config reset --yes
This permanently deletes your custom configuration. You cannot undo this action.

Configuration Settings Reference

Database Settings

db_path
string
default:"~/.local/share/openground/lancedb"
Path to LanceDB database directory. Stores embeddings and vector indexes.
table_name
string
default:"documents"
Name of the LanceDB table for storing document chunks.
raw_data_dir
string
default:"~/.local/share/openground/raw_data"
Directory for storing extracted JSON files before embedding.

Extraction Settings

extraction.concurrency_limit
integer
default:"50"
Maximum number of concurrent HTTP requests during sitemap extraction. Higher values extract faster but use more resources.

Embedding Settings

embeddings.batch_size
integer
default:"32"
Number of text chunks to embed in each batch. Larger batches are faster but use more memory.
embeddings.chunk_size
integer
default:"800"
Maximum number of characters per chunk when splitting documents. Affects embedding granularity.
embeddings.chunk_overlap
integer
default:"200"
Number of overlapping characters between consecutive chunks. Helps maintain context across chunk boundaries.
embeddings.embedding_model
string
default:"BAAI/bge-small-en-v1.5"
Name of the embedding model from Hugging Face. Must match the embedding backend.
embeddings.embedding_dimensions
integer
default:"384"
Dimensionality of embedding vectors. Must match the model’s output dimensions.
embeddings.embedding_backend
string
default:"fastembed"
Backend for generating embeddings. Options: "fastembed" or "sentence-transformers"

Query Settings

query.top_k
integer
default:"5"
Default number of search results to return. Can be overridden with --top-k flag.

Source Settings

sources.auto_add_local
boolean
default:"true"
Automatically save library sources to ~/.openground/sources.json when adding libraries with --source flag.
sources.file_path
string
Custom path to sources.json file. If not set, uses standard search order (project → user → package).

Embedding Backend Options

OpenGround supports two embedding backends:

FastEmbed (Default)

openground config set embeddings.embedding_backend "fastembed"
Advantages:
  • ✅ Faster inference
  • ✅ Lower memory usage
  • ✅ Optimized for CPU
  • ✅ No dependency on PyTorch
Models: Uses ONNX-optimized models

Sentence Transformers

openground config set embeddings.embedding_backend "sentence-transformers"
Advantages:
  • ✅ More model options
  • ✅ Better GPU support
  • ✅ Active development
  • ✅ Widely used
Models: Uses standard Hugging Face models
Changing the embedding backend requires re-embedding all libraries. Delete existing libraries and re-add them after changing backends.
  1. Set the new model:
openground config set embeddings.embedding_model "sentence-transformers/all-MiniLM-L6-v2"
openground config set embeddings.embedding_dimensions 384
  1. Delete existing embeddings:
openground nuke embeddings --yes
  1. Re-embed your libraries:
openground add mylib
Note: Make sure embedding_dimensions matches your model’s output size.

Chunking Configuration

Chunking parameters affect how documents are split before embedding:
# Larger chunks (more context, fewer chunks)
openground config set embeddings.chunk_size 1200
openground config set embeddings.chunk_overlap 300

# Smaller chunks (more granular, more chunks)
openground config set embeddings.chunk_size 600
openground config set embeddings.chunk_overlap 150
Larger chunks (800-1200 chars):
  • Better context preservation
  • Fewer total chunks (faster indexing)
  • May include irrelevant content
  • Good for conceptual queries
Smaller chunks (400-600 chars):
  • More precise results
  • Better for specific queries
  • More chunks (slower indexing)
  • May lose broader context
Overlap (150-300 chars):
  • Prevents information loss at boundaries
  • Higher overlap = more context retention
  • Higher overlap = more chunks
Recommendation: Start with defaults (800/200) and adjust based on your documentation structure.

Source File Configuration

OpenGround searches for library source configurations in this order:
  1. Custom path (if provided via --sources-file or sources.file_path)
  2. Project-local: ./.openground/sources.json
  3. User home: ~/.openground/sources.json
  4. Package-level: openground/extract/sources.json

Disable Auto-Add to Sources

By default, when you add a library with --source, it’s saved to ~/.openground/sources.json:
# Disable automatic saving
openground config set sources.auto_add_local false

Custom Sources File

# Use a custom sources.json location
openground config set sources.file_path "/path/to/my/sources.json"

# Or pass directly to add command
openground add mylib --sources-file "/path/to/sources.json"

Environment Variables

OpenGround respects XDG Base Directory specification:
# Override config directory
export XDG_CONFIG_HOME="/custom/config"
# Config will be at: /custom/config/openground/config.json

# Override data directory
export XDG_DATA_HOME="/custom/data"
# Data will be at: /custom/data/openground/

Configuration in Python

Access configuration from Python code:
from openground.config import (
    get_effective_config,
    get_default_config,
    load_config,
    save_config,
    get_config_path,
    clear_config_cache
)

# Get merged config (user + defaults)
config = get_effective_config()
print(config["embeddings"]["chunk_size"])  # 800

# Get only defaults
defaults = get_default_config()
print(defaults["query"]["top_k"])  # 5

# Load user config only (no defaults)
user_config = load_config()
print(user_config)  # Only user overrides

# Modify and save config
user_config["query"]["top_k"] = 10
save_config(user_config)

# Clear cache after modifications
clear_config_cache()

# Get config file path
path = get_config_path()
print(path)  # /home/user/.config/openground/config.json

Advanced Configuration Examples

High-Performance Setup

# Optimize for speed
openground config set extraction.concurrency_limit 100
openground config set embeddings.batch_size 64
openground config set embeddings.embedding_backend "fastembed"

High-Quality Setup

# Optimize for quality
openground config set embeddings.chunk_size 1200
openground config set embeddings.chunk_overlap 300
openground config set query.top_k 10

Custom Storage Setup

# Use custom paths
openground config set db_path "/mnt/ssd/openground/db"
openground config set raw_data_dir "/mnt/storage/openground/raw"

Large-Scale Setup

# For indexing many libraries
openground config set extraction.concurrency_limit 200
openground config set embeddings.batch_size 128
High concurrency and batch sizes require more memory and network bandwidth. Monitor resource usage when increasing these values.

Configuration Best Practices

1

Start with Defaults

Use default settings initially and only customize when needed:
openground config show --defaults
2

Test Before Committing

Test configuration changes on a small library before re-embedding large libraries:
openground config set embeddings.chunk_size 1000
openground add testlib --source https://example.com/sitemap.xml
openground query "test query" --library testlib
3

Document Custom Settings

Keep a record of why you changed settings:
# Save current config
openground config show > my-config-backup.json
4

Version Control

For team projects, commit .openground/sources.json to version control:
git add .openground/sources.json
git commit -m "Add OpenGround library sources"
No - config.json is user-specific and contains local paths. It should not be committed.Yes - .openground/sources.json is project-specific and can be shared with your team.
# .gitignore
.openground/config.json   # Do not commit
!.openground/sources.json # Do commit
Existing embeddings use the old chunk_size. They won’t automatically update. To apply new chunking:
  1. Change the setting:
openground config set embeddings.chunk_size 1000
  1. Delete and re-add libraries:
openground remove mylib --version latest --yes
openground add mylib
Or use nuke for all libraries:
openground nuke embeddings --yes
# Then re-add all libraries
Yes! Use project-local sources:
# In project A
mkdir -p .openground
echo '{ "mylib": { "type": "sitemap", "sitemap_url": "..." } }' > .openground/sources.json

# In project B
mkdir -p .openground
echo '{ "otherlib": { "type": "git_repo", "repo_url": "..." } }' > .openground/sources.json
Each project can have different source configurations.

Troubleshooting

Clear the config cache:
from openground.config import clear_config_cache
clear_config_cache()
Or restart your Python process/terminal.
Your config file has syntax errors. View it:
cat $(openground config path)
Fix JSON syntax or reset:
openground config reset --yes
Check directory permissions:
ls -ld $(dirname $(openground config path))
Fix permissions:
chmod 755 $(dirname $(openground config path))

Next Steps