~/.openground/config.json.
Configuration File
Location:~/.openground/config.json
Access:
Top-Level Keys
db_path
Path to the LanceDB database directory. Stores vector embeddings and metadata.Example:
table_name
Name of the LanceDB table for storing documents.Example:
raw_data_dir
Directory for storing extracted documentation (JSON files) before embedding.Structure:
{raw_data_dir}/{library}/{version}/page_*.jsonExample:embeddings
Embedding and chunking configuration.embeddings.embedding_backend
Embedding backend to use.Valid values:
sentence-transformers- Full transformers library (GPU/MPS/CPU support)fastembed- Lightweight, optimized for CPU speed (default)
embeddings.embedding_model
Name of the embedding model to use.Default:
BAAI/bge-small-en-v1.5 (384 dimensions, optimized for quality/speed balance)Other models:BAAI/bge-base-en-v1.5(768 dimensions, higher quality)sentence-transformers/all-MiniLM-L6-v2(384 dimensions, faster)all-mpnet-base-v2(768 dimensions, best quality, sentence-transformers only)
embeddings.embedding_dimensions
Dimension of the embedding vectors. Must match the model’s output dimensions.Common dimensions:
384- BAAI/bge-small-en-v1.5 (default), all-MiniLM-L6-v2768- BAAI/bge-base-en-v1.5, all-mpnet-base-v2
embeddings.batch_size
Number of chunks to process in a single batch when generating embeddings.Trade-offs:
- Larger batches (64-128): Faster on GPU, more memory usage
- Smaller batches (8-16): Lower memory usage, slower processing
embeddings.chunk_size
Maximum size of text chunks (in characters) for embedding.Trade-offs:
- Smaller chunks (256-512): More granular, better for specific queries
- Larger chunks (1024-2048): More context, better for broad topics
embeddings.chunk_overlap
Number of overlapping characters between consecutive chunks.Purpose: Prevents information loss at chunk boundaries.Recommended: 20-25% of chunk_size (default is 25% of 800)Example:
query
Search and query configuration.query.top_k
Number of results to return from queries by default.Can be overridden: Using
--top-k flag in openground queryRecommended range: 3 - 20Example:extraction
Extraction and scraping configuration.extraction.concurrency_limit
Maximum number of concurrent requests when extracting from sitemaps.Trade-offs:
- Higher values: Faster extraction, more memory/bandwidth
- Lower values: Slower extraction, less resource usage
sources
Source management configuration.sources.auto_add_local
Automatically add sources to
~/.openground/sources.json when using openground add with --source flag.When enabled: Source configs are saved for future useWhen disabled: Must provide --source every timeExample:sources.file_path
Path to the sources configuration file.Use case: Custom location for source definitionsExample:
Example Configuration
Common Configurations
High-Quality Embeddings
Better semantic understanding, slower, more disk space:Fast and Lightweight
Faster extraction and queries, less disk space:Large Context
Better for documentation with long-form content:Detailed Queries
More results, better for exploration:Changing Settings
View Current Value
Change a Value
Reset to Defaults
When Changes Take Effect
| Setting | Takes Effect |
|---|---|
db_path, table_name, raw_data_dir | Immediately (affects new operations) |
embeddings.* | Requires re-embedding existing libraries |
query.top_k | Immediately (or use --top-k flag) |
extraction.concurrency_limit | Next extraction |
sources.* | Immediately |
Re-embedding After Embedding Changes
If you change embedding settings, you must re-embed:Related Commands
- openground config - Manage configuration
- openground nuke - Delete data when changing embeddings
- openground add - Add documentation