Understanding embedding backends, models, dimensions, and how OpenGround converts text to vectors
Embeddings are the core of OpenGround’s semantic search. They transform text chunks into numerical vectors that capture meaning, enabling the system to find relevant documentation even when query words don’t exactly match.
def check_gpu_compatibility() -> None: """Check for GPU compatibility and provide optimization tips.""" gpu_hardware = is_gpu_hardware_available() # nvidia-smi check # Check if fastembed-gpu is installed has_gpu_pkg = False try: version("fastembed-gpu") has_gpu_pkg = True except PackageNotFoundError: pass # Check for functional GPU via onnxruntime functional_gpu = False try: import onnxruntime as ort functional_gpu = "CUDAExecutionProvider" in ort.get_available_providers() except ImportError: pass # Provide helpful hints if gpu_hardware and not has_gpu_pkg: hint("GPU detected! Install the GPU version for faster performance:") hint(" uv tool install 'openground[fastembed-gpu]'\n") elif gpu_hardware and has_gpu_pkg and not functional_gpu: error("GPU package is installed but CUDA is not functional.") # Suggest fixes...
Sentence-Transformers uses PyTorch with automatic hardware acceleration (from embeddings.py:14-25):
Copy
@lru_cache(maxsize=1)def get_st_model(model_name: str): """Get a cached instance of SentenceTransformer.""" from sentence_transformers import SentenceTransformer # Automatically uses: # - CUDA on NVIDIA GPUs # - MPS on Apple Silicon # - CPU otherwise return SentenceTransformer(model_name)
Language: Match your docs (multilingual, en, etc.)
Size: Smaller models = faster inference
2
Update Configuration
Copy
# Set model and dimensionsopenground config set embeddings.embedding_model "sentence-transformers/all-MiniLM-L6-v2"openground config set embeddings.embedding_dimensions 384
3
Delete Existing Embeddings
Copy
# Remove all embedded dataopenground nuke embeddings
This deletes the LanceDB table but preserves raw documentation.
FastEmbed distinguishes between passage (document) and query embeddings (from embeddings.py:163-204):
Copy
def _generate_embeddings_fastembed( texts: Iterable[str], show_progress: bool = True,) -> list[list[float]]: # ... model = get_fastembed_model(model_name) for i in range(0, len(texts_list), batch_size): batch = texts_list[i : i + batch_size] # Use passage_embed for document chunks batch_embeddings = list(model.passage_embed(batch)) all_embeddings.extend([emb.tolist() for emb in batch_embeddings])
Some models are trained differently for documents vs. queries. FastEmbed uses passage_embed() for document chunks and would use query_embed() for search queries (though OpenGround currently uses passage_embed for both).
# Lightweight FastEmbed with small modelopenground config set embeddings.embedding_backend fastembedopenground config set embeddings.embedding_model "BAAI/bge-small-en-v1.5"openground config set embeddings.embedding_dimensions 384openground config set embeddings.batch_size 32
# Sentence-Transformers with larger modelopenground config set embeddings.embedding_backend sentence-transformersopenground config set embeddings.embedding_model "BAAI/bge-base-en-v1.5"openground config set embeddings.embedding_dimensions 768openground config set embeddings.batch_size 64 # Larger batches for GPU
# Sentence-Transformers with MPS accelerationopenground config set embeddings.embedding_backend sentence-transformersopenground config set embeddings.embedding_model "BAAI/bge-small-en-v1.5"openground config set embeddings.embedding_dimensions 384openground config set embeddings.batch_size 32
Apple Silicon automatically uses MPS (Metal Performance Shaders) via sentence-transformers. No special configuration needed.
# Larger chunks (more context, less precise)openground config set embeddings.chunk_size 1200openground config set embeddings.chunk_overlap 300# Smaller chunks (more precise, less context)openground config set embeddings.chunk_size 512openground config set embeddings.chunk_overlap 128