Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/yocxy2/Flowise/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Chroma is an open-source embedding database designed for AI applications. Perfect for local development with easy deployment to cloud or self-hosted environments.

Setup

No setup required! Chroma runs in-memory by default.
// Simply leave Chroma URL empty
Collection Name: my-collection
Embeddings: OpenAI Embeddings

Configuration

Required Parameters

collectionName
string
required
Name of the collection to store/retrieve embeddings
embeddings
Embeddings
required
Embedding model to use (e.g., OpenAI Embeddings)

Optional Parameters

document
Document[]
Documents to upsert into the collection
chromaURL
string
URL of Chroma server. Leave empty for in-memory mode:
  • Empty = In-memory
  • http://localhost:8000 = Local server
  • https://api.trychroma.com = Chroma Cloud
credential
credential
Chroma API credential (only needed for cloud-hosted instances)
recordManager
RecordManager
Track indexed documents to prevent duplication
chromaMetadataFilter
json
Filter search results by metadata:
{
  "source": "docs",
  "category": "tutorial"
}
topK
number
default:4
Number of results to return

Usage Examples

In-Memory (Development)

// Fastest setup - no persistence
Collection Name: test-collection
Embeddings: OpenAI Embeddings
Chroma URL: [leave empty]
Top K: 4

// Data stored in memory, lost on restart

Local Persistent

# Start Chroma server
docker run -p 8000:8000 -v ./chroma_data:/chroma/chroma chromadb/chroma
// Connect to local server
Collection Name: my-docs
Chroma URL: http://localhost:8000
Embeddings: OpenAI Embeddings

Chroma Cloud

// Cloud configuration
Chroma URL: https://api.trychroma.com
Collection Name: production-docs
Credential: Chroma API (with key, tenant, database)
Embeddings: OpenAI Embeddings

With Metadata Filtering

// Search only specific documents
{
  "chromaMetadataFilter": {
    "type": "api-docs",
    "version": "v2"
  }
}

With Record Manager

// Prevent duplicate indexing
Document: Text Loader
Collection Name: knowledge-base
Record Manager: Postgres Record Manager
Embeddings: OpenAI Embeddings

// Only new/changed docs are processed

Metadata Filter Syntax

Chroma supports WHERE clause filtering:
// Simple equality
{ "category": "tutorial" }

// Operators: $eq, $ne, $gt, $gte, $lt, $lte
{
  "year": { "$gte": 2023 },
  "rating": { "$gt": 4.5 }
}

// $in operator
{
  "status": { "$in": ["published", "reviewed"] }
}

// Logical operators: $and, $or
{
  "$and": [
    { "category": "docs" },
    { "language": "en" }
  ]
}

{
  "$or": [
    { "priority": "high" },
    { "urgent": true }
  ]
}

Best Practices

Development

  • Use in-memory for quick testing
  • Use local server for development
  • Small datasets work great in-memory
  • Easy to reset and iterate

Production

  • Use Chroma Cloud or self-hosted server
  • Enable authentication
  • Set up backups
  • Monitor collection sizes

Performance

  • Create indexes on frequently queried metadata
  • Use appropriate collection sizes
  • Consider sharding for very large datasets
  • Batch upserts when possible

Data Management

  • Use descriptive collection names
  • Tag documents with metadata
  • Use record manager to avoid duplicates
  • Implement collection lifecycle management

Collection Management

Creating Collections

Collections are created automatically when you first upsert documents.

Deleting Collections

# Via Chroma client (if needed)
import chromadb
client = chromadb.HttpClient(host="localhost", port=8000)
client.delete_collection(name="old-collection")

Listing Collections

# Check existing collections
client = chromadb.HttpClient(host="localhost", port=8000)
collections = client.list_collections()
for collection in collections:
    print(f"{collection.name}: {collection.count()} documents")

Deployment Options

Docker Compose

version: '3'
services:
  chroma:
    image: chromadb/chroma
    ports:
      - "8000:8000"
    volumes:
      - ./chroma_data:/chroma/chroma
    environment:
      - CHROMA_SERVER_AUTH_PROVIDER=token
      - CHROMA_SERVER_AUTH_CREDENTIALS=your-secret-token

Kubernetes

apiVersion: apps/v1
kind: Deployment
metadata:
  name: chroma
spec:
  replicas: 1
  selector:
    matchLabels:
      app: chroma
  template:
    metadata:
      labels:
        app: chroma
    spec:
      containers:
      - name: chroma
        image: chromadb/chroma
        ports:
        - containerPort: 8000
        volumeMounts:
        - name: chroma-data
          mountPath: /chroma/chroma
      volumes:
      - name: chroma-data
        persistentVolumeClaim:
          claimName: chroma-pvc

Common Issues

Can’t connect to Chroma serverSolution:
  • Verify Chroma server is running
  • Check URL format: http://localhost:8000
  • Ensure port 8000 is not blocked
  • For Docker: Check container is running
Error accessing collectionSolution:
  • Collections are auto-created on first upsert
  • Check collection name spelling
  • Ensure documents were successfully indexed
  • Verify you’re connecting to correct server
Data disappears after restartSolution:
  • In-memory mode doesn’t persist
  • Use Chroma server for persistence
  • Configure persistent volume
  • Consider Chroma Cloud for managed hosting
Chroma Cloud connection issuesSolution:
  • Verify API key is correct
  • Check tenant and database names
  • Ensure credential is properly configured
  • Test with Chroma Cloud console

Chroma vs Other Vector DBs

FeatureChromaPineconeQdrant
Open SourceYesNoYes
In-MemoryYesNoYes
Managed CloudYesYesYes
Self-HostedYesNoYes
Best ForDevelopmentProductionProduction
Ease of UseExcellentVery GoodGood

Outputs

retriever
VectorStoreRetriever
Retriever interface for use in chains and agents
vectorStore
ChromaVectorStore
Direct vector store access for custom operations