diff --git a/README.md b/README.md index ce04340..135f11e 100644 --- a/README.md +++ b/README.md @@ -8,7 +8,7 @@ SEREACT is a secure API for storing, organizing, and retrieving images with adva - Team-based organization and access control - **Hybrid authentication model**: Public management endpoints + API key protected data endpoints - **Asynchronous image processing with Pub/Sub and Cloud Functions** -- **AI-powered image embeddings using Google Cloud Vision API** +- **AI-powered image embeddings using Google Vertex AI Multimodal Embedding API** - **Semantic search using vector similarity with Qdrant Vector Database** - **Self-hosted vector database on Google Compute Engine VM** - **Automatic retry mechanism for failed processing (up to 3 attempts)** @@ -102,7 +102,7 @@ sereact/ - **Cloud Function is triggered by Pub/Sub message** - Function updates image status to `"processing"` - **Function downloads image from Cloud Storage** - - **Function calls Google Cloud Vision API to generate embeddings** + - **Function calls Google Vertex AI Multimodal Embedding API to generate 1408-dimensional embeddings** - **Embeddings are stored in Qdrant Vector Database on dedicated VM** - **Firestore is updated with embedding info and status: "success"** @@ -123,7 +123,7 @@ sereact/ - **Google Cloud Storage** - Image storage - **Google Pub/Sub** - Message queue for async processing - **Google Cloud Functions** - Serverless image processing -- **Google Cloud Vision API** - AI-powered image analysis and embedding generation +- **Google Vertex AI Multimodal Embedding API** - AI-powered image analysis and embedding generation - **Qdrant** - Self-hosted vector database for semantic search (on Google Compute Engine VM) - **Google Compute Engine** - VM hosting for vector database - **Pydantic** - Data validation @@ -155,6 +155,20 @@ The system includes a dedicated Google Compute Engine VM running Qdrant vector d - **Cosine Similarity**: Optimized for image embedding comparisons - **Metadata Filtering**: Support for complex search filters +### **AI Embedding Model** + +SEREACT uses Google's Vertex AI multimodal embedding model for generating high-quality image embeddings: + +- **Model**: `multimodalembedding@001` +- **Provider**: Google Vertex AI +- **Type**: Multimodal embedding model (supports both images and text) +- **Output Dimensions**: **1408-dimensional vectors** +- **Data Type**: Float32, normalized vectors (norm ≈ 1.0) +- **Similarity Metric**: Cosine similarity +- **Use Case**: Semantic image search and similarity matching + +**⚠️ Important Configuration**: Ensure your Qdrant collection is configured with **1408 dimensions** to match the Vertex AI model output. Dimension mismatches will cause embedding storage failures. + ## Setup and Installation ### Prerequisites @@ -197,8 +211,9 @@ The system includes a dedicated Google Compute Engine VM running Qdrant vector d PUBSUB_TOPIC=image-processing-topic PUBSUB_SUBSCRIPTION=image-processing-subscription - # Google Cloud Vision - VISION_API_ENABLED=true + # Google Vertex AI (for image embeddings) + VERTEX_AI_LOCATION=us-central1 + GOOGLE_CLOUD_PROJECT=your-gcp-project-id # Security API_KEY_SECRET=your-secret-key @@ -206,6 +221,8 @@ The system includes a dedicated Google Compute Engine VM running Qdrant vector d # Vector database (Qdrant) QDRANT_HOST=your-vm-external-ip QDRANT_API_KEY=your-qdrant-api-key # Optional + QDRANT_COLLECTION=image_vectors + QDRANT_HTTPS=false ``` 5. **Deploy Infrastructure** (Required for vector database): diff --git a/src/services/vector_db.py b/src/services/vector_db.py index b0eab9c..c77ef0e 100644 --- a/src/services/vector_db.py +++ b/src/services/vector_db.py @@ -86,7 +86,7 @@ class VectorDatabaseService: self.client.create_collection( collection_name=self.collection_name, vectors_config=VectorParams( - size=512, # Typical size for image embeddings + size=1408, # Typical size for image embeddings distance=Distance.COSINE ) )