Cloud Function for Image Embedding Processing

This Cloud Function processes images to generate embeddings using Google's Vertex AI multimodal embedding model and stores them in a Qdrant vector database.

Overview

The function is triggered by Pub/Sub messages containing image processing tasks. It:

Downloads images from Google Cloud Storage
Generates embeddings using Vertex AI's multimodalembedding@001 model
Stores embeddings in Qdrant vector database
Updates image metadata in Firestore

Key Features

Vertex AI Multimodal Embeddings: Uses Google's state-of-the-art multimodal embedding model
1408-dimensional vectors: High-quality embeddings for semantic image search
Automatic retry: Built-in retry logic for failed processing
Status tracking: Real-time status updates in Firestore
Scalable: Auto-scaling Cloud Function with configurable limits

Dependencies

google-cloud-aiplatform: Vertex AI SDK for multimodal embeddings
google-cloud-firestore: Firestore database client
google-cloud-storage: Cloud Storage client
qdrant-client: Vector database client
numpy: Numerical operations
Pillow: Image processing

Environment Variables

The function requires these environment variables:

# Google Cloud Configuration
GOOGLE_CLOUD_PROJECT=your-project-id
VERTEX_AI_LOCATION=us-central1

# Firestore Configuration
FIRESTORE_PROJECT_ID=your-project-id
FIRESTORE_DATABASE_NAME=(default)

# Cloud Storage Configuration
GCS_BUCKET_NAME=your-bucket-name

# Qdrant Configuration
QDRANT_HOST=your-qdrant-host
QDRANT_PORT=6333
QDRANT_API_KEY=your-api-key
QDRANT_COLLECTION=image_vectors
QDRANT_HTTPS=false

# Logging
LOG_LEVEL=INFO

Testing

Local Testing

Set up your environment:

export GOOGLE_CLOUD_PROJECT=your-project-id
export VERTEX_AI_LOCATION=us-central1

Install dependencies:

pip install -r requirements.txt

Run the test script:

python test_vertex_ai_embeddings.py

This will create a test image and verify that embeddings are generated correctly.

Expected Output

The test should output something like:

INFO:__main__:Testing Vertex AI multimodal embeddings...
INFO:__main__:Using project: your-project-id
INFO:__main__:Creating test image...
INFO:__main__:Created test image with 1234 bytes
INFO:__main__:Generating embeddings using Vertex AI...
INFO:__main__:Generated embeddings with shape: (1408,)
INFO:__main__:Embeddings dtype: float32
INFO:__main__:Embeddings range: [-0.1234, 0.5678]
INFO:__main__:Embeddings norm: 1.0000
INFO:__main__:✅ All tests passed! Vertex AI embeddings are working correctly.
INFO:__main__:🎉 Test completed successfully!

Deployment

The function is deployed using Terraform. See the main deployment documentation for details.

Monitoring

Check Cloud Function logs in Google Cloud Console
Monitor Firestore for image status updates
Check Qdrant for stored embeddings

Troubleshooting

Common Issues

Authentication errors: Ensure the service account has roles/aiplatform.user permission
API not enabled: Ensure aiplatform.googleapis.com is enabled
Quota limits: Check Vertex AI quotas in your project
Network issues: Ensure the function can reach Qdrant and other services

Error Messages

"Failed to generate embeddings - no image embedding returned": Check image format and size
"PROJECT_ID not found in environment variables": Set GOOGLE_CLOUD_PROJECT
"Error generating embeddings": Check Vertex AI API access and quotas

3.6 KiB Raw Blame History