image_management_api/deployment/cloud-function/README.md

# Cloud Function for Image Embedding Processing

This Cloud Function processes images to generate embeddings using Google's Vertex AI multimodal embedding model and stores them in a Qdrant vector database.

## Overview

The function is triggered by Pub/Sub messages containing image processing tasks. It:

1. Downloads images from Google Cloud Storage
2. Generates embeddings using Vertex AI's `multimodalembedding@001` model
3. Stores embeddings in Qdrant vector database
4. Updates image metadata in Firestore

## Key Features

- **Vertex AI Multimodal Embeddings**: Uses Google's state-of-the-art multimodal embedding model
- **1408-dimensional vectors**: High-quality embeddings for semantic image search
- **Automatic retry**: Built-in retry logic for failed processing
- **Status tracking**: Real-time status updates in Firestore
- **Scalable**: Auto-scaling Cloud Function with configurable limits

## Dependencies

- `google-cloud-aiplatform`: Vertex AI SDK for multimodal embeddings
- `google-cloud-firestore`: Firestore database client
- `google-cloud-storage`: Cloud Storage client
- `qdrant-client`: Vector database client
- `numpy`: Numerical operations
- `Pillow`: Image processing

## Environment Variables

The function requires these environment variables:

```bash
# Google Cloud Configuration
GOOGLE_CLOUD_PROJECT=your-project-id
VERTEX_AI_LOCATION=us-central1

# Firestore Configuration
FIRESTORE_PROJECT_ID=your-project-id
FIRESTORE_DATABASE_NAME=(default)

# Cloud Storage Configuration
GCS_BUCKET_NAME=your-bucket-name

# Qdrant Configuration
QDRANT_HOST=your-qdrant-host
QDRANT_PORT=6333
QDRANT_API_KEY=your-api-key
QDRANT_COLLECTION=image_vectors
QDRANT_HTTPS=false

# Logging
LOG_LEVEL=INFO
```

## Testing

### Local Testing

1. Set up your environment:
```bash
export GOOGLE_CLOUD_PROJECT=your-project-id
export VERTEX_AI_LOCATION=us-central1
```

2. Install dependencies:
```bash
pip install -r requirements.txt
```

3. Run the test script:
```bash
python test_vertex_ai_embeddings.py
```

This will create a test image and verify that embeddings are generated correctly.

### Expected Output

The test should output something like:
```
INFO:__main__:Testing Vertex AI multimodal embeddings...
INFO:__main__:Using project: your-project-id
INFO:__main__:Creating test image...
INFO:__main__:Created test image with 1234 bytes
INFO:__main__:Generating embeddings using Vertex AI...
INFO:__main__:Generated embeddings with shape: (1408,)
INFO:__main__:Embeddings dtype: float32
INFO:__main__:Embeddings range: [-0.1234, 0.5678]
INFO:__main__:Embeddings norm: 1.0000
INFO:__main__:✅ All tests passed! Vertex AI embeddings are working correctly.
INFO:__main__:🎉 Test completed successfully!
```

## Deployment

The function is deployed using Terraform. See the main deployment documentation for details.

## Monitoring

- Check Cloud Function logs in Google Cloud Console
- Monitor Firestore for image status updates
- Check Qdrant for stored embeddings

## Troubleshooting

### Common Issues

1. **Authentication errors**: Ensure the service account has `roles/aiplatform.user` permission
2. **API not enabled**: Ensure `aiplatform.googleapis.com` is enabled
3. **Quota limits**: Check Vertex AI quotas in your project
4. **Network issues**: Ensure the function can reach Qdrant and other services

### Error Messages

- `"Failed to generate embeddings - no image embedding returned"`: Check image format and size
- `"PROJECT_ID not found in environment variables"`: Set `GOOGLE_CLOUD_PROJECT`
- `"Error generating embeddings"`: Check Vertex AI API access and quotas