# Image Management API A secure API for storing, organizing, and retrieving images with advanced search capabilities powered by AI-generated embeddings. ## Features - Secure image storage in Google Cloud Storage - Team-based organization and access control - **Hybrid authentication model**: Public management endpoints + API key protected data endpoints - **Asynchronous image processing with Pub/Sub and Cloud Functions** - **AI-powered image embeddings using Google Vertex AI Multimodal Embedding API** - **Semantic search using vector similarity with Qdrant Vector Database** - **Self-hosted vector database on Google Compute Engine VM** - **Automatic retry mechanism for failed processing (up to 3 attempts)** - Metadata extraction and storage - Image processing capabilities - Multi-team support - **Public user and team management APIs for easy integration** - **Comprehensive E2E testing with real database support** ## Architecture ``` root/ ├── images/ # Sample images for testing ├── deployment/ # Deployment configurations │ ├── cloud-function/ # **Cloud Function for image processing** │ ├── cloud-run/ # Google Cloud Run configuration │ └── terraform/ # Infrastructure as code │ ├── vm.tf # **Vector database VM configuration** │ └── scripts/ # **VM installation scripts** ├── docs/ # Documentation │ ├── api/ # API documentation │ └── TESTING.md # Comprehensive testing guide ├── scripts/ # Utility scripts ├── src/ # Source code │ ├── api/ # API endpoints and routers │ │ └── v1/ # API version 1 routes │ ├── auth/ # Authentication and authorization │ ├── config/ # Configuration management │ ├── db/ # Database layer │ │ ├── providers/ # Database providers (Firestore) │ │ └── repositories/ # Data access repositories │ ├── models/ # Database models │ ├── schemas/ # API request/response schemas │ ├── services/ # Business logic services │ │ ├── pubsub_service.py # **Pub/Sub message publishing** │ │ └── vector_db.py # **Qdrant vector database service** │ └── utils/ # Utility functions ├── tests/ # Test code │ ├── api/ # API tests │ ├── auth/ # Authentication tests │ ├── models/ # Model tests │ ├── services/ # Service tests │ ├── integration/ # Integration tests │ │ └── test_cloud_function.py # **Cloud Function tests** │ └── test_e2e.py # **Comprehensive E2E workflow tests** ├── main.py # Application entry point ├── requirements.txt # Python dependencies └── README.md # This file ``` ## System Architecture ``` ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ │ │ │ │ FastAPI │ ───────▶│ Firestore │◀────────│ Cloud │ │ Backend │ │ Database │ │ Functions │ │ │ │ │ │ │ └─────┬───────┘ └─────────────┘ └──────┬──────┘ │ │ │ │ ▼ │ ┌─────────────┐ ┌─────────────┐ │ │ │ │ │ │ │ Cloud │ │ Pub/Sub │ │ │ Storage │────────▶│ Queue │────────────────┘ │ │ │ │ └─────────────┘ └─────────────┘ │ │ ▼ ┌─────────────┐ ┌─────────────┐ │ │ │ │ │ Cloud │ │ Qdrant │ │ Vision API │────────▶│ Vector DB │ │ │ │ (VM) │ └─────────────┘ └─────────────┘ ``` ## **Image Processing Workflow** ### 1. **Image Upload Flow**: - User uploads image through FastAPI backend - Image is stored in Google Cloud Storage - Image metadata is saved to Firestore with `embedding_status: "pending"` - **Pub/Sub message is published to trigger async processing** ### 2. **Embedding Generation Flow** (Asynchronous): - **Cloud Function is triggered by Pub/Sub message** - Function updates image status to `"processing"` - **Function downloads image from Cloud Storage** - **Function calls Google Vertex AI Multimodal Embedding API to generate 1408-dimensional embeddings** - **Embeddings are stored in Qdrant Vector Database on dedicated VM** - **Firestore is updated with embedding info and status: "success"** ### 3. **Error Handling & Retry**: - **Failed processing updates status to "failed" with error message** - **Automatic retry up to 3 times using Pub/Sub retry policy** - **Dead letter queue for permanently failed messages** ### 4. **Search Flow**: - Search queries processed by FastAPI backend - Vector similarity search performed against Qdrant vector database on a VM - Results combined with metadata from Firestore ## Technology Stack - **FastAPI** - Web framework - **Firestore** - Database - **Google Cloud Storage** - Image storage - **Google Pub/Sub** - Message queue for async processing - **Google Cloud Functions** - Serverless image processing - **Google Vertex AI Multimodal Embedding API** - AI-powered image analysis and embedding generation - **Qdrant** - Self-hosted vector database for semantic search (on Google Compute Engine VM) - **Google Compute Engine** - VM hosting for vector database - **Pydantic** - Data validation ## **Vector Database Infrastructure** ### **Qdrant Vector Database VM** The system includes a dedicated Google Compute Engine VM running Qdrant vector database: - **VM Specifications**: 2 vCPUs, 8GB RAM, 50GB disk (e2-standard-2) - **Operating System**: Ubuntu 22.04 LTS - **Vector Database**: Qdrant (latest version via Docker) ### **AI Embedding Model** Uses Google's Vertex AI multimodal embedding model for generating high-quality image embeddings: - **Model**: `multimodalembedding@001` - **Provider**: Google Vertex AI - **Type**: Multimodal embedding model (supports both images and text) - **Output Dimensions**: **1408-dimensional vectors** ## Setup and Installation ### Prerequisites - Python 3.8+ - Google Cloud account with Firestore, Storage, Pub/Sub, Cloud Functions, Compute Engine, and Vision API enabled - Terraform (for infrastructure deployment) ### Installation 1. Clone the repository: ```bash git clone {repo-url} cd {repo-name} ``` 2. Create and activate a virtual environment: ```bash python -m venv venv source venv/bin/activate # Linux/macOS venv\Scripts\activate # Windows ``` 3. Install dependencies: ```bash pip install -r requirements.txt ``` 4. Create a `.env` file with the following environment variables: ``` # Project Environment Variables ENVIRONMENT=development LOG_LEVEL=DEBUG # CORS settings - Must be a valid JSON list of strings or comma-separated values CORS_ORIGINS=["*"] # Firestore settings FIRESTORE_PROJECT_ID=gen-lang-client-0424120530 FIRESTORE_DATABASE_NAME=sereact-imagedb FIRESTORE_CREDENTIALS_FILE=firestore-credentials.json # Google Cloud Storage settings GCS_BUCKET_NAME=sereact-images GCS_CREDENTIALS_FILE=firestore-credentials.json # Security settings API_KEY_SECRET=super-secret-key API_KEY_EXPIRY_DAYS=365 # Vector Database settings QDRANT_PORT=6333 QDRANT_HTTPS=false QDRANT_PREFER_GRPC=false ``` 5. **Deploy Infrastructure** ```bash ./deployment/deploy.sh --build --deploy python ./scripts/seed_firestore.py ``` 6. **Destroy Infrastructure** ```bash ./deployment/deploy.sh --destroy ``` 7. **Local Development** ```bash ./scripts/start.sh ``` 8. **Local Testing** ## API Endpoints The API provides the following main endpoints with their authentication and pagination support: ### 🔓 **Public Endpoints (No Authentication Required)** #### Authentication & API Key Management - `/api/v1/auth/api-keys` (POST) - Create new API key (requires `user_id` and `team_id` parameters) #### Team Management - `/api/v1/teams/*` - **Complete team management (no authentication required)** - `POST /api/v1/teams` - Create new team - `GET /api/v1/teams` - List all teams with **pagination support** - **Query Parameters:** - `skip` (default: 0, min: 0) - Number of items to skip - `limit` (default: 50, min: 1, max: 100) - Number of items per page - **Response includes:** `teams`, `total`, `skip`, `limit` - `GET /api/v1/teams/{team_id}` - Get team by ID - `PUT /api/v1/teams/{team_id}` - Update team - `DELETE /api/v1/teams/{team_id}` - Delete team #### User Management - `/api/v1/users/*` - **Complete user management (no authentication required)** - `POST /api/v1/users` - Create new user (requires `team_id`) - `GET /api/v1/users` - List users with **pagination support** - **Query Parameters:** - `skip` (default: 0, min: 0) - Number of items to skip - `limit` (default: 50, min: 1, max: 100) - Number of items per page - `team_id` (optional) - Filter by team - **Response includes:** `users`, `total`, `skip`, `limit` - `GET /api/v1/users/{user_id}` - Get user by ID - `PUT /api/v1/users/{user_id}` - Update user - `DELETE /api/v1/users/{user_id}` - Delete user - `GET /api/v1/users/me?user_id={id}` - Get user info (requires `user_id` parameter) - `PUT /api/v1/users/me?user_id={id}` - Update user info (requires `user_id` parameter) ### 🔐 **Protected Endpoints (API Key Authentication Required)** #### API Key Management (Authenticated) - `/api/v1/auth/api-keys` (GET) - List API keys for current user with **pagination support** - **Query Parameters:** - `skip` (default: 0, min: 0) - Number of items to skip - `limit` (default: 50, min: 1, max: 100) - Number of items per page - **Response includes:** `api_keys`, `total`, `skip`, `limit` - `/api/v1/auth/api-keys/{key_id}` (DELETE) - Revoke API key - `/api/v1/auth/admin/api-keys/{user_id}` (POST) - Create API key for another user (admin only) - `/api/v1/auth/verify` - Verify current authentication #### Image Management ✅ **Fully Paginated & Protected** - `/api/v1/images/*` - **Image upload, download, and management (with async processing)** - `GET /api/v1/images` - List images with **full pagination support** - **Query Parameters:** - `skip` (default: 0, min: 0) - Number of items to skip - `limit` (default: 50, min: 1, max: 100) - Number of items per page - `collection_id` (optional) - Filter by collection - **Response includes:** `images`, `total`, `skip`, `limit` #### Search Functionality ✅ **Fully Paginated & Protected** - `/api/v1/search/*` - **Image search functionality (semantic search via Qdrant)** - `GET /api/v1/search` - Search images with **pagination support** - **Query Parameters:** - `q` (required) - Search query - `skip` (default: 0, min: 0) - Number of items to skip - `limit` (default: 10, min: 1, max: 50) - Number of results - `similarity_threshold` (default: 0.7, min: 0.0, max: 1.0) - Similarity threshold - `collection_id` (optional) - Filter by collection - **Response includes:** `results`, `total`, `skip`, `limit`, `similarity_threshold`, `query` - `POST /api/v1/search` - Advanced search with same pagination ### 🔑 **Authentication Model** A **hybrid authentication model**: 1. **Public Management Endpoints**: Users, teams, and API key creation are **publicly accessible** for easy integration and setup 2. **Protected Data Endpoints**: Image storage and search require **API key authentication** ### **Authentication & Pagination Status** | Endpoint Category | Authentication | Pagination Status | Notes | |------------------|----------------|------------------|-------| | **Users Management** | 🔓 **Public** | ✅ **Fully Implemented** | `skip`, `limit`, `total` with team filtering | | **Teams Management** | 🔓 **Public** | ✅ **Fully Implemented** | `skip`, `limit`, `total` with proper validation | | **API Key Creation** | 🔓 **Public** | N/A | Requires `user_id` and `team_id` parameters | | **Images API** | 🔐 **Protected** | ✅ **Fully Implemented** | `skip`, `limit`, `total` with proper validation | | **Search API** | 🔐 **Protected** | ✅ **Fully Implemented** | `skip`, `limit`, `total` with similarity scoring | | **API Key Management** | 🔐 **Protected** | ✅ **Fully Implemented** | `skip`, `limit`, `total` for user's API keys | **Note:** All endpoints now implement consistent pagination with `skip` and `limit` parameters for optimal performance and user experience. Refer to the Swagger UI documentation at `/docs` for detailed endpoint information. ## Development ### Running Tests ```bash source venv/Scripts/activate && python scripts/run_tests.py all ``` ## API Modules Architecture The SEREACT API is organized into the following key modules to ensure separation of concerns and maintainable code: ``` src/ ├── api/ # API endpoints and routers │ └── v1/ # API version 1 routes ├── auth/ # Authentication and authorization ├── config/ # Configuration management ├── models/ # Database models ├── services/ # Business logic services │ └── vector_db.py # **Qdrant vector database service** └── utils/ # Utility functions ``` ### Module Responsibilities #### Router Module - Defines API endpoints and routes - Handles HTTP requests and responses - Validates incoming request data - Directs requests to appropriate services - Implements API versioning #### Auth Module - Manages user authentication - Handles API key validation and verification - Implements role-based access control - Provides security middleware - Manages user sessions and tokens #### Services Module - Contains core business logic - Orchestrates operations across multiple resources - Implements domain-specific rules and workflows - Integrates with external services (Cloud Vision, Storage, **Qdrant**) - Handles image processing and embedding generation #### Models Module - Defines data structures and schemas - Provides database entity representations - Handles data validation and serialization - Implements data relationships and constraints - Manages database migrations #### Utils Module - Provides helper functions and utilities - Implements common functionality used across modules - Handles error processing and logging - Provides formatting and conversion utilities - Implements reusable middleware components #### Config Module - Manages application configuration - Handles environment variable loading - Provides centralized settings management - Configures service connections and credentials - Defines application constants and defaults ### Module Interactions ``` ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ │ │ │ │ Router │ ───────▶│ Services │ ◀───────│ Config │ │ Module │ │ Module │ │ Module │ │ │ │ │ │ │ └──────┬──────┘ └──────┬──────┘ └─────────────┘ │ │ │ │ ▼ ▼ ┌─────────────┐ ┌─────────────┐ │ │ │ │ │ Auth │ │ Models │ │ Module │ │ Module │ │ │ │ │ └──────┬──────┘ └──────┬──────┘ │ │ │ │ └───────────────────────┘ │ ▼ ┌─────────────┐ │ │ │ Utils │ │ Module │ │ │ └─────────────┘ ``` The modules interact in the following ways: - **Request Flow**: - Client request arrives at the Router Module - Auth Module validates the request authentication - Router delegates to appropriate Service functions - Service uses Models to interact with the database - **Service integrates with Qdrant Vector Database for similarity search** - Service returns data to Router which formats the response - **Cross-Cutting Concerns**: - Config Module provides settings to all other modules - Utils Module provides helper functions across the application - Auth Module secures access to routes and services - **Dependency Direction**: - Router depends on Services and Auth - Services depend on Models and Config - Models depend on Utils for helper functions - Auth depends on Models for user information - All modules may use Utils and Config This modular architecture provides several benefits: - **Maintainability**: Changes in one module have minimal impact on others - **Testability**: Modules can be tested in isolation with mocked dependencies - **Scalability**: New features can be added by extending existing modules - **Reusability**: Common functionality is centralized for consistent implementation - **Security**: Authentication and authorization are handled consistently ## TODO ### High Priority - [ ] Thumbnail generation - [ ] Secret management - [ ] Scale Vector DB to multiple nodes ### Medium Priority - [ ] Implement caching layer for frequently accessed embeddings - [ ] Implement caching for frequently accessed data ### Low Priority - [ ] Move all auth logic to auth module - [ ] Move cloud function code to src folder and reuse code with embedding service - [ ] Remove Pinecone integration