459 lines
19 KiB
Markdown

# Image Management API
A secure API for storing, organizing, and retrieving images with advanced search capabilities powered by AI-generated embeddings.
## Features
- Secure image storage in Google Cloud Storage
- Team-based organization and access control
- **Hybrid authentication model**: Public management endpoints + API key protected data endpoints
- **Asynchronous image processing with Pub/Sub and Cloud Functions**
- **AI-powered image embeddings using Google Vertex AI Multimodal Embedding API**
- **Semantic search using vector similarity with Qdrant Vector Database**
- **Self-hosted vector database on Google Compute Engine VM**
- **Automatic retry mechanism for failed processing (up to 3 attempts)**
- Metadata extraction and storage
- Image processing capabilities
- Multi-team support
- **Public user and team management APIs for easy integration**
- **Comprehensive E2E testing with real database support**
## Architecture
```
root/
├── images/ # Sample images for testing
├── deployment/ # Deployment configurations
│ ├── cloud-function/ # **Cloud Function for image processing**
│ ├── cloud-run/ # Google Cloud Run configuration
│ └── terraform/ # Infrastructure as code
│ ├── vm.tf # **Vector database VM configuration**
│ └── scripts/ # **VM installation scripts**
├── docs/ # Documentation
│ ├── api/ # API documentation
│ └── TESTING.md # Comprehensive testing guide
├── scripts/ # Utility scripts
├── src/ # Source code
│ ├── api/ # API endpoints and routers
│ │ └── v1/ # API version 1 routes
│ ├── auth/ # Authentication and authorization
│ ├── config/ # Configuration management
│ ├── db/ # Database layer
│ │ ├── providers/ # Database providers (Firestore)
│ │ └── repositories/ # Data access repositories
│ ├── models/ # Database models
│ ├── schemas/ # API request/response schemas
│ ├── services/ # Business logic services
│ │ ├── pubsub_service.py # **Pub/Sub message publishing**
│ │ └── vector_db.py # **Qdrant vector database service**
│ └── utils/ # Utility functions
├── tests/ # Test code
│ ├── api/ # API tests
│ ├── auth/ # Authentication tests
│ ├── models/ # Model tests
│ ├── services/ # Service tests
│ ├── integration/ # Integration tests
│ │ └── test_cloud_function.py # **Cloud Function tests**
│ └── test_e2e.py # **Comprehensive E2E workflow tests**
├── main.py # Application entry point
├── requirements.txt # Python dependencies
└── README.md # This file
```
## System Architecture
```
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ │ │ │ │ │
│ FastAPI │ ───────▶│ Firestore │◀────────│ Cloud │
│ Backend │ │ Database │ │ Functions │
│ │ │ │ │ │
└─────┬───────┘ └─────────────┘ └──────┬──────┘
│ │
│ │
▼ │
┌─────────────┐ ┌─────────────┐ │
│ │ │ │ │
│ Cloud │ │ Pub/Sub │ │
│ Storage │────────▶│ Queue │────────────────┘
│ │ │ │
└─────────────┘ └─────────────┘
┌─────────────┐ ┌─────────────┐
│ │ │ │
│ Cloud │ │ Qdrant │
│ Vision API │────────▶│ Vector DB │
│ │ │ (VM) │
└─────────────┘ └─────────────┘
```
## **Image Processing Workflow**
### 1. **Image Upload Flow**:
- User uploads image through FastAPI backend
- Image is stored in Google Cloud Storage
- Image metadata is saved to Firestore with `embedding_status: "pending"`
- **Pub/Sub message is published to trigger async processing**
### 2. **Embedding Generation Flow** (Asynchronous):
- **Cloud Function is triggered by Pub/Sub message**
- Function updates image status to `"processing"`
- **Function downloads image from Cloud Storage**
- **Function calls Google Vertex AI Multimodal Embedding API to generate 1408-dimensional embeddings**
- **Embeddings are stored in Qdrant Vector Database on dedicated VM**
- **Firestore is updated with embedding info and status: "success"**
### 3. **Error Handling & Retry**:
- **Failed processing updates status to "failed" with error message**
- **Automatic retry up to 3 times using Pub/Sub retry policy**
- **Dead letter queue for permanently failed messages**
### 4. **Search Flow**:
- Search queries processed by FastAPI backend
- Vector similarity search performed against Qdrant vector database on a VM
- Results combined with metadata from Firestore
## Technology Stack
- **FastAPI** - Web framework
- **Firestore** - Database
- **Google Cloud Storage** - Image storage
- **Google Pub/Sub** - Message queue for async processing
- **Google Cloud Functions** - Serverless image processing
- **Google Vertex AI Multimodal Embedding API** - AI-powered image analysis and embedding generation
- **Qdrant** - Self-hosted vector database for semantic search (on Google Compute Engine VM)
- **Google Compute Engine** - VM hosting for vector database
- **Pydantic** - Data validation
## **Vector Database Infrastructure**
### **Qdrant Vector Database VM**
The system includes a dedicated Google Compute Engine VM running Qdrant vector database:
- **VM Specifications**: 2 vCPUs, 8GB RAM, 50GB disk (e2-standard-2)
- **Operating System**: Ubuntu 22.04 LTS
- **Vector Database**: Qdrant (latest version via Docker)
### **AI Embedding Model**
Uses Google's Vertex AI multimodal embedding model for generating high-quality image embeddings:
- **Model**: `multimodalembedding@001`
- **Provider**: Google Vertex AI
- **Type**: Multimodal embedding model (supports both images and text)
- **Output Dimensions**: **1408-dimensional vectors**
## Setup and Installation
### Prerequisites
- Python 3.8+
- Google Cloud account with Firestore, Storage, Pub/Sub, Cloud Functions, Compute Engine, and Vision API enabled
- Terraform (for infrastructure deployment)
### Installation
1. Clone the repository:
```bash
git clone {repo-url}
cd {repo-name}
```
2. Create and activate a virtual environment:
```bash
python -m venv venv
source venv/bin/activate # Linux/macOS
venv\Scripts\activate # Windows
```
3. Install dependencies:
```bash
pip install -r requirements.txt
```
4. Create a `.env` file with the following environment variables:
```
# Project Environment Variables
ENVIRONMENT=development
LOG_LEVEL=DEBUG
# CORS settings - Must be a valid JSON list of strings or comma-separated values
CORS_ORIGINS=["*"]
# Firestore settings
FIRESTORE_PROJECT_ID=gen-lang-client-0424120530
FIRESTORE_DATABASE_NAME=sereact-imagedb
FIRESTORE_CREDENTIALS_FILE=firestore-credentials.json
# Google Cloud Storage settings
GCS_BUCKET_NAME=sereact-images
GCS_CREDENTIALS_FILE=firestore-credentials.json
# Security settings
API_KEY_SECRET=super-secret-key
API_KEY_EXPIRY_DAYS=365
# Vector Database settings
QDRANT_PORT=6333
QDRANT_HTTPS=false
QDRANT_PREFER_GRPC=false
```
5. **Deploy Infrastructure**
```bash
./deployment/deploy.sh --build --deploy
python ./scripts/seed_firestore.py
```
6. **Destroy Infrastructure**
```bash
./deployment/deploy.sh --destroy
```
7. **Local Development**
```bash
./scripts/start.sh
```
8. **Local Testing**
## API Endpoints
The API provides the following main endpoints with their authentication and pagination support:
### 🔓 **Public Endpoints (No Authentication Required)**
#### Authentication & API Key Management
- `/api/v1/auth/api-keys` (POST) - Create new API key (requires `user_id` and `team_id` parameters)
#### Team Management
- `/api/v1/teams/*` - **Complete team management (no authentication required)**
- `POST /api/v1/teams` - Create new team
- `GET /api/v1/teams` - List all teams with **pagination support**
- **Query Parameters:**
- `skip` (default: 0, min: 0) - Number of items to skip
- `limit` (default: 50, min: 1, max: 100) - Number of items per page
- **Response includes:** `teams`, `total`, `skip`, `limit`
- `GET /api/v1/teams/{team_id}` - Get team by ID
- `PUT /api/v1/teams/{team_id}` - Update team
- `DELETE /api/v1/teams/{team_id}` - Delete team
#### User Management
- `/api/v1/users/*` - **Complete user management (no authentication required)**
- `POST /api/v1/users` - Create new user (requires `team_id`)
- `GET /api/v1/users` - List users with **pagination support**
- **Query Parameters:**
- `skip` (default: 0, min: 0) - Number of items to skip
- `limit` (default: 50, min: 1, max: 100) - Number of items per page
- `team_id` (optional) - Filter by team
- **Response includes:** `users`, `total`, `skip`, `limit`
- `GET /api/v1/users/{user_id}` - Get user by ID
- `PUT /api/v1/users/{user_id}` - Update user
- `DELETE /api/v1/users/{user_id}` - Delete user
- `GET /api/v1/users/me?user_id={id}` - Get user info (requires `user_id` parameter)
- `PUT /api/v1/users/me?user_id={id}` - Update user info (requires `user_id` parameter)
### 🔐 **Protected Endpoints (API Key Authentication Required)**
#### API Key Management (Authenticated)
- `/api/v1/auth/api-keys` (GET) - List API keys for current user with **pagination support**
- **Query Parameters:**
- `skip` (default: 0, min: 0) - Number of items to skip
- `limit` (default: 50, min: 1, max: 100) - Number of items per page
- **Response includes:** `api_keys`, `total`, `skip`, `limit`
- `/api/v1/auth/api-keys/{key_id}` (DELETE) - Revoke API key
- `/api/v1/auth/admin/api-keys/{user_id}` (POST) - Create API key for another user (admin only)
- `/api/v1/auth/verify` - Verify current authentication
#### Image Management ✅ **Fully Paginated & Protected**
- `/api/v1/images/*` - **Image upload, download, and management (with async processing)**
- `GET /api/v1/images` - List images with **full pagination support**
- **Query Parameters:**
- `skip` (default: 0, min: 0) - Number of items to skip
- `limit` (default: 50, min: 1, max: 100) - Number of items per page
- `collection_id` (optional) - Filter by collection
- **Response includes:** `images`, `total`, `skip`, `limit`
#### Search Functionality ✅ **Fully Paginated & Protected**
- `/api/v1/search/*` - **Image search functionality (semantic search via Qdrant)**
- `GET /api/v1/search` - Search images with **pagination support**
- **Query Parameters:**
- `q` (required) - Search query
- `skip` (default: 0, min: 0) - Number of items to skip
- `limit` (default: 10, min: 1, max: 50) - Number of results
- `similarity_threshold` (default: 0.7, min: 0.0, max: 1.0) - Similarity threshold
- `collection_id` (optional) - Filter by collection
- **Response includes:** `results`, `total`, `skip`, `limit`, `similarity_threshold`, `query`
- `POST /api/v1/search` - Advanced search with same pagination
### 🔑 **Authentication Model**
A **hybrid authentication model**:
1. **Public Management Endpoints**: Users, teams, and API key creation are **publicly accessible** for easy integration and setup
2. **Protected Data Endpoints**: Image storage and search require **API key authentication**
### **Authentication & Pagination Status**
| Endpoint Category | Authentication | Pagination Status | Notes |
|------------------|----------------|------------------|-------|
| **Users Management** | 🔓 **Public** | ✅ **Fully Implemented** | `skip`, `limit`, `total` with team filtering |
| **Teams Management** | 🔓 **Public** | ✅ **Fully Implemented** | `skip`, `limit`, `total` with proper validation |
| **API Key Creation** | 🔓 **Public** | N/A | Requires `user_id` and `team_id` parameters |
| **Images API** | 🔐 **Protected** | ✅ **Fully Implemented** | `skip`, `limit`, `total` with proper validation |
| **Search API** | 🔐 **Protected** | ✅ **Fully Implemented** | `skip`, `limit`, `total` with similarity scoring |
| **API Key Management** | 🔐 **Protected** | ✅ **Fully Implemented** | `skip`, `limit`, `total` for user's API keys |
**Note:** All endpoints now implement consistent pagination with `skip` and `limit` parameters for optimal performance and user experience.
Refer to the Swagger UI documentation at `/docs` for detailed endpoint information.
## Development
### Running Tests
```bash
source venv/Scripts/activate && python scripts/run_tests.py all
```
## API Modules Architecture
The SEREACT API is organized into the following key modules to ensure separation of concerns and maintainable code:
```
src/
├── api/ # API endpoints and routers
│ └── v1/ # API version 1 routes
├── auth/ # Authentication and authorization
├── config/ # Configuration management
├── models/ # Database models
├── services/ # Business logic services
│ └── vector_db.py # **Qdrant vector database service**
└── utils/ # Utility functions
```
### Module Responsibilities
#### Router Module
- Defines API endpoints and routes
- Handles HTTP requests and responses
- Validates incoming request data
- Directs requests to appropriate services
- Implements API versioning
#### Auth Module
- Manages user authentication
- Handles API key validation and verification
- Implements role-based access control
- Provides security middleware
- Manages user sessions and tokens
#### Services Module
- Contains core business logic
- Orchestrates operations across multiple resources
- Implements domain-specific rules and workflows
- Integrates with external services (Cloud Vision, Storage, **Qdrant**)
- Handles image processing and embedding generation
#### Models Module
- Defines data structures and schemas
- Provides database entity representations
- Handles data validation and serialization
- Implements data relationships and constraints
- Manages database migrations
#### Utils Module
- Provides helper functions and utilities
- Implements common functionality used across modules
- Handles error processing and logging
- Provides formatting and conversion utilities
- Implements reusable middleware components
#### Config Module
- Manages application configuration
- Handles environment variable loading
- Provides centralized settings management
- Configures service connections and credentials
- Defines application constants and defaults
### Module Interactions
```
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ │ │ │ │ │
│ Router │ ───────▶│ Services │ ◀───────│ Config │
│ Module │ │ Module │ │ Module │
│ │ │ │ │ │
└──────┬──────┘ └──────┬──────┘ └─────────────┘
│ │
│ │
▼ ▼
┌─────────────┐ ┌─────────────┐
│ │ │ │
│ Auth │ │ Models │
│ Module │ │ Module │
│ │ │ │
└──────┬──────┘ └──────┬──────┘
│ │
│ │
└───────────────────────┘
┌─────────────┐
│ │
│ Utils │
│ Module │
│ │
└─────────────┘
```
The modules interact in the following ways:
- **Request Flow**:
- Client request arrives at the Router Module
- Auth Module validates the request authentication
- Router delegates to appropriate Service functions
- Service uses Models to interact with the database
- **Service integrates with Qdrant Vector Database for similarity search**
- Service returns data to Router which formats the response
- **Cross-Cutting Concerns**:
- Config Module provides settings to all other modules
- Utils Module provides helper functions across the application
- Auth Module secures access to routes and services
- **Dependency Direction**:
- Router depends on Services and Auth
- Services depend on Models and Config
- Models depend on Utils for helper functions
- Auth depends on Models for user information
- All modules may use Utils and Config
This modular architecture provides several benefits:
- **Maintainability**: Changes in one module have minimal impact on others
- **Testability**: Modules can be tested in isolation with mocked dependencies
- **Scalability**: New features can be added by extending existing modules
- **Reusability**: Common functionality is centralized for consistent implementation
- **Security**: Authentication and authorization are handled consistently
## TODO
### High Priority
- [ ] Thumbnail generation
- [ ] Secret management
- [ ] Scale Vector DB to multiple nodes
### Medium Priority
- [ ] Implement caching layer for frequently accessed embeddings
- [ ] Implement caching for frequently accessed data
### Low Priority
- [ ] Move all auth logic to auth module
- [ ] Move cloud function code to src folder and reuse code with embedding service
- [ ] Remove Pinecone integration