2025-05-25 17:49:41 +02:00
2025-05-25 18:21:08 +02:00
2025-05-23 21:30:52 +02:00
2025-05-25 17:49:41 +02:00
2025-05-25 19:29:22 +02:00
2025-05-25 18:16:19 +02:00
2025-05-25 18:21:08 +02:00
2025-05-23 22:57:11 +02:00
cp
2025-05-25 00:17:00 +02:00
cp
2025-05-24 05:12:51 +02:00
cp
2025-05-24 13:57:58 +02:00
2025-05-25 19:05:34 +02:00
2025-05-25 18:07:23 +02:00
cp
2025-05-24 05:12:51 +02:00

Image Management API

A secure API for storing, organizing, and retrieving images with advanced search capabilities powered by AI-generated embeddings.

Features

  • Secure image storage in Google Cloud Storage
  • Team-based organization and access control
  • Hybrid authentication model: Public management endpoints + API key protected data endpoints
  • Asynchronous image processing with Pub/Sub and Cloud Functions
  • AI-powered image embeddings using Google Vertex AI Multimodal Embedding API
  • Semantic search using vector similarity with Qdrant Vector Database
  • Self-hosted vector database on Google Compute Engine VM
  • Automatic retry mechanism for failed processing (up to 3 attempts)
  • Metadata extraction and storage
  • Image processing capabilities
  • Multi-team support
  • Public user and team management APIs for easy integration
  • Comprehensive E2E testing with real database support

Architecture

root/
  ├── images/                    # Sample images for testing
  ├── deployment/                # Deployment configurations
  │   ├── cloud-function/        # **Cloud Function for image processing**
  │   ├── cloud-run/             # Google Cloud Run configuration
  │   └── terraform/             # Infrastructure as code
  │       ├── vm.tf              # **Vector database VM configuration**
  │       └── scripts/           # **VM installation scripts**
  ├── docs/                      # Documentation
  │   ├── api/                   # API documentation
  │   └── TESTING.md             # Comprehensive testing guide
  ├── scripts/                   # Utility scripts
  ├── src/                       # Source code
  │   ├── api/                   # API endpoints and routers
  │   │   └── v1/                # API version 1 routes
  │   ├── auth/                  # Authentication and authorization
  │   ├── config/                # Configuration management
  │   ├── db/                    # Database layer
  │   │   ├── providers/         # Database providers (Firestore)
  │   │   └── repositories/      # Data access repositories
  │   ├── models/                # Database models
  │   ├── schemas/               # API request/response schemas
  │   ├── services/              # Business logic services
  │   │   ├── pubsub_service.py  # **Pub/Sub message publishing**
  │   │   └── vector_db.py       # **Qdrant vector database service**
  │   └── utils/                 # Utility functions
  ├── tests/                     # Test code
  │   ├── api/                   # API tests
  │   ├── auth/                  # Authentication tests
  │   ├── models/                # Model tests
  │   ├── services/              # Service tests
  │   ├── integration/           # Integration tests
  │   │   └── test_cloud_function.py  # **Cloud Function tests**
  │   └── test_e2e.py           # **Comprehensive E2E workflow tests**
  ├── main.py                    # Application entry point
  ├── requirements.txt           # Python dependencies
  └── README.md                  # This file

System Architecture

┌─────────────┐         ┌─────────────┐         ┌─────────────┐
│             │         │             │         │             │
│  FastAPI    │ ───────▶│  Firestore  │◀────────│  Cloud      │
│  Backend    │         │  Database   │         │  Functions  │
│             │         │             │         │             │
└─────┬───────┘         └─────────────┘         └──────┬──────┘
      │                                                │
      │                                                │
      ▼                                                │
┌─────────────┐         ┌─────────────┐                │
│             │         │             │                │
│  Cloud      │         │  Pub/Sub    │                │
│  Storage    │────────▶│  Queue      │────────────────┘
│             │         │             │
└─────────────┘         └─────────────┘
                               │
                               │
                               ▼
                        ┌─────────────┐         ┌─────────────┐
                        │             │         │             │
                        │  Cloud      │         │  Qdrant     │
                        │  Vision API │────────▶│  Vector DB  │
                        │             │         │  (VM)       │
                        └─────────────┘         └─────────────┘

Image Processing Workflow

1. Image Upload Flow:

  • User uploads image through FastAPI backend
  • Image is stored in Google Cloud Storage
  • Image metadata is saved to Firestore with embedding_status: "pending"
  • Pub/Sub message is published to trigger async processing

2. Embedding Generation Flow (Asynchronous):

  • Cloud Function is triggered by Pub/Sub message
  • Function updates image status to "processing"
  • Function downloads image from Cloud Storage
  • Function calls Google Vertex AI Multimodal Embedding API to generate 1408-dimensional embeddings
  • Embeddings are stored in Qdrant Vector Database on dedicated VM
  • Firestore is updated with embedding info and status: "success"

3. Error Handling & Retry:

  • Failed processing updates status to "failed" with error message
  • Automatic retry up to 3 times using Pub/Sub retry policy
  • Dead letter queue for permanently failed messages

4. Search Flow:

  • Search queries processed by FastAPI backend
  • Vector similarity search performed against Qdrant vector database on a VM
  • Results combined with metadata from Firestore

Technology Stack

  • FastAPI - Web framework
  • Firestore - Database
  • Google Cloud Storage - Image storage
  • Google Pub/Sub - Message queue for async processing
  • Google Cloud Functions - Serverless image processing
  • Google Vertex AI Multimodal Embedding API - AI-powered image analysis and embedding generation
  • Qdrant - Self-hosted vector database for semantic search (on Google Compute Engine VM)
  • Google Compute Engine - VM hosting for vector database
  • Pydantic - Data validation

Vector Database Infrastructure

Qdrant Vector Database VM

The system includes a dedicated Google Compute Engine VM running Qdrant vector database:

  • VM Specifications: 2 vCPUs, 8GB RAM, 50GB disk (e2-standard-2)
  • Operating System: Ubuntu 22.04 LTS
  • Vector Database: Qdrant (latest version via Docker)

AI Embedding Model

Uses Google's Vertex AI multimodal embedding model for generating high-quality image embeddings:

  • Model: multimodalembedding@001
  • Provider: Google Vertex AI
  • Type: Multimodal embedding model (supports both images and text)
  • Output Dimensions: 1408-dimensional vectors

Setup and Installation

Prerequisites

  • Python 3.8+
  • Google Cloud account with Firestore, Storage, Pub/Sub, Cloud Functions, Compute Engine, and Vision API enabled
  • Terraform (for infrastructure deployment)

Installation

  1. Clone the repository:

    git clone {repo-url}
    cd {repo-name}
    
  2. Create and activate a virtual environment:

    python -m venv venv
    source venv/bin/activate  # Linux/macOS
    venv\Scripts\activate     # Windows
    
  3. Install dependencies:

    pip install -r requirements.txt
    
  4. Create a .env file with the following environment variables:

    # Project Environment Variables
     ENVIRONMENT=development
     LOG_LEVEL=DEBUG
    
     # CORS settings - Must be a valid JSON list of strings or comma-separated values
     CORS_ORIGINS=["*"]
    
     # Firestore settings
     FIRESTORE_PROJECT_ID=gen-lang-client-0424120530
     FIRESTORE_DATABASE_NAME=sereact-imagedb
     FIRESTORE_CREDENTIALS_FILE=firestore-credentials.json
    
     # Google Cloud Storage settings
     GCS_BUCKET_NAME=sereact-images
     GCS_CREDENTIALS_FILE=firestore-credentials.json
    
     # Security settings
     API_KEY_SECRET=super-secret-key
     API_KEY_EXPIRY_DAYS=365
    
     # Vector Database settings
     QDRANT_PORT=6333
     QDRANT_HTTPS=false
     QDRANT_PREFER_GRPC=false
    
  5. Deploy Infrastructure

    ./deployment/deploy.sh --build --deploy
    python ./scripts/seed_firestore.py
    
  6. Destroy Infrastructure

    ./deployment/deploy.sh --destroy
    
  7. Local Development

    ./scripts/start.sh
    
  8. Local Testing

API Endpoints

The API provides the following main endpoints with their authentication and pagination support:

🔓 Public Endpoints (No Authentication Required)

Authentication & API Key Management

  • /api/v1/auth/api-keys (POST) - Create new API key (requires user_id and team_id parameters)

Team Management

  • /api/v1/teams/* - Complete team management (no authentication required)
    • POST /api/v1/teams - Create new team
    • GET /api/v1/teams - List all teams (no pagination - returns all teams)
    • GET /api/v1/teams/{team_id} - Get team by ID
    • PUT /api/v1/teams/{team_id} - Update team
    • DELETE /api/v1/teams/{team_id} - Delete team

User Management

  • /api/v1/users/* - Complete user management (no authentication required)
    • POST /api/v1/users - Create new user (requires team_id)
    • GET /api/v1/users - List users (no pagination - returns all users, optionally filtered by team)
    • GET /api/v1/users/{user_id} - Get user by ID
    • PUT /api/v1/users/{user_id} - Update user
    • DELETE /api/v1/users/{user_id} - Delete user
    • GET /api/v1/users/me?user_id={id} - Get user info (requires user_id parameter)
    • PUT /api/v1/users/me?user_id={id} - Update user info (requires user_id parameter)

🔐 Protected Endpoints (API Key Authentication Required)

API Key Management (Authenticated)

  • /api/v1/auth/api-keys (GET) - List API keys for current user
  • /api/v1/auth/api-keys/{key_id} (DELETE) - Revoke API key
  • /api/v1/auth/admin/api-keys/{user_id} (POST) - Create API key for another user (admin only)
  • /api/v1/auth/verify - Verify current authentication

Image Management Fully Paginated & Protected

  • /api/v1/images/* - Image upload, download, and management (with async processing)
    • GET /api/v1/images - List images with full pagination support
      • Query Parameters:
        • skip (default: 0, min: 0) - Number of items to skip
        • limit (default: 50, min: 1, max: 100) - Number of items per page
        • collection_id (optional) - Filter by collection
      • Response includes: images, total, skip, limit

Search Functionality Fully Paginated & Protected

  • /api/v1/search/* - Image search functionality (semantic search via Qdrant)
    • GET /api/v1/search - Search images with pagination support
      • Query Parameters:
        • q (required) - Search query
        • limit (default: 10, min: 1, max: 50) - Number of results
        • similarity_threshold (default: 0.7, min: 0.0, max: 1.0) - Similarity threshold
        • collection_id (optional) - Filter by collection
      • Response includes: results, total, limit, similarity_threshold, query
    • POST /api/v1/search - Advanced search with same pagination

🔑 Authentication Model

A hybrid authentication model:

  1. Public Management Endpoints: Users, teams, and API key creation are publicly accessible for easy integration and setup
  2. Protected Data Endpoints: Image storage and search require API key authentication

Authentication & Pagination Status

Endpoint Category Authentication Pagination Status Notes
Users Management 🔓 Public Not Implemented Complete CRUD operations, no auth required
Teams Management 🔓 Public Not Implemented Complete CRUD operations, no auth required
API Key Creation 🔓 Public N/A Requires user_id and team_id parameters
Images API 🔐 Protected Fully Implemented skip, limit, total with proper validation
Search API 🔐 Protected Fully Implemented limit, total with similarity scoring
API Key Management 🔐 Protected Not Implemented List/revoke existing keys (small datasets)

Note: Public endpoints (users, teams) don't implement pagination as they typically return small datasets and are designed for management use cases where full data visibility is preferred.

Refer to the Swagger UI documentation at /docs for detailed endpoint information.

Development

Running Tests

source venv/Scripts/activate && python scripts/run_tests.py all

API Modules Architecture

The SEREACT API is organized into the following key modules to ensure separation of concerns and maintainable code:

src/
  ├── api/             # API endpoints and routers
  │   └── v1/          # API version 1 routes
  ├── auth/            # Authentication and authorization
  ├── config/          # Configuration management
  ├── models/          # Database models
  ├── services/        # Business logic services
  │   └── vector_db.py # **Qdrant vector database service**
  └── utils/           # Utility functions

Module Responsibilities

Router Module

  • Defines API endpoints and routes
  • Handles HTTP requests and responses
  • Validates incoming request data
  • Directs requests to appropriate services
  • Implements API versioning

Auth Module

  • Manages user authentication
  • Handles API key validation and verification
  • Implements role-based access control
  • Provides security middleware
  • Manages user sessions and tokens

Services Module

  • Contains core business logic
  • Orchestrates operations across multiple resources
  • Implements domain-specific rules and workflows
  • Integrates with external services (Cloud Vision, Storage, Qdrant)
  • Handles image processing and embedding generation

Models Module

  • Defines data structures and schemas
  • Provides database entity representations
  • Handles data validation and serialization
  • Implements data relationships and constraints
  • Manages database migrations

Utils Module

  • Provides helper functions and utilities
  • Implements common functionality used across modules
  • Handles error processing and logging
  • Provides formatting and conversion utilities
  • Implements reusable middleware components

Config Module

  • Manages application configuration
  • Handles environment variable loading
  • Provides centralized settings management
  • Configures service connections and credentials
  • Defines application constants and defaults

Module Interactions

┌─────────────┐         ┌─────────────┐         ┌─────────────┐
│             │         │             │         │             │
│  Router     │ ───────▶│  Services   │ ◀───────│  Config     │
│  Module     │         │  Module     │         │  Module     │
│             │         │             │         │             │
└──────┬──────┘         └──────┬──────┘         └─────────────┘
       │                       │
       │                       │
       ▼                       ▼
┌─────────────┐         ┌─────────────┐
│             │         │             │
│  Auth       │         │  Models     │
│  Module     │         │  Module     │
│             │         │             │
└──────┬──────┘         └──────┬──────┘
       │                       │
       │                       │
       └───────────────────────┘
                 │
                 ▼
          ┌─────────────┐
          │             │
          │  Utils      │
          │  Module     │
          │             │
          └─────────────┘

The modules interact in the following ways:

  • Request Flow:

    • Client request arrives at the Router Module
    • Auth Module validates the request authentication
    • Router delegates to appropriate Service functions
    • Service uses Models to interact with the database
    • Service integrates with Qdrant Vector Database for similarity search
    • Service returns data to Router which formats the response
  • Cross-Cutting Concerns:

    • Config Module provides settings to all other modules
    • Utils Module provides helper functions across the application
    • Auth Module secures access to routes and services
  • Dependency Direction:

    • Router depends on Services and Auth
    • Services depend on Models and Config
    • Models depend on Utils for helper functions
    • Auth depends on Models for user information
    • All modules may use Utils and Config

This modular architecture provides several benefits:

  • Maintainability: Changes in one module have minimal impact on others
  • Testability: Modules can be tested in isolation with mocked dependencies
  • Scalability: New features can be added by extending existing modules
  • Reusability: Common functionality is centralized for consistent implementation
  • Security: Authentication and authorization are handled consistently

TODO

High Priority

  • Thumbnail generation
  • Secret management
  • Scale Vector DB to multiple nodes

Medium Priority

  • Implement caching layer for frequently accessed embeddings
  • Implement caching for frequently accessed data
  • Consider adding pagination to admin endpoints (users, teams, API keys) if datasets grow large

Low Priority

  • Move all auth logic to auth module
  • Move cloud function code to src folder and reuse code with embedding service
  • Remove Pinecone integration
Description
No description provided
Readme 870 KiB
Languages
Python 74.3%
JavaScript 16%
Shell 3.7%
HTML 2.7%
HCL 2.3%
Other 1%