Secure image storage in Google Cloud Storage
Team-based organization and access control
Hybrid authentication model: Public management endpoints + API key protected data endpoints
Asynchronous image processing with Pub/Sub and Cloud Functions
AI-powered image embeddings using Google Vertex AI Multimodal Embedding API
Semantic search using vector similarity with Qdrant Vector Database
Self-hosted vector database on Google Compute Engine VM
Automatic retry mechanism for failed processing (up to 3 attempts)
Metadata extraction and storage
Image processing capabilities
Multi-team support
Public user and team management APIs for easy integration
Comprehensive E2E testing with real database support

Architecture

root/
  ├── images/                    # Sample images for testing
  ├── deployment/                # Deployment configurations
  │   ├── cloud-function/        # **Cloud Function for image processing**
  │   ├── cloud-run/             # Google Cloud Run configuration
  │   └── terraform/             # Infrastructure as code
  │       ├── vm.tf              # **Vector database VM configuration**
  │       └── scripts/           # **VM installation scripts**
  ├── docs/                      # Documentation
  │   ├── api/                   # API documentation
  │   └── TESTING.md             # Comprehensive testing guide
  ├── scripts/                   # Utility scripts
  ├── src/                       # Source code
  │   ├── api/                   # API endpoints and routers
  │   │   └── v1/                # API version 1 routes
  │   ├── auth/                  # Authentication and authorization
  │   ├── config/                # Configuration management
  │   ├── db/                    # Database layer
  │   │   ├── providers/         # Database providers (Firestore)
  │   │   └── repositories/      # Data access repositories
  │   ├── models/                # Database models
  │   ├── schemas/               # API request/response schemas
  │   ├── services/              # Business logic services
  │   │   ├── pubsub_service.py  # **Pub/Sub message publishing**
  │   │   └── vector_db.py       # **Qdrant vector database service**
  │   └── utils/                 # Utility functions
  ├── tests/                     # Test code
  │   ├── api/                   # API tests
  │   ├── auth/                  # Authentication tests
  │   ├── models/                # Model tests
  │   ├── services/              # Service tests
  │   ├── integration/           # Integration tests
  │   │   └── test_cloud_function.py  # **Cloud Function tests**
  │   └── test_e2e.py           # **Comprehensive E2E workflow tests**
  ├── main.py                    # Application entry point
  ├── requirements.txt           # Python dependencies
  └── README.md                  # This file

System Architecture

┌─────────────┐         ┌─────────────┐         ┌─────────────┐
│             │         │             │         │             │
│  FastAPI    │ ───────▶│  Firestore  │◀────────│  Cloud      │
│  Backend    │         │  Database   │         │  Functions  │
│             │         │             │         │             │
└─────┬───────┘         └─────────────┘         └──────┬──────┘
      │                                                │
      │                                                │
      ▼                                                │
┌─────────────┐         ┌─────────────┐                │
│             │         │             │                │
│  Cloud      │         │  Pub/Sub    │                │
│  Storage    │────────▶│  Queue      │────────────────┘
│             │         │             │
└─────────────┘         └─────────────┘
                               │
                               │
                               ▼
                        ┌─────────────┐         ┌─────────────┐
                        │             │         │             │
                        │  Cloud      │         │  Qdrant     │
                        │  Vision API │────────▶│  Vector DB  │
                        │             │         │  (VM)       │
                        └─────────────┘         └─────────────┘

Image Processing Workflow

1. Image Upload Flow:

User uploads image through FastAPI backend
Image is stored in Google Cloud Storage
Image metadata is saved to Firestore with embedding_status: "pending"
Pub/Sub message is published to trigger async processing

2. Embedding Generation Flow (Asynchronous):

Cloud Function is triggered by Pub/Sub message
Function updates image status to "processing"
Function downloads image from Cloud Storage
Function calls Google Vertex AI Multimodal Embedding API to generate 1408-dimensional embeddings
Embeddings are stored in Qdrant Vector Database on dedicated VM
Firestore is updated with embedding info and status: "success"

3. Error Handling & Retry:

Failed processing updates status to "failed" with error message
Automatic retry up to 3 times using Pub/Sub retry policy
Dead letter queue for permanently failed messages

4. Search Flow:

Search queries processed by FastAPI backend
Vector similarity search performed against Qdrant vector database on a VM
Results combined with metadata from Firestore

Technology Stack

FastAPI - Web framework
Firestore - Database
Google Cloud Storage - Image storage
Google Pub/Sub - Message queue for async processing
Google Cloud Functions - Serverless image processing
Google Vertex AI Multimodal Embedding API - AI-powered image analysis and embedding generation
Qdrant - Self-hosted vector database for semantic search (on Google Compute Engine VM)
Google Compute Engine - VM hosting for vector database
Pydantic - Data validation

Vector Database Infrastructure

Qdrant Vector Database VM

The system includes a dedicated Google Compute Engine VM running Qdrant vector database:

VM Specifications: 2 vCPUs, 8GB RAM, 50GB disk (e2-standard-2)
Operating System: Ubuntu 22.04 LTS
Vector Database: Qdrant (latest version via Docker)

AI Embedding Model

Uses Google's Vertex AI multimodal embedding model for generating high-quality image embeddings:

Model: multimodalembedding@001
Provider: Google Vertex AI
Type: Multimodal embedding model (supports both images and text)
Output Dimensions: 1408-dimensional vectors

Setup and Installation

Prerequisites

Python 3.8+
Docker
Google Cloud account with Firestore, Storage, Pub/Sub, Cloud Functions, Compute Engine, and Vision API enabled
Terraform (for infrastructure deployment): https://developer.hashicorp.com/terraform/tutorials/aws-get-started/install-cli
gcloud CLI: https://cloud.google.com/sdk/docs/install

Deployment

1. Clone the Repository

git clone {repo-url}
cd {repo-name}

2. Configure Google Cloud Project

Set your Google Cloud project ID:

gcloud config set project PROJECT_ID
gcloud auth login
gcloud auth application-default login
gcloud auth configure-docker

Replace PROJECT_ID with your actual Google Cloud project ID.

3. Configure Terraform Variables

Create your Terraform variables file from the example:

cp deployment/terraform/terraform.tfvars.example deployment/terraform/terraform.tfvars

Edit the variables file with your specific values.

4. Download Service Account Credentials

Download the service account credentials for the project and save them as firestore-credentials.json in the root directory.

5. Deploy Infrastructure

Deploy the complete infrastructure including VM, Cloud Functions, and all Google Cloud services:

./deployment/deploy.sh --build --deploy

6. Seed Initial Data (Optional and requires to install the virtual environment first)

Initialize the database with sample data:

python ./scripts/seed_firestore.py

7. Verify Deployment

Your API will be available at the Cloud Run URL provided after deployment. Check the deployment status and test the endpoints.

8. Destroy Infrastructure (Optional)

To tear down all deployed resources:

./deployment/deploy.sh --destroy

Local Development

1. Setup Local Environment

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate  # Linux/macOS
venv\Scripts\activate     # Windows

2. Install Dependencies

pip install -r requirements.txt

3. Configure Local Environment

Create a .env file for local development (use different values from production):

# Project Environment Variables
ENVIRONMENT=development
LOG_LEVEL=DEBUG

# CORS settings - Must be a valid JSON list of strings or comma-separated values
CORS_ORIGINS=["*"]

# Firestore settings
FIRESTORE_PROJECT_ID=gen-lang-client-0424120530
FIRESTORE_DATABASE_NAME=contoso-imagedb-dev
FIRESTORE_CREDENTIALS_FILE=firestore-credentials.json

# Google Cloud Storage settings
GCS_BUCKET_NAME=contoso-images-dev
GCS_CREDENTIALS_FILE=firestore-credentials.json

# Security settings
API_KEY_SECRET=local-dev-secret-key
API_KEY_EXPIRY_DAYS=30

# Vector Database settings (point to deployed VM or local instance)
QDRANT_PORT=6333
QDRANT_HTTPS=false
QDRANT_PREFER_GRPC=false

4. Start Local Development Server

./scripts/start.sh

5. Run Tests

source venv/Scripts/activate && python scripts/run_tests.py all

The local development server will be available at http://localhost:8000 with API documentation at http://localhost:8000/docs.

API Endpoints

The API provides the following main endpoints with their authentication and pagination support:

🔓 Public Endpoints (No Authentication Required)

Authentication & API Key Management

/api/v1/auth/api-keys (POST) - Create new API key (requires user_id and team_id parameters)

Team Management

/api/v1/teams/* - Complete team management (no authentication required)
- POST /api/v1/teams - Create new team
- GET /api/v1/teams - List all teams with pagination support
  - Query Parameters:
    - skip (default: 0, min: 0) - Number of items to skip
    - limit (default: 50, min: 1, max: 100) - Number of items per page
  - Response includes: teams, total, skip, limit
- GET /api/v1/teams/{team_id} - Get team by ID
- PUT /api/v1/teams/{team_id} - Update team
- DELETE /api/v1/teams/{team_id} - Delete team

User Management

/api/v1/users/* - Complete user management (no authentication required)
- POST /api/v1/users - Create new user (requires team_id)
- GET /api/v1/users - List users with pagination support
  - Query Parameters:
    - skip (default: 0, min: 0) - Number of items to skip
    - limit (default: 50, min: 1, max: 100) - Number of items per page
    - team_id (optional) - Filter by team
  - Response includes: users, total, skip, limit
- GET /api/v1/users/{user_id} - Get user by ID
- PUT /api/v1/users/{user_id} - Update user
- DELETE /api/v1/users/{user_id} - Delete user
- GET /api/v1/users/me?user_id={id} - Get user info (requires user_id parameter)
- PUT /api/v1/users/me?user_id={id} - Update user info (requires user_id parameter)

🔐 Protected Endpoints (API Key Authentication Required)

API Key Management (Authenticated)

/api/v1/auth/api-keys (GET) - List API keys for current user with pagination support
- Query Parameters:
  - skip (default: 0, min: 0) - Number of items to skip
  - limit (default: 50, min: 1, max: 100) - Number of items per page
- Response includes: api_keys, total, skip, limit
/api/v1/auth/api-keys/{key_id} (DELETE) - Revoke API key
/api/v1/auth/admin/api-keys/{user_id} (POST) - Create API key for another user (admin only)
/api/v1/auth/verify - Verify current authentication

Image Management ✅ Fully Paginated & Protected

/api/v1/images/* - Image upload, download, and management (with async processing)
- GET /api/v1/images - List images with full pagination support
  - Query Parameters:
    - skip (default: 0, min: 0) - Number of items to skip
    - limit (default: 50, min: 1, max: 100) - Number of items per page
    - collection_id (optional) - Filter by collection
  - Response includes: images, total, skip, limit

Search Functionality ✅ Fully Paginated & Protected

/api/v1/search/* - Image search functionality (semantic search via Qdrant)
- GET /api/v1/search - Search images with pagination support
  - Query Parameters:
    - q (required) - Search query
    - skip (default: 0, min: 0) - Number of items to skip
    - limit (default: 10, min: 1, max: 50) - Number of results
    - similarity_threshold (default: 0.7, min: 0.0, max: 1.0) - Similarity threshold
    - collection_id (optional) - Filter by collection
  - Response includes: results, total, skip, limit, similarity_threshold, query
- POST /api/v1/search - Advanced search with same pagination

🔑 Authentication Model

A hybrid authentication model:

Public Management Endpoints: Users, teams, and API key creation are publicly accessible for easy integration and setup
Protected Data Endpoints: Image storage and search require API key authentication

Authentication & Pagination Status

Endpoint Category	Authentication	Pagination Status	Notes
Users Management	🔓 Public	✅ Fully Implemented	`skip`, `limit`, `total` with team filtering
Teams Management	🔓 Public	✅ Fully Implemented	`skip`, `limit`, `total` with proper validation
API Key Creation	🔓 Public	N/A	Requires `user_id` and `team_id` parameters
Images API	🔐 Protected	✅ Fully Implemented	`skip`, `limit`, `total` with proper validation
Search API	🔐 Protected	✅ Fully Implemented	`skip`, `limit`, `total` with similarity scoring
API Key Management	🔐 Protected	✅ Fully Implemented	`skip`, `limit`, `total` for user's API keys

Note: All endpoints now implement consistent pagination with skip and limit parameters for optimal performance and user experience.

Refer to the Swagger UI documentation at /docs for detailed endpoint information.

Development

Running Tests

source venv/Scripts/activate && python scripts/run_tests.py all

API Modules Architecture

The CONTOSO API is organized into the following key modules to ensure separation of concerns and maintainable code:

src/
  ├── api/             # API endpoints and routers
  │   └── v1/          # API version 1 routes
  ├── auth/            # Authentication and authorization
  ├── config/          # Configuration management
  ├── models/          # Database models
  ├── services/        # Business logic services
  │   └── vector_db.py # **Qdrant vector database service**
  └── utils/           # Utility functions

Module Responsibilities

Router Module

Defines API endpoints and routes
Handles HTTP requests and responses
Validates incoming request data
Directs requests to appropriate services
Implements API versioning

Auth Module

Manages user authentication
Handles API key validation and verification
Implements role-based access control
Provides security middleware
Manages user sessions and tokens

Services Module

Contains core business logic
Orchestrates operations across multiple resources
Implements domain-specific rules and workflows
Integrates with external services (Cloud Vision, Storage, Qdrant)
Handles image processing and embedding generation

Models Module

Defines data structures and schemas
Provides database entity representations
Handles data validation and serialization
Implements data relationships and constraints
Manages database migrations

Utils Module

Provides helper functions and utilities
Implements common functionality used across modules
Handles error processing and logging
Provides formatting and conversion utilities
Implements reusable middleware components

Config Module

Manages application configuration
Handles environment variable loading
Provides centralized settings management
Configures service connections and credentials
Defines application constants and defaults

Module Interactions

┌─────────────┐         ┌─────────────┐         ┌─────────────┐
│             │         │             │         │             │
│  Router     │ ───────▶│  Services   │ ◀───────│  Config     │
│  Module     │         │  Module     │         │  Module     │
│             │         │             │         │             │
└──────┬──────┘         └──────┬──────┘         └─────────────┘
       │                       │
       │                       │
       ▼                       ▼
┌─────────────┐         ┌─────────────┐
│             │         │             │
│  Auth       │         │  Models     │
│  Module     │         │  Module     │
│             │         │             │
└──────┬──────┘         └──────┬──────┘
       │                       │
       │                       │
       └───────────────────────┘
                 │
                 ▼
          ┌─────────────┐
          │             │
          │  Utils      │
          │  Module     │
          │             │
          └─────────────┘

The modules interact in the following ways:

Request Flow:
- Client request arrives at the Router Module
- Auth Module validates the request authentication
- Router delegates to appropriate Service functions
- Service uses Models to interact with the database
- Service integrates with Qdrant Vector Database for similarity search
- Service returns data to Router which formats the response
Cross-Cutting Concerns:
- Config Module provides settings to all other modules
- Utils Module provides helper functions across the application
- Auth Module secures access to routes and services
Dependency Direction:
- Router depends on Services and Auth
- Services depend on Models and Config
- Models depend on Utils for helper functions
- Auth depends on Models for user information
- All modules may use Utils and Config

This modular architecture provides several benefits:

Maintainability: Changes in one module have minimal impact on others
Testability: Modules can be tested in isolation with mocked dependencies
Scalability: New features can be added by extending existing modules
Reusability: Common functionality is centralized for consistent implementation
Security: Authentication and authorization are handled consistently

TODO

High Priority

Thumbnail generation
Secret management
Scale Vector DB to multiple nodes

Medium Priority

Implement caching layer for frequently accessed embeddings
Implement caching for frequently accessed data

Low Priority

Move all auth logic to auth module
Move cloud function code to src folder and reuse code with embedding service
Remove Pinecone integration

Design Decisions

1. Modular Backend Architecture

Decision: Organized the backend into distinct modules: `routers`, `services`, `repositories`, `auth`, `models`, `schemas` and `utils`.

Rationale:

Separation of Concerns: Each module has a single, well-defined responsibility
Maintainability: Changes in one module have minimal impact on others
Testability: Modules can be tested in isolation with mocked dependencies
Scalability: New features can be added by extending existing modules without affecting the entire codebase

2. Firestore NoSQL Database

Rationale:

Schema Flexibility: NoSQL structure accommodates evolving data models without migrations
Scalability: Automatic scaling handles varying workloads without manual intervention
Google Cloud Integration: Native integration with other GCP services (Cloud Storage, Pub/Sub, Cloud Functions)
Managed Service: No database administration overhead

Trade-offs Considered:

Eventual Consistency: Acceptable for this use case as image metadata doesn't require strict consistency
Query Limitations: Compensated by designing data models around access patterns
Cost: Predictable pricing model based on operations, suitable for image metadata workload

3. Cloud Functions with Pub/Sub for Asynchronous Image Processing

Decision: Google Cloud Functions triggered by Pub/Sub messages

Rationale:

Decoupling: Separates image upload from processing, improving API response times
Scalability: Functions automatically scale based on message volume
Reliability: Built-in retry mechanisms and dead letter queues for failed processing
Fault Tolerance: Failed processing doesn't affect the main API
Native GCP Integration: Seamless integration with Pub/Sub for message handling

4. Hybrid Authentication Model

Decision: Implemented a hybrid authentication approach with public management endpoints and API key-protected data endpoints.

Rationale:

Easy Integration: Public user/team management enables simple onboarding

5. Terraform for Infrastructure as Code

Rationale:

Reproducibility: Infrastructure can be recreated identically across environments
Version Control: Infrastructure changes tracked in Git with proper review process
Sustainability: Easy to maintain, update, and scale infrastructure

6. Additional Architectural Decisions

Vector Database on Dedicated VM

Decision: Self-hosted Qdrant on Google Compute Engine instead of managed vector database
Rationale: Cost optimization

Vertex AI for Embeddings

Decision: Google Vertex AI Multimodal Embedding API over custom models
Rationale: High-quality embeddings, managed service, and consistent performance

Languages

Python 74.3%

JavaScript 16%

Shell 3.7%

HTML 2.7%

HCL 2.3%

Other 1%

README.md

Image Management API

Features

Architecture

System Architecture

Image Processing Workflow

1. Image Upload Flow:

2. Embedding Generation Flow (Asynchronous):

3. Error Handling & Retry:

4. Search Flow:

Technology Stack

Vector Database Infrastructure

Qdrant Vector Database VM

AI Embedding Model

Setup and Installation

Prerequisites

Deployment

1. Clone the Repository

2. Configure Google Cloud Project

3. Configure Terraform Variables

4. Download Service Account Credentials

5. Deploy Infrastructure

6. Seed Initial Data (Optional and requires to install the virtual environment first)

7. Verify Deployment

8. Destroy Infrastructure (Optional)

Local Development

1. Setup Local Environment

2. Install Dependencies

3. Configure Local Environment

4. Start Local Development Server

5. Run Tests

API Endpoints

🔓 Public Endpoints (No Authentication Required)

Authentication & API Key Management

Team Management

User Management

🔐 Protected Endpoints (API Key Authentication Required)

API Key Management (Authenticated)

Image Management ✅ Fully Paginated & Protected

Search Functionality ✅ Fully Paginated & Protected

🔑 Authentication Model

Authentication & Pagination Status

Development

Running Tests

API Modules Architecture

Module Responsibilities

Router Module

Auth Module

Services Module

Models Module

Utils Module

Config Module

Module Interactions

TODO

High Priority

Medium Priority

Low Priority

Design Decisions

1. Modular Backend Architecture

Decision: Organized the backend into distinct modules: routers, services, repositories, auth, models, schemas and utils.

Rationale:

2. **Firestore NoSQL Database **

Rationale:

Trade-offs Considered:

3. Cloud Functions with Pub/Sub for Asynchronous Image Processing

Decision: Google Cloud Functions triggered by Pub/Sub messages

Rationale:

4. Hybrid Authentication Model

Decision: Implemented a hybrid authentication approach with public management endpoints and API key-protected data endpoints.

Rationale:

5. Terraform for Infrastructure as Code

Rationale:

6. Additional Architectural Decisions

Vector Database on Dedicated VM

Vertex AI for Embeddings

Decision: Organized the backend into distinct modules: `routers`, `services`, `repositories`, `auth`, `models`, `schemas` and `utils`.

2. Firestore NoSQL Database