2025-05-24 15:07:46 +02:00
2025-05-23 21:30:52 +02:00
2025-05-24 14:35:35 +02:00
cp
2025-05-24 14:26:09 +02:00
cp
2025-05-24 13:57:58 +02:00
2025-05-23 21:30:52 +02:00
cp
2025-05-24 14:44:49 +02:00
cp
2025-05-24 14:26:09 +02:00
cp
2025-05-24 14:26:09 +02:00
cp
2025-05-24 06:38:15 +02:00
cp
2025-05-24 06:21:38 +02:00
2025-05-24 04:57:04 +02:00
2025-05-23 22:57:11 +02:00
2025-05-24 15:07:46 +02:00
cp
2025-05-24 05:12:51 +02:00
cp
2025-05-24 13:57:58 +02:00
cp
2025-05-24 14:26:09 +02:00
cp
2025-05-24 14:26:09 +02:00
cp
2025-05-24 05:12:51 +02:00

SEREACT - Secure Image Management API

SEREACT is a secure API for storing, organizing, and retrieving images with advanced search capabilities powered by AI-generated embeddings.

Features

  • Secure image storage in Google Cloud Storage
  • Team-based organization and access control
  • API key authentication
  • Asynchronous image processing with Pub/Sub and Cloud Functions
  • AI-powered image embeddings using Google Cloud Vision API
  • Semantic search using vector similarity in Pinecone
  • Automatic retry mechanism for failed processing (up to 3 attempts)
  • Metadata extraction and storage
  • Image processing capabilities
  • Multi-team support
  • Comprehensive E2E testing with real database support

Architecture

sereact/
  ├── images/                    # Sample images for testing
  ├── deployment/                # Deployment configurations
  │   ├── cloud-function/        # **Cloud Function for image processing**
  │   ├── cloud-run/             # Google Cloud Run configuration
  │   └── terraform/             # Infrastructure as code
  ├── docs/                      # Documentation
  │   ├── api/                   # API documentation
  │   └── TESTING.md             # Comprehensive testing guide
  ├── scripts/                   # Utility scripts
  ├── src/                       # Source code
  │   ├── api/                   # API endpoints and routers
  │   │   └── v1/                # API version 1 routes
  │   ├── auth/                  # Authentication and authorization
  │   ├── config/                # Configuration management
  │   ├── core/                  # Core application logic
  │   ├── db/                    # Database layer
  │   │   ├── providers/         # Database providers (Firestore)
  │   │   └── repositories/      # Data access repositories
  │   ├── models/                # Database models
  │   ├── schemas/               # API request/response schemas
  │   ├── services/              # Business logic services
  │   │   └── pubsub_service.py  # **Pub/Sub message publishing**
  │   └── utils/                 # Utility functions
  ├── tests/                     # Test code
  │   ├── api/                   # API tests
  │   ├── auth/                  # Authentication tests
  │   ├── models/                # Model tests
  │   ├── services/              # Service tests
  │   ├── integration/           # Integration tests
  │   │   └── test_cloud_function.py  # **Cloud Function tests**
  │   └── test_e2e.py           # **Comprehensive E2E workflow tests**
  ├── main.py                    # Application entry point
  ├── requirements.txt           # Python dependencies
  └── README.md                  # This file

System Architecture

┌─────────────┐         ┌─────────────┐         ┌─────────────┐
│             │         │             │         │             │
│  FastAPI    │ ───────▶│  Firestore  │◀────────│  Cloud      │
│  Backend    │         │  Database   │         │  Functions  │
│             │         │             │         │             │
└─────┬───────┘         └─────────────┘         └──────┬──────┘
      │                                                │
      │                                                │
      ▼                                                │
┌─────────────┐         ┌─────────────┐                │
│             │         │             │                │
│  Cloud      │         │  Pub/Sub    │                │
│  Storage    │────────▶│  Queue      │────────────────┘
│             │         │             │
└─────────────┘         └─────────────┘
                               │
                               │
                               ▼
                        ┌─────────────┐         ┌─────────────┐
                        │             │         │             │
                        │  Cloud      │         │  Pinecone   │
                        │  Vision API │────────▶│  Vector DB  │
                        │             │         │             │
                        └─────────────┘         └─────────────┘

Image Processing Workflow

1. Image Upload Flow:

  • User uploads image through FastAPI backend
  • Image is stored in Google Cloud Storage
  • Image metadata is saved to Firestore with embedding_status: "pending"
  • Pub/Sub message is published to trigger async processing

2. Embedding Generation Flow (Asynchronous):

  • Cloud Function is triggered by Pub/Sub message
  • Function updates image status to "processing"
  • Function downloads image from Cloud Storage
  • Function calls Google Cloud Vision API to generate embeddings
  • Embeddings are stored in Pinecone Vector Database
  • Firestore is updated with embedding info and status: "success"

3. Error Handling & Retry:

  • Failed processing updates status to "failed" with error message
  • Automatic retry up to 3 times using Pub/Sub retry policy
  • Dead letter queue for permanently failed messages

4. Search Flow:

  • Search queries processed by FastAPI backend
  • Vector similarity search performed against Pinecone
  • Results combined with metadata from Firestore

Technology Stack

  • FastAPI - Web framework
  • Firestore - Database
  • Google Cloud Storage - Image storage
  • Google Pub/Sub - Message queue for async processing
  • Google Cloud Functions - Serverless image processing
  • Google Cloud Vision API - AI-powered image analysis and embedding generation
  • Pinecone - Vector database for semantic search
  • Pydantic - Data validation

Setup and Installation

Prerequisites

  • Python 3.8+
  • Google Cloud account with Firestore, Storage, Pub/Sub, Cloud Functions, and Vision API enabled
  • Pinecone account for vector database

Installation

  1. Clone the repository:

    git clone https://github.com/yourusername/sereact.git
    cd sereact
    
  2. Create and activate a virtual environment:

    python -m venv venv
    source venv/bin/activate  # Linux/macOS
    venv\Scripts\activate     # Windows
    
  3. Install dependencies:

    pip install -r requirements.txt
    
  4. Create a .env file with the following environment variables:

    # Firestore
    FIRESTORE_PROJECT_ID=your-gcp-project-id
    FIRESTORE_CREDENTIALS_FILE=path/to/firestore-credentials.json
    
    # Google Cloud Storage
    GCS_BUCKET_NAME=your-bucket-name
    GCS_CREDENTIALS_FILE=path/to/credentials.json
    
    # Google Pub/Sub
    PUBSUB_TOPIC=image-processing-topic
    PUBSUB_SUBSCRIPTION=image-processing-subscription
    
    # Google Cloud Vision
    VISION_API_ENABLED=true
    
    # Security
    API_KEY_SECRET=your-secret-key
    
    # Vector database (Pinecone)
    VECTOR_DB_API_KEY=your-pinecone-api-key
    VECTOR_DB_ENVIRONMENT=your-pinecone-environment
    VECTOR_DB_INDEX_NAME=image-embeddings
    
  5. Deploy Infrastructure (Optional - for production):

    # Deploy Pub/Sub infrastructure with Terraform
    cd deployment/terraform
    terraform init
    terraform plan
    terraform apply
    
    # Deploy Cloud Function
    cd ../cloud-function
    ./deploy.sh
    
  6. Run the application:

    uvicorn main:app --reload
    
  7. Visit http://localhost:8000/docs in your browser to access the API documentation.

Deployment

Cloud Function Deployment

The image processing Cloud Function can be deployed using the provided script:

cd deployment/cloud-function

# Set environment variables
export GOOGLE_CLOUD_PROJECT=your-project-id
export PINECONE_API_KEY=your-pinecone-api-key
export PINECONE_ENVIRONMENT=your-pinecone-environment

# Deploy the function
./deploy.sh

Infrastructure as Code

Use Terraform to deploy the complete infrastructure:

cd deployment/terraform

# Initialize Terraform
terraform init

# Review the deployment plan
terraform plan

# Deploy infrastructure
terraform apply

This will create:

  • Pub/Sub topic and subscription with retry policy
  • Dead letter queue for failed messages
  • IAM bindings for service accounts

API Endpoints

The API provides the following main endpoints:

  • /api/v1/auth/* - Authentication and API key management
  • /api/v1/teams/* - Team management
  • /api/v1/users/* - User management
  • /api/v1/images/* - Image upload, download, and management (with async processing)
  • /api/v1/search/* - Image search functionality (semantic search)

Image Processing Status

Images now include embedding processing status:

{
  "id": "image-id",
  "filename": "example.jpg",
  "embedding_status": "success",  // "pending", "processing", "success", "failed"
  "embedding_error": null,
  "embedding_retry_count": 0,
  "has_embedding": true
}

Refer to the Swagger UI documentation at /docs for detailed endpoint information.

Development

Running Tests

# Run all tests
pytest

# Run specific test categories
pytest tests/services/test_pubsub_service.py  # Pub/Sub service tests
pytest tests/integration/test_cloud_function.py  # Cloud Function tests
pytest tests/api/test_images_pubsub.py  # API integration tests

Comprehensive End-to-End Testing

SEREACT includes a comprehensive E2E testing suite that covers complete user workflows with completely self-contained artificial test data:

# Run all E2E tests (completely self-contained - no setup required!)
python scripts/run_tests.py e2e

# Run unit tests only (fast)
python scripts/run_tests.py unit

# Run integration tests (requires real database)
python scripts/run_tests.py integration

License

This project is licensed under the MIT License - see the LICENSE file for details.

API Modules Architecture

The SEREACT API is organized into the following key modules to ensure separation of concerns and maintainable code:

src/
  ├── api/             # API endpoints and routers
  │   └── v1/          # API version 1 routes
  ├── auth/            # Authentication and authorization
  ├── config/          # Configuration management
  ├── models/          # Database models
  ├── services/        # Business logic services
  └── utils/           # Utility functions

Module Responsibilities

Router Module

  • Defines API endpoints and routes
  • Handles HTTP requests and responses
  • Validates incoming request data
  • Directs requests to appropriate services
  • Implements API versioning

Auth Module

  • Manages user authentication
  • Handles API key validation and verification
  • Implements role-based access control
  • Provides security middleware
  • Manages user sessions and tokens

Services Module

  • Contains core business logic
  • Orchestrates operations across multiple resources
  • Implements domain-specific rules and workflows
  • Integrates with external services (Cloud Vision, Storage)
  • Handles image processing and embedding generation

Models Module

  • Defines data structures and schemas
  • Provides database entity representations
  • Handles data validation and serialization
  • Implements data relationships and constraints
  • Manages database migrations

Utils Module

  • Provides helper functions and utilities
  • Implements common functionality used across modules
  • Handles error processing and logging
  • Provides formatting and conversion utilities
  • Implements reusable middleware components

Config Module

  • Manages application configuration
  • Handles environment variable loading
  • Provides centralized settings management
  • Configures service connections and credentials
  • Defines application constants and defaults

Module Interactions

┌─────────────┐         ┌─────────────┐         ┌─────────────┐
│             │         │             │         │             │
│  Router     │ ───────▶│  Services   │ ◀───────│  Config     │
│  Module     │         │  Module     │         │  Module     │
│             │         │             │         │             │
└──────┬──────┘         └──────┬──────┘         └─────────────┘
       │                       │
       │                       │
       ▼                       ▼
┌─────────────┐         ┌─────────────┐
│             │         │             │
│  Auth       │         │  Models     │
│  Module     │         │  Module     │
│             │         │             │
└──────┬──────┘         └──────┬──────┘
       │                       │
       │                       │
       └───────────────────────┘
                 │
                 ▼
          ┌─────────────┐
          │             │
          │  Utils      │
          │  Module     │
          │             │
          └─────────────┘

The modules interact in the following ways:

  • Request Flow:

    • Client request arrives at the Router Module
    • Auth Module validates the request authentication
    • Router delegates to appropriate Service functions
    • Service uses Models to interact with the database
    • Service returns data to Router which formats the response
  • Cross-Cutting Concerns:

    • Config Module provides settings to all other modules
    • Utils Module provides helper functions across the application
    • Auth Module secures access to routes and services
  • Dependency Direction:

    • Router depends on Services and Auth
    • Services depend on Models and Config
    • Models depend on Utils for helper functions
    • Auth depends on Models for user information
    • All modules may use Utils and Config

This modular architecture provides several benefits:

  • Maintainability: Changes in one module have minimal impact on others
  • Testability: Modules can be tested in isolation with mocked dependencies
  • Scalability: New features can be added by extending existing modules
  • Reusability: Common functionality is centralized for consistent implementation
  • Security: Authentication and authorization are handled consistently
Description
No description provided
Readme 870 KiB
Languages
Python 74.3%
JavaScript 16%
Shell 3.7%
HTML 2.7%
HCL 2.3%
Other 1%