diff --git a/README.md b/README.md index 832ae4a..10903fd 100644 --- a/README.md +++ b/README.md @@ -517,4 +517,66 @@ This modular architecture provides several benefits: ### Low Priority - [ ] Move all auth logic to auth module - [ ] Move cloud function code to src folder and reuse code with embedding service -- [ ] Remove Pinecone integration \ No newline at end of file +- [ ] Remove Pinecone integration + + + +## Design Decisions + +### 1. **Modular Backend Architecture** + +#### **Decision**: Organized the backend into distinct modules: `routers`, `services`, `repositories`, `auth`, `models`, `schemas` and `utils`. + +#### **Rationale**: +- **Separation of Concerns**: Each module has a single, well-defined responsibility +- **Maintainability**: Changes in one module have minimal impact on others +- **Testability**: Modules can be tested in isolation with mocked dependencies +- **Scalability**: New features can be added by extending existing modules without affecting the entire codebase + +### 2. **Firestore NoSQL Database ** + +#### **Rationale**: +- **Schema Flexibility**: NoSQL structure accommodates evolving data models without migrations +- **Scalability**: Automatic scaling handles varying workloads without manual intervention +- **Google Cloud Integration**: Native integration with other GCP services (Cloud Storage, Pub/Sub, Cloud Functions) +- **Managed Service**: No database administration overhead + +#### **Trade-offs Considered**: +- **Eventual Consistency**: Acceptable for this use case as image metadata doesn't require strict consistency +- **Query Limitations**: Compensated by designing data models around access patterns +- **Cost**: Predictable pricing model based on operations, suitable for image metadata workload + +### 3. **Cloud Functions with Pub/Sub for Asynchronous Image Processing** + +#### **Decision**: Google Cloud Functions triggered by Pub/Sub messages + +#### **Rationale**: +- **Decoupling**: Separates image upload from processing, improving API response times +- **Scalability**: Functions automatically scale based on message volume +- **Reliability**: Built-in retry mechanisms and dead letter queues for failed processing +- **Fault Tolerance**: Failed processing doesn't affect the main API +- **Native GCP Integration**: Seamless integration with Pub/Sub for message handling + +### 4. **Hybrid Authentication Model** + +#### **Decision**: Implemented a hybrid authentication approach with public management endpoints and API key-protected data endpoints. + +#### **Rationale**: +- **Easy Integration**: Public user/team management enables simple onboarding + +### 5. **Terraform for Infrastructure as Code** + +#### **Rationale**: +- **Reproducibility**: Infrastructure can be recreated identically across environments +- **Version Control**: Infrastructure changes tracked in Git with proper review process +- **Sustainability**: Easy to maintain, update, and scale infrastructure + +### 6. **Additional Architectural Decisions** + +#### **Vector Database on Dedicated VM** +- **Decision**: Self-hosted Qdrant on Google Compute Engine instead of managed vector database +- **Rationale**: Cost optimization + +#### **Vertex AI for Embeddings** +- **Decision**: Google Vertex AI Multimodal Embedding API over custom models +- **Rationale**: High-quality embeddings, managed service, and consistent performance