vibe-coding-cn/i18n/en/documents/00-fundamentals/General Project Architectur...

696 lines
19 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

```
# Generic Project Architecture Template
## 1⃣ Standard Structure for Python Web/API Projects
```
项目名称/
├── README.md # Project description document
├── LICENSE # Open source license
├── requirements.txt # Dependency management (pip)
├── pyproject.toml # Modern Python project configuration (recommended)
├── setup.py # Package installation script (if packaged as a library)
├── .gitignore # Git ignore file
├── .env # Environment variables (not committed to Git)
├── .env.example # Example environment variables
├── CLAUDE.md # Claude persistent context
├── AGENTS.md # Codex persistent context
├── Sublime-Text.txt # For requirements and notes, for myself, and CLI session recovery commands ^_^
├── docs/ # Documentation directory
│ ├── api.md # API documentation
│ ├── development.md # Development guide
│ └── architecture.md # Architecture description
├── scripts/ # Script tools
│ ├── deploy.sh # Deployment script
│ ├── backup.sh # Backup script
│ └── init_db.sh # Database initialization
├── tests/ # Test code
│ ├── __init__.py
│ ├── conftest.py # pytest configuration
│ ├── unit/ # Unit tests
│ ├── integration/ # Integration tests
│ └── test_config.py # Configuration tests
├── src/ # Source code (recommended approach)
│ ├── __init__.py
│ ├── main.py # Program entry point
│ ├── app.py # Flask/FastAPI application
│ ├── config.py # Configuration management
│ │
│ ├── core/ # Core business logic
│ │ ├── __init__.py
│ │ ├── models/ # Data models
│ │ ├── services/ # Business services
│ │ └── utils/ # Utility functions
│ │
│ ├── api/ # API interface layer
│ │ ├── __init__.py
│ │ ├── v1/ # Version 1
│ │ └── dependencies.py
│ │
│ ├── data/ # Data processing
│ │ ├── __init__.py
│ │ ├── repository/ # Data access layer
│ │ └── migrations/ # Database migrations
│ │
│ └── external/ # External services
│ ├── __init__.py
│ ├── clients/ # API clients
│ └── integrations/ # Integrated services
├── logs/ # Log directory (not committed to Git)
│ ├── app.log
│ └── error.log
└── data/ # Data directory (not committed to Git)
├── raw/ # Raw data
├── processed/ # Processed data
└── cache/ # Cache
```
**Usage Scenarios**: Flask/FastAPI Web applications, RESTful API services, Web backends
---
## 2⃣ Standard Structure for Data Science/Quant Projects
```
项目名称/
├── README.md
├── LICENSE
├── requirements.txt
├── .gitignore
├── .env
├── .env.example
├── CLAUDE.md # Claude persistent context
├── AGENTS.md # Codex persistent context
├── Sublime-Text.txt # For requirements and notes, for myself, and CLI session recovery commands ^_^
├── docs/ # Documentation directory
│ ├── notebooks/ # Jupyter documentation
│ └── reports/ # Analysis reports
├── notebooks/ # Jupyter Notebook
│ ├── 01_data_exploration.ipynb
│ ├── 02_feature_engineering.ipynb
│ └── 03_model_training.ipynb
├── scripts/ # Script tools
│ ├── train_model.py # Training script
│ ├── backtest.py # Backtesting script
│ ├── collect_data.py # Data collection
│ └── deploy_model.py # Model deployment
├── tests/ # Tests
│ ├── test_data/
│ └── test_models/
├── configs/ # Configuration files
│ ├── model.yaml
│ ├── database.yaml
│ └── trading.yaml
├── src/ # Source code
│ ├── __init__.py
│ │
│ ├── data/ # Data processing module
│ │ ├── __init__.py
│ │ ├── collectors/ # Data collectors
│ │ ├── processors/ # Data cleaning
│ │ ├── features/ # Feature engineering
│ │ └── loaders.py # Data loaders
│ │
│ ├── models/ # Model module
│ │ ├── __init__.py
│ │ ├── strategies/ # Trading strategies
│ │ ├── backtest/ # Backtesting engine
│ │ └── risk/ # Risk management
│ │
│ ├── utils/ # Utility module
│ │ ├── __init__.py
│ │ ├── logging.py # Logging configuration
│ │ ├── database.py # Database tools
│ │ └── api_client.py # API client
│ │
│ └── core/ # Core module
│ ├── __init__.py
│ ├── config.py # Configuration management
│ ├── signals.py # Signal generation
│ └── portfolio.py # Investment portfolio
├── data/ # Data directory (Git ignored)
│ ├── raw/ # Raw data
│ ├── processed/ # Processed data
│ ├── external/ # External data
│ └── cache/ # Cache
├── models/ # Model files (Git ignored)
│ ├── checkpoints/ # Checkpoints
│ └── exports/ # Exported models
└── logs/ # Logs (Git ignored)
├── trading.log
└── errors.log
```
**Usage Scenarios**: Quantitative trading, machine learning, data analysis, AI research
---
## 3⃣ Standard Structure for Monorepo (Multi-Project Repository)
```
项目名称-monorepo/
├── README.md
├── LICENSE
├── .gitignore
├── .gitmodules # Git submodules
├── docker-compose.yml # Docker orchestration
├── CLAUDE.md # Claude persistent context
├── AGENTS.md # Codex persistent context
├── Sublime-Text.txt # This is a file for requirements and notes, for myself, and CLI session recovery commands ^_^
├── docs/ # Global documentation
│ ├── architecture.md
│ └── deployment.md
├── scripts/ # Global scripts
│ ├── build_all.sh
│ ├── test_all.sh
│ └── deploy.sh
├── backups/ # Backup files
│ ├── archive/ # Old backup files
│ └── gz/ # Compressed backup files
├── services/ # Microservices directory
│ │
│ ├── user-service/ # User service
│ │ ├── Dockerfile
│ │ ├── requirements.txt
│ │ ├── src/
│ │ └── tests/
│ │
│ ├── trading-service/ # Trading service
│ │ ├── Dockerfile
│ │ ├── requirements.txt
│ │ ├── src/
│ │ └── tests/
│ ...
│ └── data-service/ # Data service
│ ├── Dockerfile
│ ├── requirements.txt
│ ├── src/
│ └── tests/
├── libs/ # Shared libraries
│ ├── common/ # Common modules
│ │ ├── utils/
│ │ └── models/
│ ├── external/ # Third-party libraries (not modifiable, only callable)
│ └── database/ # Database access library
├── infrastructure/ # Infrastructure
│ ├── terraform/ # Cloud resource definitions
│ ├── kubernetes/ # K8s configuration
│ └── nginx/ # Reverse proxy configuration
└── monitoring/ # Monitoring system
├── prometheus/ # Metric collection
├── grafana/ # Visualization
└── alertmanager/ # Alerting
```
**Usage Scenarios**: Microservices architecture, large-scale projects, team collaboration
---
## 4⃣ Standard Structure for Full-Stack Web Applications
```
项目名称/
├── README.md
├── LICENSE
├── .gitignore
├── docker-compose.yml # Frontend and backend orchestration together
├── CLAUDE.md # Claude persistent context
├── AGENTS.md # Codex persistent context
├── Sublime-Text.txt # For requirements and notes, for myself, and CLI session recovery commands ^_^
├── frontend/ # Frontend directory
│ ├── public/ # Static assets
│ ├── src/ # Source code
│ │ ├── components/ # React/Vue components
│ │ ├── pages/ # Pages
│ │ ├── store/ # State management
│ │ └── utils/ # Utilities
│ ├── package.json # NPM dependencies
│ └── vite.config.js # Build configuration
└── backend/ # Backend directory
├── requirements.txt
├── Dockerfile
├── src/
│ ├── api/ # API interfaces
│ ├── core/ # Business logic
│ │ └── models/ # Data models
└── tests/
```
**Usage Scenarios**: Full-stack applications, SPA single-page applications, frontend/backend separation projects
---
## 📌 Core Design Principles
### 1. Separation of Concerns
```
API → Service → Data Access → Database
Clear, hierarchical, and easy to understand
```
### 2. Testability
```
Each module can be tested independently
Dependencies can be mocked
```
### 3. Configurability
```
Configuration separated from code
Environment variables > Configuration files > Default values
```
### 4. Maintainability
```
Self-explanatory code
Reasonable file naming
Clear directory structure
```
### 5. Git-Friendly
```
data/, logs/, models/ added to .gitignore
Only commit source code and configuration examples
```
---
## 🎯 Best Practice Recommendations
1. **Use the `src/` directory**: Place source code in a dedicated `src` directory to avoid cluttering the top-level directory.
2. **Relative imports**: Consistently use import statements like `from src.module import thing`.
3. **Test coverage**: Ensure core business logic has unit and integration tests.
4. **Documentation first**: Write `README.md` for important modules.
5. **Environment isolation**: Use virtualenv or conda to create independent environments.
6. **Explicit dependencies**: All dependencies should be listed in `requirements.txt` with locked versions.
7. **Configuration management**: Use a combination of environment variables and configuration files.
8. **Logging levels**: DEBUG, INFO, WARNING, ERROR, FATAL.
9. **Error handling**: Do not suppress exceptions; ensure a complete error chain.
10. **Code style**: Use black for formatting and flake8 for linting.
---
## 🔥 .gitignore Recommended Template
```gitignore
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
*.egg-info/
dist/
build/
# Environment
.env
.venv/
env/
venv/
ENV/
# IDE
.vscode/
.idea/
*.swp
*.swo
*~
# Data
data/
*.csv
*.json
*.db
*.sqlite
*.duckdb
# Logs
logs/
*.log
# Models
models/
*.h5
*.pkl
# Temporary files
tmp/
temp/
*.tmp
.DS_Store
```
---
## 📚 Technology Stack Reference
| Scenario | Recommended Technology Stack |
|----------|-----------------------------|
| Web API | FastAPI + Pydantic + SQLAlchemy |
| Data Processing | Pandas + NumPy + Polars |
| Machine Learning | Scikit-learn + XGBoost + LightGBM |
| Deep Learning | PyTorch + TensorFlow |
| Databases | PostgreSQL + Redis |
| Message Queue | RabbitMQ / Kafka |
| Task Queue | Celery |
| Monitoring | Prometheus + Grafana |
| Deployment | Docker + Docker Compose |
| CI/CD | GitHub Actions / GitLab CI |
---
## 📝 File Template Examples
### requirements.txt
```txt
# Core Dependencies
fastapi==0.104.1
uvicorn[standard]==0.24.0
pydantic==2.5.0
# Database
sqlalchemy==2.0.23
alembic==1.12.1
psycopg2-binary==2.9.9
# Testing
pytest==7.4.3
pytest-cov==4.1.0
pytest-asyncio==0.21.1
# Utilities
python-dotenv==1.0.0
loguru==0.7.2
# Development (optional)
black==23.11.0
flake8==6.1.0
mypy==1.7.1
```
### pyproject.toml (Recommended for modern Python projects)
```toml
[project]
name = "项目名称"
version = "0.1.0"
description = "项目描述"
authors = [{name = "作者", email = "邮箱 @example.com"}]
dependencies = [
"fastapi>=0.104.0",
"uvicorn[standard]>=0.24.0",
"sqlalchemy>=2.0.0",
]
[project.optional-dependencies]
dev = ["pytest", "black", "flake8", "mypy"]
[build-system]
requires = ["setuptools", "wheel"]
build-backend = "setuptools.build_meta"
```
---
## ✅ New Project Checklist
When starting a new project, ensure the following are completed:
- [ ] Create `README.md`, including project introduction and usage instructions.
- [ ] Create `LICENSE` file, clarifying the open-source license.
- [ ] Set up a Python virtual environment (venv/conda).
- [ ] Create `requirements.txt` and lock dependency versions.
- [ ] Create `.gitignore`, excluding sensitive and unnecessary files.
- [ ] Create `.env.example`, explaining required environment variables.
- [ ] Design the directory structure, adhering to the principle of separation of concerns.
- [ ] Create basic configuration files.
- [ ] Set up a code formatter (black).
- [ ] Set up a code linter (flake8/ruff).
- [ ] Write the first test case.
- [ ] Set up a Git repository and commit initial code.
- [ ] Create `CHANGELOG.md` to record version changes.
---
In **programming / software development**, **project architecture (Project Architecture / Software Architecture)** refers to:
> **A design plan for how a project is broken down, organized, communicated, and evolved at the "overall level"**
> — It determines how code is layered, how modules are divided, how data flows, and how the system expands and is maintained.
---
## One-sentence understanding
**Project Architecture = Before writing specific business code, first decide "where the code goes, how modules connect, and how responsibilities are divided."**
---
## I. What problems does project architecture mainly solve?
Project architecture is not about "coding tricks," but about solving these **higher-level problems**:
* 📦 How to organize code so it doesn't get messy?
* 🔁 How do modules communicate with each other?
* 🧱 Which parts can be modified independently without affecting the whole?
* 🚀 How will the project expand in the future?
* 🧪 How to facilitate testing, debugging, and deployment?
* 👥 How can multiple people collaborate without stepping on each other's code?
---
## II. What does project architecture generally include?
### 1⃣ Directory Structure (Most intuitive)
```text
project/
├── src/
│ ├── main/
│ ├── services/
│ ├── models/
│ ├── utils/
│ └── config/
├── tests/
├── docs/
└── README.md
```
👉 Determines **"where different types of code are placed"**
---
### 2⃣ Layered Design (Core)
The most common is **Layered Architecture**:
```text
Presentation Layer (UI / API)
Business Logic Layer (Service)
Data Access Layer (DAO / Repository)
Database / External Systems
```
**Rules:**
* Upper layers can call lower layers
* Lower layers cannot depend on upper layers in reverse
---
### 3⃣ Module Partitioning (Responsibility Boundaries)
For example, a trading system:
```text
- market_data # Market data
- strategy # Strategy
- risk # Risk control
- order # Order placement
- account # Account
```
👉 Each module:
* Does only one type of thing
* Strives for low coupling, high cohesion
---
### 4⃣ Data and Control Flow
* Where does the data come from?
* Who is responsible for processing?
* Who is responsible for storage?
* Who is responsible for output?
For example:
```text
WebSocket → Data Cleaning → Indicator Calculation → AI Scoring → SQLite → API → Frontend
```
---
### 5⃣ Technology Stack Selection (Part of architecture)
* Programming language (Python / Java / Go)
* Framework (FastAPI / Spring / Django)
* Communication method (HTTP / WebSocket / MQ)
* Storage (SQLite / Redis / PostgreSQL)
* Deployment (Local / Docker / Cloud)
---
## III. Common Project Architecture Types (Beginner must-know)
### 1⃣ Monolithic Architecture
```text
One project, one process
```
**Suitable for:**
* Personal projects
* Prototypes
* Small systems
**Advantages:**
* Simple
* Easy to debug
**Disadvantages:**
* Difficult to scale later
---
### 2⃣ Layered Architecture (Most common)
```text
Controller → Service → Repository
```
**Suitable for:**
* Web backends
* Business systems
---
### 3⃣ Modular Architecture
```text
core + plugins
```
**Suitable for:**
* Pluggable systems
* Strategy / indicator systems
👉 **Very suitable for quant and AI analysis that you are doing**
---
### 4⃣ Microservices Architecture (Advanced)
```text
Each service is an independent process + API communication
```
**Suitable for:**
* Large teams
* High concurrency
* Long-term evolution
**Not recommended for beginners to start with**
---
## IV. Understand with a "Real Example" (Closer to what you are doing)
Suppose you are building an **AI analysis system for Binance perpetual contracts**:
```text
backend/
├── data/
│ └── binance_ws.py # Market data subscription
├── indicators/
│ └── vpvr.py
├── strategy/
│ └── signal_score.py
├── storage/
│ └── sqlite_writer.py
├── api/
│ └── http_server.py
└── main.py
```
This is **project architecture design**:
* Each folder is responsible for one thing
* Replaceable, testable
* No need to rewrite core logic if you want to integrate Telegram Bot / Web frontend later
---
## V. Common Mistakes for Beginners ⚠️
❌ Starting with microservices
❌ All code written in one file
❌ Pursuing "advanced" architecture instead of "maintainable" architecture
❌ Writing code without a clear understanding of data flow
---
## VI. Learning Path Recommendations (Very Important)
Since you are studying CS, this order is highly recommended:
1. **First build a runnable project (not perfect)**
2. **When code starts getting messy → then learn architecture**
3. Learn:
* Module partitioning
* Layering
* Dependency direction
4. Then learn:
* Design patterns
* Microservices / Message queues
---
**Version**: 1.0
**Update Date**: 2025-11-24
**Maintainers**: CLAUDE, CODEX, KIMI
```