8.3 KiB
8.3 KiB
AGENTS.md - AI Coding Assistant Guidelines
Project Overview
BTC Accumulation Bot - Data Collection Phase. High-performance async data collection system for cbBTC on Hyperliquid with TimescaleDB storage. Python 3.11, asyncio, FastAPI, asyncpg, WebSockets.
Build/Run Commands
Docker (Primary deployment - Synology DS218+)
# Build and start all services (timescaledb, data_collector, api_server)
cd docker && docker-compose up -d --build
# View logs
docker-compose logs -f data_collector
docker-compose logs -f api_server
# Full deploy (creates dirs, pulls, builds, starts)
bash scripts/deploy.sh
Development
# API server (requires DB running)
cd src/api && uvicorn server:app --reload --host 0.0.0.0 --port 8000
# Docs: http://localhost:8000/docs | Dashboard: http://localhost:8000/dashboard
# Data collector
cd src/data_collector && python -m data_collector.main
Testing
# Run all tests
pytest
# Run a specific test file
pytest tests/data_collector/test_websocket_client.py
# Run a single test by name
pytest tests/data_collector/test_websocket_client.py::test_websocket_connection -v
# Run with coverage
pytest --cov=src --cov-report=html
Note: The tests/ directory structure exists but test files have not been written yet. When creating tests, use pytest with pytest-asyncio for async test support.
Linting & Formatting
# No config files exist for these tools; use these flags:
flake8 src/ --max-line-length=100 --extend-ignore=E203,W503
black --check src/ # Check formatting
black src/ # Auto-format
mypy src/ --ignore-missing-imports
Project Structure
src/
├── data_collector/ # WebSocket client, buffer, database
│ ├── __init__.py
│ ├── main.py # Entry point, orchestration, signal handling
│ ├── websocket_client.py # Hyperliquid WS client, Candle dataclass
│ ├── candle_buffer.py # Circular buffer with async flush
│ ├── database.py # asyncpg/TimescaleDB interface
│ └── backfill.py # Historical data backfill from REST API
└── api/
├── server.py # FastAPI app, all endpoints
└── dashboard/static/
└── index.html # Real-time web dashboard
config/data_config.yaml # Non-secret operational config
docker/
├── docker-compose.yml # 3-service orchestration
├── Dockerfile.api / .collector # python:3.11-slim based
└── init-scripts/ # 01-schema.sql, 02-optimization.sql
scripts/ # deploy.sh, backup.sh, health_check.sh, backfill.sh
tests/data_collector/ # Test directory (empty - tests not yet written)
Code Style Guidelines
Imports
Group in this order, separated by blank lines:
- Standard library (
import asyncio,from datetime import datetime) - Third-party (
import websockets,import asyncpg,from fastapi import FastAPI) - Local/relative (
from .websocket_client import Candle)
Use relative imports (.module) within the data_collector package.
Use absolute imports for third-party packages.
Formatting
- Line length: 100 characters max
- Indentation: 4 spaces
- Strings: double quotes (single only to avoid escaping)
- Trailing commas in multi-line collections
- Formatter: black
Type Hints
- Required on all function parameters and return values
Optional[Type]for nullable valuesList[Type],Dict[str, Any]fromtypingmodule@dataclassfor data-holding classes (e.g.,Candle,BufferStats)- Callable types for callbacks:
Callable[[Candle], Awaitable[None]]
Naming Conventions
- Classes:
PascalCase(DataCollector, CandleBuffer) - Functions/variables:
snake_case(get_candles, buffer_size) - Constants:
UPPER_SNAKE_CASE(DB_HOST, MAX_BUFFER_SIZE) - Private methods:
_leading_underscore(_handle_reconnect, _flush_loop)
Docstrings
- Triple double quotes on all modules, classes, and public methods
- Brief one-line description on first line
- Optional blank line + detail if needed
- No Args/Returns sections (not strict Google-style)
"""Add a candle to the buffer
Returns True if added, False if buffer full and candle dropped"""
Error Handling
try/exceptwith specific exceptions (never bareexcept:)- Log errors with
logger.error()before re-raising in critical paths - Catch
asyncio.CancelledErrorto break loops cleanly - Use
finallyblocks for cleanup (always callself.stop()) - Use
@asynccontextmanagerfor resource acquisition (DB connections)
Async Patterns
async/awaitfor all I/O operationsasyncio.Lock()for thread-safe buffer accessasyncio.Event()for stop/flush coordinationasyncio.create_task()for background loopsasyncio.gather(*tasks, return_exceptions=True)for parallel cleanupasyncio.wait_for(coro, timeout)for graceful shutdownasyncio.run(main())as the entry point
Logging
- Module-level:
logger = logging.getLogger(__name__)in every file - Format:
'%(asctime)s - %(name)s - %(levelname)s - %(message)s' - Log level from env:
getattr(logging, os.getenv('LOG_LEVEL', 'INFO')) - Use f-strings in log messages with relevant context
- Levels: DEBUG (candle receipt), INFO (lifecycle), WARNING (gaps), ERROR (failures)
Database (asyncpg + TimescaleDB)
- Connection pool:
asyncpg.create_pool(min_size=1, max_size=N) @asynccontextmanagerwrapper for connection acquisition- Batch inserts with
executemany() - Upserts with
ON CONFLICT ... DO UPDATE - Positional params:
$1, $2, ...(not%s) - Use
conn.fetch(),conn.fetchrow(),conn.fetchval()for results
Configuration
- Secrets via environment variables (
os.getenv('DB_PASSWORD')) - Non-secret config in
config/data_config.yaml - Constructor defaults fall back to env vars
- Never commit
.envfiles (contains real credentials)
Common Tasks
Add New API Endpoint
- Add route in
src/api/server.pywith@app.get()/@app.post() - Type-hint query params with
Query(); returndictor raiseHTTPException - Use
asyncpgpool for database queries
Add New Data Source
- Create module in
src/data_collector/followingwebsocket_client.pypattern - Implement async
connect(),disconnect(),receive()methods - Use callback architecture:
on_data,on_errorcallables
Database Schema Changes
- Update
docker/init-scripts/01-schema.sql - Update
DatabaseManagermethods insrc/data_collector/database.py - Rebuild:
docker-compose down -v && docker-compose up -d --build
Writing Tests
- Create test files in
tests/data_collector/(e.g.,test_websocket_client.py) - Use
pytest-asynciofor async tests:@pytest.mark.asyncio - Mock external services (WebSocket, database) with
unittest.mock - Descriptive names:
test_websocket_reconnection_with_backoff
Historical Data Backfill
The backfill.py module downloads historical candle data from Hyperliquid's REST API.
API Limitations:
- Max 5000 candles per coin/interval combination
- 500 candles per response (requires pagination)
- Available intervals: 1m, 3m, 5m, 15m, 30m, 1h, 2h, 4h, 8h, 12h, 1d, 3d, 1w, 1M
Usage - Python Module:
from data_collector.backfill import HyperliquidBackfill
from data_collector.database import DatabaseManager
async with HyperliquidBackfill(db, coin="BTC", intervals=["1m", "1h"]) as backfill:
# Backfill last 7 days for all configured intervals
results = await backfill.backfill_all_intervals(days_back=7)
# Or backfill specific interval
count = await backfill.backfill_interval("1m", days_back=3)
Usage - CLI:
# Backfill 7 days of 1m candles for BTC
cd src/data_collector && python -m data_collector.backfill --coin BTC --days 7 --intervals 1m
# Backfill multiple intervals
python -m data_collector.backfill --coin BTC --days 30 --intervals 1m 5m 1h
# Backfill MAXIMUM available data (5000 candles per interval)
python -m data_collector.backfill --coin BTC --days max --intervals 1m 1h 1d
# Or use the convenience script
bash scripts/backfill.sh BTC 7 "1m 5m 1h"
bash scripts/backfill.sh BTC max "1m 1h 1d" # Maximum data
Data Coverage by Interval:
- 1m candles: ~3.5 days (5000 candles)
- 1h candles: ~7 months (5000 candles)
- 1d candles: ~13.7 years (5000 candles)