Compare commits

...

38 Commits

Author SHA1 Message Date
9abca3eeab docs: /infra 워커 관측 규칙 + trade-monitor climax 정합 반영
- CLAUDE.md: 모든 WSL docker 워커 /infra 관측 필수 규칙(BE 팀규칙) +
  services 행에 trade-monitor(:18715) 반영
- README.md: sell_climax 정합·36 tests·env 우선순위 문구 갱신

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01N83vbXEA8h83GMXQcg8fxD
2026-07-03 11:15:36 +09:00
5dbb11ac83 fix(trade-monitor): sell_climax holdings_intel 정합
BE 회신(holdings_intel.py:109-118)에 맞춰 반전 기준을
price<day_open → price<day_high×climax_close_pct(윗꼬리)로 변경.
- kis_client.get_quote에 day_high(stck_hgpr) 추가
- monitor._build_ctx가 day_high를 ctx로 전달
- climax_vol_x·climax_close_pct를 monitor-set exit_params에서 읽기
  (fallback: TM_CLIMAX_VOL_MULT/0.97)
- 테스트 36/36 (climax exit_params 2건 추가)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01N83vbXEA8h83GMXQcg8fxD
2026-07-03 11:15:27 +09:00
8e1b20190d docs(readme): trade-monitor 워커 섹션 + /infra 관측 규칙 반영 2026-07-03 11:05:57 +09:00
fa6ef6c5c8 feat(trade-monitor): Dockerfile + compose 서비스(18715) + .env.example 2026-07-03 01:48:14 +09:00
12aa55ed14 feat(trade-monitor): FastAPI lifespan + heartbeat 배선 + /health 2026-07-03 01:47:34 +09:00
ce8983c1b9 feat(trade-monitor): monitor 오케스트레이션 (run_cycle/loop/state_fn) 2026-07-03 01:47:34 +09:00
04aff34883 feat(trade-monitor): NAS trade-alert 클라이언트 (monitor-set/report) 2026-07-03 01:46:37 +09:00
d761716e00 feat(trade-monitor): KIS 자체 토큰 + quote + 일봉 클라이언트 2026-07-03 01:46:37 +09:00
241ce41a6a feat(trade-monitor): 매수/매도 조건 로직 (§6 8개 조건) 2026-07-03 01:45:41 +09:00
366a9160d5 feat(trade-monitor): 순수 지표 모듈 (sma/rsi/highest_high) 2026-07-03 01:45:41 +09:00
141209ad42 feat(trade-monitor): 스캐폴딩 + config 2026-07-03 01:44:25 +09:00
03e50d2be1 fix(task-watcher): _shared를 빌드 컨텍스트에 포함 (heartbeat import 크래시 수정)
task-watcher는 build context가 ./task-watcher라 services/_shared가 이미지에
없었음 → A3가 추가한 `from _shared.heartbeat import` ModuleNotFoundError로
컨테이너 즉시 크래시(재시작 후 alive=false). render 워커 패턴대로
context=. + COPY _shared /app/_shared + PYTHONPATH=/app 로 수정.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_019LV86jBozkNhSFXJA412fq
2026-07-01 02:26:57 +09:00
54fca07d43 feat(ai_trade): NAS Redis heartbeat (trader market_open/closed)
- ai_trade/heartbeat.py: build_trader_payload() + heartbeat_loop() 자체 미니 헬퍼
  (Windows 호스트 실행이라 _shared import 경로 달라 독립 구현, 계약은 동일)
- ai_trade/main.py: lifespan에 hb_task spawn + shutdown 시 cancel
  state_fn = scheduler._is_market_day & _is_polling_window(KST now) 조합
  signals = len(state.signals) 실시간 주입
- requirements.txt: redis>=5.0 추가
- ai_trade/tests/test_heartbeat.py: build_trader_payload 3케이스 TDD 검증

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_019LV86jBozkNhSFXJA412fq
2026-07-01 01:07:00 +09:00
574b5712c3 feat(task-watcher): heartbeat 발신 (state=mode, paused 이유 노출)
- watcher_loop 에서 mode 판정 직후 worker:task-watcher:heartbeat SET EX 45
- payload: build_payload(state=mode, extra={"mode": mode})
- LOOP_INTERVAL 30s < TTL 45s → 만료 전 주기적 갱신
- conftest.py 추가: services/ 를 sys.path에 주입해 _shared import 가능
- tests/test_watcher.py: payload kind/state/mode 필드 검증 (1 passed)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-07-01 00:59:28 +09:00
2ff31b2e76 feat(render-workers): 4 render 워커 heartbeat 배선 + poll_once 카운터
- services/_shared/heartbeat.py (A1) WorkerStats/utc_now_iso/heartbeat_loop 소비
- image-render / video-render / music-render / insta-render 각 worker.py:
  stats = WorkerStats() 모듈 레벨 추가, poll_once에서 dispatch 전 busy=True,
  ack 후 jobs_done+1 / fail 후 jobs_failed+1 + last_job_at + busy=False
- 각 main.py: lifespan에 aioredis(decode_responses=False) + heartbeat_loop 태스크 spawn,
  종료 시 cancel + aclose
- 각 tests/test_worker.py: test_poll_once_increments_jobs_done 추가
  (image:flux / video:sora / music:suno / insta:_process_one mock)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_019LV86jBozkNhSFXJA412fq
2026-07-01 00:52:57 +09:00
d1b9ff570d feat(_shared): 워커 heartbeat 모듈 (worker:<name>:heartbeat TTL SET)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-07-01 00:43:01 +09:00
4fb3d12244 merge: co-gahusb AI 클라이언트 배선 2026-06-12 23:46:35 +09:00
789a807d50 feat(co-gahusb): AI 클라이언트 배선 (.mcp.json + 역할 블록)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 23:46:34 +09:00
ad141a2887 fix(insta-render): INSTA_MEDIA_ROOT를 insta_cards 하위로 정렬 (nginx 서빙 경로 일치)
워커가 INSTA_MEDIA_ROOT/{slate_id}에 PNG를 쓰는데 기본값 /mnt/nas/webpage/data/insta가 insta_cards 서브디렉토리를 누락 → data/insta/{id}에 저장. 그러나 nginx(/media/insta→/data/insta_cards), insta-lab CARDS_DIR, frontend 마운트, 구 렌더는 전부 data/insta/insta_cards/{id}를 기대 → /media/insta/{id}/NN.png 404.

INSTA_MEDIA_ROOT을 /mnt/nas/webpage/data/insta/insta_cards로 정정(.env + compose 기본값 + .env.example). 코드 무변경 → 컨테이너 recreate만으로 적용(rebuild 불요). SMB 볼륨 마운트는 상위 디렉토리라 그대로 유효.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 01:18:09 +09:00
6774067505 fix(insta-render): 큐 연결 socket_timeout=30 (None→30 교정)
근본원인 실험 확정: redis-py 블로킹 read에서 socket_timeout이 BLMOVE 블록(5s)
이하/None이면 read_timeout 경계 경합으로 간헐 "Timeout reading" → dequeue 실패
→ 슬레이트 draft 정지. socket_timeout 10/30은 모든 실험에서 안정. 블록보다 큰
30으로 명시(직전 None 커밋은 단독 테스트만 통과시켜 오도 — 재사용 패턴서 깨짐).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 03:17:34 +09:00
c451f5313b fix(insta-render): BLMOVE dequeue가 짧은 socket_timeout으로 깨지던 문제 해결
REDIS_URL의 socket_timeout(<5s)이 ReliableQueue BLMOVE 5초 블록보다 짧아
idle dequeue마다 "Timeout reading"으로 잡을 못 꺼내 슬레이트가 draft에 정지(~2026-05-22~).
큐 연결을 socket_timeout=None + socket_keepalive로 생성(make_queue_redis)해 정상화.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 16:08:43 +09:00
9241b5cd90 fix(insta-render): fonts.ready 대기 + PNG 비어있음 검증 (렌더 known-issue 해결)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 12:53:07 +09:00
8bfc8e153f polish(insta-render): CSS accent | safe + cover sub clamp
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 12:50:25 +09:00
232aa52adb feat(insta-render): 모던 미니멀 디자인 시스템 템플릿
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 12:46:19 +09:00
d2f7030446 docs: README.md 신설 — ai_trade(V2) + services 워커 개요
- 디렉토리 구조 (ai_trade / services 4 worker + task-watcher / legacy)
- ai_trade: 매수/매도 룰, 핵심 파일, 시작/헬스
- services: ReliableQueue 신뢰성 패턴, 운영 조작, 환경 변수
- 전체 환경 변수 / 테스트 / 알려진 함정 / Phase 진행 (0-7) 정리

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 00:59:46 +09:00
43ee610780 fix(image-render): F6 ReliableQueue 적용 (F6 part 5)
- worker.py: poll_once + ReliableQueue + startup recovery
- 3 provider (gpt_image/nano_banana/flux) dispatch table 보존
- Dockerfile: build context=services/, _shared 포함, PYTHONPATH=/app
- docker-compose.yml: image-render build context 갱신

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 20:17:08 +09:00
f79c5c26df fix(video-render): F6 ReliableQueue 적용 (F6 part 4)
- worker.py: poll_once + ReliableQueue + startup recovery
- 4 provider (sora/veo/kling/seedance) dispatch table 보존
- Dockerfile: build context=services/, _shared 포함, PYTHONPATH=/app
- docker-compose.yml: video-render build context 갱신

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 20:16:01 +09:00
7108e5e4f5 fix(music-render): F6 ReliableQueue 적용 (F6 part 3)
- worker.py: poll_once 신설, BLPOP → ReliableQueue.dequeue/ack/fail + startup recovery
- 12 job_type dispatch table 보존 (기존 13 tests 그대로 PASS)
- Dockerfile: build context=services/, _shared 포함, PYTHONPATH=/app
- docker-compose.yml: music-render build context 갱신

dispatch 자체 unhandled exception 발생 시 fail(raw, payload)로 retry/dead-letter.
provider 함수가 webhook("failed")를 잡고 있는 정상 케이스는 ack (멱등 webhook).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 20:14:59 +09:00
1e6638a64b fix(insta-render): F6 ReliableQueue 적용 — BLMOVE + ack/fail (F6 part 2)
- worker.py: BLPOP → ReliableQueue.dequeue / ack / fail / startup recovery
- _process_one: 예외 시 webhook(failed) 후 raise — poll_once가 fail(raw, payload)
  로 retry/dead-letter 처리
- poll_once 함수 추가 (테스트 단위)
- Dockerfile: build context=services/ 로 올리고 _shared 포함, PYTHONPATH=/app
- docker-compose.yml: insta-render build context 갱신

기존 webhook 호출 동작은 그대로 (멱등) — retry 시 매번 NAS에 failed 통보되어도
마지막 상태만 보임. dead-letter는 운영 모니터링으로 별도 처리.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 20:13:24 +09:00
32308bede6 feat(services): _shared/reliable_queue 신설 — BLMOVE + processing list + retry (F6 part 1)
코드 리뷰 F6: render worker(insta/music/video/image)가 BLPOP 직후 crash 시
작업 손실. 공통 ReliableQueue 클래스를 services/_shared/에 신설:

- dequeue: BLMOVE main → processing (atomic, 원자적)
- ack: LREM processing 1 (성공 시 1개 제거)
- fail: attempts++ 후 main queue로 재큐, max_attempts 도달 시 dead_letter:* 이동
- recover: startup 시 자신의 processing list orphan을 main queue로 (attempts 증가)

producer side 무변경. NAS 짝 워커(insta-lab/music-lab/video-lab/image-render NAS측)는
LPUSH 그대로. payload schema에 optional attempts 필드 추가.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 20:10:47 +09:00
ac6409605c feat(ai_trade): poll_loop가 매 cycle 끝에 expired signal purge (F5 part 4)
Phase 5 consumer(agent-office /signal)가 안 붙은 상태에서도 state.signals가
무한 누적되지 않도록 매 cycle 끝에 state.purge_expired_signals(now) 호출.
expires_at < now인 signal 자동 제거.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 20:01:40 +09:00
e4d02b8059 feat(ai_trade): emit signal에 cycle_id + expires_at 부착 (F5 part 3)
- generate_signals 진입에서 state.signal_cycle_id += 1 (emit 여부 무관 증가)
- _build_buy_signal/_build_sell_signal에 cycle_id + expires_at 필드 추가
- expires_at = as_of + settings.signal_ttl_seconds (default 300s)
- 매수/매도 양쪽 로그에 cycle=N 추가

기존 test_poll_loop_calls_generate_signals_after_cycle의 settings MagicMock에
signal_ttl_seconds=300 명시 (timedelta가 MagicMock 받으면 TypeError).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 19:59:35 +09:00
94a034ef38 feat(ai_trade): SIGNAL_TTL_SECONDS env 추가 (F5 part 2)
신호 expires_at 계산용 TTL (default 300s). 환경별로 조정 가능.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 19:54:45 +09:00
2a11d05f4a feat(ai_trade): state.signals에 expires_at + cycle_id lifecycle 추가 (F5 part 1)
코드 리뷰 F5 — Phase 5 consumer(agent-office /signal) prereq:
PollState.signal_cycle_id (process auto-increment) + get_active_signals(now) +
purge_expired_signals(now) helper. expires_at 없는 legacy signal은 expired 취급.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 19:54:18 +09:00
c2e77a7310 fix(ai_trade): Chronos confidence를 absolute spread 기반으로 통일 (F4)
코드 리뷰 F4: signal_generator의 hard gate(L79)는 absolute spread(0.6 threshold)를
쓰지만 chronos_predictor:106의 confidence는 relative spread (q90-q10)/max(|median|, 0.001).
zero-shot median≈0 케이스에서 spread가 폭증하여 conf=0으로 눌리고 결국 모든
매수 신호가 confidence_threshold(0.7)를 못 넘김.

산식 통일: conf = max(0, min(1, 1 - spread/_SPREAD_THRESHOLD)). _SPREAD_THRESHOLD=0.6
은 signal_generator hard gate와 동일.

- spread≈0 → conf≈1 (확신)
- spread=0.3 → conf=0.5 (중간)
- spread≥0.6 → conf=0 (거부)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 19:39:15 +09:00
bea27a75cf fix(ai_trade): post-close trigger를 상태기반으로 변경 (F3)
코드 리뷰 F3: _is_post_close_trigger가 16:00:00-16:00:59 1분 윈도우만 true.
5분 sleep + 비결정적 cycle 시작시각 조합으로 영영 못 잡는 경우 존재
(예: cycle이 15:31에 시작하면 15:36, 15:41 ... 16:01에 깸).

"오늘 아직 post-close 안 돌렸고 현재 시각 ≥ 16:00" 상태기반으로 변경.
poll_loop가 last_post_close_date 변수로 일 1회 실행 보장.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 19:36:10 +09:00
39adfc5fc5 fix(ai_trade): KIS throttle을 asyncio.Lock으로 직렬화 (F2)
코드 리뷰 F2: pull_worker.py가 asyncio.gather로 종목별 분봉/호가를 동시 호출하는데
_throttle()이 lock 없이 _last_throttle_at만 갱신해 race condition. 여러 coroutine이
같은 elapsed 계산 후 동시에 깨어나 KIS 초당 2회 한도(EGW00201) 위반 위험.

테스트로 5 concurrent gather 측정: 수정 전 0.51s → 수정 후 2.0s+ 직렬화 확인.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 19:32:50 +09:00
1a848faac4 fix(ai_trade): V1_TOKEN_PATH default를 legacy/signal_v1/ 경유로 수정 (F1)
코드 리뷰 F1: V1이 legacy/signal_v1/로 이동되었으나 config.py default가
구 경로를 가리켜 .env 미설정 시 KIS REST가 V1 token file missing으로 실패.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 19:31:55 +09:00
81 changed files with 4666 additions and 156 deletions

9
.mcp.json Normal file
View File

@@ -0,0 +1,9 @@
{
"mcpServers": {
"co-gahusb": {
"type": "http",
"url": "https://gahusb.synology.me/api/co/mcp",
"headers": { "Authorization": "Bearer ${CO_BUS_KEY}" }
}
}
}

View File

@@ -15,12 +15,14 @@ Windows AI 머신 (AMD 9800X3D + RTX 5070 Ti 16GB) 의 두 신호 파이프라
| `ai_trade/` | 자동매매 메인 (구 `signal_v2` 2026-05-19 rename) — Chronos-bolt + 분봉 모멘텀 + KIS WebSocket + 신호 생성 | `:8001` | **Phase 4 완료 (2026-05-17)**, Phase 5 대기 | | `ai_trade/` | 자동매매 메인 (구 `signal_v2` 2026-05-19 rename) — Chronos-bolt + 분봉 모멘텀 + KIS WebSocket + 신호 생성 | `:8001` | **Phase 4 완료 (2026-05-17)**, Phase 5 대기 |
| `legacy/start_v1.bat` | (deprecated) V1 진입점 — root `start.bat`에서 이동됨. 자동 실행 차단 | — | **OFF** | | `legacy/start_v1.bat` | (deprecated) V1 진입점 — root `start.bat`에서 이동됨. 자동 실행 차단 | — | **OFF** |
| `ai_trade/start.bat` | 자동매매 진입점 | — | `ai_trade/main.py` uvicorn 실행 | | `ai_trade/start.bat` | 자동매매 진입점 | — | `ai_trade/main.py` uvicorn 실행 |
| `services/` | (예정) NAS↔Windows 분산 worker — insta-render·music-render·video-render·task-watcher | 18710~ | **Plan-B-Insta 작업 중** | | `services/` | NAS↔Windows 분산 worker — insta·music·video·image-render·task-watcher·**trade-monitor(:18715, 실시간 매매 알람)** | 18710~ | 운영 |
| `.env` | 환경변수 (`KIS_REAL_*`, `TELEGRAM_*`, `STOCK_API_URL`, `WEBAI_API_KEY`, `LOG_LEVEL`) | — | | | `.env` | 환경변수 (`KIS_REAL_*`, `TELEGRAM_*`, `STOCK_API_URL`, `WEBAI_API_KEY`, `LOG_LEVEL`) | — | |
| `requirements.txt` | 공용 의존성 | — | torch, chronos-forecasting, fastapi, httpx, websockets 등 | | `requirements.txt` | 공용 의존성 | — | torch, chronos-forecasting, fastapi, httpx, websockets 등 |
`.venv`**구조적으로 깨짐**: `pyvenv.cfg` 가 한글 사용자 경로(`C:\Users\박재오\...`) 를 포함하여 콘솔 코드페이지가 roundtrip 못함. 테스트는 시스템 Python 으로 실행: `C:\Users\jaeoh\AppData\Local\Programs\Python\Python312\python.exe -m pytest ai_trade/tests -q`. `.venv`**구조적으로 깨짐**: `pyvenv.cfg` 가 한글 사용자 경로(`C:\Users\박재오\...`) 를 포함하여 콘솔 코드페이지가 roundtrip 못함. 테스트는 시스템 Python 으로 실행: `C:\Users\jaeoh\AppData\Local\Programs\Python\Python312\python.exe -m pytest ai_trade/tests -q`.
> **분산 워커 /infra 관측 규칙 (팀 규칙, BE 제정)**: 모든 WSL docker 워커는 ① Redis `worker:<name>:heartbeat`(EX45) 발신 ② BE `node_monitor.WORKER_REGISTRY` 등재 ③ `/api/agent-office/nodes`·web-ui `/infra` 노출이 필수. trade-monitor는 kind=trader로 등재됨.
--- ---
## 서버 시작 방식 ## 서버 시작 방식
@@ -137,3 +139,14 @@ cd C:\Users\jaeoh\Desktop\workspace\web-ai\ai_trade
- **spec amendment 발생 시**: 코드는 `web-ai` 에 commit, spec 갱신은 `web-ui/docs/superpowers/specs/` 에 commit (Phase 4 spread formula 변경 사례 = web-ui commit `534ded5`) - **spec amendment 발생 시**: 코드는 `web-ai` 에 commit, spec 갱신은 `web-ui/docs/superpowers/specs/` 에 commit (Phase 4 spread formula 변경 사례 = web-ui commit `534ded5`)
자세한 V1 가이드는 `signal_v1/CLAUDE.md` 참조 (있다면). 자세한 V1 가이드는 `signal_v1/CLAUDE.md` 참조 (있다면).
---
## 협업 팀 버스 (co-gahusb) — 이 세션의 역할: **AI**
이 세션은 AI 리서치(AI) 역할이다. co-gahusb MCP 툴로 다른 세션(FE/BE/Producer)과 협업한다.
- **소유권**: 이 세션은 `web-ai` repo만 쓴다(FE=web-ui, BE=web-backend).
- **공유 리소스 변경 전 반드시 `acquire_lock(resource, "AI")`**: 대상 = `nas-deploy`, `stock-db-schema`, `lotto-db-schema`, `memory-mirror`, `nginx-conf`, `compose`. 점유 중이면 대기, 긴 작업은 `heartbeat_lock`, 끝나면 `release_lock`.
- **모든 툴 호출에 `role="AI"`** (또는 `from_role`/`created_by`에 AI).
- **수신**: `/loop`로 주기적으로 `read_inbox("AI", after_id=<last>)` + `list_tasks(assignee_role="AI")` 확인.
-`CO_BUS_KEY`는 환경변수로 주입(커밋 금지). `.mcp.json``${CO_BUS_KEY}`가 프로세스 환경변수에서 치환됨 → `setx CO_BUS_KEY "..."` 후 새 터미널에서 `claude` 실행.

269
README.md Normal file
View File

@@ -0,0 +1,269 @@
# web-ai
Windows AI 머신(AMD 9800X3D + RTX 5070 Ti 16GB)에서 동작하는 두 영역의 서비스:
1. **ai_trade** — Confidence Signal Pipeline V2. NAS stock 백엔드와 KIS Open API를 결합해 매수/매도 신호를 생성하는 FastAPI 워커.
2. **services** — NAS↔Windows 분산 워커: 렌더링(인스타 카드 / 음악 / 영상 / 이미지) + task-watcher + **trade-monitor**(실시간 매매 알람).
상위 워크스페이스 컨텍스트는 `../CLAUDE.md`, 본 디렉토리 상세는 `CLAUDE.md`, 운영 체크포인트는 `CHECK_POINT.md` 참조.
---
## 디렉토리 구조
| 경로 | 역할 | 포트 |
|------|------|------|
| `ai_trade/` | 자동매매 메인. Chronos-bolt(또는 Chronos-2) + 분봉 모멘텀 + KIS WebSocket 호가 + 매수/매도 신호 생성기. | `:8001` |
| `services/_shared/` | 4개 render worker 공통 모듈 (`ReliableQueue` — BLMOVE + ack/fail + recovery). | — |
| `services/insta-render/` | Instagram 카드 Playwright 렌더 워커. NAS Redis `queue:insta-render` 소비. | `:18710` |
| `services/music-render/` | Suno + MusicGen 음악 생성 워커. `queue:music-render` 소비. | `:18711` |
| `services/video-render/` | sora / veo / kling / seedance 4 provider 영상 생성 게이트웨이. `queue:video-render` 소비. | `:18712` |
| `services/image-render/` | gpt_image / nano_banana / flux(ComfyUI 로컬) 3 provider. `queue:image-render` 소비. | `:18714` |
| `services/task-watcher/` | 박재오 작업 시간대에 `queue:paused` 토글 → 워커 일시 정지. | `:18713` |
| `services/trade-monitor/` | 실시간 매매 알람. monitor-set pull → KIS 시세 + TA 조건(§6 8종) → report + heartbeat(kind=trader). 무상태. | `:18715` |
| `legacy/signal_v1/` | ⚠ **DEPRECATED** (2026-05-19). LSTM 봇. 자동 실행 차단됨. | OFF |
---
## ai_trade — Confidence Signal Pipeline V2
NAS stock 백엔드(`:18500`)에서 portfolio / news_sentiment / screener를 pull하고, KIS REST/WebSocket으로 분봉·호가를 보강한 뒤 Chronos 예측과 5분봉 모멘텀 분류로 매수/매도 신호를 생성한다.
### 매수 (screener Top-N + portfolio)
모두 충족 시 confidence 계산 → threshold 초과 시 emit:
1. `chronos.median > 0`
2. `chronos.q90 - chronos.q10 < 0.6` (absolute spread)
3. `minute_momentum == strong_up`
4. `asking_price.bid_ratio >= 0.6`
종합 confidence = `chronos_conf * 0.5 + minute_score * 0.3 + screener_norm * 0.2`. `> 0.7` 시 emit.
### 매도 (portfolio only, 우선순위 stop_loss → anomaly → take_profit)
- **stop_loss**: `pnl_pct < -7%` 즉시 (confidence=1.0)
- **anomaly**: `chronos.median < -1%` + `strong_down` + `bid_ratio < 0.4` + 종합 conf > 0.7
- **take_profit**: `pnl_pct > 15%` 검토 (confidence=0.6)
### 핵심 파일
| 파일 | 책임 |
|------|------|
| `main.py` | FastAPI app + lifespan (의존성 wiring) + poll_loop task 생성 |
| `config.py` | `Settings` dataclass — 환경변수 로드 |
| `state.py` | `PollState` (process-wide singleton) — portfolio·screener·signals 등 + `get_active_signals` / `purge_expired_signals` |
| `stock_client.py` | NAS stock 백엔드 pull (X-WebAI-Key + 메모리 캐시) |
| `kis_client.py` | KIS REST 분봉/호가 + asyncio.Lock 직렬화 + 지수 backoff |
| `kis_websocket.py` | KIS WebSocket 호가 + approval_key + 재연결 |
| `chronos_predictor.py` | HuggingFace Chronos zero-shot 분위수 예측 (FP32 강제) |
| `minute_momentum.py` | 5분봉 → strong_up / weak_up / neutral / weak_down / strong_down |
| `signal_generator.py` | 매수/매도 룰 엔진. cycle_id + expires_at 부착 |
| `pull_worker.py` | asyncio cron — 시간대별 분기 + post-close 트리거 + signal 생성 + expired purge |
| `scheduler.py` | 폴링 윈도우 판정 (KST 캘린더 + 휴장일) |
| `rate_limit.py` | 초당 N회 token bucket + `SignalDedup` SQLite WAL |
### 시작
```bat
cd ai_trade
start.bat
```
`Uvicorn running on http://0.0.0.0:8001`, `poll_loop started`.
휴장일/장 외 시간엔 poll_loop만 idle.
### 헬스 / 로그
```powershell
curl http://localhost:8001/health
Get-Content logs\ai_trade.log -Wait
nvidia-smi
```
---
## services — NAS↔Windows 분산 워커
NAS측 lab 서비스(insta-lab / music-lab / video-lab / image-render NAS측)가 `queue:<worker>-render` 에 LPUSH로 작업을 enqueue. Windows worker가 BLMOVE로 atomic dequeue 후 처리, 완료 시 NAS internal webhook으로 결과 통지.
### 신뢰성 패턴 (`_shared.ReliableQueue`)
- **dequeue**: `BLMOVE main → processing:<queue>:<worker_id>` (atomic).
- **ack**: `LREM processing 1 raw` (성공).
- **fail**: `LREM processing``attempts++` 후 main 재큐 또는 `max_attempts` 도달 시 `dead_letter:<queue>` 이동.
- **recover**: startup 시 자신의 processing list orphan을 main queue로 (attempts 증가).
### 시작 (NAS, WSL2 Docker)
```bash
cd services
docker compose up -d insta-render music-render video-render image-render task-watcher
```
build context는 `services/` 루트. 각 Dockerfile은 `_shared` 모듈을 함께 COPY하고 `PYTHONPATH=/app`.
### 운영 조작
```bash
# 워커 일시 정지 / 재개
redis-cli -h 192.168.45.54 SET queue:paused 1
redis-cli -h 192.168.45.54 DEL queue:paused
# 큐 / dead-letter 점검
redis-cli -h 192.168.45.54 LLEN queue:insta-render
redis-cli -h 192.168.45.54 LLEN dead_letter:queue:insta-render
redis-cli -h 192.168.45.54 KEYS 'processing:*'
```
### 환경 변수
| 변수 | 용도 |
|------|------|
| `REDIS_URL` | NAS Redis (`redis://192.168.45.54:6379`) |
| `NAS_BASE_URL` | NAS 대상 서비스 URL (insta-lab `:18700`, music-lab `:18600`, video-lab `:18801`, image-render NAS측 `:18802`) |
| `INTERNAL_API_KEY` | NAS internal webhook 인증 |
| `WORKER_ID` | (권장) `<service>-prod-1` 등 영속 ID. hostname 기반 default는 컨테이너 재기동 시 바뀌어 orphan 추적 불가 |
| `OPENAI_API_KEY` / `GEMINI_API_KEY` / `KLING_*` / `SEEDANCE_API_KEY` / `SUNO_API_KEY` | 각 provider 인증 |
| `COMFYUI_URL` | image-render FLUX 로컬 ComfyUI (`http://host.docker.internal:8188`) |
| `FLUX_BLOCK_TRADING_HOURS` | `1` 이면 장중(09:00~15:30) FLUX 차단 (Chronos GPU 보호) |
---
## trade-monitor — 실시간 매매 알람 워커 (신규 2026-07-03)
NAS stock 백엔드(`:18500`)에서 `monitor-set`을 60초마다 pull하고, KIS 실시간/일봉 시세로 TA 조건을 평가해 발화집합을 `report`로 전송. NAS가 edge diff로 신규 알림만 텔레그램/프론트에 노출. **워커 무상태**(dedup은 NAS 영속). 포트 `:18715`, WSL2 Docker. 상세 설계·조건 규칙은 `services/trade-monitor/DESIGN.md`.
### 루프 (1분)
1. `GET /api/webai/trade-alert/monitor-set` (X-WebAI-Key) → buy_targets(watchscreener) + sell_targets(보유 avg_price/qty/holding_high) + buy_params/exit_params + session.
2. `session=="closed"`면 KIS 호출 0, idle. **비-KRX(알파벳) 티커 skip**(워커 책임).
3. KIS quote + 일봉 250봉 → 지표 계산 → 조건 평가(종목 단위 실패 격리).
4. `POST /api/webai/trade-alert/report {as_of, firing:[...]}` — 빈 배열도 전송(edge clear).
5. heartbeat `worker:trade-monitor:heartbeat` EX45 (kind=trader, state=market_open|market_closed|idle + last_alert_at). 60초 루프 > TTL45 만료갭 회피 위해 **15초 독립 태스크**.
### 조건 (§6 — `condition` 문자열이 FE 라벨/뱃지로 그대로 매핑됨)
- **매수**: `buy_ma20_pullback`(정배열 + ma20 근접 반등), `buy_breakout`(20봉 고점 돌파 + 거래량 배수), `buy_rsi_bounce`(RSI 과매도 반등, 무상태).
- **매도**: `sell_stop_loss`, `sell_take_profit`, `sell_trailing_stop`, `sell_ma_break`(ma50/ma200 severity), `sell_climax`(거래량 급증 + `price<day_high×0.97` 윗꼬리 — holdings_intel 정합됨, `exit_params` 파라미터화).
### 핵심 파일
| 파일 | 책임 |
|------|------|
| `config.py` | Settings (`TM_` 접두사, ai_trade와 분리) |
| `indicators.py` | 순수: `sma` / `rsi_series`(Wilder) / `highest_high` |
| `conditions.py` | 순수 §6: `evaluate_buy` / `evaluate_sell` |
| `kis_client.py` | KIS **자체 OAuth 토큰** + `get_quote` + `get_daily_ohlcv` + 0.5s throttle |
| `nas_client.py` | monitor-set / report (X-WebAI-Key + retry) |
| `monitor.py` | `run_cycle` / `monitor_loop` / `make_state_fn` |
| `main.py` | FastAPI lifespan + `_shared.heartbeat_loop` 배선 + `/health` |
### 환경 변수
| 변수 | 기본 | 설명 |
|------|------|------|
| `NAS_BASE_URL` | `http://192.168.45.54:18500` | stock 백엔드 |
| `WEBAI_API_KEY` | (필수) | X-WebAI-Key |
| `REDIS_URL` | `redis://192.168.45.54:6379` | heartbeat |
| `TM_KIS_APP_KEY` / `TM_KIS_APP_SECRET` / `TM_KIS_ACCOUNT` | (필수) | KIS **전용** 자체 토큰(ai_trade와 분리 발급 → 토큰 상호 무효화·EGW00201 회피) |
| `TM_KIS_IS_VIRTUAL` | `0` | 실전/모의 |
| `TM_LOOP_INTERVAL` | `60` | 루프 주기(초) |
| `TM_CLIMAX_VOL_MULT` | `3.0` | sell_climax 거래량 배수 fallback (우선순위: monitor-set `exit_params.climax_vol_x` > 이 값) |
### 상태
⏳ 구현·머지 완료(테스트 36/36, sell_climax holdings_intel 정합 포함), **미배포**. 배포 전: ① 전용 KIS 앱키 발급·주입(박재오 진행 중) ② 첫 운영 KIS 필드 검증(stck_hgpr 등). BE가 `node_monitor.WORKER_REGISTRY`에 등재 완료 → 배포 시 `/api/agent-office/nodes`·web-ui `/infra`에 trader 노드 자동 노출(미배포 동안 down, 무경보).
### 시작 (NAS, WSL2 Docker)
```bash
cd services
docker compose up -d trade-monitor
```
---
## 환경 변수 (ai_trade)
| 변수 | 기본 | 설명 |
|------|------|------|
| `STOCK_API_URL` | (필수) | NAS stock 백엔드 base URL |
| `WEBAI_API_KEY` | (필수) | stock 백엔드 호출 시 X-WebAI-Key |
| `SIGNAL_V2_PORT` | `8001` | uvicorn 포트 |
| `KIS_ENV_TYPE` | `virtual` | `virtual` / `real` |
| `KIS_REAL_APP_KEY` / `KIS_REAL_APP_SECRET` / `KIS_REAL_ACCOUNT` | — | KIS 실계좌 |
| `KIS_VIRTUAL_APP_KEY` / `KIS_VIRTUAL_APP_SECRET` / `KIS_VIRTUAL_ACCOUNT` | — | KIS 모의계좌 |
| `V1_TOKEN_PATH` | `legacy/signal_v1/data/kis_token.json` | KIS 토큰 파일 (V1 토큰 read-only 공유) |
| `CHRONOS_MODEL` | `amazon/chronos-2` | Chronos 모델 ID |
| `STOP_LOSS_PCT` | `-0.07` | 손절 임계 |
| `TAKE_PROFIT_PCT` | `0.15` | 익절 임계 |
| `CHRONOS_SPREAD_THRESHOLD` | `0.6` | 매수 hard gate spread 상한 |
| `ASKING_BID_RATIO_THRESHOLD` | `0.6` | 매수 hard gate 호가 비율 |
| `CONFIDENCE_THRESHOLD` | `0.7` | 매수 종합 confidence 하한 |
| `MIN_MOMENTUM_FOR_BUY` | `strong_up` | 매수 hard gate 모멘텀 단계 |
| `SIGNAL_TTL_SECONDS` | `300` | emit signal expires_at TTL |
`.env` 는 web-ai 루트 (이 디렉토리)에 둔다. **절대 커밋 금지.**
---
## 테스트
```bash
# ai_trade
python -m pytest ai_trade/tests -q
# services/_shared 공통 모듈
cd services/_shared && python -m pytest tests/ -q
# 각 worker
cd services/insta-render && python -m pytest tests/ -q
cd services/music-render && python -m pytest tests/ -q
cd services/video-render && python -m pytest tests/ -q
cd services/image-render && python -m pytest tests/ -q
cd services/trade-monitor && python -m pytest tests/ -q # 36 tests
```
**`.venv` 한글 사용자 경로 깨짐**으로 시스템 Python(`C:\Users\jaeoh\AppData\Local\Programs\Python\Python312\python.exe`) 사용 권장. 또는 `py -3.12 -m pytest …`.
---
## 알려진 함정
1. **KIS rate limit (EGW00201)** — V1+V2 동시 실행 시 충돌. V1은 `legacy/`로 격리. ai_trade는 `asyncio.Lock`으로 throttle 직렬화 (`kis_client.py`).
2. **`.venv` 한글 경로** — 시스템 Python 사용.
3. **Chronos FP16 overflow** — 한국 주가 5만원+ 시 inf. FP32 강제됨.
4. **post-close 트리거** — 상태기반(`last_post_close_date`)으로 변경됨. 16:00 이후 + 오늘 미실행이면 trigger.
5. **services worker_id** — env로 명시 권장. hostname 기반 default는 컨테이너 재기동 시 바뀌어 orphan 분실 위험.
6. **dead-letter 누적**`redis-cli LLEN dead_letter:*` 정기 점검 필요.
7. **Dockerfile build context**`services/` 루트 (각 worker 디렉토리 아님). compose 변경 동반.
8. **분산 워커 /infra 관측 필수 (팀 규칙)** — 모든 WSL docker 워커는 heartbeat(`worker:<name>:heartbeat` EX45) + BE `node_monitor.WORKER_REGISTRY` 등재 + `/infra` 노출이 필수. trade-monitor는 kind=trader로 등재됨.
9. **trade-monitor KIS 앱키 분리** — ai_trade와 **다른 전용 app_key**(`TM_KIS_*`) 사용. 같은 app_key 공유 시 토큰 상호 무효화 + EGW00201. 실전 최대 89앱 발급 가능.
---
## Phase 진행 상태 (Confidence Signal Pipeline V2)
| Phase | 내용 | 상태 |
|-------|------|------|
| 0 | Architecture & contract spec | ✅ |
| 1 | stock 백엔드 WebAI API 보강 (NAS) | ✅ |
| 1.5 | V1 → `signal_v1/` rename → `legacy/` 격리 | ✅ |
| 2 | ai_trade pull worker + signal API client + scheduler | ✅ |
| 3a | KIS REST 분봉 + WebSocket 호가 + NXT 스케줄 | ✅ |
| 3b | Chronos-bolt-base 추론 + 5분봉 모멘텀 분류기 | ✅ |
| 4 | Signal Generator + 로깅 | ✅ |
| 4.5 | 코드 리뷰 F1-F6 hotfix (토큰 경로 / throttle Lock / post-close 상태기반 / Chronos abs / state.signals lifecycle / render queue 신뢰성) | ✅ |
| 5 | agent-office `/signal` + Ollama Qwen3 14B + 이중 텔레그램 | ⏳ |
| 6 | signal_v1 deprecation (legacy 완료, 아카이브만 남음) | 일부 ✅ |
| 7 | 운영 모니터링 + 4주 IC 검증 | ⏳ |
상세 spec/plan은 `../web-ui/docs/superpowers/specs/` / `../web-ui/docs/superpowers/plans/` (별도 repo).
---
## 라이선스 / 사용
비공개. 박재오 개인 웹 플랫폼.

View File

@@ -10,6 +10,10 @@ import numpy as np
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
KST = ZoneInfo("Asia/Seoul") KST = ZoneInfo("Asia/Seoul")
# F4: signal_generator hard gate와 동일한 absolute spread threshold.
# zero-shot median≈0에서 conf가 0으로 폭락하던 relative 산식 (spread/abs(median)) 대체.
_SPREAD_THRESHOLD = 0.6
@dataclass @dataclass
class ChronosPrediction: class ChronosPrediction:
@@ -103,8 +107,8 @@ class ChronosPredictor:
median = float((q50_price - last_close) / last_close) median = float((q50_price - last_close) / last_close)
q10 = float((q10_price - last_close) / last_close) q10 = float((q10_price - last_close) / last_close)
q90 = float((q90_price - last_close) / last_close) q90 = float((q90_price - last_close) / last_close)
spread = (q90 - q10) / max(abs(median), 0.001) spread = q90 - q10 # F4: absolute spread
conf = float(max(0.0, min(1.0, 1.0 - spread / 2.0))) conf = float(max(0.0, min(1.0, 1.0 - spread / _SPREAD_THRESHOLD)))
results[ticker] = ChronosPrediction( results[ticker] = ChronosPrediction(
median=median, q10=q10, q90=q90, conf=conf, as_of=now_iso, median=median, q10=q10, q90=q90, conf=conf, as_of=now_iso,
) )
@@ -124,8 +128,8 @@ class ChronosPredictor:
median = float(np.quantile(returns, 0.5)) median = float(np.quantile(returns, 0.5))
q10 = float(np.quantile(returns, 0.1)) q10 = float(np.quantile(returns, 0.1))
q90 = float(np.quantile(returns, 0.9)) q90 = float(np.quantile(returns, 0.9))
spread = (q90 - q10) / max(abs(median), 0.001) spread = q90 - q10 # F4: absolute spread
conf = float(max(0.0, min(1.0, 1.0 - spread / 2.0))) conf = float(max(0.0, min(1.0, 1.0 - spread / _SPREAD_THRESHOLD)))
results[ticker] = ChronosPrediction( results[ticker] = ChronosPrediction(
median=median, q10=q10, q90=q90, conf=conf, as_of=now_iso, median=median, q10=q10, q90=q90, conf=conf, as_of=now_iso,
) )

View File

@@ -31,7 +31,7 @@ class Settings:
v1_token_path: Path = field( v1_token_path: Path = field(
default_factory=lambda: Path( default_factory=lambda: Path(
os.getenv("V1_TOKEN_PATH", os.getenv("V1_TOKEN_PATH",
str(Path(__file__).parent.parent / "signal_v1" / "data" / "kis_token.json")) str(Path(__file__).parent.parent / "legacy" / "signal_v1" / "data" / "kis_token.json"))
) )
) )
chronos_model: str = field(default_factory=lambda: os.getenv("CHRONOS_MODEL", "amazon/chronos-2")) chronos_model: str = field(default_factory=lambda: os.getenv("CHRONOS_MODEL", "amazon/chronos-2"))
@@ -53,6 +53,9 @@ class Settings:
min_momentum_for_buy: str = field( min_momentum_for_buy: str = field(
default_factory=lambda: os.getenv("MIN_MOMENTUM_FOR_BUY", "strong_up") default_factory=lambda: os.getenv("MIN_MOMENTUM_FOR_BUY", "strong_up")
) )
signal_ttl_seconds: int = field(
default_factory=lambda: int(os.getenv("SIGNAL_TTL_SECONDS", "300"))
)
@property @property
def kis_is_virtual(self) -> bool: def kis_is_virtual(self) -> bool:

57
ai_trade/heartbeat.py Normal file
View File

@@ -0,0 +1,57 @@
"""ai_trade heartbeat — NAS Redis로 worker:ai_trade:heartbeat SET.
Global Constraints 계약 1: kind=trader, state=market_open|market_closed.
ai_trade는 Windows 호스트 실행이라 _shared import 경로가 달라 자체 미니 헬퍼로 둔다.
"""
from __future__ import annotations
import asyncio
import datetime as dt
import json
import logging
import os
import redis.asyncio as aioredis
logger = logging.getLogger(__name__)
REDIS_URL = os.getenv("REDIS_URL", "redis://192.168.45.54:6379")
KEY = "worker:ai_trade:heartbeat"
INTERVAL = int(os.getenv("HEARTBEAT_INTERVAL", "15"))
TTL = int(os.getenv("HEARTBEAT_TTL", "45"))
def build_trader_payload(state: str, signals: int = 0) -> str:
"""JSON 문자열 반환. state: 'market_open' | 'market_closed'."""
return json.dumps({
"name": "ai_trade",
"kind": "trader",
"state": state,
"ts": dt.datetime.now(dt.timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ"),
"last_job_at": None,
"jobs_done": signals,
"jobs_failed": 0,
})
async def heartbeat_loop(state_fn) -> None:
"""Redis에 HEARTBEAT_INTERVAL마다 SET EX TTL.
Args:
state_fn: () -> (state: str, signals: int). 호출자가 폴링 윈도우 판정 주입.
"""
redis = aioredis.from_url(REDIS_URL, decode_responses=False)
try:
while True:
try:
state, signals = state_fn()
payload = build_trader_payload(state, signals)
await redis.set(KEY, payload, ex=TTL)
logger.debug("ai_trade heartbeat sent: state=%s signals=%d", state, signals)
except asyncio.CancelledError:
raise
except Exception:
logger.exception("ai_trade heartbeat 실패 — 다음 주기에 재시도")
await asyncio.sleep(INTERVAL)
finally:
await redis.aclose()

View File

@@ -38,6 +38,7 @@ class KISClient:
self._client = httpx.AsyncClient(timeout=timeout) self._client = httpx.AsyncClient(timeout=timeout)
self._token_cache: tuple[str, float] | None = None # (token, file_mtime) self._token_cache: tuple[str, float] | None = None # (token, file_mtime)
self._last_throttle_at = 0.0 self._last_throttle_at = 0.0
self._throttle_lock = asyncio.Lock()
async def close(self) -> None: async def close(self) -> None:
await self._client.aclose() await self._client.aclose()
@@ -56,10 +57,13 @@ class KISClient:
return token return token
async def _throttle(self) -> None: async def _throttle(self) -> None:
elapsed = time.monotonic() - self._last_throttle_at # F2: Lock으로 직렬화. 없으면 asyncio.gather 동시 호출 시 race로
if elapsed < _THROTTLE_INTERVAL: # 같은 elapsed 계산 후 동시에 깨어나 KIS 초당 2회(EGW00201) 위반.
await asyncio.sleep(_THROTTLE_INTERVAL - elapsed) async with self._throttle_lock:
self._last_throttle_at = time.monotonic() elapsed = time.monotonic() - self._last_throttle_at
if elapsed < _THROTTLE_INTERVAL:
await asyncio.sleep(_THROTTLE_INTERVAL - elapsed)
self._last_throttle_at = time.monotonic()
def _common_headers(self, tr_id: str) -> dict[str, str]: def _common_headers(self, tr_id: str) -> dict[str, str]:
token = self._read_v1_token() token = self._read_v1_token()

View File

@@ -3,9 +3,12 @@ from __future__ import annotations
import asyncio import asyncio
import logging import logging
from contextlib import asynccontextmanager from contextlib import asynccontextmanager
from datetime import datetime
from zoneinfo import ZoneInfo
from fastapi import FastAPI from fastapi import FastAPI
from ai_trade import heartbeat as _hb
from ai_trade import state as state_mod from ai_trade import state as state_mod
from ai_trade.chronos_predictor import ChronosPredictor from ai_trade.chronos_predictor import ChronosPredictor
from ai_trade.config import get_settings from ai_trade.config import get_settings
@@ -13,8 +16,11 @@ from ai_trade.kis_client import KISClient
from ai_trade.kis_websocket import KISWebSocket from ai_trade.kis_websocket import KISWebSocket
from ai_trade.pull_worker import poll_loop, make_asking_price_callback from ai_trade.pull_worker import poll_loop, make_asking_price_callback
from ai_trade.rate_limit import SignalDedup from ai_trade.rate_limit import SignalDedup
from ai_trade.scheduler import _is_polling_window, _is_market_day
from ai_trade.stock_client import StockClient from ai_trade.stock_client import StockClient
_KST = ZoneInfo("Asia/Seoul")
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -23,6 +29,7 @@ class AppContext:
dedup: SignalDedup | None = None dedup: SignalDedup | None = None
shutdown: asyncio.Event | None = None shutdown: asyncio.Event | None = None
poll_task: asyncio.Task | None = None poll_task: asyncio.Task | None = None
hb_task: asyncio.Task | None = None
kis_client: KISClient | None = None kis_client: KISClient | None = None
kis_ws: KISWebSocket | None = None kis_ws: KISWebSocket | None = None
chronos: ChronosPredictor | None = None chronos: ChronosPredictor | None = None
@@ -87,9 +94,27 @@ async def lifespan(app: FastAPI):
) )
) )
def _trader_state() -> tuple[str, int]:
"""scheduler의 실제 폴링 윈도우 판정으로 market_open/market_closed 결정."""
now = datetime.now(_KST)
is_open = _is_market_day(now) and _is_polling_window(now)
state_str = "market_open" if is_open else "market_closed"
signals = len(state_mod.state.signals)
return state_str, signals
_ctx.hb_task = asyncio.create_task(_hb.heartbeat_loop(_trader_state))
yield yield
# Shutdown # Shutdown heartbeat task
if _ctx.hb_task is not None:
_ctx.hb_task.cancel()
try:
await _ctx.hb_task
except asyncio.CancelledError:
pass
# Shutdown poll task
if _ctx.shutdown is not None: if _ctx.shutdown is not None:
_ctx.shutdown.set() _ctx.shutdown.set()
if _ctx.poll_task is not None: if _ctx.poll_task is not None:

View File

@@ -24,6 +24,7 @@ async def poll_loop(
) -> None: ) -> None:
"""FastAPI lifespan 에서 asyncio.create_task 로 시작.""" """FastAPI lifespan 에서 asyncio.create_task 로 시작."""
logger.info("poll_loop started") logger.info("poll_loop started")
last_post_close_date = None # F3: state-based post-close trigger
while not shutdown.is_set(): while not shutdown.is_set():
now = datetime.now(KST) now = datetime.now(KST)
if _is_market_day(now) and _is_polling_window(now): if _is_market_day(now) and _is_polling_window(now):
@@ -36,10 +37,14 @@ async def poll_loop(
update_minute_momentum_for_all(state) update_minute_momentum_for_all(state)
except Exception: except Exception:
logger.exception("minute momentum update failed") logger.exception("minute momentum update failed")
# Post-close trigger (16:00 KST) # Post-close trigger (F3: 상태기반 — 16:00 이후 + 오늘 미실행)
if _is_post_close_trigger(now) and chronos is not None and kis_client is not None: if (
_is_post_close_trigger(now, last_post_close_date)
and chronos is not None and kis_client is not None
):
try: try:
await _run_post_close_cycle(kis_client, chronos, state) await _run_post_close_cycle(kis_client, chronos, state)
last_post_close_date = now.date()
except Exception: except Exception:
logger.exception("post-close cycle failed") logger.exception("post-close cycle failed")
# Phase 4: generate signals # Phase 4: generate signals
@@ -49,6 +54,11 @@ async def poll_loop(
generate_signals(state, dedup, settings) generate_signals(state, dedup, settings)
except Exception: except Exception:
logger.exception("generate_signals failed") logger.exception("generate_signals failed")
# F5: cycle 끝에 expired signal purge (consumer 미사용 케이스 보호)
try:
state.purge_expired_signals(datetime.now(KST))
except Exception:
logger.exception("purge_expired_signals failed")
interval = _next_interval(now) interval = _next_interval(now)
try: try:
await asyncio.wait_for(shutdown.wait(), timeout=interval) await asyncio.wait_for(shutdown.wait(), timeout=interval)

View File

@@ -76,12 +76,21 @@ def _seconds_until_nxt_or_market_open(now: datetime) -> float:
return 86400.0 return 86400.0
def _is_post_close_trigger(now: datetime) -> bool: def _is_post_close_trigger(now: datetime, last_post_close_date) -> bool:
"""16:00 KST ±1분 (post-close cycle 트리거). 평일/영업일만.""" """F3 — 16:00 KST 이후 오늘 아직 post-close cycle 안 돌렸으면 True (상태기반).
이전엔 16:00:00-16:00:59 1분 윈도우라 5분 sleep + 비결정적 cycle 시작시각
조합으로 영영 못 잡는 경우 발생 (예: cycle이 15:31에 시작되면 16:01에 깸).
Args:
now: 현재 KST datetime.
last_post_close_date: 마지막 post-close 실행 영업일 date (None=미실행).
"""
if not _is_market_day(now): if not _is_market_day(now):
return False return False
t = now.time() if now.time() < time(16, 0):
return time(16, 0) <= t < time(16, 1) return False
return last_post_close_date != now.date()
def _seconds_until_next_market_open(now: datetime) -> float: def _seconds_until_next_market_open(now: datetime) -> float:

View File

@@ -4,7 +4,7 @@
""" """
from __future__ import annotations from __future__ import annotations
import logging import logging
from datetime import datetime from datetime import datetime, timedelta
from zoneinfo import ZoneInfo from zoneinfo import ZoneInfo
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -20,7 +20,12 @@ MOMENTUM_SCORES = {
def generate_signals(state, dedup, settings) -> None: def generate_signals(state, dedup, settings) -> None:
"""Phase 4 entry — state-mutating. Evaluation order: sell first (priority), then buy. A ticker receiving a sell signal in this cycle is excluded from buy evaluation to avoid silent overwrite.""" """Phase 4 entry — state-mutating. F5: cycle_id += 1 (호출마다, emit 여부 무관).
Evaluation order: sell first (priority), then buy. A ticker receiving a sell
signal in this cycle is excluded from buy evaluation to avoid silent overwrite.
"""
state.signal_cycle_id += 1
_evaluate_sell_signals(state, dedup, settings) _evaluate_sell_signals(state, dedup, settings)
_evaluate_buy_signals(state, dedup, settings) _evaluate_buy_signals(state, dedup, settings)
@@ -45,9 +50,10 @@ def _evaluate_buy_signals(state, dedup, settings) -> None:
if dedup.is_recent(ticker, "buy", within_hours=24): if dedup.is_recent(ticker, "buy", within_hours=24):
logger.debug("buy %s skipped: dedup 24h", ticker) logger.debug("buy %s skipped: dedup 24h", ticker)
continue continue
state.signals[ticker] = _build_buy_signal(state, ticker, name, rank, confidence) state.signals[ticker] = _build_buy_signal(state, ticker, name, rank, confidence, settings)
dedup.record(ticker, "buy", confidence=confidence) dedup.record(ticker, "buy", confidence=confidence)
logger.info("signal emit %s buy conf=%.3f rank=%s", ticker, confidence, rank) logger.info("signal emit %s buy conf=%.3f rank=%s cycle=%d",
ticker, confidence, rank, state.signal_cycle_id)
def _buy_candidates(state) -> list[tuple[str, str, int | None]]: def _buy_candidates(state) -> list[tuple[str, str, int | None]]:
@@ -96,8 +102,11 @@ def _compute_buy_confidence(state, ticker: str, rank: int | None) -> float:
return chronos_conf * 0.5 + minute_score * 0.3 + screener_norm * 0.2 return chronos_conf * 0.5 + minute_score * 0.3 + screener_norm * 0.2
def _build_buy_signal(state, ticker: str, name: str, rank: int | None, confidence: float) -> dict: def _build_buy_signal(state, ticker: str, name: str, rank: int | None, confidence: float, settings) -> dict:
ap = state.asking_price[ticker] ap = state.asking_price[ticker]
as_of_dt = datetime.now(KST)
ttl = getattr(settings, "signal_ttl_seconds", 300)
expires_at = (as_of_dt + timedelta(seconds=ttl)).isoformat()
return { return {
"ticker": ticker, "ticker": ticker,
"name": name, "name": name,
@@ -107,7 +116,9 @@ def _build_buy_signal(state, ticker: str, name: str, rank: int | None, confidenc
"avg_price": None, "avg_price": None,
"pnl_pct": None, "pnl_pct": None,
"context": _build_context(state, ticker, rank), "context": _build_context(state, ticker, rank),
"as_of": datetime.now(KST).isoformat(), "as_of": as_of_dt.isoformat(),
"cycle_id": state.signal_cycle_id,
"expires_at": expires_at,
} }
@@ -132,23 +143,24 @@ def _evaluate_sell_signals(state, dedup, settings) -> None:
continue continue
state.signals[ticker] = sell state.signals[ticker] = sell
dedup.record(ticker, "sell", confidence=sell["confidence_webai"]) dedup.record(ticker, "sell", confidence=sell["confidence_webai"])
logger.info("signal emit %s sell conf=%.3f reason=%s", logger.info("signal emit %s sell conf=%.3f reason=%s cycle=%d",
ticker, sell["confidence_webai"], ticker, sell["confidence_webai"],
sell.get("context", {}).get("sell_reason")) sell.get("context", {}).get("sell_reason"),
state.signal_cycle_id)
def _try_stop_loss(state, holding: dict, settings) -> dict | None: def _try_stop_loss(state, holding: dict, settings) -> dict | None:
pnl = holding.get("pnl_pct") pnl = holding.get("pnl_pct")
if pnl is None or pnl >= settings.stop_loss_pct: if pnl is None or pnl >= settings.stop_loss_pct:
return None return None
return _build_sell_signal(state, holding, confidence=1.0, reason="stop_loss") return _build_sell_signal(state, holding, confidence=1.0, reason="stop_loss", settings=settings)
def _try_take_profit(state, holding: dict, settings) -> dict | None: def _try_take_profit(state, holding: dict, settings) -> dict | None:
pnl = holding.get("pnl_pct") pnl = holding.get("pnl_pct")
if pnl is None or pnl <= settings.take_profit_pct: if pnl is None or pnl <= settings.take_profit_pct:
return None return None
return _build_sell_signal(state, holding, confidence=0.6, reason="take_profit") return _build_sell_signal(state, holding, confidence=0.6, reason="take_profit", settings=settings)
def _try_anomaly(state, holding: dict, settings) -> dict | None: def _try_anomaly(state, holding: dict, settings) -> dict | None:
@@ -168,11 +180,14 @@ def _try_anomaly(state, holding: dict, settings) -> dict | None:
confidence = pred["conf"] * 0.5 + minute_score * 0.3 + 1.0 * 0.2 confidence = pred["conf"] * 0.5 + minute_score * 0.3 + 1.0 * 0.2
if confidence <= settings.confidence_threshold: if confidence <= settings.confidence_threshold:
return None return None
return _build_sell_signal(state, holding, confidence=confidence, reason="anomaly") return _build_sell_signal(state, holding, confidence=confidence, reason="anomaly", settings=settings)
def _build_sell_signal(state, holding: dict, confidence: float, reason: str) -> dict: def _build_sell_signal(state, holding: dict, confidence: float, reason: str, settings=None) -> dict:
ticker = holding["ticker"] ticker = holding["ticker"]
as_of_dt = datetime.now(KST)
ttl = getattr(settings, "signal_ttl_seconds", 300) if settings else 300
expires_at = (as_of_dt + timedelta(seconds=ttl)).isoformat()
return { return {
"ticker": ticker, "ticker": ticker,
"name": holding.get("name", ticker), "name": holding.get("name", ticker),
@@ -182,7 +197,9 @@ def _build_sell_signal(state, holding: dict, confidence: float, reason: str) ->
"avg_price": holding.get("avg_price"), "avg_price": holding.get("avg_price"),
"pnl_pct": holding.get("pnl_pct"), "pnl_pct": holding.get("pnl_pct"),
"context": _build_context(state, ticker, rank=None, sell_reason=reason), "context": _build_context(state, ticker, rank=None, sell_reason=reason),
"as_of": datetime.now(KST).isoformat(), "as_of": as_of_dt.isoformat(),
"cycle_id": state.signal_cycle_id,
"expires_at": expires_at,
} }

View File

@@ -1,6 +1,7 @@
"""PollState — process-wide singleton.""" """PollState — process-wide singleton."""
from collections import deque from collections import deque
from dataclasses import dataclass, field from dataclasses import dataclass, field
from datetime import datetime
@dataclass @dataclass
@@ -15,8 +16,44 @@ class PollState:
chronos_predictions: dict[str, dict] = field(default_factory=dict) chronos_predictions: dict[str, dict] = field(default_factory=dict)
minute_momentum: dict[str, str] = field(default_factory=dict) minute_momentum: dict[str, str] = field(default_factory=dict)
signals: dict[str, dict] = field(default_factory=dict) signals: dict[str, dict] = field(default_factory=dict)
# F5 lifecycle
signal_cycle_id: int = 0
last_updated: dict[str, str] = field(default_factory=dict) last_updated: dict[str, str] = field(default_factory=dict)
fetch_errors: dict[str, int] = field(default_factory=dict) fetch_errors: dict[str, int] = field(default_factory=dict)
def get_active_signals(self, now: datetime) -> list[dict]:
"""expires_at > now 인 신호만 반환. expires_at 없거나 파싱 실패는 expired 취급."""
active: list[dict] = []
for sig in self.signals.values():
expires_at = sig.get("expires_at")
if not expires_at:
continue
try:
exp_dt = datetime.fromisoformat(expires_at)
except ValueError:
continue
if exp_dt > now:
active.append(sig)
return active
def purge_expired_signals(self, now: datetime) -> int:
"""만료된 signal 제거. expires_at 없거나 파싱 실패도 제거. 제거 개수 반환."""
to_drop = []
for ticker, sig in self.signals.items():
expires_at = sig.get("expires_at")
if not expires_at:
to_drop.append(ticker)
continue
try:
exp_dt = datetime.fromisoformat(expires_at)
except ValueError:
to_drop.append(ticker)
continue
if exp_dt <= now:
to_drop.append(ticker)
for t in to_drop:
del self.signals[t]
return len(to_drop)
state = PollState() state = PollState()

View File

@@ -90,3 +90,54 @@ def test_return_computed_from_price_relative_to_last_close(mock_pipeline, mock_t
daily = {"005930": _daily_ohlcv(list(range(41, 101)))} # last = 100 daily = {"005930": _daily_ohlcv(list(range(41, 101)))} # last = 100
result = predictor.predict_batch(daily) result = predictor.predict_batch(daily)
assert abs(result["005930"].median - 0.10) < 0.001 assert abs(result["005930"].median - 0.10) < 0.001
# ----- F4: absolute spread 기반 confidence -----
def test_confidence_high_when_spread_near_zero(mock_pipeline, mock_torch_cpu):
"""F4 — median≈0 + spread≈0 일 때 conf≈1 (현 relative 산식의 회귀 케이스).
한국 주가 100000원, q10=q50=q90=100000 → median=0, spread=0.
Relative 산식 (spread/abs(median))은 0/0.001 보호선이라 spread=0이면 conf=1로
동작하지만, median≈0 + 미세 spread(예 1원) 케이스에서 폭증 → conf=0.
Absolute 산식은 그런 폭증 없음.
"""
quantiles = _mk_quantiles_tensor(100000.0, 100000.0, 100000.0)
mock_pipeline.predict_quantiles.return_value = (quantiles, None)
from ai_trade.chronos_predictor import ChronosPredictor
predictor = ChronosPredictor(model_name="mock-model")
daily = {"005930": _daily_ohlcv([100000] * 60)}
result = predictor.predict_batch(daily)
assert result["005930"].conf > 0.95, (
f"median≈0 + spread≈0인데 conf={result['005930'].conf} (F4 회귀)"
)
def test_confidence_half_at_spread_03(mock_pipeline, mock_torch_cpu):
"""F4 — spread 0.30일 때 conf ≈ 0.5 (1 - 0.3/0.6)."""
# q10=85000 → -0.15, q90=115000 → 0.15, q50=100000 → 0.0
# spread = 0.30, conf = 1 - 0.30/0.60 = 0.50
quantiles = _mk_quantiles_tensor(85000.0, 100000.0, 115000.0)
mock_pipeline.predict_quantiles.return_value = (quantiles, None)
from ai_trade.chronos_predictor import ChronosPredictor
predictor = ChronosPredictor(model_name="mock-model")
daily = {"005930": _daily_ohlcv([100000] * 60)}
result = predictor.predict_batch(daily)
conf = result["005930"].conf
assert 0.45 < conf < 0.55, f"spread=0.30에서 conf={conf} (expected ≈0.5)"
def test_confidence_zero_at_threshold_spread(mock_pipeline, mock_torch_cpu):
"""F4 — spread가 _SPREAD_THRESHOLD(0.6)이면 conf=0."""
quantiles = _mk_quantiles_tensor(70000.0, 100000.0, 130000.0)
mock_pipeline.predict_quantiles.return_value = (quantiles, None)
from ai_trade.chronos_predictor import ChronosPredictor
predictor = ChronosPredictor(model_name="mock-model")
daily = {"005930": _daily_ohlcv([100000] * 60)}
result = predictor.predict_batch(daily)
assert result["005930"].conf < 0.05, (
f"spread=threshold에서 conf={result['005930'].conf} (expected ≈0)"
)

View File

@@ -0,0 +1,22 @@
"""F1 — V1_TOKEN_PATH default가 legacy/signal_v1/ 경유인지 검증."""
from pathlib import Path
from ai_trade.config import Settings
def test_v1_token_default_path_uses_legacy_dir(monkeypatch):
"""env에 V1_TOKEN_PATH 없으면 legacy/signal_v1/data/kis_token.json"""
monkeypatch.delenv("V1_TOKEN_PATH", raising=False)
settings = Settings()
expected_suffix = Path("legacy") / "signal_v1" / "data" / "kis_token.json"
assert str(settings.v1_token_path).endswith(str(expected_suffix)), (
f"expected default to end with {expected_suffix}, got {settings.v1_token_path}"
)
def test_v1_token_env_override_wins(monkeypatch, tmp_path):
"""env로 명시한 경로가 default를 덮어씀."""
custom = tmp_path / "custom_token.json"
monkeypatch.setenv("V1_TOKEN_PATH", str(custom))
settings = Settings()
assert settings.v1_token_path == custom

View File

@@ -0,0 +1,38 @@
"""Tests for ai_trade heartbeat payload builder."""
import json
import pytest
def test_trader_payload_market_open():
from ai_trade.heartbeat import build_trader_payload
p = json.loads(build_trader_payload("market_open", signals=2))
assert p["name"] == "ai_trade"
assert p["kind"] == "trader"
assert p["state"] == "market_open"
assert p["ts"].endswith("Z")
assert p["jobs_done"] == 2
def test_trader_payload_market_closed():
from ai_trade.heartbeat import build_trader_payload
p = json.loads(build_trader_payload("market_closed"))
assert p["name"] == "ai_trade"
assert p["kind"] == "trader"
assert p["state"] == "market_closed"
assert p["jobs_done"] == 0
assert p["jobs_failed"] == 0
assert p["last_job_at"] is None
def test_trader_payload_ts_format():
"""ts 필드가 ISO 8601 UTC 형식 (YYYY-MM-DDTHH:MM:SSZ)인지 확인."""
from ai_trade.heartbeat import build_trader_payload
import re
p = json.loads(build_trader_payload("market_open"))
assert re.match(r"\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z", p["ts"]), (
f"ts={p['ts']!r} does not match expected UTC format"
)

View File

@@ -1,5 +1,7 @@
"""Tests for KISClient (REST).""" """Tests for KISClient (REST)."""
import asyncio
import json import json
import time as time_module
from pathlib import Path from pathlib import Path
import httpx import httpx
@@ -159,3 +161,30 @@ async def test_get_daily_ohlcv_returns_60_bars(kis_client_factory):
assert "datetime" in bars[0] assert "datetime" in bars[0]
finally: finally:
await client.close() await client.close()
@respx.mock
async def test_throttle_serializes_concurrent_gather(kis_client_factory):
"""F2 — 5개 동시 요청이 asyncio.gather로 들어와도 0.5초 간격으로 직렬화.
초당 2회 = 0.5초 간격. 5개 요청 시 최소 (5-1)*0.5 = 2.0초.
Lock 없으면 race condition으로 거의 동시에 나가 0.5초대로 끝남.
"""
sample = {"output2": []}
respx.get(
"https://openapivts.koreainvestment.com:29443"
"/uapi/domestic-stock/v1/quotations/inquire-time-itemchartprice"
).mock(return_value=httpx.Response(200, json=sample))
client = kis_client_factory()
try:
start = time_module.monotonic()
await asyncio.gather(*[client.get_minute_ohlcv(f"00593{i}") for i in range(5)])
elapsed = time_module.monotonic() - start
# 5 throttle = 최소 (5-1)*0.5 = 2.0s, tolerance 0.3s
assert elapsed >= 1.7, (
f"throttle race condition: 5 concurrent calls took only {elapsed:.2f}s, "
f"expected >=1.7s (0.5s * 4 inter-call gaps)"
)
finally:
await client.close()

View File

@@ -122,6 +122,7 @@ def test_poll_loop_calls_generate_signals_after_cycle(monkeypatch):
settings.asking_bid_ratio_threshold = 0.6 settings.asking_bid_ratio_threshold = 0.6
settings.confidence_threshold = 0.7 settings.confidence_threshold = 0.7
settings.min_momentum_for_buy = "strong_up" settings.min_momentum_for_buy = "strong_up"
settings.signal_ttl_seconds = 300
generate_signals(state, dedup, settings) generate_signals(state, dedup, settings)
@@ -129,3 +130,112 @@ def test_poll_loop_calls_generate_signals_after_cycle(monkeypatch):
assert state.signals["005930"]["action"] == "sell" assert state.signals["005930"]["action"] == "sell"
assert state.signals["005930"]["confidence_webai"] == 1.0 assert state.signals["005930"]["confidence_webai"] == 1.0
dedup.record.assert_called_with("005930", "sell", confidence=1.0) dedup.record.assert_called_with("005930", "sell", confidence=1.0)
async def test_post_close_fires_at_1601_when_not_yet_today(monkeypatch):
"""F3 — 16:01에 깬 cycle도 오늘 post_close 안 돌렸으면 호출됨 (회귀 방지)."""
from datetime import datetime as _dt
from zoneinfo import ZoneInfo as _ZI
import asyncio as _asyncio
from ai_trade import pull_worker
_kst = _ZI("Asia/Seoul")
now_at_1601 = _dt(2026, 5, 18, 16, 1, tzinfo=_kst)
class FrozenDateTime:
@staticmethod
def now(tz=None):
return now_at_1601
monkeypatch.setattr(pull_worker, "datetime", FrozenDateTime)
monkeypatch.setattr(pull_worker, "_is_market_day", lambda n: True)
monkeypatch.setattr(pull_worker, "_is_polling_window", lambda n: True)
monkeypatch.setattr(pull_worker, "_next_interval", lambda n: 0.01)
monkeypatch.setattr(pull_worker, "_run_polling_cycle", AsyncMock())
monkeypatch.setattr(pull_worker, "update_minute_momentum_for_all", lambda s: None)
post_close = AsyncMock()
monkeypatch.setattr(pull_worker, "_run_post_close_cycle", post_close)
state = MagicMock()
chronos = MagicMock()
kis = MagicMock()
shutdown = _asyncio.Event()
async def _stop_soon():
await _asyncio.sleep(0.05)
shutdown.set()
_asyncio.create_task(_stop_soon())
await pull_worker.poll_loop(
client=MagicMock(),
state=state,
shutdown=shutdown,
kis_client=kis,
chronos=chronos,
dedup=None,
settings=None,
)
assert post_close.await_count >= 1, "post-close가 16:01에 호출되지 않음 (F3 회귀)"
async def test_poll_loop_purges_expired_signals(monkeypatch):
"""F5 — 매 cycle 끝에 expired signal이 제거됨."""
from datetime import datetime as _dt
from zoneinfo import ZoneInfo as _ZI
import asyncio as _asyncio
from ai_trade import pull_worker
from ai_trade.state import PollState
_kst = _ZI("Asia/Seoul")
now = _dt(2026, 5, 18, 10, 0, tzinfo=_kst)
class FrozenDT:
@staticmethod
def now(tz=None):
return now
state = PollState()
state.signals = {
"OLD": {
"ticker": "OLD",
"expires_at": _dt(2026, 5, 18, 9, 0, tzinfo=_kst).isoformat(),
"cycle_id": 1,
},
"FRESH": {
"ticker": "FRESH",
"expires_at": _dt(2026, 5, 18, 10, 30, tzinfo=_kst).isoformat(),
"cycle_id": 1,
},
}
monkeypatch.setattr(pull_worker, "datetime", FrozenDT)
monkeypatch.setattr(pull_worker, "_is_market_day", lambda n: True)
monkeypatch.setattr(pull_worker, "_is_polling_window", lambda n: True)
monkeypatch.setattr(pull_worker, "_next_interval", lambda n: 0.01)
monkeypatch.setattr(pull_worker, "_run_polling_cycle", AsyncMock())
monkeypatch.setattr(pull_worker, "update_minute_momentum_for_all", lambda s: None)
monkeypatch.setattr(pull_worker, "_is_post_close_trigger", lambda *a, **k: False)
shutdown = _asyncio.Event()
async def stop_soon():
await _asyncio.sleep(0.05)
shutdown.set()
_asyncio.create_task(stop_soon())
await pull_worker.poll_loop(
client=MagicMock(),
state=state,
shutdown=shutdown,
kis_client=MagicMock(),
chronos=MagicMock(),
dedup=None,
settings=None,
)
assert "OLD" not in state.signals
assert "FRESH" in state.signals

View File

@@ -79,3 +79,41 @@ def test_next_interval_dead_zone_skip():
interval = _next_interval(now) interval = _next_interval(now)
# 02:00 → 04:30 = 2.5h = 9000s # 02:00 → 04:30 = 2.5h = 9000s
assert 9000 - 60 < interval < 9000 + 60 assert 9000 - 60 < interval < 9000 + 60
# ----- F3 post-close 상태기반 트리거 -----
from datetime import date as _date # noqa: E402
from ai_trade.scheduler import _is_post_close_trigger # noqa: E402
def test_post_close_trigger_fires_at_1601_if_not_yet_today():
"""F3 — 16:01에 깬 cycle도 오늘 아직 안 돌렸으면 trigger."""
now = _kst(2026, 5, 18, 16, 1)
assert _is_post_close_trigger(now, last_post_close_date=None) is True
def test_post_close_trigger_skips_if_already_today():
"""F3 — 이미 오늘 돌렸으면 trigger 안 함."""
now = _kst(2026, 5, 18, 16, 5)
today = _date(2026, 5, 18)
assert _is_post_close_trigger(now, last_post_close_date=today) is False
def test_post_close_trigger_skips_before_1600():
"""F3 — 16:00 전에는 trigger 안 함."""
now = _kst(2026, 5, 18, 15, 59)
assert _is_post_close_trigger(now, last_post_close_date=None) is False
def test_post_close_trigger_fires_next_day_after_reset():
"""F3 — 다음 영업일이 되면 다시 trigger."""
now = _kst(2026, 5, 19, 16, 0)
yesterday = _date(2026, 5, 18)
assert _is_post_close_trigger(now, last_post_close_date=yesterday) is True
def test_post_close_trigger_skips_on_holiday():
"""F3 — 휴장일에는 trigger 안 함 (2026-05-05 어린이날)."""
now = _kst(2026, 5, 5, 16, 30)
assert _is_post_close_trigger(now, last_post_close_date=None) is False

View File

@@ -16,6 +16,7 @@ def _settings(**overrides):
asking_bid_ratio_threshold=0.6, asking_bid_ratio_threshold=0.6,
confidence_threshold=0.7, confidence_threshold=0.7,
min_momentum_for_buy="strong_up", min_momentum_for_buy="strong_up",
signal_ttl_seconds=300,
) )
defaults.update(overrides) defaults.update(overrides)
m = MagicMock() m = MagicMock()
@@ -170,3 +171,48 @@ def test_sell_signal_triggers_on_anomaly_path(dedup_mock):
assert sig["action"] == "sell" assert sig["action"] == "sell"
assert sig["context"]["sell_reason"] == "anomaly" assert sig["context"]["sell_reason"] == "anomaly"
assert sig["confidence_webai"] > 0.7 assert sig["confidence_webai"] > 0.7
# ----- F5: cycle_id + expires_at 부착 -----
def test_emit_attaches_cycle_id_and_expires_at(dedup_mock):
"""F5 — emit signal에 cycle_id (state.signal_cycle_id) + expires_at 부착."""
from datetime import datetime, timedelta
from zoneinfo import ZoneInfo
_kst = ZoneInfo("Asia/Seoul")
state = _make_state_with_buy_candidate()
before = datetime.now(_kst)
generate_signals(state, dedup_mock, _settings(signal_ttl_seconds=300))
after = datetime.now(_kst)
sig = state.signals["005930"]
assert sig["cycle_id"] == 1
assert "expires_at" in sig
exp_dt = datetime.fromisoformat(sig["expires_at"])
assert before + timedelta(seconds=295) < exp_dt < after + timedelta(seconds=305)
def test_cycle_id_increments_each_call(dedup_mock):
"""F5 — generate_signals 호출마다 cycle_id += 1 (emit 여부 무관)."""
state = _make_state_with_buy_candidate()
generate_signals(state, dedup_mock, _settings())
assert state.signal_cycle_id == 1
# 2번째 호출 — dedup이 막아도 cycle_id는 증가
dedup_mock.is_recent.return_value = True
generate_signals(state, dedup_mock, _settings())
assert state.signal_cycle_id == 2
def test_sell_signal_also_carries_cycle_id_and_expires_at(dedup_mock):
"""F5 — sell signal도 동일하게 부착."""
from datetime import datetime
state = _make_state_with_holding(pnl_pct=-0.08, current_price=68000)
generate_signals(state, dedup_mock, _settings(signal_ttl_seconds=120))
assert "005930" in state.signals
sig = state.signals["005930"]
assert sig["action"] == "sell"
assert sig["cycle_id"] == 1
# parse expires_at as ISO — must succeed
datetime.fromisoformat(sig["expires_at"])

View File

@@ -0,0 +1,66 @@
"""F5 — state.signals lifecycle (expires_at + cycle_id)."""
from datetime import datetime, timedelta
from zoneinfo import ZoneInfo
from ai_trade.state import PollState
KST = ZoneInfo("Asia/Seoul")
def test_initial_signal_cycle_id_is_zero():
state = PollState()
assert state.signal_cycle_id == 0
def test_get_active_signals_excludes_expired():
state = PollState()
now = datetime(2026, 5, 25, 10, 0, tzinfo=KST)
future = (now + timedelta(seconds=300)).isoformat()
past = (now - timedelta(seconds=60)).isoformat()
state.signals = {
"A": {"ticker": "A", "expires_at": future, "cycle_id": 1, "action": "buy"},
"B": {"ticker": "B", "expires_at": past, "cycle_id": 1, "action": "buy"},
}
active = state.get_active_signals(now)
tickers = [s["ticker"] for s in active]
assert "A" in tickers
assert "B" not in tickers
def test_get_active_signals_treats_missing_expires_as_expired():
"""expires_at 없는 legacy 신호는 expired로 간주."""
state = PollState()
now = datetime(2026, 5, 25, 10, 0, tzinfo=KST)
state.signals = {"C": {"ticker": "C", "action": "buy"}}
assert state.get_active_signals(now) == []
def test_purge_expired_signals_removes_expired():
state = PollState()
now = datetime(2026, 5, 25, 10, 0, tzinfo=KST)
future = (now + timedelta(seconds=300)).isoformat()
past = (now - timedelta(seconds=60)).isoformat()
state.signals = {
"A": {"ticker": "A", "expires_at": future, "cycle_id": 1},
"B": {"ticker": "B", "expires_at": past, "cycle_id": 1},
}
removed = state.purge_expired_signals(now)
assert "A" in state.signals
assert "B" not in state.signals
assert removed == 1
# ----- SIGNAL_TTL_SECONDS env -----
def test_signal_ttl_seconds_default(monkeypatch):
monkeypatch.delenv("SIGNAL_TTL_SECONDS", raising=False)
from ai_trade.config import Settings
s = Settings()
assert s.signal_ttl_seconds == 300
def test_signal_ttl_seconds_env_override(monkeypatch):
monkeypatch.setenv("SIGNAL_TTL_SECONDS", "60")
from ai_trade.config import Settings
s = Settings()
assert s.signal_ttl_seconds == 60

View File

@@ -7,6 +7,7 @@ pytest>=8.0
pytest-asyncio>=0.23 pytest-asyncio>=0.23
respx>=0.21 respx>=0.21
websockets>=12 websockets>=12
redis>=5.0
# Phase 3b dependencies (Chronos-2 + ML) # Phase 3b dependencies (Chronos-2 + ML)
transformers>=4.40 transformers>=4.40
chronos-forecasting>=1.4 chronos-forecasting>=1.4

View File

View File

@@ -0,0 +1,55 @@
"""분산 워커 heartbeat — worker:<name>:heartbeat SET (TTL). Global Constraints 계약 1."""
from __future__ import annotations
import asyncio, datetime as dt, json, logging, os
logger = logging.getLogger(__name__)
DEFAULT_INTERVAL = int(os.getenv("HEARTBEAT_INTERVAL", "15"))
DEFAULT_TTL = int(os.getenv("HEARTBEAT_TTL", "45"))
class WorkerStats:
"""worker_loop가 갱신, heartbeat_loop가 읽는 가변 카운터."""
def __init__(self):
self.busy = False
self.jobs_done = 0
self.jobs_failed = 0
self.last_job_at = None # ISO str | None
def utc_now_iso() -> str:
return dt.datetime.now(dt.timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
def build_payload(name: str, kind: str, state: str, stats: WorkerStats, extra: dict | None = None) -> str:
payload = {
"name": name, "kind": kind, "state": state, "ts": utc_now_iso(),
"last_job_at": stats.last_job_at,
"jobs_done": stats.jobs_done, "jobs_failed": stats.jobs_failed,
}
if extra:
payload.update(extra)
return json.dumps(payload)
async def render_state(redis, stats: WorkerStats, paused_key: str = "queue:paused") -> str:
if await redis.get(paused_key) == b"1":
return "paused"
return "busy" if stats.busy else "idle"
async def heartbeat_loop(redis, name, kind, stats, *, interval=DEFAULT_INTERVAL,
ttl=DEFAULT_TTL, paused_key="queue:paused", state_fn=None):
key = f"worker:{name}:heartbeat"
logger.info("heartbeat 시작 name=%s ttl=%ds", name, ttl)
while True:
try:
if state_fn is not None:
state, extra = await state_fn(redis, stats)
else:
state, extra = await render_state(redis, stats, paused_key), None
await redis.set(key, build_payload(name, kind, state, stats, extra), ex=ttl)
except asyncio.CancelledError:
raise
except Exception:
logger.exception("heartbeat 발신 실패 name=%s", name)
await asyncio.sleep(interval)

View File

@@ -0,0 +1,2 @@
[pytest]
asyncio_mode = auto

View File

@@ -0,0 +1,135 @@
"""F6 — Reliable Redis queue with processing list + recovery + retry.
Pattern:
- BLMOVE main → processing (atomic dequeue)
- ack: LREM processing (1 occurrence)
- fail: LREM processing + (re-enqueue with attempts++ OR move to dead-letter)
- recover: startup-time orphan recovery (worker's processing list → main queue)
Producer side stays unchanged: LPUSH queue:<x> <json payload>.
Worker side: dequeue() → process → ack(raw) on success or fail(raw, payload) on error.
Startup: await queue.recover() to re-enqueue orphans.
"""
from __future__ import annotations
import json
import logging
import os
import socket
from typing import Optional
logger = logging.getLogger(__name__)
def default_worker_id(queue_key: str) -> str:
"""env WORKER_ID > hostname-pid."""
explicit = os.getenv("WORKER_ID")
if explicit:
return explicit
return f"{queue_key}-{socket.gethostname()}-{os.getpid()}"
class ReliableQueue:
"""BLMOVE-backed atomic dequeue + processing list + retry/dead-letter."""
def __init__(
self,
redis,
queue_key: str,
worker_id: Optional[str] = None,
max_attempts: int = 3,
):
self._redis = redis
self._queue_key = queue_key
self._worker_id = worker_id or default_worker_id(queue_key)
self._processing_key = f"processing:{queue_key}:{self._worker_id}"
self._dead_letter_key = f"dead_letter:{queue_key}"
self._max_attempts = max_attempts
@property
def worker_id(self) -> str:
return self._worker_id
@property
def processing_key(self) -> str:
return self._processing_key
async def dequeue(self, timeout: int = 5) -> Optional[tuple[dict, bytes]]:
"""Atomically move 1 item from main queue tail to processing head.
Returns (parsed_dict, raw_bytes) or None on timeout/parse-error.
Caller MUST call ack(raw) on success or fail(raw, payload) on error.
"""
raw = await self._redis.blmove(
self._queue_key, self._processing_key,
timeout, "RIGHT", "LEFT",
)
if raw is None:
return None
try:
payload = json.loads(raw)
except json.JSONDecodeError:
logger.error(
"invalid payload on dequeue, moving to dead-letter: %r", raw[:200]
)
await self._redis.lrem(self._processing_key, 1, raw)
await self._redis.lpush(self._dead_letter_key, raw)
return None
return payload, raw
async def ack(self, raw: bytes) -> None:
"""Successful processing — remove from processing list."""
removed = await self._redis.lrem(self._processing_key, 1, raw)
if removed == 0:
logger.warning("ack on missing payload (already removed?): %r", raw[:100])
async def fail(self, raw: bytes, payload: dict) -> None:
"""Failed processing — remove from processing list and re-enqueue or dead-letter."""
await self._redis.lrem(self._processing_key, 1, raw)
attempts = int(payload.get("attempts", 0)) + 1
if attempts >= self._max_attempts:
payload["attempts"] = attempts
await self._redis.lpush(self._dead_letter_key, json.dumps(payload).encode())
logger.error(
"task moved to dead-letter after %d attempts: task_id=%s",
attempts, payload.get("task_id"),
)
return
payload["attempts"] = attempts
await self._redis.lpush(self._queue_key, json.dumps(payload).encode())
logger.info(
"task re-enqueued (attempt %d/%d): task_id=%s",
attempts, self._max_attempts, payload.get("task_id"),
)
async def recover(self) -> int:
"""Startup: move all orphans from this worker's processing list back to main queue.
Increments attempts counter (orphan == implicit failure). Returns count.
"""
count = 0
while True:
raw = await self._redis.lpop(self._processing_key)
if raw is None:
break
try:
payload = json.loads(raw)
except json.JSONDecodeError:
await self._redis.lpush(self._dead_letter_key, raw)
count += 1
continue
payload["attempts"] = int(payload.get("attempts", 0)) + 1
if payload["attempts"] >= self._max_attempts:
await self._redis.lpush(
self._dead_letter_key, json.dumps(payload).encode()
)
else:
await self._redis.lpush(
self._queue_key, json.dumps(payload).encode()
)
count += 1
if count:
logger.info(
"recovered %d orphaned items for worker %s", count, self._worker_id
)
return count

View File

@@ -0,0 +1 @@
redis>=5.0.0

View File

View File

@@ -0,0 +1,46 @@
"""Tests for _shared.heartbeat — Task A1."""
import json
import sys
from pathlib import Path
import pytest
# Make `_shared` importable (same pattern as test_reliable_queue.py)
sys.path.insert(0, str(Path(__file__).resolve().parent.parent.parent))
from _shared.heartbeat import WorkerStats, build_payload, render_state
def test_build_payload_has_contract_fields():
s = WorkerStats(); s.jobs_done = 3; s.last_job_at = "2026-06-29T00:00:00Z"
payload = json.loads(build_payload("image-render", "render", "idle", s))
assert payload["name"] == "image-render"
assert payload["kind"] == "render"
assert payload["state"] == "idle"
assert payload["jobs_done"] == 3
assert payload["last_job_at"] == "2026-06-29T00:00:00Z"
assert payload["ts"].endswith("Z")
def test_build_payload_merges_extra():
payload = json.loads(build_payload("task-watcher", "watcher", "free", WorkerStats(), extra={"mode": "free"}))
assert payload["mode"] == "free"
class _FakeRedis:
def __init__(self, paused): self._paused = paused
async def get(self, key): return b"1" if self._paused else None
@pytest.mark.asyncio
async def test_render_state_paused_overrides_busy():
s = WorkerStats(); s.busy = True
assert await render_state(_FakeRedis(paused=True), s) == "paused"
@pytest.mark.asyncio
async def test_render_state_busy_then_idle():
s = WorkerStats(); s.busy = True
assert await render_state(_FakeRedis(paused=False), s) == "busy"
s.busy = False
assert await render_state(_FakeRedis(paused=False), s) == "idle"

View File

@@ -0,0 +1,84 @@
"""F6 — ReliableQueue: atomic dequeue + recovery + retry."""
import json
import sys
from pathlib import Path
import fakeredis.aioredis
import pytest
# Make `_shared` importable when tests run from services/_shared
sys.path.insert(0, str(Path(__file__).resolve().parent.parent.parent))
from _shared.reliable_queue import ReliableQueue
@pytest.fixture
async def redis():
r = fakeredis.aioredis.FakeRedis(decode_responses=False)
yield r
await r.flushall()
await r.aclose()
async def test_dequeue_atomically_moves_to_processing(redis):
"""BLMOVE: queue → processing 원자적 이동."""
q = ReliableQueue(redis, queue_key="queue:test", worker_id="w1")
await redis.lpush("queue:test", json.dumps({"task_id": "t1"}).encode())
result = await q.dequeue(timeout=1)
assert result is not None
payload, raw = result
assert payload["task_id"] == "t1"
assert await redis.llen("queue:test") == 0
assert await redis.llen("processing:queue:test:w1") == 1
async def test_dequeue_returns_none_on_timeout(redis):
q = ReliableQueue(redis, queue_key="queue:test", worker_id="w1")
result = await q.dequeue(timeout=1)
assert result is None
async def test_ack_removes_from_processing(redis):
q = ReliableQueue(redis, queue_key="queue:test", worker_id="w1")
await redis.lpush("queue:test", json.dumps({"task_id": "t1"}).encode())
_, raw = await q.dequeue(timeout=1)
await q.ack(raw)
assert await redis.llen("processing:queue:test:w1") == 0
async def test_recover_returns_orphaned_to_main_queue(redis):
"""startup recovery: 잔존 processing list 항목을 main queue로 되돌림."""
orphan = json.dumps({"task_id": "t1", "attempts": 0}).encode()
await redis.lpush("processing:queue:test:w1", orphan)
q = ReliableQueue(redis, queue_key="queue:test", worker_id="w1")
recovered = await q.recover()
assert recovered == 1
assert await redis.llen("processing:queue:test:w1") == 0
payload, _ = await q.dequeue(timeout=1)
assert payload["task_id"] == "t1"
assert payload["attempts"] == 1 # incremented on recover
async def test_fail_below_max_attempts_returns_to_main_queue(redis):
q = ReliableQueue(redis, queue_key="queue:test", worker_id="w1", max_attempts=3)
await redis.lpush("queue:test", json.dumps({"task_id": "t1", "attempts": 0}).encode())
payload, raw = await q.dequeue(timeout=1)
await q.fail(raw, payload)
assert await redis.llen("processing:queue:test:w1") == 0
assert await redis.llen("queue:test") == 1
requeued_raw = await redis.lindex("queue:test", 0)
requeued = json.loads(requeued_raw)
assert requeued["attempts"] == 1
async def test_fail_at_max_attempts_moves_to_dead_letter(redis):
q = ReliableQueue(redis, queue_key="queue:test", worker_id="w1", max_attempts=3)
await redis.lpush(
"queue:test", json.dumps({"task_id": "t1", "attempts": 2}).encode()
)
payload, raw = await q.dequeue(timeout=1)
await q.fail(raw, payload)
# attempts 2 → 3 (== max) → dead-letter
assert await redis.llen("queue:test") == 0
assert await redis.llen("processing:queue:test:w1") == 0
assert await redis.llen("dead_letter:queue:test") == 1

View File

@@ -3,7 +3,8 @@ name: web-ai-services
services: services:
insta-render: insta-render:
build: build:
context: ./insta-render context: .
dockerfile: insta-render/Dockerfile
container_name: insta-render container_name: insta-render
restart: unless-stopped restart: unless-stopped
ports: ports:
@@ -13,7 +14,7 @@ services:
- REDIS_URL=${REDIS_URL:-redis://192.168.45.54:6379} - REDIS_URL=${REDIS_URL:-redis://192.168.45.54:6379}
- NAS_BASE_URL=${NAS_BASE_URL:-http://192.168.45.54:18700} - NAS_BASE_URL=${NAS_BASE_URL:-http://192.168.45.54:18700}
- INTERNAL_API_KEY=${INTERNAL_API_KEY:-} - INTERNAL_API_KEY=${INTERNAL_API_KEY:-}
- INSTA_MEDIA_ROOT=${INSTA_MEDIA_ROOT:-/mnt/nas/webpage/data/insta} - INSTA_MEDIA_ROOT=${INSTA_MEDIA_ROOT:-/mnt/nas/webpage/data/insta/insta_cards}
- INSTA_MEDIA_URL_PREFIX=${INSTA_MEDIA_URL_PREFIX:-/media/insta} - INSTA_MEDIA_URL_PREFIX=${INSTA_MEDIA_URL_PREFIX:-/media/insta}
- CARD_TEMPLATE_DIR=/app/templates - CARD_TEMPLATE_DIR=/app/templates
volumes: volumes:
@@ -26,7 +27,8 @@ services:
music-render: music-render:
build: build:
context: ./music-render context: .
dockerfile: music-render/Dockerfile
container_name: music-render container_name: music-render
restart: unless-stopped restart: unless-stopped
ports: ports:
@@ -52,7 +54,8 @@ services:
video-render: video-render:
build: build:
context: ./video-render context: .
dockerfile: video-render/Dockerfile
container_name: video-render container_name: video-render
restart: unless-stopped restart: unless-stopped
ports: ports:
@@ -79,7 +82,8 @@ services:
task-watcher: task-watcher:
build: build:
context: ./task-watcher context: .
dockerfile: task-watcher/Dockerfile
container_name: task-watcher container_name: task-watcher
restart: unless-stopped restart: unless-stopped
ports: ports:
@@ -98,7 +102,8 @@ services:
image-render: image-render:
build: build:
context: ./image-render context: .
dockerfile: image-render/Dockerfile
container_name: image-render container_name: image-render
restart: unless-stopped restart: unless-stopped
ports: ports:
@@ -122,3 +127,28 @@ services:
interval: 60s interval: 60s
timeout: 5s timeout: 5s
retries: 3 retries: 3
trade-monitor:
build:
context: .
dockerfile: trade-monitor/Dockerfile
container_name: trade-monitor
restart: unless-stopped
ports:
- "18715:8000"
environment:
- TZ=Asia/Seoul
- REDIS_URL=${REDIS_URL:-redis://192.168.45.54:6379}
- NAS_BASE_URL=${NAS_BASE_URL:-http://192.168.45.54:18500}
- WEBAI_API_KEY=${WEBAI_API_KEY:-}
- TM_KIS_APP_KEY=${TM_KIS_APP_KEY:-}
- TM_KIS_APP_SECRET=${TM_KIS_APP_SECRET:-}
- TM_KIS_ACCOUNT=${TM_KIS_ACCOUNT:-}
- TM_KIS_IS_VIRTUAL=${TM_KIS_IS_VIRTUAL:-0}
- TM_LOOP_INTERVAL=${TM_LOOP_INTERVAL:-60}
- TM_CLIMAX_VOL_MULT=${TM_CLIMAX_VOL_MULT:-3.0}
healthcheck:
test: ["CMD", "python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')"]
interval: 60s
timeout: 5s
retries: 3

View File

@@ -7,10 +7,13 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
ca-certificates \ ca-certificates \
&& rm -rf /var/lib/apt/lists/* && rm -rf /var/lib/apt/lists/*
COPY requirements.txt . COPY image-render/requirements.txt /app/
RUN pip install --no-cache-dir --timeout 600 --retries 5 -r requirements.txt RUN pip install --no-cache-dir --timeout 600 --retries 5 -r requirements.txt
COPY . . # F6: 공통 ReliableQueue 모듈 (services/_shared)
COPY _shared /app/_shared
COPY image-render/. /app/
ENV PYTHONPATH=/app
EXPOSE 8000 EXPOSE 8000
CMD ["python", "-m", "uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "1"] CMD ["python", "-m", "uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "1"]

View File

@@ -0,0 +1,5 @@
"""Make services/ root importable so `from _shared.reliable_queue import ...` works during tests."""
import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))

View File

@@ -3,11 +3,14 @@ from __future__ import annotations
import asyncio import asyncio
import logging import logging
import os
from contextlib import asynccontextmanager from contextlib import asynccontextmanager
import redis.asyncio as aioredis
from fastapi import FastAPI from fastapi import FastAPI
import worker import worker
from _shared.heartbeat import heartbeat_loop
logging.basicConfig(level=logging.INFO, format="%(asctime)s %(name)s %(levelname)s %(message)s") logging.basicConfig(level=logging.INFO, format="%(asctime)s %(name)s %(levelname)s %(message)s")
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -16,15 +19,19 @@ logger = logging.getLogger(__name__)
@asynccontextmanager @asynccontextmanager
async def lifespan(app: FastAPI): async def lifespan(app: FastAPI):
worker_task = asyncio.create_task(worker.worker_loop()) worker_task = asyncio.create_task(worker.worker_loop())
hb_redis = aioredis.from_url(os.getenv("REDIS_URL", "redis://192.168.45.54:6379"), decode_responses=False)
hb_task = asyncio.create_task(heartbeat_loop(hb_redis, "image-render", "render", worker.stats))
logger.info("image-render lifespan 시작") logger.info("image-render lifespan 시작")
try: try:
yield yield
finally: finally:
worker_task.cancel() for t in (worker_task, hb_task):
try: t.cancel()
await worker_task try:
except asyncio.CancelledError: await t
pass except asyncio.CancelledError:
pass
await hb_redis.aclose()
logger.info("image-render lifespan 종료") logger.info("image-render lifespan 종료")

View File

@@ -0,0 +1,83 @@
{
"5": {
"inputs": {
"width": 1024,
"height": 1024,
"batch_size": 1
},
"class_type": "EmptyLatentImage",
"_meta": {"title": "Empty Latent Image"}
},
"6": {
"inputs": {
"text": "%PROMPT%",
"clip": ["11", 0]
},
"class_type": "CLIPTextEncode",
"_meta": {"title": "Positive Prompt"}
},
"8": {
"inputs": {
"samples": ["13", 0],
"vae": ["10", 0]
},
"class_type": "VAEDecode",
"_meta": {"title": "VAE Decode"}
},
"9": {
"inputs": {
"filename_prefix": "flux",
"images": ["8", 0]
},
"class_type": "SaveImage",
"_meta": {"title": "Save Image"}
},
"10": {
"inputs": {
"vae_name": "ae.safetensors"
},
"class_type": "VAELoader",
"_meta": {"title": "Load VAE"}
},
"11": {
"inputs": {
"clip_name1": "clip_l.safetensors",
"clip_name2": "t5xxl_fp8_e4m3fn.safetensors",
"type": "flux"
},
"class_type": "DualCLIPLoader",
"_meta": {"title": "Dual CLIP Loader"}
},
"12": {
"inputs": {
"unet_name": "flux1-schnell-fp8.safetensors",
"weight_dtype": "default"
},
"class_type": "UNETLoader",
"_meta": {"title": "Load Diffusion Model"}
},
"13": {
"inputs": {
"seed": 0,
"steps": 4,
"cfg": 1.0,
"sampler_name": "euler",
"scheduler": "simple",
"denoise": 1.0,
"model": ["12", 0],
"positive": ["6", 0],
"negative": ["33", 0],
"latent_image": ["5", 0]
},
"class_type": "KSampler",
"_meta": {"title": "KSampler"}
},
"33": {
"inputs": {
"text": "",
"clip": ["11", 0]
},
"class_type": "CLIPTextEncode",
"_meta": {"title": "Negative Prompt (empty for Schnell)"}
}
}

View File

@@ -1,3 +1,8 @@
import json
from unittest.mock import AsyncMock, MagicMock
import pytest
import worker import worker
@@ -13,3 +18,74 @@ def test_dispatch_unknown_job_type_reports_failed(monkeypatch):
monkeypatch.setattr(worker, "webhook_update_task", lambda *a, **k: calls.append((a, k))) monkeypatch.setattr(worker, "webhook_update_task", lambda *a, **k: calls.append((a, k)))
worker._dispatch({"job_type": "midjourney_generation", "task_id": "t9", "params": {}}) worker._dispatch({"job_type": "midjourney_generation", "task_id": "t9", "params": {}})
assert calls[-1][0][1] == "failed" assert calls[-1][0][1] == "failed"
# ----- F6: ReliableQueue poll_once -----
@pytest.mark.asyncio
async def test_poll_once_acks_on_success(monkeypatch):
payload = {"task_id": "t1", "job_type": "gpt_image_generation", "params": {}}
raw = json.dumps(payload).encode()
fake_queue = AsyncMock()
fake_queue.dequeue = AsyncMock(return_value=(payload, raw))
fake_queue.ack = AsyncMock()
fake_queue.fail = AsyncMock()
monkeypatch.setattr(worker, "_dispatch", MagicMock())
handled = await worker.poll_once(fake_queue)
assert handled is True
fake_queue.ack.assert_awaited_once_with(raw)
fake_queue.fail.assert_not_awaited()
@pytest.mark.asyncio
async def test_poll_once_calls_fail_on_dispatch_exception(monkeypatch):
payload = {"task_id": "t2", "job_type": "gpt_image_generation", "params": {}}
raw = json.dumps(payload).encode()
fake_queue = AsyncMock()
fake_queue.dequeue = AsyncMock(return_value=(payload, raw))
fake_queue.ack = AsyncMock()
fake_queue.fail = AsyncMock()
def _boom(p):
raise RuntimeError("dispatch crash")
monkeypatch.setattr(worker, "_dispatch", _boom)
handled = await worker.poll_once(fake_queue)
assert handled is True
fake_queue.fail.assert_awaited_once_with(raw, payload)
fake_queue.ack.assert_not_awaited()
@pytest.mark.asyncio
async def test_poll_once_returns_false_on_timeout(monkeypatch):
fake_queue = AsyncMock()
fake_queue.dequeue = AsyncMock(return_value=None)
fake_queue.ack = AsyncMock()
fake_queue.fail = AsyncMock()
monkeypatch.setattr(worker, "_dispatch", MagicMock())
handled = await worker.poll_once(fake_queue)
assert handled is False
fake_queue.ack.assert_not_awaited()
fake_queue.fail.assert_not_awaited()
# ----- heartbeat stats 카운터 -----
class _OneJobQueue:
def __init__(self): self.acked = False
async def dequeue(self, timeout=5):
if self.acked: return None
return ({"job_type": "flux_generation", "task_id": "t1", "params": {}}, b"raw")
async def ack(self, raw): self.acked = True
async def fail(self, raw, payload): pass
@pytest.mark.asyncio
async def test_poll_once_increments_jobs_done(monkeypatch):
worker.stats.jobs_done = 0
monkeypatch.setattr(worker, "run_flux_generation", lambda task_id, params: None)
handled = await worker.poll_once(_OneJobQueue())
assert handled is True
assert worker.stats.jobs_done == 1
assert worker.stats.busy is False
assert worker.stats.last_job_at is not None

View File

@@ -1,7 +1,7 @@
"""Redis BLPOP worker — queue:image-render → job_type dispatch → NAS webhook. """Redis ReliableQueue worker — F6 신뢰성 패턴 (BLMOVE + ack/fail + recovery).
queue:paused 가 set이면 대기 (task-watcher가 박재오 활동 감지 시 set). queue:paused 가 set이면 대기 (task-watcher가 박재오 활동 감지 시 set).
video-render worker.py 패턴 — string-based dispatch + getattr (테스트 patch 호환). string-based dispatch + getattr (테스트 patch 호환).
""" """
from __future__ import annotations from __future__ import annotations
@@ -17,6 +17,8 @@ from nas_client import webhook_update_task
from providers.gpt_image import run_gpt_image_generation from providers.gpt_image import run_gpt_image_generation
from providers.nano_banana import run_nano_banana_generation from providers.nano_banana import run_nano_banana_generation
from providers.flux import run_flux_generation from providers.flux import run_flux_generation
from _shared.reliable_queue import ReliableQueue
from _shared.heartbeat import WorkerStats, utc_now_iso
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -24,6 +26,8 @@ REDIS_URL = os.getenv("REDIS_URL", "redis://192.168.45.54:6379")
QUEUE_KEY = "queue:image-render" QUEUE_KEY = "queue:image-render"
PAUSED_KEY = "queue:paused" PAUSED_KEY = "queue:paused"
stats = WorkerStats()
# string names so `unittest.mock.patch` / `monkeypatch.setattr` on `worker.<name>` # string names so `unittest.mock.patch` / `monkeypatch.setattr` on `worker.<name>`
# is correctly intercepted by getattr(sys.modules[__name__], ...) # is correctly intercepted by getattr(sys.modules[__name__], ...)
_DISPATCH_TABLE = { _DISPATCH_TABLE = {
@@ -52,25 +56,49 @@ def _dispatch(payload: dict) -> None:
fn(task_id, params) fn(task_id, params)
async def poll_once(queue: ReliableQueue) -> bool:
"""F6 — 1 cycle: dequeue → _dispatch → ack/fail. Returns True if a job handled."""
result = await queue.dequeue(timeout=5)
if result is None:
return False
payload, raw = result
stats.busy = True
try:
await asyncio.to_thread(_dispatch, payload)
except Exception:
logger.exception("dispatch unhandled exception task_id=%s",
payload.get("task_id"))
await queue.fail(raw, payload)
stats.jobs_failed += 1
stats.last_job_at = utc_now_iso()
stats.busy = False
return True
await queue.ack(raw)
stats.jobs_done += 1
stats.last_job_at = utc_now_iso()
stats.busy = False
return True
async def worker_loop(): async def worker_loop():
redis = aioredis.from_url(REDIS_URL, decode_responses=False) redis = aioredis.from_url(REDIS_URL, decode_responses=False)
logger.info("image-render worker started (queue=%s)", QUEUE_KEY) queue = ReliableQueue(redis, queue_key=QUEUE_KEY)
logger.info("image-render worker started worker_id=%s queue=%s",
queue.worker_id, QUEUE_KEY)
try:
recovered = await queue.recover()
if recovered:
logger.info("recovered %d orphaned items at startup", recovered)
except Exception:
logger.exception("startup recover failed")
while True: while True:
try: try:
paused = await redis.get(PAUSED_KEY) paused = await redis.get(PAUSED_KEY)
if paused == b"1": if paused == b"1":
await asyncio.sleep(10) await asyncio.sleep(10)
continue continue
item = await redis.blpop(QUEUE_KEY, timeout=5) await poll_once(queue)
if item is None:
continue
_, raw = item
try:
payload = json.loads(raw)
except json.JSONDecodeError:
logger.error("invalid queue payload: %r", raw[:200])
continue
await asyncio.to_thread(_dispatch, payload)
except asyncio.CancelledError: except asyncio.CancelledError:
logger.info("worker_loop cancelled") logger.info("worker_loop cancelled")
raise raise

View File

@@ -7,8 +7,9 @@ REDIS_URL=redis://192.168.45.54:6379
NAS_BASE_URL=http://192.168.45.54:18700 NAS_BASE_URL=http://192.168.45.54:18700
INTERNAL_API_KEY=__copy_from_nas_dotenv__ INTERNAL_API_KEY=__copy_from_nas_dotenv__
# NAS SMB mount 안의 미디어 디렉토리 (/mnt/nas/webpage/data/insta/) # NAS SMB mount 안의 미디어 디렉토리.
INSTA_MEDIA_ROOT=/mnt/nas/webpage/data/insta # ⚠️ nginx가 /media/insta를 data/insta/insta_cards/로 서빙하므로 반드시 insta_cards까지 포함.
INSTA_MEDIA_ROOT=/mnt/nas/webpage/data/insta/insta_cards
# nginx 서빙 prefix (NAS webhook payload에 보낼 result_path 만들 때) # nginx 서빙 prefix (NAS webhook payload에 보낼 result_path 만들 때)
INSTA_MEDIA_URL_PREFIX=/media/insta INSTA_MEDIA_URL_PREFIX=/media/insta

View File

@@ -12,11 +12,14 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
libcairo2 libasound2 libatspi2.0-0 \ libcairo2 libasound2 libatspi2.0-0 \
&& rm -rf /var/lib/apt/lists/* && rm -rf /var/lib/apt/lists/*
COPY requirements.txt . COPY insta-render/requirements.txt /app/
RUN pip install --no-cache-dir --timeout 600 --retries 5 -r requirements.txt RUN pip install --no-cache-dir --timeout 600 --retries 5 -r requirements.txt
RUN playwright install chromium RUN playwright install chromium
COPY . . # F6: 공통 ReliableQueue 모듈 (services/_shared)
COPY _shared /app/_shared
COPY insta-render/. /app/
ENV PYTHONPATH=/app
EXPOSE 8000 EXPOSE 8000
CMD ["python", "-m", "uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "1"] CMD ["python", "-m", "uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "1"]

View File

@@ -151,8 +151,11 @@ async def _render_slate_locked(slate: dict, slate_id: int, template: str) -> Lis
html_path = f.name html_path = f.name
try: try:
await page.goto(f"file://{html_path}", wait_until="networkidle") await page.goto(f"file://{html_path}", wait_until="networkidle")
await page.evaluate("document.fonts.ready") # 웹폰트 로딩 완료까지 대기
out_path = os.path.join(out_dir, f"{spec['page_no']:02d}.png") out_path = os.path.join(out_dir, f"{spec['page_no']:02d}.png")
await page.screenshot(path=out_path, full_page=False, omit_background=False) await page.screenshot(path=out_path, full_page=False, omit_background=False)
if os.path.getsize(out_path) < 1000: # 빈/깨진 PNG 방어
raise RuntimeError(f"rendered PNG too small: {out_path}")
paths.append(out_path) paths.append(out_path)
finally: finally:
try: try:

View File

@@ -0,0 +1,5 @@
"""Make services/ root importable so `from _shared.reliable_queue import ...` works during tests."""
import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))

View File

@@ -3,12 +3,15 @@ from __future__ import annotations
import asyncio import asyncio
import logging import logging
import os
from contextlib import asynccontextmanager from contextlib import asynccontextmanager
import redis.asyncio as aioredis
from fastapi import FastAPI from fastapi import FastAPI
import card_renderer import card_renderer
import worker import worker
from _shared.heartbeat import heartbeat_loop
logging.basicConfig(level=logging.INFO, format="%(asctime)s %(name)s %(levelname)s %(message)s") logging.basicConfig(level=logging.INFO, format="%(asctime)s %(name)s %(levelname)s %(message)s")
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -20,15 +23,19 @@ async def lifespan(app: FastAPI):
await card_renderer.init_browser() await card_renderer.init_browser()
# 큐 워커 백그라운드 시작 # 큐 워커 백그라운드 시작
worker_task = asyncio.create_task(worker.worker_loop()) worker_task = asyncio.create_task(worker.worker_loop())
hb_redis = aioredis.from_url(os.getenv("REDIS_URL", "redis://192.168.45.54:6379"), decode_responses=False)
hb_task = asyncio.create_task(heartbeat_loop(hb_redis, "insta-render", "render", worker.stats))
logger.info("insta-render lifespan 시작") logger.info("insta-render lifespan 시작")
try: try:
yield yield
finally: finally:
worker_task.cancel() for t in (worker_task, hb_task):
try: t.cancel()
await worker_task try:
except asyncio.CancelledError: await t
pass except asyncio.CancelledError:
pass
await hb_redis.aclose()
await card_renderer.shutdown_browser() await card_renderer.shutdown_browser()
logger.info("insta-render lifespan 종료") logger.info("insta-render lifespan 종료")

View File

@@ -3,52 +3,85 @@
<head> <head>
<meta charset="UTF-8"> <meta charset="UTF-8">
<style> <style>
@import url('https://fonts.googleapis.com/css2?family=Noto+Sans+KR:wght@400;700;900&display=swap'); @import url('https://cdn.jsdelivr.net/gh/orioncactus/pretendard@v1.3.9/dist/web/static/pretendard.css');
* { margin: 0; padding: 0; box-sizing: border-box; } * { margin: 0; padding: 0; box-sizing: border-box; }
html, body { html, body { width: 1080px; height: 1350px; }
width: 1080px; height: 1350px; body {
font-family: 'Noto Sans KR', sans-serif; font-family: 'Pretendard', 'Noto Sans KR', sans-serif;
background: #F7F7FA; color: #14171A; background: #F7F7FA; color: #14171A;
-webkit-font-smoothing: antialiased;
} }
.card { .card {
width: 1080px; height: 1350px; position: relative; width: 1080px; height: 1350px; overflow: hidden;
padding: 80px 72px; padding: 96px 84px 72px;
display: flex; flex-direction: column; justify-content: space-between; display: flex; flex-direction: column;
background: linear-gradient(180deg, #FFFFFF 0%, #F7F7FA 100%); background: #FFFFFF;
border-top: 16px solid {{ accent_color }};
} }
.accent-bar { position: absolute; top: 0; left: 0; width: 100%; height: 14px; background: {{ accent_color | safe }}; }
.badge { .badge {
display: inline-block; padding: 8px 20px; border-radius: 999px; align-self: flex-start; padding: 10px 24px; border-radius: 999px;
background: {{ accent_color }}; color: #fff; background: {{ accent_color | safe }}; color: #fff;
font-size: 28px; font-weight: 700; letter-spacing: -0.02em; font-size: 30px; font-weight: 700; letter-spacing: -0.02em;
} }
.idx { font-size: 120px; font-weight: 800; line-height: 1; color: {{ accent_color | safe }}; letter-spacing: -0.04em; }
.content { flex: 1; display: flex; flex-direction: column; justify-content: center; gap: 36px; }
.headline { .headline {
font-size: {{ 96 if page_type == 'cover' else 72 }}px; font-weight: 800; line-height: 1.18; letter-spacing: -0.04em; color: #14171A;
font-weight: 900; line-height: 1.15; letter-spacing: -0.04em; display: -webkit-box; -webkit-box-orient: vertical; overflow: hidden;
margin-top: 32px;
} }
.body { .cover .headline { font-size: 104px; -webkit-line-clamp: 4; }
font-size: 40px; font-weight: 400; line-height: 1.55; .body-page .headline { font-size: 76px; -webkit-line-clamp: 3; }
margin-top: 40px; color: #2A2F35; .cta .headline { font-size: 88px; -webkit-line-clamp: 3; }
.sub {
font-size: 42px; font-weight: 400; line-height: 1.5; color: #3A4047;
display: -webkit-box; -webkit-box-orient: vertical; overflow: hidden; -webkit-line-clamp: 8;
white-space: pre-wrap; white-space: pre-wrap;
} }
.cover .sub { -webkit-line-clamp: 5; }
.footer { .footer {
display: flex; justify-content: space-between; align-items: center; display: flex; justify-content: space-between; align-items: center;
font-size: 28px; color: #6B7280; font-weight: 500; font-size: 28px; color: #8A9099; font-weight: 600; margin-top: 40px;
} }
.cta { font-weight: 700; color: {{ accent_color }}; } .cta-pill {
align-self: flex-start; margin-top: 8px; padding: 18px 40px; border-radius: 16px;
background: {{ accent_color | safe }}; color: #fff; font-size: 40px; font-weight: 700;
}
.progress { display: flex; gap: 10px; }
.progress i { width: 14px; height: 14px; border-radius: 50%; background: #D8DCE0; display: inline-block; }
.progress i.on { background: {{ accent_color | safe }}; }
</style> </style>
</head> </head>
<body> <body>
<div class="card"> <div class="card {{ 'cover' if page_type=='cover' else ('cta' if page_type=='cta' else 'body-page') }}">
<div> <div class="accent-bar"></div>
<span class="badge">{{ page_type|upper }}</span>
<h1 class="headline">{{ headline }}</h1> {% if page_type == 'cover' %}
<p class="body">{{ body }}</p> <span class="badge">{{ category_label|default('') or '오늘의 이슈' }}</span>
</div> <div class="content">
<h1 class="headline">{{ headline }}</h1>
<p class="sub">{{ body }}</p>
</div>
{% elif page_type == 'cta' %}
<div class="content">
<h1 class="headline">{{ headline }}</h1>
<p class="sub">{{ body }}</p>
{% if cta %}<div class="cta-pill">{{ cta }}</div>{% endif %}
</div>
{% else %}
<span class="idx">{{ '%02d'|format(page_no - 1) }}</span>
<div class="content">
<h1 class="headline">{{ headline }}</h1>
<p class="sub">{{ body }}</p>
</div>
{% endif %}
<div class="footer"> <div class="footer">
<span>{{ page_no }} / {{ total_pages }}</span> {% if page_type == 'cover' or page_type == 'cta' %}
{% if cta %}<span class="cta">{{ cta }}</span>{% endif %} <span>{{ brand_handle|default('') }}</span><span>{{ page_no }} / {{ total_pages }}</span>
{% else %}
<div class="progress">{% for n in range(2, total_pages) %}<i class="{{ 'on' if n <= page_no }}"></i>{% endfor %}</div>
<span>{{ page_no }} / {{ total_pages }}</span>
{% endif %}
</div> </div>
</div> </div>
</body> </body>

View File

@@ -1,10 +1,13 @@
"""worker.py — Redis BLPOP + webhook 단위 테스트.""" """worker.py — Redis BLPOP + webhook 단위 테스트."""
import json import json
import os
from pathlib import Path
import pytest import pytest
import httpx import httpx
from unittest.mock import AsyncMock, patch from unittest.mock import AsyncMock, patch
import worker import worker
from card_renderer import render_slate, init_browser, shutdown_browser
@pytest.fixture @pytest.fixture
@@ -112,11 +115,142 @@ async def test_process_one_render_failure_reports_failed(monkeypatch, fake_slate
worker.NAS_BASE_URL = "http://nas.test" worker.NAS_BASE_URL = "http://nas.test"
async with httpx.AsyncClient() as client: async with httpx.AsyncClient() as client:
await worker._process_one(client, { # F6: _process_one은 webhook(failed) 호출 후 raise — poll_once가 fail(raw)로 retry/dead-letter.
"task_id": "t-3", with pytest.raises(RuntimeError, match="Chromium"):
"params": {"slate_id": 99}, await worker._process_one(client, {
}) "task_id": "t-3",
"params": {"slate_id": 99},
})
last = calls[-1] last = calls[-1]
assert last["status"] == "failed" assert last["status"] == "failed"
assert "Chromium" in last["error"] assert "Chromium" in last["error"]
# ----- F6: ReliableQueue (ack on success, fail on exception) -----
@pytest.mark.asyncio
async def test_poll_once_acks_on_success(monkeypatch):
"""F6 — 성공 시 queue.ack(raw) 호출 + fail 안 부름."""
fake_payload = {
"task_id": "t-ok",
"params": {"slate_id": 7, "theme": "default"},
}
fake_raw = json.dumps(fake_payload).encode()
fake_queue = AsyncMock()
fake_queue.dequeue = AsyncMock(return_value=(fake_payload, fake_raw))
fake_queue.ack = AsyncMock()
fake_queue.fail = AsyncMock()
process_mock = AsyncMock()
monkeypatch.setattr(worker, "_process_one", process_mock)
async with httpx.AsyncClient() as client:
handled = await worker.poll_once(fake_queue, client)
assert handled is True
process_mock.assert_awaited_once()
fake_queue.ack.assert_awaited_once_with(fake_raw)
fake_queue.fail.assert_not_awaited()
@pytest.mark.asyncio
async def test_poll_once_calls_fail_on_exception(monkeypatch):
"""F6 — _process_one 예외 시 queue.fail(raw, payload) 호출."""
fake_payload = {
"task_id": "t-err",
"params": {"slate_id": 9, "theme": "default"},
}
fake_raw = json.dumps(fake_payload).encode()
fake_queue = AsyncMock()
fake_queue.dequeue = AsyncMock(return_value=(fake_payload, fake_raw))
fake_queue.ack = AsyncMock()
fake_queue.fail = AsyncMock()
async def boom(client, payload):
raise RuntimeError("simulated dispatch failure")
monkeypatch.setattr(worker, "_process_one", boom)
async with httpx.AsyncClient() as client:
handled = await worker.poll_once(fake_queue, client)
assert handled is True
fake_queue.fail.assert_awaited_once_with(fake_raw, fake_payload)
fake_queue.ack.assert_not_awaited()
@pytest.mark.asyncio
async def test_render_produces_nonempty_1080x1350(tmp_path, monkeypatch):
"""Phase 2 — fonts.ready 대기 + PNG 비어있음 검증: 10장 모두 > 1000 bytes."""
import card_renderer as _cr
templates_dir = str(Path(__file__).resolve().parent.parent / "templates")
monkeypatch.setattr(_cr, "CARD_TEMPLATE_DIR", templates_dir)
monkeypatch.setattr(_cr, "INSTA_MEDIA_ROOT", str(tmp_path))
await init_browser()
try:
slate = {
"cover_copy": {"headline": "헤드라인", "body": "서브", "accent_color": "#0F62FE"},
"body_copies": [{"headline": f"포인트{i}", "body": "본문"} for i in range(8)],
"cta_copy": {"headline": "요약", "body": "마무리", "cta": "팔로우"},
}
paths = await render_slate(slate, slate_id=99999)
assert len(paths) == 10
for p in paths:
assert os.path.getsize(p) > 1000 # 비어있지 않음
finally:
await shutdown_browser()
@pytest.mark.asyncio
async def test_poll_once_returns_false_on_timeout(monkeypatch):
"""F6 — dequeue가 None 반환(타임아웃)이면 False 리턴, ack/fail 안 부름."""
fake_queue = AsyncMock()
fake_queue.dequeue = AsyncMock(return_value=None)
fake_queue.ack = AsyncMock()
fake_queue.fail = AsyncMock()
process_mock = AsyncMock()
monkeypatch.setattr(worker, "_process_one", process_mock)
async with httpx.AsyncClient() as client:
handled = await worker.poll_once(fake_queue, client)
assert handled is False
process_mock.assert_not_awaited()
fake_queue.ack.assert_not_awaited()
fake_queue.fail.assert_not_awaited()
def test_make_queue_redis_socket_timeout_exceeds_block():
"""BLMOVE(블록 5s) dequeue가 read-timeout 경계 경합으로 깨지지 않도록
socket_timeout이 블록보다 충분히 커야 한다 (회귀 가드)."""
c = worker.make_queue_redis()
st = c.connection_pool.connection_kwargs.get("socket_timeout")
assert st is not None and st > 5 # blmove 블록(5s)보다 커야 안정
# ----- heartbeat stats 카운터 -----
class _OneJobQueueInsta:
def __init__(self): self.acked = False
async def dequeue(self, timeout=5):
if self.acked: return None
return ({"task_id": "t1", "params": {"slate_id": 1, "theme": "default"}}, b"raw")
async def ack(self, raw): self.acked = True
async def fail(self, raw, payload): pass
@pytest.mark.asyncio
async def test_poll_once_increments_jobs_done(monkeypatch):
worker.stats.jobs_done = 0
async def fake_process(client, payload): pass
monkeypatch.setattr(worker, "_process_one", fake_process)
async with httpx.AsyncClient() as client:
handled = await worker.poll_once(_OneJobQueueInsta(), client)
assert handled is True
assert worker.stats.jobs_done == 1
assert worker.stats.busy is False
assert worker.stats.last_job_at is not None

View File

@@ -1,11 +1,10 @@
"""Redis BLPOP worker — queue:insta-render → render_slate → NAS webhook. """Redis ReliableQueue worker — F6 신뢰성 패턴 (BLMOVE + ack/fail + recovery).
queue:paused가 set이면 대기 (task-watcher가 박재오 활동 감지 시 set). queue:paused가 set이면 대기 (task-watcher가 박재오 활동 감지 시 set).
""" """
from __future__ import annotations from __future__ import annotations
import asyncio import asyncio
import json
import logging import logging
import os import os
from typing import Any from typing import Any
@@ -14,9 +13,12 @@ import httpx
import redis.asyncio as aioredis import redis.asyncio as aioredis
from card_renderer import render_slate from card_renderer import render_slate
from _shared.reliable_queue import ReliableQueue
from _shared.heartbeat import WorkerStats, utc_now_iso
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
stats = WorkerStats()
REDIS_URL = os.getenv("REDIS_URL", "redis://192.168.45.54:6379") REDIS_URL = os.getenv("REDIS_URL", "redis://192.168.45.54:6379")
NAS_BASE_URL = os.getenv("NAS_BASE_URL", "http://192.168.45.54:18700") NAS_BASE_URL = os.getenv("NAS_BASE_URL", "http://192.168.45.54:18700")
@@ -57,7 +59,10 @@ async def _fetch_slate(client: httpx.AsyncClient, slate_id: int) -> dict:
async def _process_one(client: httpx.AsyncClient, payload: dict) -> None: async def _process_one(client: httpx.AsyncClient, payload: dict) -> None:
"""단일 작업 처리: fetch slate → render → webhook.""" """단일 작업 처리: fetch slate → render → webhook. 예외 발생 시 webhook(failed) 호출 후 raise.
F6: webhook 통신 외 예외는 poll_once가 fail(raw, payload)로 retry/dead-letter 처리.
"""
task_id = payload["task_id"] task_id = payload["task_id"]
params = payload.get("params", {}) params = payload.get("params", {})
slate_id = params.get("slate_id") slate_id = params.get("slate_id")
@@ -69,7 +74,6 @@ async def _process_one(client: httpx.AsyncClient, payload: dict) -> None:
slate = await _fetch_slate(client, slate_id) slate = await _fetch_slate(client, slate_id)
await _post_update(client, task_id, "processing", 50) await _post_update(client, task_id, "processing", 50)
paths = await render_slate(slate, slate_id, template=template) paths = await render_slate(slate, slate_id, template=template)
# 결과 URL은 첫 페이지의 nginx 경로
first_url = f"{INSTA_MEDIA_URL_PREFIX}/{slate_id}/01.png" first_url = f"{INSTA_MEDIA_URL_PREFIX}/{slate_id}/01.png"
await _post_update( await _post_update(
client, task_id, "succeeded", 100, result_path=first_url client, task_id, "succeeded", 100, result_path=first_url
@@ -78,29 +82,68 @@ async def _process_one(client: httpx.AsyncClient, payload: dict) -> None:
except Exception as e: except Exception as e:
logger.exception("render task=%s 실패", task_id) logger.exception("render task=%s 실패", task_id)
await _post_update(client, task_id, "failed", 0, error=str(e)) await _post_update(client, task_id, "failed", 0, error=str(e))
raise
async def poll_once(queue: ReliableQueue, client: httpx.AsyncClient) -> bool:
"""1 cycle: dequeue → _process_one → ack/fail. Returns True if a job handled."""
result = await queue.dequeue(timeout=5)
if result is None:
return False
payload, raw = result
stats.busy = True
try:
await _process_one(client, payload)
except Exception:
await queue.fail(raw, payload)
stats.jobs_failed += 1
stats.last_job_at = utc_now_iso()
stats.busy = False
return True
await queue.ack(raw)
stats.jobs_done += 1
stats.last_job_at = utc_now_iso()
stats.busy = False
return True
# 블로킹 dequeue는 BLMOVE(블록 5s)를 쓴다. redis-py 블로킹 read에서 socket_timeout이
# 블록(5s) 이하이거나 None이면 read-timeout이 블록 경계와 경합해 간헐적으로
# "Timeout reading"이 터져 잡을 못 꺼낸다(슬레이트 draft 정지). 실험상 socket_timeout이
# 블록보다 충분히 크면(10/30) 항상 안정. → 블록보다 넉넉히 큰 값을 명시한다.
QUEUE_SOCKET_TIMEOUT = 30 # > dequeue blmove 블록(5s)
def make_queue_redis():
"""블로킹 dequeue(BLMOVE)용 redis 클라이언트. socket_timeout > 블록(5s) 보장."""
return aioredis.from_url(
REDIS_URL, decode_responses=False,
socket_timeout=QUEUE_SOCKET_TIMEOUT, socket_keepalive=True,
)
async def worker_loop(): async def worker_loop():
"""무한 루프 — paused 체크 → BLPOP → process_one.""" """무한 루프 — paused 체크 → ReliableQueue.dequeue → process_one → ack/fail."""
redis = aioredis.from_url(REDIS_URL, decode_responses=False) redis = make_queue_redis()
queue = ReliableQueue(redis, queue_key=QUEUE_KEY)
async with httpx.AsyncClient() as client: async with httpx.AsyncClient() as client:
logger.info("insta-render worker started (queue=%s)", QUEUE_KEY) logger.info("insta-render worker started worker_id=%s queue=%s",
queue.worker_id, QUEUE_KEY)
# F6: startup recovery — 이전 crash 시 잔존 orphan 재큐
try:
recovered = await queue.recover()
if recovered:
logger.info("recovered %d orphaned items at startup", recovered)
except Exception:
logger.exception("startup recover failed")
while True: while True:
try: try:
paused = await redis.get(PAUSED_KEY) paused = await redis.get(PAUSED_KEY)
if paused == b"1": if paused == b"1":
await asyncio.sleep(10) await asyncio.sleep(10)
continue continue
item = await redis.blpop(QUEUE_KEY, timeout=1) await poll_once(queue, client)
if item is None:
continue
_, raw = item
try:
payload = json.loads(raw)
except json.JSONDecodeError:
logger.error("invalid queue payload: %r", raw[:200])
continue
await _process_one(client, payload)
except asyncio.CancelledError: except asyncio.CancelledError:
logger.info("worker_loop cancelled") logger.info("worker_loop cancelled")
raise raise

View File

@@ -8,10 +8,13 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
ca-certificates \ ca-certificates \
&& rm -rf /var/lib/apt/lists/* && rm -rf /var/lib/apt/lists/*
COPY requirements.txt . COPY music-render/requirements.txt /app/
RUN pip install --no-cache-dir --timeout 600 --retries 5 -r requirements.txt RUN pip install --no-cache-dir --timeout 600 --retries 5 -r requirements.txt
COPY . . # F6: 공통 ReliableQueue 모듈 (services/_shared)
COPY _shared /app/_shared
COPY music-render/. /app/
ENV PYTHONPATH=/app
EXPOSE 8000 EXPOSE 8000
CMD ["python", "-m", "uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "1"] CMD ["python", "-m", "uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "1"]

View File

@@ -0,0 +1,5 @@
"""Make services/ root importable so `from _shared.reliable_queue import ...` works during tests."""
import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))

View File

@@ -7,12 +7,15 @@ from __future__ import annotations
import asyncio import asyncio
import logging import logging
import os
from contextlib import asynccontextmanager from contextlib import asynccontextmanager
import redis.asyncio as aioredis
from fastapi import FastAPI, HTTPException from fastapi import FastAPI, HTTPException
from pydantic import BaseModel from pydantic import BaseModel
import worker import worker
from _shared.heartbeat import heartbeat_loop
from providers.sync_ops import ( from providers.sync_ops import (
generate_lyrics, get_credits, generate_lyrics, get_credits,
get_timestamped_lyrics, generate_style_boost, get_timestamped_lyrics, generate_style_boost,
@@ -25,15 +28,19 @@ logger = logging.getLogger(__name__)
@asynccontextmanager @asynccontextmanager
async def lifespan(app: FastAPI): async def lifespan(app: FastAPI):
worker_task = asyncio.create_task(worker.worker_loop()) worker_task = asyncio.create_task(worker.worker_loop())
hb_redis = aioredis.from_url(os.getenv("REDIS_URL", "redis://192.168.45.54:6379"), decode_responses=False)
hb_task = asyncio.create_task(heartbeat_loop(hb_redis, "music-render", "render", worker.stats))
logger.info("music-render lifespan 시작") logger.info("music-render lifespan 시작")
try: try:
yield yield
finally: finally:
worker_task.cancel() for t in (worker_task, hb_task):
try: t.cancel()
await worker_task try:
except asyncio.CancelledError: await t
pass except asyncio.CancelledError:
pass
await hb_redis.aclose()
logger.info("music-render lifespan 종료") logger.info("music-render lifespan 종료")

View File

@@ -107,3 +107,85 @@ def test_dispatch_add_instrumental_calls_run_add_instrumental():
with patch("worker.run_add_instrumental") as m: with patch("worker.run_add_instrumental") as m:
worker._dispatch(payload) worker._dispatch(payload)
m.assert_called_once_with("t13", {"upload_url": "u"}) m.assert_called_once_with("t13", {"upload_url": "u"})
# ----- F6: ReliableQueue poll_once -----
from unittest.mock import AsyncMock
@pytest.mark.asyncio
async def test_poll_once_acks_on_success(monkeypatch):
"""F6 — _dispatch 정상 return → queue.ack(raw)."""
payload = {"task_id": "t1", "job_type": "suno_generation", "params": {}}
raw = json.dumps(payload).encode()
fake_queue = AsyncMock()
fake_queue.dequeue = AsyncMock(return_value=(payload, raw))
fake_queue.ack = AsyncMock()
fake_queue.fail = AsyncMock()
monkeypatch.setattr(worker, "_dispatch", MagicMock())
handled = await worker.poll_once(fake_queue)
assert handled is True
fake_queue.ack.assert_awaited_once_with(raw)
fake_queue.fail.assert_not_awaited()
@pytest.mark.asyncio
async def test_poll_once_calls_fail_on_dispatch_exception(monkeypatch):
"""F6 — _dispatch unhandled exception → queue.fail(raw, payload)."""
payload = {"task_id": "t2", "job_type": "suno_generation", "params": {}}
raw = json.dumps(payload).encode()
fake_queue = AsyncMock()
fake_queue.dequeue = AsyncMock(return_value=(payload, raw))
fake_queue.ack = AsyncMock()
fake_queue.fail = AsyncMock()
def _boom(p):
raise RuntimeError("dispatch crash")
monkeypatch.setattr(worker, "_dispatch", _boom)
handled = await worker.poll_once(fake_queue)
assert handled is True
fake_queue.fail.assert_awaited_once_with(raw, payload)
fake_queue.ack.assert_not_awaited()
@pytest.mark.asyncio
async def test_poll_once_returns_false_on_timeout(monkeypatch):
fake_queue = AsyncMock()
fake_queue.dequeue = AsyncMock(return_value=None)
fake_queue.ack = AsyncMock()
fake_queue.fail = AsyncMock()
dispatch_mock = MagicMock()
monkeypatch.setattr(worker, "_dispatch", dispatch_mock)
handled = await worker.poll_once(fake_queue)
assert handled is False
dispatch_mock.assert_not_called()
fake_queue.ack.assert_not_awaited()
fake_queue.fail.assert_not_awaited()
# ----- heartbeat stats 카운터 -----
class _OneJobQueue:
def __init__(self): self.acked = False
async def dequeue(self, timeout=5):
if self.acked: return None
return ({"job_type": "suno_generation", "task_id": "t1", "params": {}}, b"raw")
async def ack(self, raw): self.acked = True
async def fail(self, raw, payload): pass
@pytest.mark.asyncio
async def test_poll_once_increments_jobs_done(monkeypatch):
worker.stats.jobs_done = 0
monkeypatch.setattr(worker, "run_suno_generation", lambda task_id, params: None)
handled = await worker.poll_once(_OneJobQueue())
assert handled is True
assert worker.stats.jobs_done == 1
assert worker.stats.busy is False
assert worker.stats.last_job_at is not None

View File

@@ -1,4 +1,4 @@
"""Redis BLPOP worker — queue:music-render → job_type 디스패치 → NAS webhook. """Redis ReliableQueue worker — F6 신뢰성 패턴 (BLMOVE + ack/fail + recovery).
queue:paused 가 set이면 대기 (task-watcher가 박재오 활동 감지 시 set). queue:paused 가 set이면 대기 (task-watcher가 박재오 활동 감지 시 set).
""" """
@@ -20,6 +20,8 @@ from providers.suno import (
run_add_instrumental, run_video_generate, run_add_instrumental, run_video_generate,
) )
from providers.local import run_local_generation from providers.local import run_local_generation
from _shared.reliable_queue import ReliableQueue
from _shared.heartbeat import WorkerStats, utc_now_iso
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -27,6 +29,8 @@ REDIS_URL = os.getenv("REDIS_URL", "redis://192.168.45.54:6379")
QUEUE_KEY = "queue:music-render" QUEUE_KEY = "queue:music-render"
PAUSED_KEY = "queue:paused" PAUSED_KEY = "queue:paused"
stats = WorkerStats()
# Maps job_type → module-level function name (string). # Maps job_type → module-level function name (string).
# _dispatch resolves the name via globals() at call time so unittest.mock.patch # _dispatch resolves the name via globals() at call time so unittest.mock.patch
# on "worker.<name>" is correctly intercepted. # on "worker.<name>" is correctly intercepted.
@@ -67,26 +71,51 @@ def _dispatch(payload: dict) -> None:
fn(task_id, params) fn(task_id, params)
async def poll_once(queue: ReliableQueue) -> bool:
"""F6 — 1 cycle: dequeue → _dispatch → ack/fail. Returns True if a job handled."""
result = await queue.dequeue(timeout=5)
if result is None:
return False
payload, raw = result
stats.busy = True
try:
# sync provider 함수 — thread로 실행해서 이벤트 루프 블로킹 방지
await asyncio.to_thread(_dispatch, payload)
except Exception:
logger.exception("dispatch unhandled exception task_id=%s",
payload.get("task_id"))
await queue.fail(raw, payload)
stats.jobs_failed += 1
stats.last_job_at = utc_now_iso()
stats.busy = False
return True
await queue.ack(raw)
stats.jobs_done += 1
stats.last_job_at = utc_now_iso()
stats.busy = False
return True
async def worker_loop(): async def worker_loop():
redis = aioredis.from_url(REDIS_URL, decode_responses=False) redis = aioredis.from_url(REDIS_URL, decode_responses=False)
logger.info("music-render worker started (queue=%s)", QUEUE_KEY) queue = ReliableQueue(redis, queue_key=QUEUE_KEY)
logger.info("music-render worker started worker_id=%s queue=%s",
queue.worker_id, QUEUE_KEY)
# F6: startup recovery
try:
recovered = await queue.recover()
if recovered:
logger.info("recovered %d orphaned items at startup", recovered)
except Exception:
logger.exception("startup recover failed")
while True: while True:
try: try:
paused = await redis.get(PAUSED_KEY) paused = await redis.get(PAUSED_KEY)
if paused == b"1": if paused == b"1":
await asyncio.sleep(10) await asyncio.sleep(10)
continue continue
item = await redis.blpop(QUEUE_KEY, timeout=1) await poll_once(queue)
if item is None:
continue
_, raw = item
try:
payload = json.loads(raw)
except json.JSONDecodeError:
logger.error("invalid queue payload: %r", raw[:200])
continue
# sync provider 함수 — thread로 실행해서 이벤트 루프 블로킹 방지
await asyncio.to_thread(_dispatch, payload)
except asyncio.CancelledError: except asyncio.CancelledError:
logger.info("worker_loop cancelled") logger.info("worker_loop cancelled")
raise raise

View File

@@ -7,10 +7,13 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
ca-certificates tzdata \ ca-certificates tzdata \
&& rm -rf /var/lib/apt/lists/* && rm -rf /var/lib/apt/lists/*
COPY requirements.txt . COPY task-watcher/requirements.txt /app/
RUN pip install --no-cache-dir --timeout 600 --retries 5 -r requirements.txt RUN pip install --no-cache-dir --timeout 600 --retries 5 -r requirements.txt
COPY . . # 공통 heartbeat 모듈 (services/_shared) — watcher.py가 from _shared.heartbeat import
COPY _shared /app/_shared
COPY task-watcher/. /app/
ENV PYTHONPATH=/app
EXPOSE 8000 EXPOSE 8000
CMD ["python", "-m", "uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "1"] CMD ["python", "-m", "uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "1"]

View File

@@ -0,0 +1,5 @@
"""Make services/ root importable so `from _shared.heartbeat import ...` works during tests."""
import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))

View File

@@ -0,0 +1,16 @@
"""task-watcher heartbeat payload — state=mode + mode 필드 검증."""
import json
from _shared.heartbeat import build_payload, WorkerStats
def test_watcher_heartbeat_payload_carries_mode():
payload = json.loads(
build_payload(
"task-watcher", "watcher", "trading",
WorkerStats(), extra={"mode": "trading"},
)
)
assert payload["kind"] == "watcher"
assert payload["state"] == "trading"
assert payload["mode"] == "trading"

View File

@@ -15,6 +15,7 @@ from zoneinfo import ZoneInfo
import redis.asyncio as aioredis import redis.asyncio as aioredis
from mode import current_mode, fetch_holidays, KST from mode import current_mode, fetch_holidays, KST
from _shared.heartbeat import build_payload, WorkerStats
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -23,6 +24,10 @@ PAUSED_KEY = "queue:paused"
LOOP_INTERVAL = 30 # 초 LOOP_INTERVAL = 30 # 초
HOLIDAYS_REFRESH = 3600 # 1시간 HOLIDAYS_REFRESH = 3600 # 1시간
PAUSED_TTL = 600 # 10분 (watcher 죽어도 자동 해제) PAUSED_TTL = 600 # 10분 (watcher 죽어도 자동 해제)
HEARTBEAT_KEY = "worker:task-watcher:heartbeat"
HEARTBEAT_TTL = 45 # LOOP_INTERVAL 30s < TTL 45s → 만료 전 갱신
_HB_STATS = WorkerStats()
async def watcher_loop(): async def watcher_loop():
@@ -46,6 +51,13 @@ async def watcher_loop():
else: else:
await redis.delete(PAUSED_KEY) await redis.delete(PAUSED_KEY)
# heartbeat (LOOP_INTERVAL=30s < TTL 45s → 만료 전 갱신)
await redis.set(
HEARTBEAT_KEY,
build_payload("task-watcher", "watcher", mode, _HB_STATS, extra={"mode": mode}),
ex=HEARTBEAT_TTL,
)
if mode != last_mode: if mode != last_mode:
logger.info("mode 전환: %s%s (paused=%s)", last_mode, mode, mode == "trading") logger.info("mode 전환: %s%s (paused=%s)", last_mode, mode, mode == "trading")
last_mode = mode last_mode = mode

View File

@@ -0,0 +1,18 @@
# Plan-realtime-trade-alerts — trade-monitor
# NAS Redis (heartbeat)
REDIS_URL=redis://192.168.45.54:6379
# NAS stock 백엔드 (monitor-set / report)
NAS_BASE_URL=http://192.168.45.54:18500
WEBAI_API_KEY=
# KIS 자체 토큰 (ai_trade와 분리된 전용 app_key)
TM_KIS_APP_KEY=
TM_KIS_APP_SECRET=
TM_KIS_ACCOUNT=
TM_KIS_IS_VIRTUAL=0
# 루프 주기(초) / sell_climax 거래량 배수 임계
TM_LOOP_INTERVAL=60
TM_CLIMAX_VOL_MULT=3.0

View File

@@ -0,0 +1,166 @@
# trade-monitor 워커 — 구현 설계
> **2026-07-03 · web-ai 소유.** 실시간 매매 알람 파이프라인의 Windows-side 워커.
> **권위 계약(원본 스펙)**: web-backend repo `docs/superpowers/specs/2026-07-02-realtime-trade-alerts-design.md` §5(계약)·§6(조건).
> 이 문서는 그 계약을 Windows Docker 워커로 구현하기 위한 **구현 설계**(모듈 분해·조건 해석·배포)만 다룬다.
---
## 1. 역할 · 경계
`services/trade-monitor/` 는 형제 워커(`image-render`, `task-watcher`)와 동일한 관례를 따르는 **FastAPI + asyncio 루프** 워커다. WSL2 Docker(`services/docker-compose.yml`)에서 구동.
**책임**
1. 60초 루프로 NAS `monitor-set` 조회 → 세션 게이트.
2. 비-KRX(알파벳) 티커 skip.
3. KIS 실시간 현재가 + 일봉 OHLCV 조회 → TA 지표 계산.
4. 매수/매도 조건 평가 → 발화집합 F 구성.
5. `POST report`로 F 전체 전송(무상태 — dedup은 NAS 영속).
6. Redis heartbeat 발신(`worker:trade-monitor:heartbeat` EX45).
7. KIS 오류는 사이클/종목 단위 격리(다음 분 재시도).
**경계 밖(안 함)**: dedup 상태 보관, 텔레그램 전송(NAS 담당), KST 세션/휴장 캘린더 재구현(NAS가 `session` 판정), 주문 실행.
---
## 2. 모듈 분해
| 파일 | 책임 | 인터페이스(순수/부작용) |
|------|------|------|
| `main.py` | FastAPI app + lifespan. `monitor_loop` + `heartbeat_loop` 스폰, `/health` | 부작용(태스크 스폰) |
| `monitor.py` | 오케스트레이션 루프. monitor-set→게이트→종목순회→firing→report. 공유 `MonitorState` 갱신 | 부작용 |
| `nas_client.py` | `get_monitor_set()` / `post_report(as_of, firing)``X-WebAI-Key` + retry | 부작용(HTTP) |
| `kis_client.py` | KIS REST: `_issue_token()`(OAuth 자체 발급, 24h 캐시) + `get_quote()` + `get_daily_ohlcv()` + 0.5s throttle | 부작용(HTTP) |
| `indicators.py` | `sma`, `rsi`, `avg_volume`, `highest_high` | **순수** |
| `conditions.py` | `evaluate_buy(ctx, buy_params)` / `evaluate_sell(ctx, exit_params)``list[firing]` | **순수** |
| `config.py` | `Settings` — env 로드 | 순수 |
순수 모듈(`indicators`, `conditions`)에 조건 로직을 격리해 테이블 기반 단위 테스트로 검증 가능하게 한다. HTTP·시간·Redis는 경계 모듈에만.
---
## 3. 데이터 흐름 (monitor_loop, 60초)
```
매 사이클:
ms = nas.get_monitor_set() # §5.1
state.session = ms.session
if ms.session == "closed":
state.hb = "market_closed"; sleep; continue # KIS 호출 0
targets = filter_krx(ms.buy_targets, ms.sell_targets) # 알파벳 티커 skip
firing = []
for t in targets: # 종목 단위 try/except 격리
quote = kis.get_quote(t.ticker) # 현재가 + 당일 누적 거래량
daily = kis.get_daily_ohlcv(t.ticker, 250) # MA200·52주 고점용
ctx = build_ctx(t, quote, daily)
if t is buy_target: firing += evaluate_buy(ctx, ms.buy_params)
if t is sell_target: firing += evaluate_sell(ctx, ms.exit_params)
if firing: state.last_alert_at = now
nas.post_report(as_of=now_kst_iso, firing=firing) # §5.2 — 빈 배열도 전송(edge clear 위해)
state.hb = "market_open"
stats.jobs_done += 1
```
`heartbeat_loop`(별도 15초 태스크)이 `state`를 읽어 §5.4 페이로드 발신.
---
## 4. 조건 로직 해석 (§6)
지표는 **일봉 시계열**(최신 last, 오름차순) + **실시간 현재가**(`price`) 기준. 데이터 부족(예: MA200용 200봉 미만)이면 해당 조건은 **미발화**(graceful skip). 각 firing은 `{ticker, kind, condition, price, detail}`.
### 매수 (`buy_targets`, `buy_params={rsi_oversold, breakout_vol_mult, pullback_pct}`)
| condition | 발화 규칙(해석) | detail |
|-----------|----------------|--------|
| `buy_ma20_pullback` | `ma20>ma50>ma200`(정배열) **AND** 최근 3봉 최저가가 `ma20*(1+pullback_pct)` 이하로 접근 **AND** `price>ma20`(반등 복귀) | `ma20, ma50, ma200, recent_low` |
| `buy_breakout` | `price > 직전 20봉 최고가`(당일 제외) **AND** `today_volume > breakout_vol_mult × avg_volume(20)` | `prior_high_20, vol_mult, avg_vol_20` |
| `buy_rsi_bounce` | RSI(14) 시계열에서 `min(rsi[-3:]) < rsi_oversold` **AND** `rsi[-1] > rsi_oversold` **AND** `rsi[-1] > rsi[-2]`(반등). 사이클마다 재계산·무상태 | `rsi, rsi_prev, rsi_oversold` |
종합: 각 조건 독립 발화(신뢰도 가중합은 NAS/텔레그램 단계 책임 아님 — 워커는 조건 발화만).
### 매도 (`sell_targets`, `exit_params={stop_pct, take_pct, trailing_pct}`, target에 `avg_price, qty, holding_high`)
| condition | 발화 규칙 | detail |
|-----------|----------|--------|
| `sell_stop_loss` | `(price-avg)/avg ≤ -stop_pct` | `avg_price, pnl_pct, stop_pct` |
| `sell_take_profit` | `(price-avg)/avg ≥ take_pct` | `avg_price, pnl_pct, take_pct` |
| `sell_trailing_stop` | `price ≤ holding_high × (1-trailing_pct)` (기본 0.10) | `holding_high, trailing_pct, drawdown_pct` |
| `sell_ma_break` | `price < ma50` (추가 `price<ma200`이면 detail.severity="high") | `ma50, ma200, severity` |
| `sell_climax` | **holdings_intel 정합**: `today_volume ≥ climax_vol_x × avg_volume(20)` **AND** `price < day_high × climax_close_pct`(윗꼬리) | `vol_mult, day_high, climax_close_pct` |
`climax_vol_x`(기본 3.0)·`climax_close_pct`(기본 0.97)는 monitor-set `exit_params`에서 읽음(BE 중앙화, main ed17193). 없으면 env `TM_CLIMAX_VOL_MULT` fallback. `day_high`는 KIS quote `stck_hgpr`(당일 세션 누적 고가).
---
## 5. KIS 클라이언트 (자체 토큰)
- `_issue_token()`: `POST {base}/oauth2/tokenP {grant_type, appkey, appsecret}``access_token`(만료 24h). 메모리 캐시, 만료 10분 전 재발급. **ai_trade와 분리된 `TM_KIS_APP_KEY/SECRET`** 사용(같은 app_key 공유 시 토큰 상호 무효화 + EGW00201).
- `get_quote(ticker)`: `inquire-price`(FHKST01010100) → `stck_prpr`(현재가), `acml_vol`(당일 누적 거래량), `stck_oprc`(당일 시가).
- `get_daily_ohlcv(ticker, days=250)`: `inquire-daily-itemchartprice`(FHKST03010100) — ai_trade `kis_client.py` 로직 복제, 오름차순.
- throttle 0.5s(초당 2회) + `_throttle_lock` 직렬화 + 429/timeout 지수 backoff(ai_trade 패턴 재사용).
> ⚠️ **운영 함정**: ai_trade와 KIS를 동시 호출하면 전용 app_key라도 KIS 계정 전체 rate limit을 공유할 수 있음. 별도 app_key로 무효화는 회피되나, 운영 시 동시 부하 모니터링 필요(Phase 7 백로그 연계).
---
## 6. heartbeat (§5.4)
`_shared.heartbeat.heartbeat_loop(redis, "trade-monitor", "trader", stats, interval=15, ttl=45, state_fn=...)`.
- `state_fn``MonitorState`를 읽어 `state ∈ {market_open, market_closed, idle}` + `extra={"last_alert_at": ...}` 반환.
- **디커플링 이유**: 루프 60초 > TTL 45초 → 인라인 발신 시 만료 갭. 15초 독립 태스크로 해소(형제 워커와 동일 구조). §5.4 필수 필드(name/kind/state/ts/last_alert_at) 충족, `jobs_done/jobs_failed`는 형제 워커처럼 superset 유지.
- 초기 상태 `idle`(첫 monitor-set 조회 전).
---
## 7. 설정 (env) — `TM_` 접두사로 ai_trade와 분리
| env | 기본값 | 용도 |
|-----|--------|------|
| `NAS_BASE_URL` | `http://192.168.45.54:18500` | stock 백엔드 |
| `WEBAI_API_KEY` | (필수) | `X-WebAI-Key` |
| `REDIS_URL` | `redis://192.168.45.54:6379` | heartbeat |
| `TM_KIS_APP_KEY` / `TM_KIS_APP_SECRET` | (필수) | KIS 자체 토큰 |
| `TM_KIS_ACCOUNT` | (필수) | KIS 계좌 |
| `TM_KIS_IS_VIRTUAL` | `0` | 실전/모의 |
| `TM_LOOP_INTERVAL` | `60` | 루프 주기(초) |
| `TM_CLIMAX_VOL_MULT` | `3.0` | sell_climax 임계 |
---
## 8. 에러 처리
- **monitor-set 실패**: 사이클 skip(report 안 함), heartbeat=`idle`, 다음 분 재시도.
- **KIS 종목 실패**: 해당 종목만 skip(로그 warning), 나머지 종목 계속.
- **report 실패**: 로그 error, 다음 사이클 신선 firing 재전송(무상태라 손실 허용).
- 루프 최상위 `try/except` — 어떤 예외도 루프를 죽이지 않음(task-watcher 패턴).
---
## 9. 테스트 전략 (pytest, 시스템 Python)
| 파일 | 검증 |
|------|------|
| `test_indicators.py` | sma/rsi/avg_volume/highest_high 수치(알려진 시계열), 데이터 부족 시 None |
| `test_conditions.py` | 8개 조건 테이블 기반(발화/미발화 경계), detail 필드 |
| `test_nas_client.py` | respx — monitor-set 파싱, report 페이로드, X-WebAI-Key 헤더, retry |
| `test_kis_client.py` | respx — 토큰 발급/캐시, quote/daily 파싱, throttle |
| `test_monitor.py` | 루프 1회(mock): closed skip, 비-KRX skip, firing 조립, last_alert_at 갱신, 종목 실패 격리 |
---
## 10. 배포
- `services/trade-monitor/Dockerfile`: task-watcher 관례 복제 — `COPY _shared /app/_shared` **필수**(빌드 컨텍스트 `.` 에서), `COPY trade-monitor/. /app/`, `PYTHONPATH=/app`, uvicorn `:8000`.
- `services/docker-compose.yml`: `trade-monitor` 서비스 추가, 포트 **18715**(image-render 18714 다음), `TZ=Asia/Seoul`, KIS/WEBAI/REDIS env, healthcheck `/health`.
- `services/.env`(비커밋): `TM_KIS_*`, `WEBAI_API_KEY` 실값. `.env.example`에 키만 기재.
---
## 11. 미해결 플래그 / 후속
1. **sell_climax** — ✅ 2026-07-03 holdings_intel 정합 완료(`price < day_high × climax_close_pct` + `exit_params` 파라미터화). BE 회신 기준.
2. **KIS 지표 필드 실검증** — quote의 `acml_vol`/`stck_oprc`, daily TR 응답 필드는 첫 운영 raw 캡처로 대조.
3. **`buy_ma20_pullback`·`buy_rsi_bounce` 해석** — "current candle series" 문구를 일봉 시계열로 해석. 첫 운영 4주 IC 검증 시 재조정 가능.
4. **KIS rate limit 공존** — ai_trade와 동시 부하. 전용 app_key로 토큰 무효화는 회피, 초당 호출 총량은 운영 모니터링.
5. **after 세션 시간외 시세**`inquire-price`가 시간외 단일가를 반영하는지 첫 운영 대조.

View File

@@ -0,0 +1,19 @@
FROM python:3.12-slim-bookworm
ENV PYTHONUNBUFFERED=1
WORKDIR /app
RUN apt-get update && apt-get install -y --no-install-recommends \
ca-certificates tzdata \
&& rm -rf /var/lib/apt/lists/*
COPY trade-monitor/requirements.txt /app/
RUN pip install --no-cache-dir --timeout 600 --retries 5 -r requirements.txt
# 공통 heartbeat 모듈 (services/_shared) — main.py가 from _shared.heartbeat import
COPY _shared /app/_shared
COPY trade-monitor/. /app/
ENV PYTHONPATH=/app
EXPOSE 8000
CMD ["python", "-m", "uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "1"]

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,99 @@
"""§6 조건 로직 (순수). ctx + params → firing 리스트."""
from __future__ import annotations
from indicators import sma, rsi_series, highest_high
def _fire(ctx: dict, kind: str, condition: str, price: float, detail: dict) -> dict:
return {
"ticker": ctx["ticker"], "kind": kind,
"condition": condition, "price": price, "detail": detail,
}
def evaluate_buy(ctx: dict, params: dict) -> list[dict]:
price = ctx["price"]
closes, highs, lows, vols = ctx["closes"], ctx["highs"], ctx["lows"], ctx["volumes"]
rsi_os = params.get("rsi_oversold", 30)
vol_mult = params.get("breakout_vol_mult", 1.5)
pullback = params.get("pullback_pct", 0.02)
firing: list[dict] = []
# buy_ma20_pullback — 정배열 + ma20 근접 저가 + 반등 복귀
ma20, ma50, ma200 = sma(closes, 20), sma(closes, 50), sma(closes, 200)
if ma20 and ma50 and ma200 and ma20 > ma50 > ma200 and len(lows) >= 3:
recent_low = min(lows[-3:])
if recent_low <= ma20 * (1 + pullback) and price > ma20:
firing.append(_fire(ctx, "buy", "buy_ma20_pullback", price, {
"ma20": round(ma20, 1), "ma50": round(ma50, 1),
"ma200": round(ma200, 1), "recent_low": recent_low,
}))
# buy_breakout — 직전 20봉 고점 돌파 + 거래량 배수
prior_high20 = highest_high(highs, 20)
avg_vol20 = sma(vols, 20)
if prior_high20 and avg_vol20 and price > prior_high20 \
and ctx["today_volume"] > vol_mult * avg_vol20:
firing.append(_fire(ctx, "buy", "buy_breakout", price, {
"prior_high_20": prior_high20,
"vol_mult": round(ctx["today_volume"] / avg_vol20, 2),
"avg_vol_20": round(avg_vol20, 0),
}))
# buy_rsi_bounce — RSI 과매도 후 반등 (무상태 재계산)
rs = rsi_series(closes, 14)
if len(rs) >= 3 and min(rs[-3:]) < rsi_os and rs[-1] > rsi_os and rs[-1] > rs[-2]:
firing.append(_fire(ctx, "buy", "buy_rsi_bounce", price, {
"rsi": round(rs[-1], 1), "rsi_prev": round(rs[-2], 1),
"rsi_oversold": rsi_os,
}))
return firing
def evaluate_sell(ctx: dict, params: dict) -> list[dict]:
price = ctx["price"]
avg = ctx.get("avg_price")
hh = ctx.get("holding_high")
closes, vols = ctx["closes"], ctx["volumes"]
stop = params.get("stop_pct", 0.08)
take = params.get("take_pct", 0.25)
trail = params.get("trailing_pct", 0.10)
firing: list[dict] = []
if avg:
pnl = (price - avg) / avg
if pnl <= -stop:
firing.append(_fire(ctx, "sell", "sell_stop_loss", price, {
"avg_price": avg, "pnl_pct": round(pnl, 4), "stop_pct": stop}))
if pnl >= take:
firing.append(_fire(ctx, "sell", "sell_take_profit", price, {
"avg_price": avg, "pnl_pct": round(pnl, 4), "take_pct": take}))
if hh and price <= hh * (1 - trail):
firing.append(_fire(ctx, "sell", "sell_trailing_stop", price, {
"holding_high": hh, "trailing_pct": trail,
"drawdown_pct": round((price - hh) / hh, 4)}))
ma50, ma200 = sma(closes, 50), sma(closes, 200)
if ma50 and price < ma50:
severity = "high" if (ma200 and price < ma200) else "normal"
firing.append(_fire(ctx, "sell", "sell_ma_break", price, {
"ma50": round(ma50, 1),
"ma200": round(ma200, 1) if ma200 else None,
"severity": severity}))
# sell_climax — holdings_intel 정합(stock/app/holdings_intel.py:109-118):
# 거래량 ≥ 20일평균 × climax_vol_x AND 종가 < 당일고가 × climax_close_pct (윗꼬리)
# 실시간이므로 day_high = 당일 세션 누적 고가(최신 1분봉 고가 아님).
climax_vol_x = params.get("climax_vol_x", ctx.get("climax_vol_mult", 3.0))
climax_close_pct = params.get("climax_close_pct", 0.97)
avg_vol20 = sma(vols, 20)
day_high = ctx.get("day_high")
if avg_vol20 and day_high and ctx["today_volume"] >= climax_vol_x * avg_vol20 \
and price < day_high * climax_close_pct:
firing.append(_fire(ctx, "sell", "sell_climax", price, {
"vol_mult": round(ctx["today_volume"] / avg_vol20, 2),
"day_high": day_high, "climax_close_pct": climax_close_pct}))
return firing

View File

@@ -0,0 +1,32 @@
"""Settings — 환경변수 로드. TM_ 접두사로 ai_trade와 분리."""
from __future__ import annotations
import os
from dataclasses import dataclass
@dataclass
class Settings:
nas_base_url: str
webai_api_key: str
redis_url: str
kis_app_key: str
kis_app_secret: str
kis_account: str
kis_is_virtual: bool
loop_interval: int
climax_vol_mult: float
def load_settings() -> Settings:
return Settings(
nas_base_url=os.getenv("NAS_BASE_URL", "http://192.168.45.54:18500"),
webai_api_key=os.getenv("WEBAI_API_KEY", ""),
redis_url=os.getenv("REDIS_URL", "redis://192.168.45.54:6379"),
kis_app_key=os.getenv("TM_KIS_APP_KEY", ""),
kis_app_secret=os.getenv("TM_KIS_APP_SECRET", ""),
kis_account=os.getenv("TM_KIS_ACCOUNT", ""),
kis_is_virtual=os.getenv("TM_KIS_IS_VIRTUAL", "0") == "1",
loop_interval=int(os.getenv("TM_LOOP_INTERVAL", "60")),
climax_vol_mult=float(os.getenv("TM_CLIMAX_VOL_MULT", "3.0")),
)

View File

@@ -0,0 +1,5 @@
"""services/ 루트를 sys.path에 추가 — from _shared.heartbeat import 가능하게."""
import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))

View File

@@ -0,0 +1,38 @@
"""순수 TA 지표 — sma / rsi_series / highest_high."""
from __future__ import annotations
def sma(values: list[float], period: int) -> float | None:
if period <= 0 or len(values) < period:
return None
return sum(values[-period:]) / period
def highest_high(highs: list[float], period: int) -> float | None:
if period <= 0 or len(highs) < period:
return None
return max(highs[-period:])
def rsi_series(closes: list[float], period: int = 14) -> list[float]:
"""Wilder RSI. 반환 리스트는 closes[period:]에 1:1 정렬. 부족하면 []."""
if len(closes) <= period:
return []
deltas = [closes[i] - closes[i - 1] for i in range(1, len(closes))]
gains = [d if d > 0 else 0.0 for d in deltas]
losses = [-d if d < 0 else 0.0 for d in deltas]
def _rsi(ag: float, al: float) -> float:
if al == 0:
return 100.0
rs = ag / al
return 100.0 - 100.0 / (1.0 + rs)
avg_gain = sum(gains[:period]) / period
avg_loss = sum(losses[:period]) / period
out = [_rsi(avg_gain, avg_loss)]
for i in range(period, len(deltas)):
avg_gain = (avg_gain * (period - 1) + gains[i]) / period
avg_loss = (avg_loss * (period - 1) + losses[i]) / period
out.append(_rsi(avg_gain, avg_loss))
return out

View File

@@ -0,0 +1,124 @@
"""KIS REST client — 자체 OAuth 토큰(TM_KIS_*) + quote + 일봉 + throttle."""
from __future__ import annotations
import asyncio
import logging
import time
from datetime import datetime, timedelta
from zoneinfo import ZoneInfo
import httpx
logger = logging.getLogger(__name__)
KST = ZoneInfo("Asia/Seoul")
_MAX_ATTEMPTS = 3
_THROTTLE_INTERVAL = 0.5 # 초당 2회
_TOKEN_MARGIN = 600 # 만료 10분 전 재발급
class KISClient:
def __init__(self, app_key, app_secret, account, is_virtual, timeout: float = 10.0):
self._app_key = app_key
self._app_secret = app_secret
self._account = account
self._base_url = (
"https://openapivts.koreainvestment.com:29443" if is_virtual
else "https://openapi.koreainvestment.com:9443"
)
self._client = httpx.AsyncClient(timeout=timeout)
self._token: str | None = None
self._token_exp: float = 0.0
self._last_throttle_at = 0.0
self._throttle_lock = asyncio.Lock()
self._token_lock = asyncio.Lock()
async def close(self) -> None:
await self._client.aclose()
async def _issue_token(self) -> str:
async with self._token_lock:
now = time.time()
if self._token and now < self._token_exp - _TOKEN_MARGIN:
return self._token
r = await self._client.post(
f"{self._base_url}/oauth2/tokenP",
json={"grant_type": "client_credentials",
"appkey": self._app_key, "appsecret": self._app_secret},
)
r.raise_for_status()
data = r.json()
self._token = data["access_token"]
self._token_exp = now + int(data.get("expires_in", 86400))
return self._token
async def _throttle(self) -> None:
async with self._throttle_lock:
elapsed = time.monotonic() - self._last_throttle_at
if elapsed < _THROTTLE_INTERVAL:
await asyncio.sleep(_THROTTLE_INTERVAL - elapsed)
self._last_throttle_at = time.monotonic()
async def _request(self, method: str, path: str, tr_id: str, **kwargs) -> dict:
token = await self._issue_token()
headers = {
"authorization": f"Bearer {token}",
"appkey": self._app_key, "appsecret": self._app_secret,
"tr_id": tr_id, "custtype": "P",
}
url = f"{self._base_url}{path}"
for attempt in range(_MAX_ATTEMPTS):
await self._throttle()
try:
resp = await self._client.request(method, url, headers=headers, **kwargs)
if resp.status_code == 429 and attempt < _MAX_ATTEMPTS - 1:
await asyncio.sleep(2 ** attempt)
continue
resp.raise_for_status()
return resp.json()
except httpx.TimeoutException:
if attempt < _MAX_ATTEMPTS - 1:
await asyncio.sleep(2 ** attempt)
continue
raise
raise RuntimeError("retry exhausted")
async def get_quote(self, ticker: str) -> dict:
raw = await self._request(
"GET", "/uapi/domestic-stock/v1/quotations/inquire-price",
tr_id="FHKST01010100",
params={"FID_COND_MRKT_DIV_CODE": "J", "FID_INPUT_ISCD": ticker},
)
o = raw.get("output", {})
return {
"price": int(o["stck_prpr"]),
"day_open": int(o["stck_oprc"]),
"day_high": int(o["stck_hgpr"]),
"today_volume": int(o["acml_vol"]),
"as_of": datetime.now(KST).isoformat(),
}
async def get_daily_ohlcv(self, ticker: str, days: int = 250) -> list[dict]:
today = datetime.now(KST).strftime("%Y%m%d")
start = (datetime.now(KST) - timedelta(days=days * 2)).strftime("%Y%m%d")
raw = await self._request(
"GET", "/uapi/domestic-stock/v1/quotations/inquire-daily-itemchartprice",
tr_id="FHKST03010100",
params={"FID_COND_MRKT_DIV_CODE": "J", "FID_INPUT_ISCD": ticker,
"FID_INPUT_DATE_1": start, "FID_INPUT_DATE_2": today,
"FID_PERIOD_DIV_CODE": "D", "FID_ORG_ADJ_PRC": "1"},
)
bars = []
for row in raw.get("output2", []):
try:
d = row["stck_bsop_date"]
bars.append({
"datetime": f"{d[:4]}-{d[4:6]}-{d[6:]}",
"open": int(row["stck_oprc"]), "high": int(row["stck_hgpr"]),
"low": int(row["stck_lwpr"]), "close": int(row["stck_clpr"]),
"volume": int(row["acml_vol"]),
})
except (KeyError, ValueError):
continue
bars.reverse()
return bars[-days:]

View File

@@ -0,0 +1,62 @@
"""trade-monitor FastAPI entry — lifespan(monitor_loop + heartbeat_loop) + /health."""
from __future__ import annotations
import asyncio
import logging
from contextlib import asynccontextmanager
import redis.asyncio as aioredis
from fastapi import FastAPI
import monitor
from config import load_settings
from kis_client import KISClient
from nas_client import NASClient
from _shared.heartbeat import heartbeat_loop, WorkerStats
logging.basicConfig(level=logging.INFO,
format="%(asctime)s %(name)s %(levelname)s %(message)s")
logger = logging.getLogger(__name__)
HEARTBEAT_INTERVAL = 15 # 60초 루프 > TTL 45초 → 독립 15초 발신으로 만료갭 해소
HEARTBEAT_TTL = 45
@asynccontextmanager
async def lifespan(app: FastAPI):
settings = load_settings()
nas = NASClient(settings.nas_base_url, settings.webai_api_key)
kis = KISClient(settings.kis_app_key, settings.kis_app_secret,
settings.kis_account, settings.kis_is_virtual)
state = monitor.MonitorState()
stats = WorkerStats()
redis = aioredis.from_url(settings.redis_url, decode_responses=False)
mon_task = asyncio.create_task(
monitor.monitor_loop(nas, kis, state, stats, settings))
hb_task = asyncio.create_task(heartbeat_loop(
redis, "trade-monitor", "trader", stats,
interval=HEARTBEAT_INTERVAL, ttl=HEARTBEAT_TTL,
state_fn=monitor.make_state_fn(state)))
logger.info("trade-monitor lifespan 시작")
try:
yield
finally:
for t in (mon_task, hb_task):
t.cancel()
try:
await t
except asyncio.CancelledError:
pass
await kis.close()
await nas.close()
await redis.aclose()
logger.info("trade-monitor lifespan 종료")
app = FastAPI(lifespan=lifespan)
@app.get("/health")
def health():
return {"ok": True, "service": "trade-monitor"}

View File

@@ -0,0 +1,114 @@
"""오케스트레이션 — monitor-set 조회 → 조건 평가 → report + heartbeat state."""
from __future__ import annotations
import asyncio
import logging
from datetime import datetime
from zoneinfo import ZoneInfo
from conditions import evaluate_buy, evaluate_sell
logger = logging.getLogger(__name__)
KST = ZoneInfo("Asia/Seoul")
class MonitorState:
"""monitor_loop가 갱신, heartbeat state_fn이 읽는 공유 상태."""
def __init__(self):
self.session_state = "idle" # market_open | market_closed | idle
self.last_alert_at: str | None = None
def filter_krx(targets: list[dict]) -> list[dict]:
"""6자리 숫자 티커(KRX)만. 알파벳 티커 skip."""
out = []
for t in targets:
tk = str(t.get("ticker", ""))
if tk.isdigit() and len(tk) == 6:
out.append(t)
return out
async def _build_ctx(kis, target: dict, settings) -> dict:
ticker = target["ticker"]
quote = await kis.get_quote(ticker)
daily = await kis.get_daily_ohlcv(ticker, 250)
return {
"ticker": ticker, "name": target.get("name", ""),
"price": quote["price"], "day_open": quote["day_open"],
"day_high": quote["day_high"],
"today_volume": quote["today_volume"],
"closes": [b["close"] for b in daily],
"highs": [b["high"] for b in daily],
"lows": [b["low"] for b in daily],
"volumes": [b["volume"] for b in daily],
"avg_price": target.get("avg_price"),
"qty": target.get("qty"),
"holding_high": target.get("holding_high"),
"climax_vol_mult": settings.climax_vol_mult,
}
async def run_cycle(nas, kis, state, stats, settings) -> None:
try:
ms = await nas.get_monitor_set()
except Exception:
logger.exception("monitor-set 조회 실패")
state.session_state = "idle"
stats.jobs_failed += 1
return
session = ms.get("session", "closed")
if session == "closed":
state.session_state = "market_closed"
return
buy_targets = filter_krx(ms.get("buy_targets", []))
sell_targets = filter_krx(ms.get("sell_targets", []))
buy_params = ms.get("buy_params", {})
exit_params = ms.get("exit_params", {})
firing: list[dict] = []
for t in buy_targets:
try:
firing += evaluate_buy(await _build_ctx(kis, t, settings), buy_params)
except Exception:
logger.exception("buy 평가 실패 %s", t.get("ticker"))
for t in sell_targets:
try:
firing += evaluate_sell(await _build_ctx(kis, t, settings), exit_params)
except Exception:
logger.exception("sell 평가 실패 %s", t.get("ticker"))
as_of = datetime.now(KST).isoformat(timespec="seconds")
if firing:
state.last_alert_at = as_of
logger.info("firing %d개: %s", len(firing),
[f"{f['ticker']}:{f['condition']}" for f in firing])
try:
await nas.post_report(as_of, firing) # 빈 배열도 전송(edge clear)
except Exception:
logger.exception("report 전송 실패")
state.session_state = "market_open"
stats.jobs_done += 1
stats.last_job_at = as_of
async def monitor_loop(nas, kis, state, stats, settings) -> None:
logger.info("trade-monitor loop 시작 interval=%ds", settings.loop_interval)
while True:
try:
await run_cycle(nas, kis, state, stats, settings)
except asyncio.CancelledError:
logger.info("monitor_loop cancelled")
raise
except Exception:
logger.exception("monitor_loop iteration 실패")
await asyncio.sleep(settings.loop_interval)
def make_state_fn(state):
async def state_fn(redis, stats):
return state.session_state, {"last_alert_at": state.last_alert_at}
return state_fn

View File

@@ -0,0 +1,48 @@
"""NAS stock 백엔드 trade-alert 계약 — X-WebAI-Key + retry."""
from __future__ import annotations
import asyncio
import logging
import httpx
logger = logging.getLogger(__name__)
_MAX_ATTEMPTS = 3
_RETRY_STATUSES = {429, 500, 502, 503, 504}
class NASClient:
def __init__(self, base_url: str, api_key: str, timeout: float = 10.0):
self._base_url = base_url.rstrip("/")
self._api_key = api_key
self._client = httpx.AsyncClient(timeout=timeout)
async def close(self) -> None:
await self._client.aclose()
async def get_monitor_set(self) -> dict:
return await self._request("GET", "/api/webai/trade-alert/monitor-set")
async def post_report(self, as_of: str, firing: list[dict]) -> dict:
return await self._request(
"POST", "/api/webai/trade-alert/report",
json={"as_of": as_of, "firing": firing})
async def _request(self, method: str, path: str, **kwargs) -> dict:
url = f"{self._base_url}{path}"
headers = {"X-WebAI-Key": self._api_key}
for attempt in range(_MAX_ATTEMPTS):
try:
resp = await self._client.request(method, url, headers=headers, **kwargs)
if resp.status_code in _RETRY_STATUSES and attempt < _MAX_ATTEMPTS - 1:
await asyncio.sleep(2 ** attempt)
continue
resp.raise_for_status()
return resp.json()
except httpx.TimeoutException:
if attempt < _MAX_ATTEMPTS - 1:
await asyncio.sleep(2 ** attempt)
continue
raise
raise RuntimeError("retry exhausted")

View File

@@ -0,0 +1,2 @@
[pytest]
asyncio_mode = auto

View File

@@ -0,0 +1,7 @@
fastapi==0.115.6
uvicorn[standard]==0.34.0
redis>=5.0
httpx>=0.27
pytest>=8.0
pytest-asyncio>=0.24
respx>=0.21

View File

View File

@@ -0,0 +1,66 @@
"""evaluate_buy — 3개 매수 조건 경계."""
from conditions import evaluate_buy
BUY_PARAMS = {"rsi_oversold": 30, "breakout_vol_mult": 1.5, "pullback_pct": 0.02}
def _ctx(**over):
base = dict(
ticker="005930", name="삼성전자", price=100.0, day_open=99.0,
today_volume=1000.0, closes=[], highs=[], lows=[], volumes=[],
avg_price=None, qty=None, holding_high=None, climax_vol_mult=3.0,
)
base.update(over)
return base
def _conditions(firing):
return {f["condition"] for f in firing}
def test_ma20_pullback_fires():
# 정배열(ma20>ma50>ma200), 최근 저가가 ma20 근처, price가 ma20 위로 반등
closes = [90.0] * 200 + [100.0] * 20 # ma20=100, ma50/ma200 낮음 → 정배열
lows = [90.0] * 217 + [100.5, 100.4, 100.3] # 최근 3봉 저가 ~ma20*(1.02)=102 이하
ctx = _ctx(price=101.0, closes=closes, highs=closes, lows=lows,
volumes=[1.0] * len(closes))
assert "buy_ma20_pullback" in _conditions(evaluate_buy(ctx, BUY_PARAMS))
def test_ma20_pullback_skips_when_not_aligned():
closes = [100.0] * 200 + [90.0] * 20 # 역배열
ctx = _ctx(price=91.0, closes=closes, highs=closes, lows=closes,
volumes=[1.0] * len(closes))
assert "buy_ma20_pullback" not in _conditions(evaluate_buy(ctx, BUY_PARAMS))
def test_breakout_fires():
closes = [50.0] * 25
highs = [60.0] * 25 # 직전 20봉 최고 60
vols = [100.0] * 25 # avg20=100
ctx = _ctx(price=61.0, today_volume=200.0, closes=closes, highs=highs,
lows=closes, volumes=vols) # 61>60, 200>1.5*100
assert "buy_breakout" in _conditions(evaluate_buy(ctx, BUY_PARAMS))
def test_breakout_skips_on_low_volume():
highs = [60.0] * 25
ctx = _ctx(price=61.0, today_volume=120.0, closes=[50.0] * 25, highs=highs,
lows=[50.0] * 25, volumes=[100.0] * 25) # 120 < 1.5*100=150
assert "buy_breakout" not in _conditions(evaluate_buy(ctx, BUY_PARAMS))
def test_rsi_bounce_fires():
# 14봉 급락으로 RSI<30 찍고 5봉 반등하여 30 위로 복귀
closes = [100.0]
for _ in range(14):
closes.append(closes[-1] * 0.97) # 하락 → RSI 저하
for _ in range(5):
closes.append(closes[-1] * 1.05) # 반등 → RSI 30 위로
ctx = _ctx(price=closes[-1], closes=closes, highs=closes, lows=closes,
volumes=[1.0] * len(closes))
assert "buy_rsi_bounce" in _conditions(evaluate_buy(ctx, BUY_PARAMS))
def test_empty_series_no_fire():
assert evaluate_buy(_ctx(), BUY_PARAMS) == []

View File

@@ -0,0 +1,89 @@
"""evaluate_sell — 5개 매도 조건 경계."""
from conditions import evaluate_sell
EXIT = {"stop_pct": 0.08, "take_pct": 0.25, "trailing_pct": 0.10,
"climax_vol_x": 3.0, "climax_close_pct": 0.97}
def _ctx(**over):
base = dict(
ticker="000660", name="SK하이닉스", price=100.0, day_open=100.0,
day_high=100.0, today_volume=100.0, closes=[100.0] * 60,
highs=[100.0] * 60, lows=[100.0] * 60, volumes=[100.0] * 60,
avg_price=100.0, qty=10, holding_high=100.0, climax_vol_mult=3.0,
)
base.update(over)
return base
def _c(firing):
return {f["condition"] for f in firing}
def test_stop_loss_fires():
ctx = _ctx(price=90.0, avg_price=100.0) # -10% <= -8%
assert "sell_stop_loss" in _c(evaluate_sell(ctx, EXIT))
def test_stop_loss_skips_above_threshold():
ctx = _ctx(price=95.0, avg_price=100.0) # -5% > -8%
assert "sell_stop_loss" not in _c(evaluate_sell(ctx, EXIT))
def test_take_profit_fires():
ctx = _ctx(price=130.0, avg_price=100.0) # +30% >= 25%
assert "sell_take_profit" in _c(evaluate_sell(ctx, EXIT))
def test_trailing_stop_fires():
ctx = _ctx(price=89.0, holding_high=100.0) # 89 <= 100*0.9=90
assert "sell_trailing_stop" in _c(evaluate_sell(ctx, EXIT))
def test_ma_break_severity_high():
# price가 ma50/ma200 아래 → severity high (ma200 계산 위해 200봉 필요)
closes = [200.0] * 200
ctx = _ctx(price=100.0, closes=closes, avg_price=100.0, holding_high=100.0)
firing = evaluate_sell(ctx, EXIT)
mb = [f for f in firing if f["condition"] == "sell_ma_break"]
assert mb and mb[0]["detail"]["severity"] == "high"
def test_climax_fires():
# holdings_intel 정합: 거래량 3배 이상 + 종가 < 당일고가×0.97 (윗꼬리)
ctx = _ctx(price=96.0, day_high=100.0, today_volume=400.0,
volumes=[100.0] * 60) # 400>=3*100, 96 < 100*0.97=97
assert "sell_climax" in _c(evaluate_sell(ctx, EXIT))
def test_climax_skips_when_not_reversal():
# 종가가 당일고가의 97% 이상 → 윗꼬리 아님
ctx = _ctx(price=99.0, day_high=100.0, today_volume=400.0,
volumes=[100.0] * 60) # 99 >= 100*0.97=97 → 반전 아님
assert "sell_climax" not in _c(evaluate_sell(ctx, EXIT))
def test_climax_uses_exit_params_vol_x():
# exit_params.climax_vol_x=5.0 → 400 < 5*100=500 → 미발화
exit5 = {**EXIT, "climax_vol_x": 5.0}
ctx = _ctx(price=96.0, day_high=100.0, today_volume=400.0,
volumes=[100.0] * 60)
assert "sell_climax" not in _c(evaluate_sell(ctx, exit5))
def test_climax_uses_exit_params_close_pct():
# climax_close_pct=0.90 → 임계 90, price=95 → 95<90? No → 미발화
exit90 = {**EXIT, "climax_close_pct": 0.90}
ctx = _ctx(price=95.0, day_high=100.0, today_volume=400.0,
volumes=[100.0] * 60)
assert "sell_climax" not in _c(evaluate_sell(ctx, exit90))
# 기본 0.97이면 95 < 97 → 발화
assert "sell_climax" in _c(evaluate_sell(ctx, EXIT))
def test_no_avg_no_pnl_conditions():
# avg_price None(보유정보 없음) → stop/take 미발화
ctx = _ctx(price=50.0, avg_price=None, holding_high=None,
closes=[100.0] * 60)
conds = _c(evaluate_sell(ctx, EXIT))
assert "sell_stop_loss" not in conds and "sell_take_profit" not in conds

View File

@@ -0,0 +1,27 @@
"""Settings env 로드 — 기본값 + override."""
from config import load_settings
def test_defaults(monkeypatch):
for k in ("NAS_BASE_URL", "WEBAI_API_KEY", "REDIS_URL", "TM_KIS_APP_KEY",
"TM_KIS_APP_SECRET", "TM_KIS_ACCOUNT", "TM_KIS_IS_VIRTUAL",
"TM_LOOP_INTERVAL", "TM_CLIMAX_VOL_MULT"):
monkeypatch.delenv(k, raising=False)
s = load_settings()
assert s.nas_base_url == "http://192.168.45.54:18500"
assert s.redis_url == "redis://192.168.45.54:6379"
assert s.kis_is_virtual is False
assert s.loop_interval == 60
assert s.climax_vol_mult == 3.0
def test_override(monkeypatch):
monkeypatch.setenv("TM_KIS_IS_VIRTUAL", "1")
monkeypatch.setenv("TM_LOOP_INTERVAL", "30")
monkeypatch.setenv("TM_CLIMAX_VOL_MULT", "2.5")
monkeypatch.setenv("WEBAI_API_KEY", "secret")
s = load_settings()
assert s.kis_is_virtual is True
assert s.loop_interval == 30
assert s.climax_vol_mult == 2.5
assert s.webai_api_key == "secret"

View File

@@ -0,0 +1,6 @@
"""/health — 라우트 핸들러 직접 검증."""
from main import health
def test_health():
assert health() == {"ok": True, "service": "trade-monitor"}

View File

@@ -0,0 +1,39 @@
"""indicators — 순수 수치 검증."""
from indicators import sma, rsi_series, highest_high
def test_sma_basic():
assert sma([1, 2, 3, 4, 5], 5) == 3.0
assert sma([1, 2, 3, 4, 5], 2) == 4.5
def test_sma_insufficient():
assert sma([1, 2], 5) is None
assert sma([], 3) is None
def test_highest_high():
assert highest_high([1, 9, 3, 4], 3) == 9
assert highest_high([1, 2, 3], 3) == 3
assert highest_high([1, 2], 3) is None
def test_rsi_all_gains_is_100():
# 단조 증가 → 손실 0 → RSI 100
closes = [float(i) for i in range(1, 20)]
rs = rsi_series(closes, 14)
assert rs, "series should not be empty"
assert rs[-1] == 100.0
def test_rsi_insufficient():
assert rsi_series([1, 2, 3], 14) == []
def test_rsi_known_range():
# 등락 섞인 시계열 → RSI는 0~100 사이
closes = [10, 11, 10.5, 11.5, 11, 12, 11.8, 12.5, 12, 13,
12.7, 13.2, 12.9, 13.5, 13.1, 13.8]
rs = rsi_series(closes, 14)
assert len(rs) == len(closes) - 14
assert all(0.0 <= v <= 100.0 for v in rs)

View File

@@ -0,0 +1,56 @@
"""KISClient — 토큰 발급/캐시 + quote/daily 파싱 (respx)."""
import httpx
import respx
from kis_client import KISClient
BASE = "https://openapi.koreainvestment.com:9443"
def _client():
return KISClient("APPKEY", "APPSECRET", "12345678-01", is_virtual=False)
@respx.mock
async def test_issue_token_cached():
route = respx.post(f"{BASE}/oauth2/tokenP").mock(
return_value=httpx.Response(200, json={"access_token": "TKN", "expires_in": 86400}))
c = _client()
t1 = await c._issue_token()
t2 = await c._issue_token()
assert t1 == "TKN" and t2 == "TKN"
assert route.call_count == 1 # 캐시 → 1회만 발급
await c.close()
@respx.mock
async def test_get_quote_parses():
respx.post(f"{BASE}/oauth2/tokenP").mock(
return_value=httpx.Response(200, json={"access_token": "TKN", "expires_in": 86400}))
respx.get(f"{BASE}/uapi/domestic-stock/v1/quotations/inquire-price").mock(
return_value=httpx.Response(200, json={"output": {
"stck_prpr": "71500", "stck_oprc": "71000", "stck_hgpr": "72000",
"acml_vol": "1234567"}}))
c = _client()
q = await c.get_quote("005930")
assert q["price"] == 71500 and q["day_open"] == 71000 and q["today_volume"] == 1234567
assert q["day_high"] == 72000
await c.close()
@respx.mock
async def test_get_daily_ascending():
respx.post(f"{BASE}/oauth2/tokenP").mock(
return_value=httpx.Response(200, json={"access_token": "TKN", "expires_in": 86400}))
# KIS는 내림차순 반환 → 오름차순으로 뒤집혀야 함
respx.get(f"{BASE}/uapi/domestic-stock/v1/quotations/inquire-daily-itemchartprice").mock(
return_value=httpx.Response(200, json={"output2": [
{"stck_bsop_date": "20260702", "stck_oprc": "100", "stck_hgpr": "110",
"stck_lwpr": "90", "stck_clpr": "105", "acml_vol": "5"},
{"stck_bsop_date": "20260701", "stck_oprc": "95", "stck_hgpr": "102",
"stck_lwpr": "94", "stck_clpr": "100", "acml_vol": "4"}]}))
c = _client()
bars = await c.get_daily_ohlcv("005930", days=250)
assert bars[0]["datetime"] == "2026-07-01"
assert bars[-1]["close"] == 105
await c.close()

View File

@@ -0,0 +1,95 @@
"""monitor.run_cycle — 게이트/필터/조립/격리."""
from monitor import MonitorState, filter_krx, run_cycle
from config import load_settings
from _shared.heartbeat import WorkerStats
def test_filter_krx_keeps_only_numeric6():
targets = [{"ticker": "005930"}, {"ticker": "AAPL"}, {"ticker": "00660"},
{"ticker": "000660"}, {"ticker": "0059301"}]
kept = {t["ticker"] for t in filter_krx(targets)}
assert kept == {"005930", "000660"}
class _FakeNAS:
def __init__(self, ms):
self._ms = ms
self.reported = None
async def get_monitor_set(self):
return self._ms
async def post_report(self, as_of, firing):
self.reported = {"as_of": as_of, "firing": firing}
return {"new_alerts": len(firing), "cleared": 0}
class _FakeKIS:
def __init__(self, price=100, fail_on=None):
self._price = price
self._fail_on = fail_on or set()
async def get_quote(self, ticker):
if ticker in self._fail_on:
raise RuntimeError("KIS down")
return {"price": self._price, "day_open": 99, "day_high": 100,
"today_volume": 1000, "as_of": "x"}
async def get_daily_ohlcv(self, ticker, days=250):
# 정배열 + 저가 근접 → ma20_pullback 발화 유도
return [{"open": 90, "high": 90, "low": 90, "close": 90, "volume": 1}] * 200 \
+ [{"open": 100, "high": 100, "low": 100, "close": 100, "volume": 1}] * 20
async def test_closed_session_skips_kis():
nas = _FakeNAS({"session": "closed"})
state, stats = MonitorState(), WorkerStats()
await run_cycle(nas, _FakeKIS(), state, stats, load_settings())
assert state.session_state == "market_closed"
assert nas.reported is None # report도 안 함
async def test_non_krx_skipped_and_report_sent():
nas = _FakeNAS({"session": "regular",
"buy_targets": [{"ticker": "AAPL", "name": "Apple"}],
"sell_targets": [], "buy_params": {}, "exit_params": {}})
state, stats = MonitorState(), WorkerStats()
await run_cycle(nas, _FakeKIS(), state, stats, load_settings())
assert state.session_state == "market_open"
assert nas.reported is not None
assert nas.reported["firing"] == [] # 알파벳 티커 skip → 빈 발화
async def test_firing_assembled_and_last_alert_set():
nas = _FakeNAS({"session": "regular",
"buy_targets": [{"ticker": "005930", "name": "삼성전자"}],
"sell_targets": [], "buy_params": {"pullback_pct": 0.02},
"exit_params": {}})
state, stats = MonitorState(), WorkerStats()
await run_cycle(nas, _FakeKIS(price=101), state, stats, load_settings())
conds = {f["condition"] for f in nas.reported["firing"]}
assert "buy_ma20_pullback" in conds
assert state.last_alert_at is not None
async def test_per_ticker_failure_isolated():
nas = _FakeNAS({"session": "regular",
"buy_targets": [{"ticker": "005930"}, {"ticker": "000660"}],
"sell_targets": [], "buy_params": {}, "exit_params": {}})
state, stats = MonitorState(), WorkerStats()
# 005930은 실패, 000660은 성공 → 루프가 죽지 않고 report 전송
await run_cycle(nas, _FakeKIS(fail_on={"005930"}), state, stats, load_settings())
assert nas.reported is not None
assert state.session_state == "market_open"
async def test_monitor_set_failure_sets_idle():
class _BadNAS(_FakeNAS):
async def get_monitor_set(self):
raise RuntimeError("NAS down")
nas = _BadNAS({})
state, stats = MonitorState(), WorkerStats()
await run_cycle(nas, _FakeKIS(), state, stats, load_settings())
assert state.session_state == "idle"
assert nas.reported is None

View File

@@ -0,0 +1,39 @@
"""NASClient — monitor-set/report + X-WebAI-Key (respx)."""
import json as _json
import httpx
import respx
from nas_client import NASClient
BASE = "http://nas.test"
@respx.mock
async def test_get_monitor_set_sends_key():
route = respx.get(f"{BASE}/api/webai/trade-alert/monitor-set").mock(
return_value=httpx.Response(200, json={"session": "regular", "buy_targets": []}))
c = NASClient(BASE, "KEY")
ms = await c.get_monitor_set()
assert ms["session"] == "regular"
assert route.calls.last.request.headers["X-WebAI-Key"] == "KEY"
await c.close()
@respx.mock
async def test_post_report_payload():
captured = {}
def _resp(request):
captured.update(_json.loads(request.content))
return httpx.Response(200, json={"new_alerts": 1, "cleared": 0})
respx.post(f"{BASE}/api/webai/trade-alert/report").mock(side_effect=_resp)
c = NASClient(BASE, "KEY")
firing = [{"ticker": "005930", "kind": "buy", "condition": "buy_breakout",
"price": 71500, "detail": {}}]
out = await c.post_report("2026-07-02T09:01:00+09:00", firing)
assert out["new_alerts"] == 1
assert captured["as_of"] == "2026-07-02T09:01:00+09:00"
assert captured["firing"] == firing
await c.close()

View File

@@ -7,10 +7,13 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
ca-certificates \ ca-certificates \
&& rm -rf /var/lib/apt/lists/* && rm -rf /var/lib/apt/lists/*
COPY requirements.txt . COPY video-render/requirements.txt /app/
RUN pip install --no-cache-dir --timeout 600 --retries 5 -r requirements.txt RUN pip install --no-cache-dir --timeout 600 --retries 5 -r requirements.txt
COPY . . # F6: 공통 ReliableQueue 모듈 (services/_shared)
COPY _shared /app/_shared
COPY video-render/. /app/
ENV PYTHONPATH=/app
EXPOSE 8000 EXPOSE 8000
CMD ["python", "-m", "uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "1"] CMD ["python", "-m", "uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "1"]

View File

@@ -0,0 +1,5 @@
"""Make services/ root importable so `from _shared.reliable_queue import ...` works during tests."""
import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))

View File

@@ -3,11 +3,14 @@ from __future__ import annotations
import asyncio import asyncio
import logging import logging
import os
from contextlib import asynccontextmanager from contextlib import asynccontextmanager
import redis.asyncio as aioredis
from fastapi import FastAPI from fastapi import FastAPI
import worker import worker
from _shared.heartbeat import heartbeat_loop
logging.basicConfig(level=logging.INFO, format="%(asctime)s %(name)s %(levelname)s %(message)s") logging.basicConfig(level=logging.INFO, format="%(asctime)s %(name)s %(levelname)s %(message)s")
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -16,15 +19,19 @@ logger = logging.getLogger(__name__)
@asynccontextmanager @asynccontextmanager
async def lifespan(app: FastAPI): async def lifespan(app: FastAPI):
worker_task = asyncio.create_task(worker.worker_loop()) worker_task = asyncio.create_task(worker.worker_loop())
hb_redis = aioredis.from_url(os.getenv("REDIS_URL", "redis://192.168.45.54:6379"), decode_responses=False)
hb_task = asyncio.create_task(heartbeat_loop(hb_redis, "video-render", "render", worker.stats))
logger.info("video-render lifespan 시작") logger.info("video-render lifespan 시작")
try: try:
yield yield
finally: finally:
worker_task.cancel() for t in (worker_task, hb_task):
try: t.cancel()
await worker_task try:
except asyncio.CancelledError: await t
pass except asyncio.CancelledError:
pass
await hb_redis.aclose()
logger.info("video-render lifespan 종료") logger.info("video-render lifespan 종료")

View File

@@ -41,3 +41,78 @@ def test_dispatch_unknown_job_type_logs_error():
args = m.call_args[0] args = m.call_args[0]
assert args[0] == "t5" assert args[0] == "t5"
assert args[1] == "failed" assert args[1] == "failed"
# ----- F6: ReliableQueue poll_once -----
import json
from unittest.mock import AsyncMock, MagicMock
@pytest.mark.asyncio
async def test_poll_once_acks_on_success(monkeypatch):
payload = {"task_id": "t1", "job_type": "sora_generation", "params": {}}
raw = json.dumps(payload).encode()
fake_queue = AsyncMock()
fake_queue.dequeue = AsyncMock(return_value=(payload, raw))
fake_queue.ack = AsyncMock()
fake_queue.fail = AsyncMock()
monkeypatch.setattr(worker, "_dispatch", MagicMock())
handled = await worker.poll_once(fake_queue)
assert handled is True
fake_queue.ack.assert_awaited_once_with(raw)
fake_queue.fail.assert_not_awaited()
@pytest.mark.asyncio
async def test_poll_once_calls_fail_on_dispatch_exception(monkeypatch):
payload = {"task_id": "t2", "job_type": "sora_generation", "params": {}}
raw = json.dumps(payload).encode()
fake_queue = AsyncMock()
fake_queue.dequeue = AsyncMock(return_value=(payload, raw))
fake_queue.ack = AsyncMock()
fake_queue.fail = AsyncMock()
def _boom(p):
raise RuntimeError("dispatch crash")
monkeypatch.setattr(worker, "_dispatch", _boom)
handled = await worker.poll_once(fake_queue)
assert handled is True
fake_queue.fail.assert_awaited_once_with(raw, payload)
fake_queue.ack.assert_not_awaited()
@pytest.mark.asyncio
async def test_poll_once_returns_false_on_timeout(monkeypatch):
fake_queue = AsyncMock()
fake_queue.dequeue = AsyncMock(return_value=None)
fake_queue.ack = AsyncMock()
fake_queue.fail = AsyncMock()
monkeypatch.setattr(worker, "_dispatch", MagicMock())
handled = await worker.poll_once(fake_queue)
assert handled is False
fake_queue.ack.assert_not_awaited()
fake_queue.fail.assert_not_awaited()
# ----- heartbeat stats 카운터 -----
class _OneJobQueue:
def __init__(self): self.acked = False
async def dequeue(self, timeout=5):
if self.acked: return None
return ({"job_type": "sora_generation", "task_id": "t1", "params": {}}, b"raw")
async def ack(self, raw): self.acked = True
async def fail(self, raw, payload): pass
@pytest.mark.asyncio
async def test_poll_once_increments_jobs_done(monkeypatch):
worker.stats.jobs_done = 0
monkeypatch.setattr(worker, "run_sora_generation", lambda task_id, params: None)
handled = await worker.poll_once(_OneJobQueue())
assert handled is True
assert worker.stats.jobs_done == 1
assert worker.stats.busy is False
assert worker.stats.last_job_at is not None

View File

@@ -1,7 +1,7 @@
"""Redis BLPOP worker — queue:video-render → job_type 디스패치 → NAS webhook. """Redis ReliableQueue worker — F6 신뢰성 패턴 (BLMOVE + ack/fail + recovery).
queue:paused 가 set이면 대기 (task-watcher가 박재오 활동 감지 시 set). queue:paused 가 set이면 대기 (task-watcher가 박재오 활동 감지 시 set).
Plan-B-Music worker.py 패턴 — string-based dispatch + getattr (테스트 patch 호환). string-based dispatch + getattr (테스트 patch 호환).
""" """
from __future__ import annotations from __future__ import annotations
@@ -18,6 +18,8 @@ from providers.sora import run_sora_generation
from providers.veo import run_veo_generation from providers.veo import run_veo_generation
from providers.kling import run_kling_generation from providers.kling import run_kling_generation
from providers.seedance import run_seedance_generation from providers.seedance import run_seedance_generation
from _shared.reliable_queue import ReliableQueue
from _shared.heartbeat import WorkerStats, utc_now_iso
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -25,6 +27,8 @@ REDIS_URL = os.getenv("REDIS_URL", "redis://192.168.45.54:6379")
QUEUE_KEY = "queue:video-render" QUEUE_KEY = "queue:video-render"
PAUSED_KEY = "queue:paused" PAUSED_KEY = "queue:paused"
stats = WorkerStats()
# string names so `unittest.mock.patch` on `worker.<name>` is correctly intercepted # string names so `unittest.mock.patch` on `worker.<name>` is correctly intercepted
_DISPATCH_TABLE = { _DISPATCH_TABLE = {
"sora_generation": "run_sora_generation", "sora_generation": "run_sora_generation",
@@ -53,25 +57,49 @@ def _dispatch(payload: dict) -> None:
fn(task_id, params) fn(task_id, params)
async def poll_once(queue: ReliableQueue) -> bool:
"""F6 — 1 cycle: dequeue → _dispatch → ack/fail. Returns True if a job handled."""
result = await queue.dequeue(timeout=5)
if result is None:
return False
payload, raw = result
stats.busy = True
try:
await asyncio.to_thread(_dispatch, payload)
except Exception:
logger.exception("dispatch unhandled exception task_id=%s",
payload.get("task_id"))
await queue.fail(raw, payload)
stats.jobs_failed += 1
stats.last_job_at = utc_now_iso()
stats.busy = False
return True
await queue.ack(raw)
stats.jobs_done += 1
stats.last_job_at = utc_now_iso()
stats.busy = False
return True
async def worker_loop(): async def worker_loop():
redis = aioredis.from_url(REDIS_URL, decode_responses=False) redis = aioredis.from_url(REDIS_URL, decode_responses=False)
logger.info("video-render worker started (queue=%s)", QUEUE_KEY) queue = ReliableQueue(redis, queue_key=QUEUE_KEY)
logger.info("video-render worker started worker_id=%s queue=%s",
queue.worker_id, QUEUE_KEY)
try:
recovered = await queue.recover()
if recovered:
logger.info("recovered %d orphaned items at startup", recovered)
except Exception:
logger.exception("startup recover failed")
while True: while True:
try: try:
paused = await redis.get(PAUSED_KEY) paused = await redis.get(PAUSED_KEY)
if paused == b"1": if paused == b"1":
await asyncio.sleep(10) await asyncio.sleep(10)
continue continue
item = await redis.blpop(QUEUE_KEY, timeout=1) await poll_once(queue)
if item is None:
continue
_, raw = item
try:
payload = json.loads(raw)
except json.JSONDecodeError:
logger.error("invalid queue payload: %r", raw[:200])
continue
await asyncio.to_thread(_dispatch, payload)
except asyncio.CancelledError: except asyncio.CancelledError:
logger.info("worker_loop cancelled") logger.info("worker_loop cancelled")
raise raise