35 Commits

Author SHA1 Message Date
ace0339d33 refactor: rename stock-lab → stock (graduation)
- git mv stock-lab/ → stock/
- docker-compose.yml: 서비스 키 + container_name + build.context +
  frontend.depends_on + agent-office STOCK_LAB_URL → STOCK_URL
- agent-office/app: config.py, service_proxy.py, agents/stock.py, tests/
  STOCK_LAB_URL → STOCK_URL
- nginx/default.conf: proxy_pass http://stock-labhttp://stock (3 lines)
- CLAUDE.md / README.md / STATUS.md / scripts/ 문구 갱신
- stock/ 내부 자기 참조 갱신

lab 네이밍 정책 (feedback_lab_naming.md) graduation.
API URL / Python import / DB 파일명 변경 없음.
2026-05-15 01:45:44 +09:00
8812bd870a docs(ai_news): mark scraper.py deprecated (Phase 1 transition) 2026-05-14 02:13:30 +09:00
b3fac4f442 feat(ai_news): router forwards mapping stats to telegram 2026-05-14 02:13:06 +09:00
19aed304cb feat(ai_news): telegram includes article mapping stats line 2026-05-14 02:12:17 +09:00
bbe5221e57 feat(ai_news): pipeline uses articles_source (replaces Naver scraper) 2026-05-14 02:09:41 +09:00
ec0ccf649e feat(ai_news): include summary + pub_date in LLM prompt 2026-05-14 02:07:01 +09:00
84d90f6e1c feat(ai_news): articles_source module (substring ticker matching) 2026-05-14 02:04:32 +09:00
ddfe0ca3eb feat(ai_news): add news_sentiment.source column with migration 2026-05-14 02:00:38 +09:00
943f676414 fix(ai_news): set weight=0 and add Spearman IC validation harness
검증 전 gradient 차단 + IC 측정 인프라.

- schema.py: DEFAULT_WEIGHTS["ai_news"] 0.8 → 0.0
  + 1회성 migration: 기존 운영 row 의 0.8 값 자동 reset
  (사용자가 명시 조정한 다른 값은 그대로 유지)
- ai_news/validation.py: compute_ic() — 일자별 score_raw × forward
  return Spearman 상관, ic_mean/ic_std/ic_per_day 반환, verdict 분류
  (skip/weak/strong)
- router.py: GET /api/stock/screener/ai-news/ic?days=30&horizon=1
- 단위 테스트 5개: empty DB, strong +IC, random ≈0 IC, min_news_count
  필터, horizon=5

배경: adversarial review 결과 — ai_news 가중치 0.8 이 검증 없이 출시됨.
4주+ 데이터 누적 후 IC > 0.05 확인 전까지 데이터 수집은 계속하되
가중합 영향만 차단. 운영 DB row 의 0.8 → 0.0 자동 reset 도 같은 의도.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 01:06:02 +09:00
06162b1e6e feat(ai_news): show stock name (ticker) in telegram top 5/5 2026-05-14 00:36:10 +09:00
c3659eb6c5 fix(ai_news): assistant prefill + temperature=0 + system prompt to force JSON 2026-05-14 00:26:48 +09:00
16941d76e8 fix(ai_news): escape MarkdownV2 reserved chars in score (+, -, .) 2026-05-14 00:17:53 +09:00
9f91dae1a4 feat(agent-office): add run_ai_news command for manual trigger 2026-05-13 23:59:30 +09:00
2a552d3cc8 test(screener): update node count test to 8 (ai_news added) 2026-05-13 23:52:54 +09:00
f37b21a408 fix(agent-office): on_ai_news_schedule — graceful fail on missing telegram_text 2026-05-13 23:48:59 +09:00
df7a8d985e feat(agent-office): cron mon-fri 08:00 ai_news sentiment job 2026-05-13 23:46:37 +09:00
c5d0c84183 feat(agent-office): on_ai_news_schedule (cron handler + telegram dispatch) 2026-05-13 23:46:17 +09:00
53a78a1062 feat(agent-office): refresh_ai_news_sentiment service helper 2026-05-13 23:45:51 +09:00
ca8bcb3fed feat(screener): POST /snapshot/refresh-news-sentiment with telegram_text 2026-05-13 23:44:38 +09:00
4b4f91c052 feat(screener): register ai_news in NODE_REGISTRY 2026-05-13 23:41:21 +09:00
6c3a84b8ec feat(screener): ScreenContext.news_sentiment field + load query 2026-05-13 23:41:01 +09:00
2ff2645240 feat(screener): AiNewsSentiment ScoreNode (percentile_rank + min_news_count) 2026-05-13 23:39:42 +09:00
f2143b3889 feat(screener): ai_news telegram message builder (MarkdownV2 + cost line) 2026-05-13 23:38:07 +09:00
810cc76d40 feat(screener): ai_news pipeline (top-100 parallel, fail-soft, upsert) 2026-05-13 23:36:03 +09:00
0a91f43c46 feat(screener): ai_news Claude Haiku analyzer (-10~+10 + clamp + JSON-fail soft) 2026-05-13 23:33:20 +09:00
3d321f2b4b chore(stock-lab): add pytest + pytest-asyncio to requirements 2026-05-13 23:30:47 +09:00
6ba29599aa feat(screener): ai_news scraper (naver finance ticker news) 2026-05-13 23:29:52 +09:00
658ed13571 feat(screener): add news_sentiment table + ai_news defaults + migration 2026-05-13 23:26:38 +09:00
15ee3c3301 fix(compose): frontend.depends_on 누락된 6개 lab 추가
lotto, stock-lab, agent-office, personal, packs-lab, travel-proxy 가
누락되어 있어 한 컨테이너 다운 시 nginx upstream resolve 실패 위험.
이번 사이클에 lotto httpx 사고로 명시화된 risk 를 해소.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 22:12:07 +09:00
2b5009f864 fix(sqlite): WAL + busy_timeout 120s standardized across all labs
8개 lab의 _conn() 함수에 표준 동시성 패턴 통일:
- timeout=120.0 (connection 획득)
- PRAGMA journal_mode=WAL (reader/writer 분리)
- PRAGMA busy_timeout=120000 (트랜잭션 충돌 시 120초 대기)

stock-lab/screener/router.py 의 검증된 패턴(d9b6122) 을 lotto, stock-lab(메인),
music-lab, blog-lab, realestate-lab, agent-office, personal, travel-proxy 로 확산.
기존 'database is locked' 오류 윈도우를 흡수.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 22:12:01 +09:00
d9b612253a fix(stock-lab): snapshot flow 범위 100종목 + busy_timeout 2분 (writer 충돌 완화)
자동 잡 16:30 KST 실패 원인:
- agent-office httpx timeout 180s
- 그러나 snapshot/refresh의 flow 스크래핑(500종목 × 0.2-0.5s) = 100~250s
- 180s 초과 시 client timeout → 서버 background 처리 계속
- 곧 /run 호출 → snapshot의 long write transaction과 INSERT 충돌
- WAL은 reader/writer 분리만, writer 두 명은 직렬 → busy_timeout 30s 초과 lock

Fix:
- DEFAULT_FLOW_TOP_N 500 → 100 (시총 상위 100종목 × 0.2s = ~20s)
- busy_timeout 30s → 120s (snapshot write 시간보다 충분히 김)
- connect timeout 30s → 120s

외국인 매수 시그널은 대형주에서 의미 큼. 상위 100종목으로 충분.
더 많은 커버리지 필요 시 별도 cron으로 snapshot/refresh와 /run 시간 분리.
2026-05-13 19:56:30 +09:00
db4322006d fix(stock-lab): screener DB connection WAL 모드 + busy_timeout 30s
snapshot/refresh 직후 /run mode=auto가 'database is locked'으로 500
실패하던 증상 fix. SQLite 기본 rollback journal 모드 + busy_timeout=0
조합에서 long write transaction과 read가 겹치면 즉시 OperationalError.

PRAGMA journal_mode=WAL: reader가 writer를 block 안 함
PRAGMA busy_timeout=30000: 30초 대기 후 timeout (즉시 실패 X)
sqlite3.connect timeout=30: connection 획득 자체에도 대기 적용

agent-office 자동 잡 16:30 KST 흐름 안정화.
2026-05-13 16:50:25 +09:00
a05e6ba8ca feat(stock-lab): 텔레그램 노드 풀 라벨 + 원 단위 표기
- 아이콘(👤외/🆙고/...) 제거하고 풀 한글 라벨로 변경
  (외국인/거래량급증/20일모멘텀/52주신고가/RS레이팅/이평선정배열/VCP수축)
- 가격은 "103,917원" 형태로 원 단위 명시
- 활성 노드 없을 때 fallback 문구
- 테스트도 새 포맷으로 갱신 + 원 단위 검증 신규 케이스
2026-05-13 07:52:17 +09:00
4a333434ac Merge feature/stock-screener-board: Stock Screener Board MVP (backend + agent-office)
stock-lab:
- pykrx→FDR/네이버 데이터 전환 (KRX 인증 회피)
- 스키마 7테이블 + 디폴트 시드
- snapshot.py (FDR 마스터·일봉 + 네이버 외국인 수급, 시총 상위 500종목)
- ScreenContext, ScoreNode/GateNode 추상, percentile_rank
- 게이트 1 (HygieneGate) + 점수 노드 7 (ForeignBuy/VolumeSurge/Momentum20/
  High52WProximity/RsRating/MaAlignment/VcpLite)
- Screener 엔진 + combine + position_sizer (ATR Wilder) + telegram 빌더
- FastAPI 라우터: /nodes, /settings, /run (preview/manual_save/auto),
  /snapshot/refresh, /runs (리스트·상세), 공휴일·주말 skipped_holiday

agent-office:
- StockAgent.on_screener_schedule + run_screener 명령
- 평일 16:30 KST APScheduler cron (Asia/Seoul)
- service_proxy 헬퍼, send_raw parse_mode 확장 (MarkdownV2 지원)
- 5 신규 테스트, 38 회귀 통과
2026-05-13 07:23:43 +09:00
204cee67d6 fix(lotto): grade_weekly_review import용 httpx 의존성 추가
운영 사이트 nginx emerg 'host not found in upstream lotto'의 진짜
원인은 lotto 컨테이너 자체가 ModuleNotFoundError: httpx로 시작 실패한
것이었음. grade_weekly_review.py가 httpx를 import하는데 requirements
에서 누락. 재빌드 시 컨테이너 정상 부팅 가능.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 08:03:34 +09:00
83 changed files with 1684 additions and 97 deletions

View File

@@ -7,7 +7,7 @@
## 1. 프로젝트 개요
Synology NAS 기반의 개인 웹 플랫폼 백엔드 모노레포.
- **서비스**: lotto-lab, stock-lab, travel-proxy, music-lab, blog-lab, realestate-lab, agent-office, personal, packs-lab, deployer (10개)
- **서비스**: lotto-lab, stock, travel-proxy, music-lab, blog-lab, realestate-lab, agent-office, personal, packs-lab, deployer (10개)
- **프론트엔드**: 별도 레포 (React + Vite SPA), 빌드 산출물만 NAS에 배포
- **인프라**: Docker Compose (10컨테이너) + Nginx(리버스 프록시) + Gitea Webhook 자동 배포
@@ -32,7 +32,7 @@ Synology NAS 기반의 개인 웹 플랫폼 백엔드 모노레포.
/volume1
├── docker/webpage/ # 운영 런타임 (Docker Compose 실행 위치)
│ ├── lotto/ # lotto 소스 (rsync 동기화)
│ ├── stock-lab/ # stock-lab 소스 (rsync 동기화)
│ ├── stock/ # stock 소스 (rsync 동기화)
│ ├── travel-proxy/ # travel-proxy 소스 (rsync 동기화)
│ ├── deployer/ # deployer 소스 (rsync 동기화)
│ ├── nginx/default.conf # Nginx 설정
@@ -54,7 +54,7 @@ Synology NAS 기반의 개인 웹 플랫폼 백엔드 모노레포.
| 컨테이너 | 포트 | 역할 |
|---------|------|------|
| `lotto` | 18000 | 로또 데이터 수집·분석·추천 API |
| `stock-lab` | 18500 | 주식 뉴스·AI 분석·KIS API 연동 |
| `stock` | 18500 | 주식 뉴스·AI 분석·KIS API 연동 |
| `music-lab` | 18600 | AI 음악 생성·라이브러리 관리 API |
| `blog-lab` | 18700 | 블로그 마케팅 수익화 API |
| `realestate-lab` | 18800 | 부동산 청약 자동 수집·매칭 API |
@@ -73,9 +73,9 @@ Synology NAS 기반의 개인 웹 플랫폼 백엔드 모노레포.
|------|------------|------|
| `/api/` | `lotto:8000` | lotto API (기본) |
| `/api/travel/` | `travel-proxy:8000` | travel API |
| `/api/stock/` | `stock-lab:8000` | stock API |
| `/api/trade/` | `stock-lab:8000` | KIS 실계좌 API |
| `/api/portfolio` | `stock-lab:8000` | trailing slash 유무 모두 매칭 |
| `/api/stock/` | `stock:8000` | stock API |
| `/api/trade/` | `stock:8000` | KIS 실계좌 API |
| `/api/portfolio` | `stock:8000` | trailing slash 유무 모두 매칭 |
| `/api/music/` | `music-lab:8000` | AI 음악 생성·라이브러리 API |
| `/api/blog-marketing/` | `blog-lab:8000` | 블로그 마케팅 수익화 API |
| `/api/realestate/` | `realestate-lab:8000` | 부동산 청약 API |
@@ -205,14 +205,14 @@ docker compose up -d
| GET | `/api/lotto/briefing/{draw_no}` | 특정 회차 브리핑 |
| GET | `/api/lotto/briefing` | 브리핑 이력 |
### stock-lab (stock-lab/)
### stock (stock/)
- Windows AI 서버 연동: `WINDOWS_AI_SERVER_URL=http://192.168.45.59:8000`
- KIS API 연동으로 실계좌 잔고·거래 조회
- 뉴스 스크래핑: 네이버 증권 + 해외 사이트
- DB: `/app/data/stock.db` (articles, portfolio, broker_cash, asset_snapshots, sell_history 테이블)
- 파일 구조: `main.py`, `db.py`, `scraper.py`, `price_fetcher.py`, `holidays.json`
**stock-lab API 목록**
**stock API 목록**
| 메서드 | 경로 | 설명 |
|--------|------|------|
@@ -512,7 +512,7 @@ docker compose up -d
### agent-office (agent-office/)
- AI 에이전트 가상 오피스 — 2D 픽셀아트 사무실에서 에이전트가 실제 작업 수행
- stock-lab/music-lab/realestate-lab 기존 API를 서비스 프록시로 호출 (직접 DB 접근 없음)
- stock/music-lab/realestate-lab 기존 API를 서비스 프록시로 호출 (직접 DB 접근 없음)
- 실시간 상태 동기화: WebSocket (`/api/agent-office/ws`)
- 텔레그램 봇: 양방향 알림 + 승인 (인라인 키보드)
- 청약 매칭 알림: realestate-lab이 신규 매칭 발견 시 push → `RealestateAgent.on_new_matches()` → 텔레그램 1통(인라인 [🔖 북마크]/[📄 공고] 또는 [전체 보기] 버튼)
@@ -522,7 +522,7 @@ docker compose up -d
**에이전트 FSM 상태**: idle → working → waiting (승인 대기) → reporting → break (휴식)
**환경변수**
- `STOCK_LAB_URL`: stock-lab 내부 URL (기본 `http://stock-lab:8000`)
- `STOCK_URL`: stock 내부 URL (기본 `http://stock:8000`)
- `MUSIC_LAB_URL`: music-lab 내부 URL (기본 `http://music-lab:8000`)
- `REALESTATE_LAB_URL`: realestate-lab 내부 URL (기본 `http://realestate-lab:8000`) — 북마크 콜백 프록시 대상
- `REALESTATE_DASHBOARD_URL`: 텔레그램 [전체 보기] 버튼 URL (기본 `http://localhost:8080/realestate`)
@@ -697,7 +697,7 @@ docker compose up -d
- **캐시 전략**: `index.html``no-store`, `assets/`는 1년 장기 캐시(immutable)
- **Frontend 배포**: git push로 자동 배포되지 않음. 로컬 빌드 후 NAS에 수동 업로드
- **.env 파일**: 절대 커밋 금지. `.env.example`만 레포에 포함
- **공휴일 목록**: `stock-lab/app/holidays.json` 매년 수동 갱신 필요 (KRX 기준)
- **공휴일 목록**: `stock/app/holidays.json` 매년 수동 갱신 필요 (KRX 기준)
- **Windows AI 서버 IP**: `192.168.45.59` — 공유기 DHCP 고정 예약으로 고정. Tailscale은 Synology에서 TCP 불가(userspace 모드)라 로컬 IP 사용
- **현재가 조회**: 네이버 모바일 API → HTML 파싱 폴백, 3분 TTL 캐시 (`price_fetcher.py`)
- **시뮬레이션 교체 방식**: `best_picks`는 교체형 — 새 시뮬레이션 실행 시 `is_active=0`으로 비활성화 후 신규 입력

View File

@@ -13,8 +13,8 @@ Synology NAS 기반 개인 웹 플랫폼 백엔드 모노레포.
│ ├── 정적 SPA 서빙 (React + Vite) │
│ └── API 리버스 프록시 │
│ ├── /api/ → lotto-backend:8000 (로또·블로그·투두)│
│ ├── /api/stock/, /trade/ → stock-lab:8000 │
│ ├── /api/portfolio → stock-lab:8000 │
│ ├── /api/stock/, /trade/ → stock:8000 │
│ ├── /api/portfolio → stock:8000 │
│ ├── /api/music/ → music-lab:8000 │
│ ├── /api/blog-marketing/ → blog-lab:8000 │
│ ├── /api/realestate/ → realestate-lab:8000 │
@@ -29,7 +29,7 @@ Synology NAS 기반 개인 웹 플랫폼 백엔드 모노레포.
| 컨테이너 | 포트 | 역할 |
|---------|------|------|
| `lotto-backend` | 18000 | 로또 데이터 수집·분석·추천 + 블로그·투두 API |
| `stock-lab` | 18500 | 주식 뉴스·AI 요약·KIS 실계좌·포트폴리오·자산 추적 |
| `stock` | 18500 | 주식 뉴스·AI 요약·KIS 실계좌·포트폴리오·자산 추적 |
| `music-lab` | 18600 | AI 음악 생성 (Suno + 로컬 MusicGen 듀얼 프로바이더) |
| `blog-lab` | 18700 | 블로그 마케팅 수익화 (키워드→글 생성→리뷰→발행) |
| `realestate-lab` | 18800 | 청약 공고 자동 수집·프로필 매칭 |
@@ -45,7 +45,7 @@ Synology NAS 기반 개인 웹 플랫폼 백엔드 모노레포.
```
web-backend/
├── backend/ # lotto-backend (로또·블로그·투두)
├── stock-lab/ # 주식·포트폴리오
├── stock/ # 주식·포트폴리오
├── music-lab/ # AI 음악 생성
├── blog-lab/ # 블로그 마케팅 파이프라인
├── realestate-lab/ # 청약 자동 수집·매칭
@@ -75,7 +75,7 @@ curl http://localhost:18500/health
|--------|----------|
| Frontend + API | http://localhost:8080 |
| lotto-backend | http://localhost:18000 |
| stock-lab | http://localhost:18500 |
| stock | http://localhost:18500 |
| music-lab | http://localhost:18600 |
| blog-lab | http://localhost:18700 |
| realestate-lab | http://localhost:18800 |
@@ -99,7 +99,7 @@ curl http://localhost:18500/health
- 09:10 / 21:10 — 당첨번호 동기화 + 추천 채점
- 00:05, 04:05, 08:05, 12:05, 16:05, 20:05 — 몬테카를로 시뮬레이션 (후보 20,000 → 상위 100 → best_picks 20쌍 교체)
### 2. stock-lab (`/api/stock/`, `/api/trade/`, `/api/portfolio`)
### 2. stock (`/api/stock/`, `/api/trade/`, `/api/portfolio`)
주식 뉴스 스크래핑 + LLM 요약 + KIS 실계좌 연동 + 포트폴리오·자산 스냅샷.
@@ -152,7 +152,7 @@ curl http://localhost:18500/health
AI 에이전트 가상 오피스 — 2D 픽셀아트 사무실에서 4명의 에이전트가 실제 작업을 수행한다.
- **아키텍처**: stock-lab / music-lab / blog-lab / realestate-lab 기존 API를 서비스 프록시로 호출 (직접 DB 접근 없음)
- **아키텍처**: stock / music-lab / blog-lab / realestate-lab 기존 API를 서비스 프록시로 호출 (직접 DB 접근 없음)
- **FSM 상태**: `idle → working → waiting(승인 대기) → reporting → break`
- **실시간 동기화**: WebSocket `/api/agent-office/ws` (init, agent_state, task_complete, command_result)
- **텔레그램 연동**: 양방향 알림 + 인라인 키보드 승인
@@ -224,7 +224,7 @@ Gitea Webhook 수신 → NAS 자동 배포.
| 공동 출현 | 15% | 번호 쌍 동시 출현 빈도 |
| 다양성 | 10% | 연속번호·범위·구간 커버리지 |
### LLM 요약 provider 추상화 (stock-lab)
### LLM 요약 provider 추상화 (stock)
`ai_summarizer.py`는 provider 분리 구조. `summarize_news(articles)` 시그니처는 provider와 무관하게 고정.
@@ -232,7 +232,7 @@ Gitea Webhook 수신 → NAS 자동 배포.
- `_summarize_with_ollama`: Ollama `/api/generate` (타임아웃 180s, qwen3:14b 첫 로드 대응)
- 실패 시 `LLMError` (구 `OllamaError` alias 유지)
### 총 자산 스냅샷 (stock-lab)
### 총 자산 스냅샷 (stock)
평일 15:40 자동 실행 → `holidays.json`으로 공휴일 스킵 → 포트폴리오 현재가 조회 + 예수금 합계 → `asset_snapshots` upsert (date UNIQUE).
@@ -266,7 +266,7 @@ git push → Gitea → X-Gitea-Signature (HMAC SHA256)
| DB | 소유 서비스 | 주요 테이블 |
|----|------------|-----------|
| `lotto.db` | lotto-backend | draws, recommendations, simulation_runs/candidates, best_picks, purchase_history, strategy_performance/weights, weekly_reports, lotto_briefings, todos, blog_posts |
| `stock.db` | stock-lab | articles, portfolio, broker_cash, asset_snapshots, sell_history |
| `stock.db` | stock | articles, portfolio, broker_cash, asset_snapshots, sell_history |
| `music.db` | music-lab | music_tasks, music_library (provider, lyrics, image_url, suno_id, file_hash, cover_images, wav_url, video_url, stem_urls) |
| `blog_marketing.db` | blog-lab | keyword_analyses, blog_posts, brand_links, commissions, generation_tasks, prompt_templates |
| `realestate.db` | realestate-lab | announcements, announcement_models, user_profile, match_results, collect_log |
@@ -292,7 +292,7 @@ PGID=1000
WINDOWS_AI_SERVER_URL=http://192.168.45.59:8000
WEBHOOK_SECRET=your_secret_here
# LLM (stock-lab, blog-lab, agent-office 공통)
# LLM (stock, blog-lab, agent-office 공통)
ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_MODEL=claude-haiku-4-5-20251001
LLM_PROVIDER=claude # claude | ollama
@@ -315,7 +315,7 @@ DATA_GO_KR_API_KEY=
TELEGRAM_BOT_TOKEN=
TELEGRAM_CHAT_ID=
TELEGRAM_WEBHOOK_URL=
STOCK_LAB_URL=http://stock-lab:8000
STOCK_URL=http://stock:8000
MUSIC_LAB_URL=http://music-lab:8000
BLOG_LAB_URL=http://blog-lab:8000
REALESTATE_LAB_URL=http://realestate-lab:8000
@@ -343,7 +343,7 @@ REALESTATE_LAB_URL=http://realestate-lab:8000
- **라우트 순서** — `DELETE /api/todos/done``/api/todos/{id}` 보다 먼저 등록 필수 (FastAPI prefix 매칭)
- **캐시 전략** — `index.html`: no-store / `assets/`: 1년 immutable
- **PUID/PGID** — travel-proxy는 NAS 파일 권한을 위해 환경변수 주입 필수
- **공휴일 목록** — `stock-lab/app/holidays.json` 매년 수동 갱신 (KRX 기준)
- **공휴일 목록** — `stock/app/holidays.json` 매년 수동 갱신 (KRX 기준)
- **Windows AI 서버 IP** — `192.168.45.59` 공유기 DHCP 고정 예약. Synology Tailscale은 userspace 모드라 TCP 불가 → 로컬 IP 사용
- **Suno CDN** — `cdn1.suno.ai` URL은 임시 만료 → 생성 즉시 로컬 다운로드 필수
- **LLM provider 롤백** — Claude API 장애 시 `.env``LLM_PROVIDER=ollama`로 전환 후 `docker compose up -d`

View File

@@ -12,7 +12,7 @@
| 서비스 | 포트 | 상태 | 핵심 기능 |
|--------|------|------|-----------|
| `lotto-backend` | 18000 | ✅ | 로또 추천·통계·리포트·구매내역 + 블로그·투두 |
| `stock-lab` | 18500 | ✅ | 주식 뉴스·지수·트레이딩·포트폴리오·자산 스냅샷 |
| `stock` | 18500 | ✅ | 주식 뉴스·지수·트레이딩·포트폴리오·자산 스냅샷 |
| `music-lab` | 18600 | ✅ | Suno + MusicGen + YouTube 수익화 + 컴파일 |
| `blog-lab` | 18700 | ✅ | 블로그 마케팅 수익화 파이프라인 |
| `realestate-lab` | 18800 | ✅ | 청약 수집·5티어 매칭·매칭 알림 |

View File

@@ -51,7 +51,7 @@ class StockAgent(BaseAgent):
await self.transition("working", "최신 뉴스 수집 중...", task_id)
try:
# stock-lab cron(매일 8:00)이 7:30 브리핑보다 늦게 돌아 어제 뉴스가
# stock cron(매일 8:00)이 7:30 브리핑보다 늦게 돌아 어제 뉴스가
# 요약되던 문제 방지 — 요약 직전에 동기 스크랩으로 DB를 갱신한다.
try:
await service_proxy.scrape_stock_news()
@@ -60,7 +60,7 @@ class StockAgent(BaseAgent):
await self.transition("working", "AI 뉴스 요약 생성 중...")
# AI 요약 호출 (LLM 처리는 stock-lab이 담당)
# AI 요약 호출 (LLM 처리는 stock이 담당)
result = await service_proxy.summarize_stock_news(limit=15)
await self.transition("reporting", "뉴스 요약 전송 중...")
@@ -233,11 +233,118 @@ class StockAgent(BaseAgent):
await self.transition("idle", f"스크리너 오류: {err_msg[:80]}")
async def on_ai_news_schedule(self) -> None:
"""AI 뉴스 sentiment 분석 자동 잡 (평일 08:00 KST).
흐름:
1) stock /snapshot/refresh-news-sentiment 호출
2) status='skipped_weekend'/'skipped_holiday' → 종료 (텔레그램 미발신)
3) updated=0 → 운영자 알림 (HTML)
4) failures > 30% → 경고 알림 후 메인 메시지 발송
5) 정상 → Top 5 호재/악재 메시지 발송 (MarkdownV2)
"""
if self.state not in ("idle", "break"):
return
task_id = create_task(self.agent_id, "ai_news_sentiment", {})
await self.transition("working", "AI 뉴스 분석 중...", task_id)
try:
result = await service_proxy.refresh_ai_news_sentiment()
except Exception as e:
err_msg = str(e)
add_log(self.agent_id, f"AI 뉴스 분석 실패: {err_msg}", "error", task_id)
update_task_status(task_id, "failed", {"error": err_msg})
try:
from ..telegram.messaging import send_raw
await send_raw(
f"⚠️ <b>AI 뉴스 분석 실패</b>\n"
f"<code>{html.escape(err_msg)[:500]}</code>"
)
except Exception as notify_err:
add_log(
self.agent_id,
f"operator notify failed: {notify_err}",
"warning", task_id,
)
await self.transition("idle", f"AI 뉴스 오류: {err_msg[:80]}")
return
status = result.get("status")
if status in ("skipped_weekend", "skipped_holiday"):
update_task_status(task_id, "succeeded", {"status": status})
add_log(self.agent_id, f"AI 뉴스 건너뜀: {status}", "info", task_id)
await self.transition("idle", "휴일/주말 — 건너뜀")
return
updated = int(result.get("updated", 0))
failures = result.get("failures", []) or []
if updated == 0:
update_task_status(task_id, "failed", {"reason": "0 tickers updated"})
try:
from ..telegram.messaging import send_raw
await send_raw(
"⚠️ <b>AI 뉴스 분석 0종목</b>\n"
"스크래핑/LLM 전체 실패 — 어제 데이터 사용"
)
except Exception:
pass
await self.transition("idle", "AI 뉴스 0건")
return
# 실패율 경고 (별도 알림, 본 메시지는 계속 발송)
failure_rate = len(failures) / max(1, updated + len(failures))
if failure_rate > 0.3:
try:
from ..telegram.messaging import send_raw
await send_raw(
f"⚠️ <b>AI 뉴스 실패율 {failure_rate:.0%}</b>\n"
f"updated={updated}, failures={len(failures)}"
)
except Exception:
pass
# 정상 — Top 5 메시지 (stock이 빌드해서 응답에 telegram_text 동봉)
text = result.get("telegram_text") or ""
if not text:
add_log(self.agent_id, "telegram_text 누락 — stock 응답 결함", "error", task_id)
update_task_status(task_id, "failed", {"error": "telegram_text 누락"})
await self.transition("idle", "AI 뉴스 응답 결함")
return
await self.transition("reporting", "AI 뉴스 알림 전송 중...")
from ..telegram.messaging import send_raw
tg = await send_raw(text, parse_mode="MarkdownV2")
update_task_status(task_id, "succeeded", {
"asof": result["asof"],
"updated": updated,
"failures": len(failures),
"tokens_input": int(result.get("tokens_input", 0)),
"tokens_output": int(result.get("tokens_output", 0)),
"telegram_sent": tg.get("ok", False),
})
if not tg.get("ok"):
desc = tg.get("description") or "unknown"
code = tg.get("error_code")
add_log(
self.agent_id,
f"AI news telegram send failed: [{code}] {desc}",
"warning", task_id,
)
await self.transition("idle", "AI 뉴스 완료")
async def on_command(self, command: str, params: dict) -> dict:
if command == "run_screener":
await self.on_screener_schedule()
return {"ok": True, "message": "스크리너 실행 트리거 완료"}
if command == "run_ai_news":
await self.on_ai_news_schedule()
return {"ok": True, "message": "AI 뉴스 분석 트리거 완료"}
if command == "test_telegram":
from ..telegram import send_agent_message
result = await send_agent_message(

View File

@@ -1,7 +1,7 @@
import os
# Service URLs (Docker internal network)
STOCK_LAB_URL = os.getenv("STOCK_LAB_URL", "http://localhost:18500")
STOCK_URL = os.getenv("STOCK_URL", "http://localhost:18500")
MUSIC_LAB_URL = os.getenv("MUSIC_LAB_URL", "http://localhost:18600")
BLOG_LAB_URL = os.getenv("BLOG_LAB_URL", "http://localhost:18700")
REALESTATE_LAB_URL = os.getenv("REALESTATE_LAB_URL", "http://localhost:18800")

View File

@@ -9,9 +9,10 @@ from .config import DB_PATH
def _conn() -> sqlite3.Connection:
os.makedirs(os.path.dirname(DB_PATH), exist_ok=True)
conn = sqlite3.connect(DB_PATH, timeout=10)
conn = sqlite3.connect(DB_PATH, timeout=120.0)
conn.row_factory = sqlite3.Row
conn.execute("PRAGMA journal_mode=WAL")
conn.execute("PRAGMA busy_timeout=120000")
return conn

View File

@@ -19,6 +19,11 @@ async def _run_stock_screener():
if agent:
await agent.on_screener_schedule()
async def _run_stock_ai_news():
agent = AGENT_REGISTRY.get("stock")
if agent:
await agent.on_ai_news_schedule()
async def _run_blog_schedule():
agent = AGENT_REGISTRY.get("blog")
if agent:
@@ -54,6 +59,14 @@ def init_scheduler():
minute=30,
id="stock_screener",
)
scheduler.add_job(
_run_stock_ai_news,
"cron",
day_of_week="mon-fri",
hour=8,
minute=0,
id="stock_ai_news_sentiment",
)
scheduler.add_job(_run_blog_schedule, "cron", hour=10, minute=0, id="blog_pipeline")
scheduler.add_job(_run_lotto_schedule, "cron", day_of_week="mon", hour=9, minute=0, id="lotto_curate")
scheduler.add_job(_run_youtube_research, "cron", hour=9, minute=0, id="youtube_research")

View File

@@ -1,7 +1,7 @@
import httpx
from typing import Any, Dict, List, Optional
from .config import STOCK_LAB_URL, MUSIC_LAB_URL, BLOG_LAB_URL, REALESTATE_LAB_URL
from .config import STOCK_URL, MUSIC_LAB_URL, BLOG_LAB_URL, REALESTATE_LAB_URL
_client = httpx.AsyncClient(timeout=30.0)
@@ -9,23 +9,23 @@ async def fetch_stock_news(limit: int = 10, category: str = None) -> List[Dict[s
params = {"limit": limit}
if category:
params["category"] = category
resp = await _client.get(f"{STOCK_LAB_URL}/api/stock/news", params=params)
resp = await _client.get(f"{STOCK_URL}/api/stock/news", params=params)
resp.raise_for_status()
return resp.json()
async def fetch_stock_indices() -> Dict[str, Any]:
resp = await _client.get(f"{STOCK_LAB_URL}/api/stock/indices")
resp = await _client.get(f"{STOCK_URL}/api/stock/indices")
resp.raise_for_status()
return resp.json()
async def summarize_stock_news(limit: int = 15) -> Dict[str, Any]:
"""stock-lab의 AI 요약 엔드포인트 호출.
"""stock의 AI 요약 엔드포인트 호출.
반환: {"summary": str, "tokens": {...}, "model": str, "duration_ms": int, "article_count": int}
"""
# stock-lab 내부 Ollama 호출이 180s까지 가능하므로 여유있게 200s
# stock 내부 Ollama 호출이 180s까지 가능하므로 여유있게 200s
async with httpx.AsyncClient(timeout=200.0) as client:
resp = await client.post(
f"{STOCK_LAB_URL}/api/stock/news/summarize",
f"{STOCK_URL}/api/stock/news/summarize",
json={"limit": limit},
)
resp.raise_for_status()
@@ -33,18 +33,32 @@ async def summarize_stock_news(limit: int = 15) -> Dict[str, Any]:
async def refresh_screener_snapshot() -> Dict[str, Any]:
"""stock-lab의 KRX 일봉 스냅샷 갱신 (스크리너 실행 전 호출).
"""stock의 KRX 일봉 스냅샷 갱신 (스크리너 실행 전 호출).
네이버 금융 일괄 다운로드라 보통 30~120s, 여유있게 180s.
"""
async with httpx.AsyncClient(timeout=180.0) as client:
resp = await client.post(f"{STOCK_LAB_URL}/api/stock/screener/snapshot/refresh")
resp = await client.post(f"{STOCK_URL}/api/stock/screener/snapshot/refresh")
resp.raise_for_status()
return resp.json()
async def refresh_ai_news_sentiment() -> Dict[str, Any]:
"""stock의 AI 뉴스 sentiment 분석 트리거 (08:00 cron).
네이버 100종목 스크래핑 + Claude Haiku 100콜 병렬 = 약 30-60초.
여유있게 240s timeout.
"""
async with httpx.AsyncClient(timeout=240.0) as client:
resp = await client.post(
f"{STOCK_URL}/api/stock/screener/snapshot/refresh-news-sentiment"
)
resp.raise_for_status()
return resp.json()
async def run_stock_screener(mode: str = "auto") -> Dict[str, Any]:
"""stock-lab의 스크리너 실행.
"""stock의 스크리너 실행.
반환 status:
- 'skipped_holiday': 공휴일/주말 — telegram_payload 없음
@@ -53,7 +67,7 @@ async def run_stock_screener(mode: str = "auto") -> Dict[str, Any]:
"""
async with httpx.AsyncClient(timeout=180.0) as client:
resp = await client.post(
f"{STOCK_LAB_URL}/api/stock/screener/run",
f"{STOCK_URL}/api/stock/screener/run",
json={"mode": mode},
)
resp.raise_for_status()
@@ -61,13 +75,13 @@ async def run_stock_screener(mode: str = "auto") -> Dict[str, Any]:
async def scrape_stock_news() -> Dict[str, Any]:
"""stock-lab의 수동 뉴스 스크랩 트리거 — DB에 최신 뉴스 저장.
"""stock의 수동 뉴스 스크랩 트리거 — DB에 최신 뉴스 저장.
아침 브리핑 직전 호출하여 어제 데이터가 아닌 오늘 새벽 뉴스를 보장한다.
네이버 금융 단일 요청이라 보통 수 초 내 완료, 여유있게 60s.
"""
async with httpx.AsyncClient(timeout=60.0) as client:
resp = await client.post(f"{STOCK_LAB_URL}/api/stock/scrap")
resp = await client.post(f"{STOCK_URL}/api/stock/scrap")
resp.raise_for_status()
return resp.json()

View File

@@ -1,6 +1,6 @@
"""StockAgent.on_screener_schedule — 평일 16:30 KST 자동 잡 단위 테스트.
stock-lab HTTP 호출은 service_proxy mock, 텔레그램은 messaging.send_raw mock.
stock HTTP 호출은 service_proxy mock, 텔레그램은 messaging.send_raw mock.
"""
import os
import sys
@@ -138,7 +138,7 @@ def test_screener_run_failure_notifies_operator():
from app.telegram import messaging
fake_snap = AsyncMock(return_value={"status": "ok"})
fake_run = AsyncMock(side_effect=RuntimeError("stock-lab 500"))
fake_run = AsyncMock(side_effect=RuntimeError("stock 500"))
fake_send = AsyncMock(return_value={"ok": True, "message_id": 1})
with patch.object(service_proxy, "refresh_screener_snapshot", fake_snap), \

View File

@@ -8,9 +8,10 @@ from .config import DB_PATH
def _conn() -> sqlite3.Connection:
os.makedirs(os.path.dirname(DB_PATH), exist_ok=True)
conn = sqlite3.connect(DB_PATH)
conn = sqlite3.connect(DB_PATH, timeout=120.0)
conn.row_factory = sqlite3.Row
conn.execute("PRAGMA journal_mode=WAL")
conn.execute("PRAGMA busy_timeout=120000")
return conn

View File

@@ -22,12 +22,12 @@ services:
timeout: 5s
retries: 3
stock-lab:
stock:
build:
context: ./stock-lab
context: ./stock
args:
APP_VERSION: ${APP_VERSION:-dev}
container_name: stock-lab
container_name: stock
restart: unless-stopped
ports:
- "18500:8000"
@@ -136,7 +136,7 @@ services:
environment:
- TZ=${TZ:-Asia/Seoul}
- CORS_ALLOW_ORIGINS=${CORS_ALLOW_ORIGINS:-http://localhost:3007,http://localhost:8080}
- STOCK_LAB_URL=http://stock-lab:8000
- STOCK_URL=http://stock:8000
- MUSIC_LAB_URL=http://music-lab:8000
- BLOG_LAB_URL=http://blog-lab:8000
- REALESTATE_LAB_URL=http://realestate-lab:8000
@@ -157,7 +157,7 @@ services:
volumes:
- ${RUNTIME_PATH:-.}/data/agent-office:/app/data
depends_on:
- stock-lab
- stock
- music-lab
- blog-lab
- realestate-lab
@@ -241,9 +241,15 @@ services:
container_name: frontend
restart: unless-stopped
depends_on:
- lotto
- stock
- music-lab
- blog-lab
- realestate-lab
- agent-office
- personal
- packs-lab
- travel-proxy
ports:
- "8080:80"
volumes:

View File

@@ -9,8 +9,10 @@ DB_PATH = "/app/data/lotto.db"
def _conn() -> sqlite3.Connection:
os.makedirs(os.path.dirname(DB_PATH), exist_ok=True)
conn = sqlite3.connect(DB_PATH)
conn = sqlite3.connect(DB_PATH, timeout=120.0)
conn.row_factory = sqlite3.Row
conn.execute("PRAGMA journal_mode=WAL")
conn.execute("PRAGMA busy_timeout=120000")
return conn
def _ensure_column(conn: sqlite3.Connection, table: str, col: str, ddl: str) -> None:

View File

@@ -1,5 +1,6 @@
fastapi==0.115.6
uvicorn[standard]==0.30.6
requests==2.32.3
httpx==0.27.2
beautifulsoup4==4.12.3
APScheduler==3.10.4

View File

@@ -9,8 +9,10 @@ DB_PATH = "/app/data/music.db"
def _conn() -> sqlite3.Connection:
os.makedirs(os.path.dirname(DB_PATH), exist_ok=True)
conn = sqlite3.connect(DB_PATH)
conn = sqlite3.connect(DB_PATH, timeout=120.0)
conn.row_factory = sqlite3.Row
conn.execute("PRAGMA journal_mode=WAL")
conn.execute("PRAGMA busy_timeout=120000")
return conn

View File

@@ -139,17 +139,17 @@ server {
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_pass http://stock-lab:8000/api/stock/;
proxy_pass http://stock:8000/api/stock/;
}
# trade API (Stock Lab Proxy)
# trade API (Stock Proxy)
location /api/trade/ {
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_pass http://stock-lab:8000/api/trade/;
proxy_pass http://stock:8000/api/trade/;
}
# blog-marketing API
@@ -166,14 +166,14 @@ server {
proxy_pass http://$blog_backend$request_uri;
}
# portfolio API (Stock Lab) — trailing slash 유무 모두 매칭
# portfolio API (Stock) — trailing slash 유무 모두 매칭
location /api/portfolio {
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_pass http://stock-lab:8000/api/portfolio;
proxy_pass http://stock:8000/api/portfolio;
}

View File

@@ -9,9 +9,10 @@ DB_PATH = "/app/data/personal.db"
def _conn():
c = sqlite3.connect(DB_PATH, timeout=10)
c = sqlite3.connect(DB_PATH, timeout=120.0)
c.row_factory = sqlite3.Row
c.execute("PRAGMA journal_mode=WAL;")
c.execute("PRAGMA busy_timeout=120000;")
c.execute("PRAGMA foreign_keys=ON;")
return c

View File

@@ -12,9 +12,10 @@ DB_PATH = os.getenv("REALESTATE_DB_PATH", "/app/data/realestate.db")
def _conn():
c = sqlite3.connect(DB_PATH, timeout=10)
c = sqlite3.connect(DB_PATH, timeout=120.0)
c.row_factory = sqlite3.Row
c.execute("PRAGMA journal_mode=WAL;")
c.execute("PRAGMA busy_timeout=120000;")
c.execute("PRAGMA foreign_keys=ON;")
return c

View File

@@ -2,7 +2,7 @@
set -euo pipefail
# ── 서비스 목록 (한 곳에서만 관리) ──
SERVICES="lotto travel-proxy deployer stock-lab music-lab blog-lab realestate-lab agent-office personal packs-lab nginx scripts"
SERVICES="lotto travel-proxy deployer stock music-lab blog-lab realestate-lab agent-office personal packs-lab nginx scripts"
# 1. 자동 감지: Docker 컨테이너 내부인가?
if [ -d "/repo" ] && [ -d "/runtime" ]; then

View File

@@ -7,11 +7,11 @@ flock -n 200 || { echo "Deploy already running, skipping"; exit 0; }
# ── 서비스 목록 (한 곳에서만 관리) ──
# docker compose 서비스명 (deployer 제외 — 자기 자신을 재빌드하면 스크립트 중단)
BUILD_TARGETS="lotto travel-proxy stock-lab music-lab blog-lab realestate-lab agent-office personal packs-lab frontend"
BUILD_TARGETS="lotto travel-proxy stock music-lab blog-lab realestate-lab agent-office personal packs-lab frontend"
# 컨테이너 이름 (고아 정리용)
CONTAINER_NAMES="lotto stock-lab music-lab blog-lab realestate-lab agent-office personal packs-lab travel-proxy frontend"
CONTAINER_NAMES="lotto stock music-lab blog-lab realestate-lab agent-office personal packs-lab travel-proxy frontend"
# 헬스체크 대상
HEALTH_ENDPOINTS="lotto stock-lab travel-proxy music-lab blog-lab realestate-lab agent-office personal packs-lab"
HEALTH_ENDPOINTS="lotto stock travel-proxy music-lab blog-lab realestate-lab agent-office personal packs-lab"
# data 디렉토리 (packs-lab은 별도 media/packs 사용)
DATA_DIRS="music stock blog realestate agent-office personal"

View File

@@ -14,7 +14,7 @@ from typing import List, Dict, Any
import httpx
logger = logging.getLogger("stock-lab.ai_summarizer")
logger = logging.getLogger("stock.ai_summarizer")
LLM_PROVIDER = os.getenv("LLM_PROVIDER", "claude").lower().strip()

View File

@@ -12,8 +12,10 @@ def _conn() -> sqlite3.Connection:
parent = os.path.dirname(db_path)
if parent:
os.makedirs(parent, exist_ok=True)
conn = sqlite3.connect(db_path)
conn = sqlite3.connect(db_path, timeout=120.0)
conn.row_factory = sqlite3.Row
conn.execute("PRAGMA journal_mode=WAL")
conn.execute("PRAGMA busy_timeout=120000")
return conn
def init_db():

View File

@@ -11,7 +11,7 @@ from apscheduler.schedulers.background import BackgroundScheduler
from pydantic import BaseModel
logging.basicConfig(level=logging.INFO, format="%(asctime)s [%(name)s] %(levelname)s %(message)s")
logger = logging.getLogger("stock-lab")
logger = logging.getLogger("stock")
from .db import (
init_db, save_articles, get_latest_articles,

View File

@@ -4,7 +4,7 @@ from bs4 import BeautifulSoup
from typing import List, Dict, Any
import time
logger = logging.getLogger("stock-lab.scraper")
logger = logging.getLogger("stock.scraper")
# 네이버 파이낸스 주요 뉴스
NAVER_FINANCE_NEWS_URL = "https://finance.naver.com/news/mainnews.naver"

View File

@@ -0,0 +1,103 @@
"""Claude Haiku 기반 종목 뉴스 호재/악재 분석."""
from __future__ import annotations
import json
import logging
import os
from typing import Any, Dict, List
log = logging.getLogger(__name__)
DEFAULT_MODEL = os.getenv("AI_NEWS_MODEL", "claude-haiku-4-5-20251001")
PROMPT_TEMPLATE = """다음은 종목 {name}({ticker})에 대한 최근 뉴스 {n}개의 헤드라인입니다.
{news_block}
이 뉴스들이 종목에 호재인지 악재인지 평가하세요.
score: -10(매우 강한 악재) ~ +10(매우 강한 호재) 사이의 실수. 0은 중립.
reason: 30자 이내 한 줄 근거.
JSON으로만 응답하세요. 다른 텍스트 금지:
{{"score": <float>, "reason": "<string>"}}"""
def _clamp(x: float, lo: float = -10.0, hi: float = 10.0) -> float:
return max(lo, min(hi, x))
def _format_news_block(news: List[Dict[str, Any]]) -> str:
"""news dict 리스트 → prompt 에 들어가는 텍스트 블록.
summary 가 있으면 title 다음 줄에 indent 해서 포함 (최대 200자).
pub_date 가 있으면 title 앞에 표시.
"""
lines: List[str] = []
for n in news:
date = (n.get("pub_date") or "").strip()
title = (n.get("title") or "").strip()
summary = (n.get("summary") or "").strip()
prefix = f"[{date}] " if date else ""
if summary:
lines.append(f"- {prefix}{title}\n {summary[:200]}")
else:
lines.append(f"- {prefix}{title}")
return "\n".join(lines)
async def score_sentiment(
llm,
ticker: str,
news: List[Dict[str, Any]],
*,
name: str | None = None,
model: str = DEFAULT_MODEL,
) -> Dict[str, Any]:
"""Returns {ticker, score_raw, reason, news_count, tokens_input, tokens_output, model}."""
news_block = _format_news_block(news)
prompt = PROMPT_TEMPLATE.format(
name=name or ticker, ticker=ticker,
n=len(news), news_block=news_block,
)
resp = await llm.messages.create(
model=model,
max_tokens=200,
temperature=0,
system="너는 한국 주식 뉴스 감성 분석가다. JSON 객체 하나만 반환한다.",
messages=[
{"role": "user", "content": prompt},
# Assistant prefill — 첫 토큰을 강제로 '{' 로 시작해 JSON 응답을 보장
{"role": "assistant", "content": "{"},
],
)
raw = resp.content[0].text if resp.content else ""
# prefill '{' 이 응답에 포함되지 않으므로 다시 붙임
text = "{" + raw if not raw.lstrip().startswith("{") else raw
in_tokens = int(getattr(resp.usage, "input_tokens", 0) or 0)
out_tokens = int(getattr(resp.usage, "output_tokens", 0) or 0)
try:
data = json.loads(text)
score = _clamp(float(data["score"]))
reason = str(data["reason"])[:200]
return {
"ticker": ticker,
"score_raw": score,
"reason": reason,
"news_count": len(news),
"tokens_input": in_tokens,
"tokens_output": out_tokens,
"model": model,
}
except (json.JSONDecodeError, KeyError, TypeError, ValueError) as e:
log.warning("ai_news parse fail for %s: %s (raw=%r)", ticker, e, text[:100])
return {
"ticker": ticker,
"score_raw": 0.0,
"reason": f"parse fail: {e!s}"[:200],
"news_count": len(news),
"tokens_input": in_tokens,
"tokens_output": out_tokens,
"model": model,
}

View File

@@ -0,0 +1,70 @@
"""기존 articles 테이블에서 종목별 뉴스 매핑."""
from __future__ import annotations
import datetime as dt
import logging
import sqlite3
from typing import Any, Dict, List, Tuple
log = logging.getLogger(__name__)
def gather_articles_for_tickers(
conn: sqlite3.Connection,
tickers: List[str],
asof: dt.date,
*,
window_days: int = 1,
max_per_ticker: int = 5,
) -> Tuple[Dict[str, List[Dict[str, Any]]], Dict[str, int]]:
"""articles 에서 ticker.name substring 매칭으로 종목별 뉴스 dict 반환.
Returns:
(
{ticker: [{"title": str, "summary": str, "press": str, "pub_date": str}, ...]},
{"total_articles": int, "matched_pairs": int, "hit_tickers": int},
)
"""
out: Dict[str, List[Dict[str, Any]]] = {t: [] for t in tickers}
stats = {"total_articles": 0, "matched_pairs": 0, "hit_tickers": 0}
if not tickers:
return out, stats
cutoff = (asof - dt.timedelta(days=window_days)).isoformat()
placeholders = ",".join("?" * len(tickers))
name_rows = conn.execute(
f"SELECT ticker, name FROM krx_master WHERE ticker IN ({placeholders})",
tickers,
).fetchall()
# 2글자 미만 회사명은 false positive 위험으로 제외
name_map = {r[0]: r[1] for r in name_rows if r[1] and len(r[1]) >= 2}
articles = conn.execute(
"SELECT title, summary, press, pub_date, crawled_at "
"FROM articles WHERE crawled_at >= ? ORDER BY crawled_at DESC",
(cutoff,),
).fetchall()
stats["total_articles"] = len(articles)
for a in articles:
title = (a[0] or "").strip()
summary = (a[1] or "").strip()
haystack = title + " " + summary
for ticker, name in name_map.items():
if name not in haystack:
continue
if len(out[ticker]) >= max_per_ticker:
continue
out[ticker].append({
"title": title,
"summary": summary,
"press": a[2] or "",
"pub_date": a[3] or "",
})
stats["matched_pairs"] += 1
stats["hit_tickers"] = sum(1 for arts in out.values() if arts)
return out, stats

View File

@@ -0,0 +1,141 @@
"""ai_news refresh pipeline — 시총 상위 N종목 병렬 처리."""
from __future__ import annotations
import asyncio
import datetime as dt
import logging
import os
import sqlite3
import time
from typing import Any, Dict, List
from . import scraper as _scraper # legacy, kept for backward import
from . import analyzer as _analyzer
from . import articles_source # 신규
log = logging.getLogger(__name__)
DEFAULT_TOP_N = 100
DEFAULT_CONCURRENCY = 10
DEFAULT_NEWS_PER_TICKER = 5
def _top_market_cap_tickers(conn: sqlite3.Connection, n: int) -> List[str]:
rows = conn.execute(
"SELECT ticker FROM krx_master "
"WHERE market_cap IS NOT NULL AND is_preferred=0 AND is_spac=0 "
"ORDER BY market_cap DESC LIMIT ?",
(n,),
).fetchall()
return [r[0] for r in rows]
def _make_llm():
"""Anthropic AsyncClient — env에 ANTHROPIC_API_KEY 필수."""
from anthropic import AsyncAnthropic
return AsyncAnthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
async def _process_one(
ticker: str, name: str, articles: List[Dict[str, Any]],
sem: asyncio.Semaphore, llm, model: str,
) -> Dict[str, Any]:
async with sem:
return await _analyzer.score_sentiment(
llm, ticker, articles, name=name, model=model,
)
def _upsert_news_sentiment(
conn: sqlite3.Connection, asof: dt.date,
rows: List[Dict[str, Any]], *, source: str = "articles",
) -> None:
iso = asof.isoformat()
data = [
(
r["ticker"], iso, r["score_raw"], r["reason"], r["news_count"],
r["tokens_input"], r["tokens_output"], r["model"], source,
)
for r in rows
]
conn.executemany(
"""INSERT INTO news_sentiment
(ticker, date, score_raw, reason, news_count,
tokens_input, tokens_output, model, source)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)
ON CONFLICT(ticker, date) DO UPDATE SET
score_raw=excluded.score_raw,
reason=excluded.reason,
news_count=excluded.news_count,
tokens_input=excluded.tokens_input,
tokens_output=excluded.tokens_output,
model=excluded.model,
source=excluded.source
""",
data,
)
conn.commit()
async def refresh_daily(
conn: sqlite3.Connection,
asof: dt.date,
*,
top_n: int = DEFAULT_TOP_N,
concurrency: int = DEFAULT_CONCURRENCY,
max_news_per_ticker: int = DEFAULT_NEWS_PER_TICKER,
window_days: int = 1,
model: str = _analyzer.DEFAULT_MODEL,
) -> Dict[str, Any]:
started = time.time()
tickers = _top_market_cap_tickers(conn, n=top_n)
name_map = {
r[0]: r[1] for r in conn.execute(
f"SELECT ticker, name FROM krx_master WHERE ticker IN "
f"({','.join('?' * len(tickers))})", tickers,
).fetchall()
} if tickers else {}
articles_by_ticker, mapping_stats = articles_source.gather_articles_for_tickers(
conn, tickers, asof,
window_days=window_days,
max_per_ticker=max_news_per_ticker,
)
sem = asyncio.Semaphore(concurrency)
async with _make_llm() as llm:
tasks = []
for t in tickers:
arts = articles_by_ticker.get(t, [])
if not arts:
continue # 매핑 0 — score 미생성
tasks.append(_process_one(t, name_map.get(t, t), arts, sem, llm, model))
raw_results = await asyncio.gather(*tasks, return_exceptions=True)
successes: List[Dict[str, Any]] = []
failures: List[str] = []
for r in raw_results:
if isinstance(r, BaseException):
failures.append(repr(r))
elif isinstance(r, dict):
successes.append(r)
if successes:
_upsert_news_sentiment(conn, asof, successes, source="articles")
top_pos = sorted(successes, key=lambda r: -r["score_raw"])[:5]
top_neg = sorted(successes, key=lambda r: r["score_raw"])[:5]
return {
"asof": asof.isoformat(),
"updated": len(successes),
"failures": failures,
"duration_sec": round(time.time() - started, 2),
"tokens_input": sum(r["tokens_input"] for r in successes),
"tokens_output": sum(r["tokens_output"] for r in successes),
"top_pos": top_pos,
"top_neg": top_neg,
"model": model,
"mapping": mapping_stats,
}

View File

@@ -0,0 +1,46 @@
"""[DEPRECATED] 네이버 finance 종목 뉴스 스크래핑.
본 모듈은 ai_news Phase 1 (2026-05-14) 에서 더 이상 파이프라인에서 사용되지 않음.
데이터 소스는 stock 의 articles 테이블 (ai_news/articles_source.py) 로 전환됨.
삭제 시점: Phase 2 (DART 도입) 결정 후. IC 검증 4주 누적 후 노드 활성화
여부에 따라 본 모듈을 (a) 완전 삭제 또는 (b) ensemble fallback 으로 재활용.
"""
from __future__ import annotations
import logging
from typing import Any, Dict, List
from bs4 import BeautifulSoup
log = logging.getLogger(__name__)
NAVER_NEWS_URL = "https://finance.naver.com/item/news_news.naver"
NAVER_HEADERS = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
"Referer": "https://finance.naver.com/",
}
async def fetch_news(client, ticker: str, n: int = 5) -> List[Dict[str, Any]]:
"""Scrape top N news headlines for a ticker. Returns [] on any failure."""
try:
r = await client.get(NAVER_NEWS_URL, params={"code": ticker, "page": 1})
except Exception as e:
log.warning("ai_news scrape http error for %s: %s", ticker, e)
return []
if r.status_code != 200:
return []
soup = BeautifulSoup(r.text, "lxml")
out: List[Dict[str, Any]] = []
for row in soup.select("table.type5 tbody tr")[:n]:
title_el = row.select_one("td.title a")
date_el = row.select_one("td.date")
if not title_el or not date_el:
continue
out.append({
"title": title_el.get_text(strip=True),
"date": date_el.get_text(strip=True),
})
return out

View File

@@ -0,0 +1,73 @@
"""ai_news Top 5/5 텔레그램 메시지 빌더 (MarkdownV2)."""
from __future__ import annotations
from typing import Any, Dict, List
_MD_SPECIAL = r"_*[]()~`>#+-=|{}.!\\"
def _escape(text: str) -> str:
return "".join("\\" + c if c in _MD_SPECIAL else c for c in str(text))
def _cost_won(tokens_input: int, tokens_output: int) -> int:
"""Claude Haiku 가격 환산 (대략): in $1/M × ₩1300, out $5/M × ₩1300."""
return int(tokens_input * 0.0013 + tokens_output * 0.0065)
def _row_line(idx: int, r: Dict[str, Any]) -> str:
score = r["score_raw"]
# score 문자열 자체를 _escape 통과 — '+', '-', '.' 모두 MarkdownV2 reserved
score_str = _escape(f"{score:+.1f}")
name = r.get("name") or ""
ticker = r["ticker"]
label = (
f"{_escape(name)} \\({_escape(ticker)}\\)"
if name else _escape(ticker)
)
return f"{idx}\\. {label} \\({score_str}\\) — {_escape(r['reason'])}"
def build_message(
*,
asof: str,
top_pos: List[Dict[str, Any]],
top_neg: List[Dict[str, Any]],
tokens_input: int,
tokens_output: int,
mapping: Dict[str, int] | None = None,
) -> str:
lines: List[str] = [
f"🌅 *AI 뉴스 분석* \\({_escape(asof)} 08:00\\)",
"",
"📈 *호재 Top 5*",
]
if top_pos:
for i, r in enumerate(top_pos, 1):
lines.append(_row_line(i, r))
else:
lines.append(_escape("- (없음)"))
lines += ["", "📉 *악재 Top 5*"]
if top_neg:
for i, r in enumerate(top_neg, 1):
lines.append(_row_line(i, r))
else:
lines.append(_escape("- (없음)"))
cost = _cost_won(tokens_input, tokens_output)
mapping_part = ""
if mapping:
mapping_part = (
f"매핑 {mapping['hit_tickers']}/100 ticker "
f"\\({mapping['matched_pairs']}쌍 / articles {mapping['total_articles']}\\) · "
)
lines += [
"",
f"_분석: 시총 상위 100종목 · {mapping_part}"
f"토큰 {tokens_input:,} in / {tokens_output:,} out · "
f"약 ₩{cost:,}_",
]
return "\n".join(lines)

View File

@@ -0,0 +1,125 @@
"""AI news sentiment validation — Spearman IC vs forward returns.
핵심 metric: 일자별 score_raw 와 다음 N일 forward return 의 Spearman 상관.
4주+ 누적 후 IC mean > 0.05 면 weight 활성화 가치 있음.
"""
from __future__ import annotations
import datetime as dt
import sqlite3
from typing import Any, Dict, List, Optional
import pandas as pd
def _spearman(a: pd.Series, b: pd.Series) -> Optional[float]:
"""Spearman rank correlation. None if insufficient/degenerate data."""
if len(a) < 5 or len(b) < 5:
return None
if a.std(ddof=0) == 0 or b.std(ddof=0) == 0:
return None
return float(a.rank().corr(b.rank()))
def compute_ic(
conn: sqlite3.Connection,
*,
days: int = 30,
horizon: int = 1,
min_news_count: int = 1,
asof_today: Optional[dt.date] = None,
) -> Dict[str, Any]:
"""Compute daily Spearman IC of ai_news.score_raw vs forward return.
Returns:
{
"horizon_days": int,
"min_news_count": int,
"window_days": int,
"ic_count": int, # 유효 일수
"ic_mean": float | None,
"ic_std": float | None,
"ic_per_day": [{"date": "YYYY-MM-DD", "ic": float, "n": int}, ...],
"verdict": "skip" | "weak" | "strong",
}
verdict:
- skip: ic_count < 10
- weak: ic_mean in [-0.05, 0.05]
- strong: |ic_mean| > 0.05
"""
asof_today = asof_today or dt.date.today()
cutoff = (asof_today - dt.timedelta(days=days)).isoformat()
sentiment = pd.read_sql_query(
"SELECT ticker, date, score_raw, news_count "
"FROM news_sentiment WHERE date >= ? AND news_count >= ? ORDER BY date",
conn, params=(cutoff, min_news_count),
)
if sentiment.empty:
return _empty_result(days, horizon, min_news_count)
# forward return 조회: 각 (ticker, date) 에 대해 close[date+horizon] / close[date] - 1
prices = pd.read_sql_query(
"SELECT ticker, date, close FROM krx_daily_prices "
"WHERE date >= ? ORDER BY ticker, date",
conn, params=(cutoff,),
)
if prices.empty:
return _empty_result(days, horizon, min_news_count)
prices = prices.sort_values(["ticker", "date"])
prices["fwd_close"] = prices.groupby("ticker", group_keys=False)["close"].shift(-horizon)
prices["fwd_ret"] = prices["fwd_close"] / prices["close"] - 1.0
merged = sentiment.merge(
prices[["ticker", "date", "fwd_ret"]], on=["ticker", "date"], how="inner"
)
merged = merged.dropna(subset=["fwd_ret"])
if merged.empty:
return _empty_result(days, horizon, min_news_count)
ic_rows: List[Dict[str, Any]] = []
for date, grp in merged.groupby("date"):
ic = _spearman(grp["score_raw"], grp["fwd_ret"])
if ic is not None:
ic_rows.append({"date": date, "ic": ic, "n": int(len(grp))})
if not ic_rows:
return _empty_result(days, horizon, min_news_count)
ic_series = pd.Series([r["ic"] for r in ic_rows], dtype=float)
ic_mean = float(ic_series.mean())
ic_std = float(ic_series.std(ddof=0)) if len(ic_series) > 1 else 0.0
if len(ic_rows) < 10:
verdict = "skip"
elif abs(ic_mean) > 0.05:
verdict = "strong"
else:
verdict = "weak"
return {
"horizon_days": horizon,
"min_news_count": min_news_count,
"window_days": days,
"ic_count": len(ic_rows),
"ic_mean": round(ic_mean, 4),
"ic_std": round(ic_std, 4),
"ic_per_day": ic_rows,
"verdict": verdict,
}
def _empty_result(days: int, horizon: int, min_news_count: int) -> Dict[str, Any]:
return {
"horizon_days": horizon,
"min_news_count": min_news_count,
"window_days": days,
"ic_count": 0,
"ic_mean": None,
"ic_std": None,
"ic_per_day": [],
"verdict": "skip",
}

View File

@@ -17,6 +17,7 @@ class ScreenContext:
flow: pd.DataFrame # cols: ticker,date,foreign_net,institution_net
kospi: pd.Series # index=date(str), name="kospi"
asof: dt.date
news_sentiment: "pd.DataFrame | None" = None
@classmethod
def load(cls, conn: sqlite3.Connection, asof: dt.date,
@@ -38,6 +39,10 @@ class ScreenContext:
"FROM krx_flow WHERE date BETWEEN ? AND ? ORDER BY date",
conn, params=(cutoff, asof_iso),
)
news_sentiment = pd.read_sql_query(
"SELECT ticker, score_raw, news_count FROM news_sentiment WHERE date = ?",
conn, params=(asof_iso,),
)
# KOSPI 지수: MVP에서는 005930(삼성전자) 종가를 시장 대용으로 사용.
# 후속 슬라이스에서 ^KS11 별도 캐시.
@@ -47,7 +52,8 @@ class ScreenContext:
kospi = sub.copy()
kospi.name = "kospi"
return cls(master=master, prices=prices, flow=flow, kospi=kospi, asof=asof)
return cls(master=master, prices=prices, flow=flow, kospi=kospi, asof=asof,
news_sentiment=news_sentiment)
def restrict(self, tickers) -> "ScreenContext":
tickers = pd.Index(tickers)

View File

View File

@@ -0,0 +1,36 @@
"""AI 뉴스 호재/악재 점수 노드.
ScreenContext.news_sentiment (DataFrame: ticker, score_raw, news_count) 를
min_news_count 로 필터한 뒤 percentile_rank 로 0~100 변환.
"""
from __future__ import annotations
import pandas as pd
from .base import ScoreNode, percentile_rank
class AiNewsSentiment(ScoreNode):
name = "ai_news"
label = "AI 뉴스 호재/악재"
default_params = {"min_news_count": 1}
param_schema = {
"type": "object",
"properties": {
"min_news_count": {
"type": "integer", "minimum": 0, "default": 1,
"description": "최소 분석 뉴스 수. 미만이면 점수 미산출.",
},
},
}
def compute(self, ctx, params: dict) -> pd.Series:
df = getattr(ctx, "news_sentiment", None)
if df is None or df.empty:
return pd.Series(dtype=float)
min_news = int(params.get("min_news_count", 1))
df = df[df["news_count"] >= min_news]
if df.empty:
return pd.Series(dtype=float)
return percentile_rank(df.set_index("ticker")["score_raw"])

View File

@@ -8,6 +8,7 @@ from .nodes.high52w import High52WProximity
from .nodes.rs_rating import RsRating
from .nodes.ma_alignment import MaAlignment
from .nodes.vcp_lite import VcpLite
from .nodes.ai_news import AiNewsSentiment
NODE_REGISTRY: dict = {
"foreign_buy": ForeignBuy,
@@ -17,6 +18,7 @@ NODE_REGISTRY: dict = {
"rs_rating": RsRating,
"ma_alignment": MaAlignment,
"vcp_lite": VcpLite,
"ai_news": AiNewsSentiment,
}
GATE_REGISTRY: dict = {

View File

@@ -45,7 +45,13 @@ def _db_path() -> str:
def _conn() -> sqlite3.Connection:
return sqlite3.connect(_db_path())
# WAL 모드 + busy_timeout으로 동시 read/write lock 회피
# WAL은 reader vs writer 동시성만 해결 — writer 두 명은 직렬이므로 busy_timeout이
# snapshot/refresh의 write 시간보다 길어야 함 (네이버 스크래핑 ~20초 + DB upsert).
conn = sqlite3.connect(_db_path(), timeout=120.0)
conn.execute("PRAGMA journal_mode=WAL")
conn.execute("PRAGMA busy_timeout=120000")
return conn
# ---------- /nodes ----------
@@ -270,6 +276,61 @@ def list_runs(limit: int = 30):
]
# ---------- /snapshot/refresh-news-sentiment ----------
from .ai_news import pipeline as _ai_pipeline
from .ai_news import telegram as _ai_telegram
from .ai_news import validation as _ai_validation
@router.post("/snapshot/refresh-news-sentiment")
async def post_refresh_news_sentiment(asof: Optional[str] = None):
asof_date = dt.date.fromisoformat(asof) if asof else dt.date.today()
if asof_date.weekday() >= 5:
return {"asof": asof_date.isoformat(), "status": "skipped_weekend"}
if _is_holiday(asof_date):
return {"asof": asof_date.isoformat(), "status": "skipped_holiday"}
with _conn() as c:
summary = await _ai_pipeline.refresh_daily(c, asof_date)
# top_pos/top_neg 항목에 종목명 주입 (텔레그램 가독성)
tickers = {r["ticker"] for r in summary["top_pos"] + summary["top_neg"]}
if tickers:
placeholders = ",".join("?" * len(tickers))
name_map = {
row[0]: row[1] for row in c.execute(
f"SELECT ticker, name FROM krx_master WHERE ticker IN ({placeholders})",
list(tickers),
).fetchall()
}
for r in summary["top_pos"] + summary["top_neg"]:
r["name"] = name_map.get(r["ticker"], "")
summary["telegram_text"] = _ai_telegram.build_message(
asof=summary["asof"],
top_pos=summary["top_pos"], top_neg=summary["top_neg"],
tokens_input=summary["tokens_input"],
tokens_output=summary["tokens_output"],
mapping=summary.get("mapping"),
)
return summary
# ---------- /ai-news/ic ----------
@router.get("/ai-news/ic")
def get_ai_news_ic(days: int = 30, horizon: int = 1, min_news_count: int = 1):
"""ai_news.score_raw 의 forward return IC (Spearman) 계산.
verdict:
- skip: ic_count < 10 (데이터 부족)
- weak: |ic_mean| <= 0.05
- strong: |ic_mean| > 0.05 (gradient 활성화 가치 있음)
"""
with _conn() as c:
return _ai_validation.compute_ic(
c, days=days, horizon=horizon, min_news_count=min_news_count,
)
@router.get("/runs/{run_id}")
def get_run(run_id: int):
with _conn() as c:

View File

@@ -12,6 +12,9 @@ DEFAULT_WEIGHTS = {
"rs_rating": 1.2,
"ma_alignment": 1.0,
"vcp_lite": 0.8,
# ai_news: 검증 전 gradient 차단 (4주 IC > 0.05 확인 후 활성화).
# 데이터 수집은 계속, 가중합 영향만 0.
"ai_news": 0.0,
}
DEFAULT_NODE_PARAMS = {
"foreign_buy": {"window_days": 5},
@@ -21,6 +24,7 @@ DEFAULT_NODE_PARAMS = {
"rs_rating": {"weights": {"3m": 2, "6m": 1, "9m": 1, "12m": 1}},
"ma_alignment": {"ma_periods": [50, 150, 200]},
"vcp_lite": {"short_window": 40, "long_window": 252},
"ai_news": {"min_news_count": 1},
}
DEFAULT_GATE_PARAMS = {
"min_market_cap_won": 50_000_000_000,
@@ -110,12 +114,76 @@ CREATE TABLE IF NOT EXISTS screener_results (
FOREIGN KEY (run_id) REFERENCES screener_runs(id) ON DELETE CASCADE
);
CREATE INDEX IF NOT EXISTS idx_results_run_rank ON screener_results(run_id, rank);
-- articles 테이블 (도메스틱/해외 뉴스 원본).
-- 메인 app.db.init_db() 에서도 생성하지만, 테스트 환경 단독 screener 컨텍스트
-- (ai_news.articles_source )에서도 참조 가능하도록 idempotent 하게 보장한다.
CREATE TABLE IF NOT EXISTS articles (
id INTEGER PRIMARY KEY AUTOINCREMENT,
hash TEXT UNIQUE NOT NULL,
category TEXT DEFAULT 'domestic',
title TEXT NOT NULL,
link TEXT,
summary TEXT,
press TEXT,
pub_date TEXT,
crawled_at TEXT
);
CREATE INDEX IF NOT EXISTS idx_articles_crawled ON articles(crawled_at DESC);
CREATE TABLE IF NOT EXISTS news_sentiment (
ticker TEXT NOT NULL,
date TEXT NOT NULL,
score_raw REAL NOT NULL,
reason TEXT NOT NULL DEFAULT '',
news_count INTEGER NOT NULL DEFAULT 0,
tokens_input INTEGER NOT NULL DEFAULT 0,
tokens_output INTEGER NOT NULL DEFAULT 0,
model TEXT NOT NULL DEFAULT 'claude-haiku-4-5-20251001',
source TEXT NOT NULL DEFAULT 'articles',
created_at TEXT NOT NULL DEFAULT (datetime('now','localtime')),
PRIMARY KEY (ticker, date)
);
CREATE INDEX IF NOT EXISTS idx_news_sentiment_date ON news_sentiment(date DESC);
"""
def ensure_screener_schema(conn: sqlite3.Connection) -> None:
"""Create tables and seed default settings (idempotent)."""
conn.executescript(DDL)
# ai_news 키 누락 시 1회 보충 (이미 운영 중인 환경에 대해)
row = conn.execute(
"SELECT weights_json, node_params_json FROM screener_settings WHERE id=1"
).fetchone()
if row is not None:
w = json.loads(row[0])
p = json.loads(row[1])
changed = False
if "ai_news" not in w:
w["ai_news"] = DEFAULT_WEIGHTS["ai_news"]
changed = True
# One-time reset: ai_news default 0.8 → 0.0 (검증 전 gradient 차단).
# 사용자가 명시적으로 0.8 외 값을 설정했다면 영향 없음.
elif w.get("ai_news") == 0.8:
w["ai_news"] = 0.0
changed = True
if "ai_news" not in p:
p["ai_news"] = DEFAULT_NODE_PARAMS["ai_news"]
changed = True
if changed:
conn.execute(
"UPDATE screener_settings SET weights_json=?, node_params_json=? WHERE id=1",
(json.dumps(w), json.dumps(p)),
)
# news_sentiment.source 컬럼 1회 추가 (기존 운영 환경)
cols = {r[1] for r in conn.execute(
"PRAGMA table_info(news_sentiment)"
).fetchall()}
if "source" not in cols:
conn.execute(
"ALTER TABLE news_sentiment "
"ADD COLUMN source TEXT NOT NULL DEFAULT 'articles'"
)
existing = conn.execute("SELECT id FROM screener_settings WHERE id=1").fetchone()
if existing is None:
now = datetime.now(timezone.utc).isoformat()

View File

@@ -22,8 +22,11 @@ NAVER_HEADERS = {
"Referer": "https://finance.naver.com/",
}
DEFAULT_FLOW_TOP_N = 500
DEFAULT_FLOW_TOP_N = 100
DEFAULT_RATE_LIMIT_SEC = 0.2
# 시총 상위 100종목 × 0.2초 = ~20초 — agent-office httpx timeout(180s) 안에 여유롭게 완료
# 외국인 매수 시그널은 대형주에서 의미가 크므로 상위 100종목으로 충분.
# 더 많은 종목이 필요하면 별도 cron으로 분리 권장.
@dataclass

View File

@@ -4,14 +4,15 @@ from __future__ import annotations
import datetime as dt
NODE_ICONS = {
"foreign_buy": "👤외",
"volume_surge": "⚡거",
"momentum": "🚀모",
"high52w": "🆙고",
"rs_rating": "💪RS",
"ma_alignment": "📈MA",
"vcp_lite": "🌀VCP",
# 노드별 풀 라벨 (아이콘 대신 사용 — 사용자가 명확한 이름 선호)
NODE_LABELS = {
"foreign_buy": "외국인",
"volume_surge": "거래량급증",
"momentum": "20일모멘텀",
"high52w": "52주신고가",
"rs_rating": "RS레이팅",
"ma_alignment": "이평선정배열",
"vcp_lite": "VCP수축",
}
PAGE_BASE = "https://gahusb.synology.me/stock/screener"
@@ -25,9 +26,21 @@ def _escape_md(s: str) -> str:
def _format_won(n) -> str:
"""1,234,567원 형태 (None 시 '-')."""
if n is None:
return "-"
return f"{int(n):,}"
return "\\-"
return f"{int(n):,}"
def _format_active_nodes(scores: dict, threshold: int = 70) -> str:
"""70점 이상 노드를 '라벨 점수' 형태로 나열, 콤마 구분."""
active = []
for name, sc in scores.items():
label = NODE_LABELS.get(name)
if label is None or sc < threshold:
continue
active.append(f"{_escape_md(label)} {int(sc)}")
return " · ".join(active) if active else "\\(70점 이상 노드 없음\\)"
def build_telegram_payload(asof: dt.date, mode: str, survivors_count: int,
@@ -40,17 +53,14 @@ def build_telegram_payload(asof: dt.date, mode: str, survivors_count: int,
lines = []
for r in rows[:10]:
icons = " ".join(
NODE_ICONS[name] for name, sc in r["scores"].items()
if sc >= 70 and name in NODE_ICONS
)
nodes_str = _format_active_nodes(r.get("scores", {}))
score_str = f"{r['total_score']:.1f}"
r_pct = r.get("r_pct")
r_pct_str = f"{r_pct:.1f}" if r_pct is not None else "-"
lines.append(
f"{r['rank']}\\. *{_escape_md(r['name'])}* `{r['ticker']}` "
f"{_escape_md(score_str)}\n"
f" {icons}\n"
f" {nodes_str}\n"
f" 진입 {_format_won(r.get('entry_price'))} "
f"손절 {_format_won(r.get('stop_price'))} "
f"익절 {_format_won(r.get('target_price'))} "

View File

@@ -1,7 +1,7 @@
"""price_fetcher._select_price_from_response 단위 테스트.
실행:
cd web-backend/stock-lab
cd web-backend/stock
python -m unittest app.test_price_fetcher -v
"""
import os

View File

@@ -21,15 +21,16 @@ def client():
return TestClient(app)
def test_get_nodes_lists_7_score_and_1_gate(client):
def test_get_nodes_lists_8_score_and_1_gate(client):
r = client.get("/api/stock/screener/nodes")
assert r.status_code == 200
body = r.json()
assert len(body["score_nodes"]) == 7
assert len(body["score_nodes"]) == 8
assert len(body["gate_nodes"]) == 1
assert {n["name"] for n in body["score_nodes"]} == {
"foreign_buy", "volume_surge", "momentum",
"high52w", "rs_rating", "ma_alignment", "vcp_lite",
"ai_news",
}

View File

@@ -31,7 +31,7 @@ def test_build_payload_includes_top10_and_link():
assert "42" in text # run_id 링크
def test_score_threshold_filters_icons():
def test_score_threshold_filters_node_labels():
rows = [{
"rank": 1, "ticker": "A", "name": "A주",
"total_score": 80,
@@ -41,11 +41,28 @@ def test_score_threshold_filters_icons():
"target_price": 53750, "r_pct": 3.5,
}]
p = build_telegram_payload(dt.date(2026, 5, 12), "auto", 100, 1, rows, run_id=1)
# foreign_buy(90), momentum(70), rs_rating(80), ma_alignment(80) 만 표시 (≥70)
assert "👤외" in p["text"]
assert "🚀모" in p["text"]
assert "💪RS" in p["text"]
assert "📈MA" in p["text"]
assert "⚡거" not in p["text"]
assert "🆙고" not in p["text"]
assert "🌀VCP" not in p["text"]
text = p["text"]
# ≥70 노드만 풀 라벨로 표시 (foreign_buy=90, momentum=70, rs_rating=80, ma_alignment=80)
assert "외국인 90" in text
assert "20일모멘텀 70" in text
assert "RS레이팅 80" in text
assert "이평선정배열 80" in text
# <70 노드는 숨김 (volume_surge=50, high52w=30, vcp_lite=60)
assert "거래량급증" not in text
assert "52주신고가" not in text
assert "VCP수축" not in text
def test_prices_have_won_suffix():
rows = [{
"rank": 1, "ticker": "A", "name": "A주",
"total_score": 80,
"scores": {"foreign_buy": 80},
"close": 50000, "entry_price": 50250, "stop_price": 48500,
"target_price": 53750, "r_pct": 3.5,
}]
p = build_telegram_payload(dt.date(2026, 5, 12), "auto", 100, 1, rows, run_id=1)
text = p["text"]
assert "50,250원" in text
assert "48,500원" in text
assert "53,750원" in text

View File

@@ -1,4 +1,5 @@
# 주식 서비스용 라이브러리
anthropic==0.39.0
requests==2.32.3
httpx==0.27.2
beautifulsoup4==4.12.3
@@ -8,4 +9,6 @@ apscheduler==3.10.4
python-dotenv==1.0.1
finance-datareader==0.9.110
lxml==6.1.0
pytest==8.3.2
pytest-asyncio==0.24.0

View File

@@ -0,0 +1,70 @@
import json
import pytest
from unittest.mock import AsyncMock, MagicMock
from app.screener.ai_news import analyzer
def _mk_llm(content_text: str, in_tokens: int = 100, out_tokens: int = 20):
llm = AsyncMock()
resp = MagicMock()
block = MagicMock()
block.text = content_text
resp.content = [block]
resp.usage = MagicMock(input_tokens=in_tokens, output_tokens=out_tokens)
llm.messages = MagicMock()
llm.messages.create = AsyncMock(return_value=resp)
return llm
NEWS = [
{"title": "삼성전자, HBM 양산", "summary": "1분기 영업이익 사상 최대", "pub_date": "2026-05-14"},
{"title": "메모리 가격 반등", "summary": "", "pub_date": "2026-05-14"},
]
@pytest.mark.asyncio
async def test_score_sentiment_success_parses_json():
llm = _mk_llm(json.dumps({"score": 7.5, "reason": "HBM 호재"}))
out = await analyzer.score_sentiment(llm, "005930", NEWS, name="삼성전자")
assert out["ticker"] == "005930"
assert out["score_raw"] == 7.5
assert out["reason"] == "HBM 호재"
assert out["news_count"] == 2
assert out["tokens_input"] == 100
assert out["tokens_output"] == 20
@pytest.mark.asyncio
async def test_score_sentiment_json_parse_fail_returns_zero():
llm = _mk_llm("not valid json")
out = await analyzer.score_sentiment(llm, "005930", NEWS)
assert out["score_raw"] == 0.0
assert "parse fail" in out["reason"]
assert out["tokens_input"] == 100 # 호출은 발생했음
@pytest.mark.asyncio
async def test_score_sentiment_clamps_out_of_range():
llm = _mk_llm(json.dumps({"score": 15.0, "reason": "초강세"}))
out = await analyzer.score_sentiment(llm, "005930", NEWS)
assert out["score_raw"] == 10.0 # +10 클램프
@pytest.mark.asyncio
async def test_score_sentiment_clamps_negative_out_of_range():
llm = _mk_llm(json.dumps({"score": -42.0, "reason": "초악재"}))
out = await analyzer.score_sentiment(llm, "005930", NEWS)
assert out["score_raw"] == -10.0
@pytest.mark.asyncio
async def test_score_sentiment_includes_summary_in_prompt():
"""summary 가 있으면 prompt 에 포함, 없으면 title 만."""
llm = _mk_llm(json.dumps({"score": 5.0, "reason": "ok"}))
await analyzer.score_sentiment(llm, "005930", NEWS, name="삼성전자")
call = llm.messages.create.call_args
user_msg = call.kwargs["messages"][0]["content"]
assert "1분기 영업이익 사상 최대" in user_msg # summary 포함
assert "삼성전자, HBM 양산" in user_msg # title 포함
assert "2026-05-14" in user_msg # pub_date 포함

View File

@@ -0,0 +1,108 @@
import datetime as dt
import sqlite3
import pytest
from app.screener.ai_news import articles_source
from app.screener.schema import ensure_screener_schema
@pytest.fixture
def conn():
c = sqlite3.connect(":memory:")
c.row_factory = sqlite3.Row
ensure_screener_schema(c)
yield c
c.close()
def _seed_master(conn, ticker, name):
conn.execute(
"INSERT INTO krx_master (ticker, name, market, market_cap, updated_at) "
"VALUES (?, ?, 'KOSPI', 1000000000, datetime('now'))",
(ticker, name),
)
def _seed_article(conn, title, summary="", crawled_at="2026-05-14T07:30:00"):
import hashlib
h = hashlib.md5(f"{title}|x".encode()).hexdigest()
conn.execute(
"INSERT INTO articles (hash, title, summary, link, press, pub_date, crawled_at) "
"VALUES (?, ?, ?, '', '', '2026-05-14', ?)",
(h, title, summary, crawled_at),
)
ASOF = dt.date(2026, 5, 14)
def test_single_ticker_match_in_title(conn):
_seed_master(conn, "005930", "삼성전자")
_seed_article(conn, "삼성전자, HBM 양산 가시화")
conn.commit()
out, stats = articles_source.gather_articles_for_tickers(
conn, ["005930"], ASOF, window_days=1, max_per_ticker=5,
)
assert len(out["005930"]) == 1
assert out["005930"][0]["title"] == "삼성전자, HBM 양산 가시화"
assert stats["matched_pairs"] == 1
assert stats["hit_tickers"] == 1
def test_single_ticker_match_in_summary(conn):
_seed_master(conn, "005930", "삼성전자")
_seed_article(conn, "메모리 시장 회복세", summary="삼성전자가 1분기 어닝 서프라이즈")
conn.commit()
out, _ = articles_source.gather_articles_for_tickers(
conn, ["005930"], ASOF, window_days=1, max_per_ticker=5,
)
assert len(out["005930"]) == 1
def test_multi_ticker_match(conn):
_seed_master(conn, "005930", "삼성전자")
_seed_master(conn, "000660", "SK하이닉스")
_seed_article(conn, "삼성전자와 SK하이닉스, 메모리 양산 경쟁")
conn.commit()
out, stats = articles_source.gather_articles_for_tickers(
conn, ["005930", "000660"], ASOF, window_days=1, max_per_ticker=5,
)
assert len(out["005930"]) == 1
assert len(out["000660"]) == 1
assert stats["matched_pairs"] == 2
assert stats["hit_tickers"] == 2
def test_no_match_returns_empty_list(conn):
_seed_master(conn, "005930", "삼성전자")
_seed_article(conn, "엔비디아 실적 발표", summary="AI 칩 수요 견조")
conn.commit()
out, stats = articles_source.gather_articles_for_tickers(
conn, ["005930"], ASOF, window_days=1, max_per_ticker=5,
)
assert out["005930"] == []
assert stats["matched_pairs"] == 0
assert stats["hit_tickers"] == 0
def test_max_per_ticker_caps_results(conn):
_seed_master(conn, "005930", "삼성전자")
for i in range(6):
_seed_article(conn, f"삼성전자 뉴스 #{i}", crawled_at=f"2026-05-14T0{i}:00:00")
conn.commit()
out, _ = articles_source.gather_articles_for_tickers(
conn, ["005930"], ASOF, window_days=1, max_per_ticker=5,
)
assert len(out["005930"]) == 5
def test_window_days_filters_old_articles(conn):
_seed_master(conn, "005930", "삼성전자")
_seed_article(conn, "삼성전자 최신 뉴스", crawled_at="2026-05-14T07:00:00")
_seed_article(conn, "삼성전자 오래된 뉴스", crawled_at="2026-05-01T07:00:00")
conn.commit()
out, _ = articles_source.gather_articles_for_tickers(
conn, ["005930"], ASOF, window_days=1, max_per_ticker=5,
)
assert len(out["005930"]) == 1
assert "최신" in out["005930"][0]["title"]

View File

@@ -0,0 +1,57 @@
import datetime as dt
import pandas as pd
import pytest
from app.screener.nodes.ai_news import AiNewsSentiment
class FakeCtx:
def __init__(self, df=None):
self.news_sentiment = df
self.asof = dt.date(2026, 5, 13)
def test_compute_empty_context():
out = AiNewsSentiment().compute(FakeCtx(None), {"min_news_count": 1})
assert out.empty
def test_compute_with_data_percentile_ranks():
df = pd.DataFrame([
{"ticker": "A", "score_raw": -5.0, "news_count": 3},
{"ticker": "B", "score_raw": 0.0, "news_count": 3},
{"ticker": "C", "score_raw": 8.0, "news_count": 3},
])
out = AiNewsSentiment().compute(FakeCtx(df), {"min_news_count": 1})
assert len(out) == 3
# percentile rank: A (lowest) < B < C (highest)
assert out.loc["A"] < out.loc["B"] < out.loc["C"]
# all within [0, 100]
assert (out >= 0).all() and (out <= 100).all()
def test_compute_filters_by_min_news_count():
df = pd.DataFrame([
{"ticker": "A", "score_raw": -5.0, "news_count": 0}, # 필터됨
{"ticker": "B", "score_raw": 0.0, "news_count": 2},
{"ticker": "C", "score_raw": 8.0, "news_count": 5},
])
out = AiNewsSentiment().compute(FakeCtx(df), {"min_news_count": 1})
assert "A" not in out.index
assert "B" in out.index
assert "C" in out.index
def test_compute_all_filtered_returns_empty():
df = pd.DataFrame([
{"ticker": "A", "score_raw": 5.0, "news_count": 0},
])
out = AiNewsSentiment().compute(FakeCtx(df), {"min_news_count": 1})
assert out.empty
def test_metadata():
n = AiNewsSentiment()
assert n.name == "ai_news"
assert "AI" in n.label or "뉴스" in n.label
assert n.default_params == {"min_news_count": 1}
assert "min_news_count" in n.param_schema["properties"]

View File

@@ -0,0 +1,145 @@
import datetime as dt
import sqlite3
import pytest
from unittest.mock import AsyncMock, MagicMock, patch
from app.screener.ai_news import pipeline
from app.screener.schema import ensure_screener_schema
@pytest.fixture
def conn():
c = sqlite3.connect(":memory:")
c.row_factory = sqlite3.Row
ensure_screener_schema(c)
# 시총 상위 3종목 시드
c.execute("INSERT INTO krx_master (ticker, name, market, market_cap, updated_at) "
"VALUES (?, ?, 'KOSPI', ?, datetime('now'))", ("005930", "삼성전자", 9_000_000))
c.execute("INSERT INTO krx_master (ticker, name, market, market_cap, updated_at) "
"VALUES (?, ?, 'KOSPI', ?, datetime('now'))", ("000660", "SK하이닉스", 8_000_000))
c.execute("INSERT INTO krx_master (ticker, name, market, market_cap, updated_at) "
"VALUES (?, ?, 'KOSPI', ?, datetime('now'))", ("373220", "LG에너지솔루션", 7_000_000))
c.commit()
yield c
c.close()
@pytest.mark.asyncio
async def test_refresh_daily_happy_path(conn):
"""3종목 mini integration — articles_source mock + analyzer mock."""
asof = dt.date(2026, 5, 13)
fake_articles_by_ticker = {
"005930": [{"title": "삼성 뉴스", "summary": "", "press": "", "pub_date": ""}],
"000660": [{"title": "SK 뉴스", "summary": "", "press": "", "pub_date": ""}],
"373220": [{"title": "LG 뉴스", "summary": "", "press": "", "pub_date": ""}],
}
fake_stats = {"total_articles": 3, "matched_pairs": 3, "hit_tickers": 3}
scores_by_ticker = {
"005930": 7.5, "000660": 4.0, "373220": -6.0,
}
async def fake_score(llm, ticker, news, *, name=None, model="m"):
return {
"ticker": ticker, "score_raw": scores_by_ticker[ticker],
"reason": f"r{ticker}", "news_count": 1,
"tokens_input": 100, "tokens_output": 20, "model": model,
}
with patch.object(pipeline, "articles_source") as mas, \
patch.object(pipeline, "_analyzer") as ma, \
patch.object(pipeline, "_make_llm") as ml:
mas.gather_articles_for_tickers = MagicMock(
return_value=(fake_articles_by_ticker, fake_stats)
)
ma.score_sentiment = fake_score
ml.return_value.__aenter__.return_value = AsyncMock()
ml.return_value.__aexit__.return_value = None
result = await pipeline.refresh_daily(conn, asof, concurrency=3)
assert result["asof"] == "2026-05-13"
assert result["updated"] == 3
assert result["failures"] == []
assert result["top_pos"][0]["ticker"] == "005930"
assert result["top_neg"][0]["ticker"] == "373220"
assert result["mapping"] == fake_stats
rows = conn.execute("SELECT ticker, score_raw, source FROM news_sentiment "
"WHERE date=?", ("2026-05-13",)).fetchall()
assert len(rows) == 3
assert all(r["source"] == "articles" for r in rows)
@pytest.mark.asyncio
async def test_refresh_daily_failures_isolated(conn):
asof = dt.date(2026, 5, 13)
fake_articles_by_ticker = {
"005930": [{"title": "h", "summary": "", "press": "", "pub_date": ""}],
"000660": [{"title": "h", "summary": "", "press": "", "pub_date": ""}],
"373220": [{"title": "h", "summary": "", "press": "", "pub_date": ""}],
}
fake_stats = {"total_articles": 3, "matched_pairs": 3, "hit_tickers": 3}
async def fake_score(llm, ticker, news, *, name=None, model="m"):
if ticker == "000660":
raise RuntimeError("llm exploded")
return {
"ticker": ticker, "score_raw": 5.0, "reason": "r", "news_count": 1,
"tokens_input": 100, "tokens_output": 20, "model": model,
}
with patch.object(pipeline, "articles_source") as mas, \
patch.object(pipeline, "_analyzer") as ma, \
patch.object(pipeline, "_make_llm") as ml:
mas.gather_articles_for_tickers = MagicMock(
return_value=(fake_articles_by_ticker, fake_stats)
)
ma.score_sentiment = fake_score
ml.return_value.__aenter__.return_value = AsyncMock()
ml.return_value.__aexit__.return_value = None
result = await pipeline.refresh_daily(conn, asof, concurrency=3)
assert result["updated"] == 2
assert len(result["failures"]) == 1
@pytest.mark.asyncio
async def test_refresh_daily_no_match_ticker_skipped(conn):
"""매핑 0인 ticker 는 LLM 호출 skip + news_sentiment 행 미생성."""
asof = dt.date(2026, 5, 13)
fake_articles_by_ticker = {
"005930": [{"title": "삼성", "summary": "", "press": "", "pub_date": ""}],
"000660": [], # 매핑 없음
"373220": [], # 매핑 없음
}
fake_stats = {"total_articles": 1, "matched_pairs": 1, "hit_tickers": 1}
async def fake_score(llm, ticker, news, *, name=None, model="m"):
return {
"ticker": ticker, "score_raw": 5.0, "reason": "r",
"news_count": 1, "tokens_input": 100, "tokens_output": 20,
"model": model,
}
with patch.object(pipeline, "articles_source") as mas, \
patch.object(pipeline, "_analyzer") as ma, \
patch.object(pipeline, "_make_llm") as ml:
mas.gather_articles_for_tickers = MagicMock(
return_value=(fake_articles_by_ticker, fake_stats)
)
ma.score_sentiment = fake_score
ml.return_value.__aenter__.return_value = AsyncMock()
ml.return_value.__aexit__.return_value = None
result = await pipeline.refresh_daily(conn, asof, concurrency=3)
assert result["updated"] == 1
rows = conn.execute("SELECT ticker FROM news_sentiment "
"WHERE date=?", ("2026-05-13",)).fetchall()
assert {r["ticker"] for r in rows} == {"005930"}
def test_top_market_cap_tickers(conn):
out = pipeline._top_market_cap_tickers(conn, n=2)
assert out == ["005930", "000660"]

View File

@@ -0,0 +1,36 @@
import datetime as dt
from unittest.mock import AsyncMock, patch
from fastapi.testclient import TestClient
from app.main import app
def test_refresh_news_sentiment_weekend_skip():
# 2026-05-16 = Saturday
client = TestClient(app)
resp = client.post(
"/api/stock/screener/snapshot/refresh-news-sentiment?asof=2026-05-16"
)
assert resp.status_code == 200
assert resp.json()["status"] == "skipped_weekend"
def test_refresh_news_sentiment_weekday_invokes_pipeline():
fake_summary = {
"asof": "2026-05-13", "updated": 3, "failures": [],
"duration_sec": 1.0, "tokens_input": 100, "tokens_output": 20,
"top_pos": [], "top_neg": [], "model": "m",
"mapping": {"total_articles": 5, "matched_pairs": 8, "hit_tickers": 3},
}
with patch("app.screener.router._ai_pipeline") as mp, \
patch("app.screener.router._ai_telegram") as mt:
mp.refresh_daily = AsyncMock(return_value=fake_summary)
mt.build_message = lambda **kw: f"TEXT_with_mapping={kw.get('mapping')}"
client = TestClient(app)
resp = client.post(
"/api/stock/screener/snapshot/refresh-news-sentiment?asof=2026-05-13"
)
assert resp.status_code == 200
body = resp.json()
assert body["mapping"]["hit_tickers"] == 3
assert "mapping=" in body["telegram_text"]

View File

@@ -0,0 +1,55 @@
import pytest
from unittest.mock import AsyncMock
from app.screener.ai_news import scraper
SAMPLE_HTML = """
<html><body>
<table class="type5"><tbody>
<tr><td class="title"><a href="/news1">삼성전자, HBM 양산 가시화</a></td><td class="date">2026.05.13 07:30</td></tr>
<tr><td class="title"><a href="/news2">삼성, 4분기 어닝 쇼크 우려</a></td><td class="date">2026.05.13 06:00</td></tr>
<tr><td class="title"><a href="/news3">메모리 시장 회복세</a></td><td class="date">2026.05.12 18:00</td></tr>
</tbody></table>
</body></html>
"""
EMPTY_HTML = "<html><body><table class='type5'><tbody></tbody></table></body></html>"
def _mk_client(status_code=200, text=SAMPLE_HTML):
client = AsyncMock()
resp = AsyncMock()
resp.status_code = status_code
resp.text = text
client.get = AsyncMock(return_value=resp)
return client
@pytest.mark.asyncio
async def test_fetch_news_success_returns_n_items():
client = _mk_client()
out = await scraper.fetch_news(client, "005930", n=2)
assert len(out) == 2
assert out[0]["title"] == "삼성전자, HBM 양산 가시화"
assert out[0]["date"] == "2026.05.13 07:30"
@pytest.mark.asyncio
async def test_fetch_news_404_returns_empty():
client = _mk_client(status_code=404, text="")
out = await scraper.fetch_news(client, "999999", n=5)
assert out == []
@pytest.mark.asyncio
async def test_fetch_news_empty_table_returns_empty():
client = _mk_client(text=EMPTY_HTML)
out = await scraper.fetch_news(client, "005930", n=5)
assert out == []
@pytest.mark.asyncio
async def test_fetch_news_n_caps_results():
client = _mk_client()
out = await scraper.fetch_news(client, "005930", n=2)
assert len(out) == 2 # 샘플에 3개 있지만 n=2로 잘림

View File

@@ -0,0 +1,79 @@
from app.screener.ai_news import telegram as tg
def _row(ticker, score, reason="r"):
return {"ticker": ticker, "score_raw": score, "reason": reason,
"news_count": 5, "tokens_input": 100, "tokens_output": 20,
"model": "m"}
def test_build_message_includes_top_sections():
msg = tg.build_message(
asof="2026-05-13",
top_pos=[_row("005930", 8.5, "HBM 호재")],
top_neg=[_row("373220", -6.3, "수주 지연")],
tokens_input=10000, tokens_output=2000,
)
assert "AI 뉴스 분석" in msg
assert "호재 Top" in msg
assert "악재 Top" in msg
assert "005930" in msg
# score는 MarkdownV2 escape 거쳐 "8\.5" 형태 ('.' 가 reserved)
assert "8\\.5" in msg
assert "HBM" in msg
assert "373220" in msg
def test_build_message_escapes_markdownv2_specials():
msg = tg.build_message(
asof="2026-05-13",
top_pos=[_row("005930", 3.0, "테스트(괄호) [대괄호]")],
top_neg=[],
tokens_input=100, tokens_output=20,
)
# MarkdownV2 특수문자 ( ) [ ] 이 escape 되어야 함
assert r"\(" in msg or r"\)" in msg
assert r"\[" in msg or r"\]" in msg
def test_build_message_cost_won_line():
msg = tg.build_message(
asof="2026-05-13", top_pos=[], top_neg=[],
tokens_input=10000, tokens_output=2000,
)
# tokens_input × 0.0013 + tokens_output × 0.0065 = 13 + 13 = ₩26
assert "₩26" in msg or "₩ 26" in msg or "" in msg
def test_build_message_empty_lists():
msg = tg.build_message(
asof="2026-05-13", top_pos=[], top_neg=[],
tokens_input=0, tokens_output=0,
)
# 빈 리스트라도 헤더는 있어야 함
assert "호재 Top" in msg
assert "악재 Top" in msg
def test_build_message_includes_mapping_line():
msg = tg.build_message(
asof="2026-05-14",
top_pos=[_row("005930", 8.5, "HBM 호재")],
top_neg=[],
tokens_input=1000, tokens_output=200,
mapping={"total_articles": 35, "matched_pairs": 50, "hit_tickers": 42},
)
assert "매핑" in msg
assert "42" in msg
assert "50" in msg
assert "35" in msg
def test_build_message_without_mapping_omits_line():
msg = tg.build_message(
asof="2026-05-14",
top_pos=[],
top_neg=[],
tokens_input=1000, tokens_output=200,
)
assert "매핑" not in msg

View File

@@ -0,0 +1,120 @@
"""Tests for ai_news validation harness (Spearman IC)."""
import datetime as dt
import sqlite3
import pytest
from app.screener.ai_news import validation
from app.screener.schema import ensure_screener_schema
@pytest.fixture
def conn():
c = sqlite3.connect(":memory:")
c.row_factory = sqlite3.Row
ensure_screener_schema(c)
yield c
c.close()
def _seed_sentiment(conn, date, ticker, score, news_count=3):
conn.execute(
"INSERT INTO news_sentiment (ticker, date, score_raw, reason, news_count, "
"tokens_input, tokens_output, model) "
"VALUES (?, ?, ?, 'r', ?, 100, 20, 'm')",
(ticker, date, score, news_count),
)
def _seed_price(conn, ticker, date, close):
conn.execute(
"INSERT INTO krx_daily_prices (ticker, date, close) VALUES (?, ?, ?)",
(ticker, date, close),
)
def test_empty_db_returns_skip(conn):
out = validation.compute_ic(conn, days=30, horizon=1, asof_today=dt.date(2026, 5, 14))
assert out["ic_count"] == 0
assert out["verdict"] == "skip"
assert out["ic_mean"] is None
def test_strong_positive_ic(conn):
"""5종목 × 12일 — 점수가 높을수록 다음날 수익률 높게 시드 → IC ≈ +1.
score 가 변하지 않는 ticker × day-wise close 로 정확한 monotonic 관계 시드.
"""
base_date = dt.date(2026, 5, 1)
# 가격 13일치 시드 (day0..day12). ticker별 base 다르고 (score-기반) day마다 다른 close.
for i, ticker in enumerate(["A", "B", "C", "D", "E"]):
score = i * 2.0 - 4.0 # ticker별 score 고정 (-4, -2, 0, +2, +4)
# day 0 close=100, day n close=100+(score × n)
for day in range(13):
d = (base_date + dt.timedelta(days=day)).isoformat()
_seed_price(conn, ticker, d, 100.0 + score * day)
if day < 12:
_seed_sentiment(conn, d, ticker, score)
conn.commit()
out = validation.compute_ic(conn, days=30, horizon=1, asof_today=dt.date(2026, 5, 14))
assert out["ic_count"] >= 10
assert out["ic_mean"] > 0.5
assert out["verdict"] == "strong"
def test_zero_ic_random_data(conn):
"""점수와 수익률이 무관 → IC ≈ 0."""
import random
random.seed(42)
base_date = dt.date(2026, 5, 1)
for ticker in ["A", "B", "C", "D", "E", "F", "G"]:
for day in range(13):
d = (base_date + dt.timedelta(days=day)).isoformat()
_seed_price(conn, ticker, d, 100.0 + random.uniform(-5, 5))
if day < 12:
_seed_sentiment(conn, d, ticker, random.uniform(-10, 10))
conn.commit()
out = validation.compute_ic(conn, days=30, horizon=1, asof_today=dt.date(2026, 5, 14))
assert out["ic_count"] >= 10
assert abs(out["ic_mean"]) < 0.3 # 약한 신호 — verdict는 weak 가능
assert out["verdict"] in ("weak", "strong") # 시드에 따라 약간 흔들림
def test_min_news_count_filter(conn):
"""news_count < min_news_count 인 row 는 제외."""
_seed_sentiment(conn, "2026-05-13", "A", 5.0, news_count=0)
_seed_sentiment(conn, "2026-05-13", "B", -5.0, news_count=3)
_seed_price(conn, "A", "2026-05-13", 100.0)
_seed_price(conn, "A", "2026-05-14", 105.0)
_seed_price(conn, "B", "2026-05-13", 100.0)
_seed_price(conn, "B", "2026-05-14", 95.0)
conn.commit()
out = validation.compute_ic(
conn, days=30, horizon=1, min_news_count=1,
asof_today=dt.date(2026, 5, 14),
)
# A 가 필터됨 → 1종목만 남으면 Spearman 계산 불가 (< 5) → skip
assert out["ic_count"] == 0
def test_horizon_5_days(conn):
"""horizon=5 면 close[date+5] / close[date] - 1 사용."""
base_date = dt.date(2026, 5, 1)
for day in range(20):
d = (base_date + dt.timedelta(days=day)).isoformat()
for i, ticker in enumerate(["A", "B", "C", "D", "E"]):
_seed_sentiment(conn, d, ticker, i * 2.0 - 4.0)
# 가격: A=오름, B=오름, C=평, D=내림, E=내림
for day in range(25):
d = (base_date + dt.timedelta(days=day)).isoformat()
for i, ticker in enumerate(["A", "B", "C", "D", "E"]):
slope = i - 2 # -2 ~ +2
_seed_price(conn, ticker, d, 100.0 + slope * day)
conn.commit()
out = validation.compute_ic(conn, days=30, horizon=5, asof_today=dt.date(2026, 5, 25))
assert out["horizon_days"] == 5
assert out["ic_count"] > 0

View File

@@ -7,9 +7,10 @@ DB_PATH = os.getenv("TRAVEL_DB_PATH", "/data/thumbs/travel.db")
def _conn() -> sqlite3.Connection:
os.makedirs(os.path.dirname(DB_PATH), exist_ok=True)
conn = sqlite3.connect(DB_PATH)
conn = sqlite3.connect(DB_PATH, timeout=120.0)
conn.row_factory = sqlite3.Row
conn.execute("PRAGMA journal_mode=WAL")
conn.execute("PRAGMA busy_timeout=120000")
return conn