diff --git a/CLAUDE.md b/CLAUDE.md index a95c1be..ae06300 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -7,7 +7,7 @@ ## 1. 프로젝트 개요 Synology NAS 기반의 개인 웹 플랫폼 백엔드 모노레포. -- **서비스**: lotto-lab, stock, travel-proxy, music-lab, blog-lab, realestate-lab, agent-office, personal, packs-lab, deployer (10개) +- **서비스**: lotto-lab, stock, travel-proxy, music-lab, insta-lab, realestate-lab, agent-office, personal, packs-lab, deployer (10개) - **프론트엔드**: 별도 레포 (React + Vite SPA), 빌드 산출물만 NAS에 배포 - **인프라**: Docker Compose (10컨테이너) + Nginx(리버스 프록시) + Gitea Webhook 자동 배포 @@ -56,7 +56,7 @@ Synology NAS 기반의 개인 웹 플랫폼 백엔드 모노레포. | `lotto` | 18000 | 로또 데이터 수집·분석·추천 API | | `stock` | 18500 | 주식 뉴스·AI 분석·KIS API 연동 | | `music-lab` | 18600 | AI 음악 생성·라이브러리 관리 API | -| `blog-lab` | 18700 | 블로그 마케팅 수익화 API | +| `insta-lab` | 18700 | 인스타 카드 피드 자동 생성 (뉴스→키워드→10페이지 카드) | | `realestate-lab` | 18800 | 부동산 청약 자동 수집·매칭 API | | `agent-office` | 18900 | AI 에이전트 오피스 (실시간 WebSocket + 텔레그램 연동) | | `packs-lab` | 18950 | NAS 자료 다운로드 자동화 (DSM 공유 링크 + 5GB 업로드, Vercel SaaS와 HMAC 통신) | @@ -77,7 +77,7 @@ Synology NAS 기반의 개인 웹 플랫폼 백엔드 모노레포. | `/api/trade/` | `stock:8000` | KIS 실계좌 API | | `/api/portfolio` | `stock:8000` | trailing slash 유무 모두 매칭 | | `/api/music/` | `music-lab:8000` | AI 음악 생성·라이브러리 API | -| `/api/blog-marketing/` | `blog-lab:8000` | 블로그 마케팅 수익화 API | +| `/api/insta/` | `insta-lab:8000` | 인스타 카드 자동 생성 API | | `/api/realestate/` | `realestate-lab:8000` | 부동산 청약 API | | `/api/todos` | `personal:8000` | 투두 API | | `/api/blog/` | `personal:8000` | 블로그 API | @@ -135,7 +135,7 @@ docker compose up -d | Lotto Backend | http://localhost:18000 | | Travel API | http://localhost:19000 | | Stock Lab | http://localhost:18500 | -| Blog Lab | http://localhost:18700 | +| Insta Lab | http://localhost:18700 | | Realestate Lab | http://localhost:18800 | | Packs Lab | http://localhost:18950 | @@ -454,61 +454,51 @@ docker compose up -d | PUT | `/api/travel/albums/{album}/region` | 앨범 지역 변경 (region_map_extra 수정) | | PUT | `/api/travel/regions/{region_id}` | 커스텀 지역 이름/좌표 수정 (지도 핀 표시용) | -### blog-lab (blog-lab/) -- 블로그 마케팅 수익화 서비스 (키워드 분석 → AI 글 생성 → 마케팅 강화 → 품질 리뷰 → 포스팅 → 수익 추적) -- AI 엔진: Claude API (Anthropic, `claude-sonnet-4-20250514`) -- 웹 검색: Naver Search API (블로그 + 쇼핑) + 상위 블로그 본문 크롤링 -- DB: `/app/data/blog_marketing.db` -- 파일 구조: `main.py`, `db.py`, `config.py`, `naver_search.py`, `content_generator.py`, `marketer.py`, `quality_reviewer.py`, `web_crawler.py` +### insta-lab (insta-lab/) +- 인스타그램 카드 피드 자동 생성 — 뉴스 모니터링 → 키워드 추출 → 10페이지 카드 카피 + PNG 렌더 → 텔레그램 푸시 → 사용자 수동 업로드 +- DB: `/app/data/insta.db` (news_articles, trending_keywords, card_slates, card_assets, generation_tasks, prompt_templates) +- 카드 사이즈: 1080×1350 (인스타 4:5 세로) +- 카드 렌더: Jinja2 템플릿 → Playwright headless Chromium 스크린샷 +- 파일 구조: `app/main.py`, `config.py`, `db.py`, `news_collector.py`, `keyword_extractor.py`, `card_writer.py`, `card_renderer.py`, `templates/default/card.html.j2` -**파이프라인**: 리서치(+크롤링) → 작가(초안) → 마케터(링크 삽입) → 평가자(6기준 60점) -**상태 흐름**: `draft` → `marketed` → `reviewed` → `published` +**환경변수** +- `NAVER_CLIENT_ID` / `NAVER_CLIENT_SECRET`: 네이버 검색 API +- `ANTHROPIC_API_KEY`: Claude API (Haiku=키워드 정제, Sonnet=카드 카피) +- `ANTHROPIC_MODEL_HAIKU` / `ANTHROPIC_MODEL_SONNET`: 모델명 오버라이드 +- `INSTA_DATA_PATH`: SQLite + 카드 PNG 저장 경로 (기본 `/app/data`) +- `CARD_TEMPLATE_DIR`: HTML 템플릿 디렉토리 (기본 `/app/app/templates`) +- `NEWS_PER_CATEGORY` / `KEYWORDS_PER_CATEGORY`: 수집·추출 limit 튜닝 -**blog_marketing.db 테이블** +**카테고리 시드 키워드** +- 기본 economy / psychology / celebrity 3종 (config.DEFAULT_CATEGORY_SEEDS) +- `prompt_templates.name='category_seeds'`에 JSON으로 오버라이드 가능 -| 테이블 | 설명 | -|--------|------| -| `keyword_analyses` | 키워드 분석 결과 (네이버 검색 데이터 + 경쟁도/기회 점수 + 크롤링 본문) | -| `blog_posts` | 블로그 글 (draft → marketed → reviewed → published) | -| `brand_links` | 브랜드커넥트 제휴 링크 (post_id/keyword_id FK) | -| `commissions` | 포스트별 월간 클릭/구매/수익 | -| `generation_tasks` | 비동기 작업 상태 (research/generate/market/review) | -| `prompt_templates` | AI 프롬프트 템플릿 (DB 저장, 코드 배포 없이 수정 가능) | +**카드 슬레이트 (`card_slates`)** +- status: `draft` → `rendered` → `sent` (또는 `failed`) +- cover_copy / body_copies (8개) / cta_copy / suggested_caption / hashtags JSON 컬럼 +- accent_color는 카테고리별 기본값 (economy=#0F62FE, psychology=#A66CFF, celebrity=#FF5C8A) -**blog-lab API 목록** +**스케줄러 job (agent-office)** +- 09:30 매일 — `_run_insta_schedule` (insta_pipeline) → 뉴스 수집 → 키워드 추출 → 텔레그램 후보 푸시 +- `agent_config.custom_config.auto_select=True`이면 카테고리당 1위 키워드 자동 슬레이트 생성·발송 + +**insta-lab API 목록** | 메서드 | 경로 | 설명 | |--------|------|------| -| GET | `/api/blog-marketing/status` | 서비스 상태 (API 키 설정 현황) | -| POST | `/api/blog-marketing/research` | 키워드 분석 시작 (+ 상위 블로그 크롤링) | -| GET | `/api/blog-marketing/research/history` | 분석 이력 조회 | -| GET | `/api/blog-marketing/research/{id}` | 분석 상세 조회 | -| DELETE | `/api/blog-marketing/research/{id}` | 분석 삭제 | -| GET | `/api/blog-marketing/task/{task_id}` | 작업 상태 폴링 | -| POST | `/api/blog-marketing/generate` | 작가 단계: AI 글 생성 (크롤링 참고 + 링크 반영) | -| POST | `/api/blog-marketing/market/{post_id}` | 마케터 단계: 전환율 강화 + 링크 삽입 | -| POST | `/api/blog-marketing/review/{post_id}` | 평가자 단계: 품질 리뷰 (6기준 × 10점, 42/60 통과) | -| POST | `/api/blog-marketing/regenerate/{post_id}` | 피드백 기반 재생성 | -| POST | `/api/blog-marketing/links` | 브랜드커넥트 링크 등록 | -| GET | `/api/blog-marketing/links` | 링크 조회 (post_id, keyword_id 필터) | -| PUT | `/api/blog-marketing/links/{id}` | 링크 수정 | -| DELETE | `/api/blog-marketing/links/{id}` | 링크 삭제 | -| GET | `/api/blog-marketing/posts` | 포스트 목록 (status 필터) | -| GET | `/api/blog-marketing/posts/{id}` | 포스트 상세 | -| PUT | `/api/blog-marketing/posts/{id}` | 포스트 수정 | -| DELETE | `/api/blog-marketing/posts/{id}` | 포스트 삭제 | -| POST | `/api/blog-marketing/posts/{id}/publish` | 발행 (네이버 URL 등록) | -| GET | `/api/blog-marketing/commissions` | 수익 내역 조회 | -| POST | `/api/blog-marketing/commissions` | 수익 기록 추가 | -| PUT | `/api/blog-marketing/commissions/{id}` | 수익 기록 수정 | -| DELETE | `/api/blog-marketing/commissions/{id}` | 수익 기록 삭제 | -| GET | `/api/blog-marketing/dashboard` | 대시보드 집계 | - -**환경변수** -- `ANTHROPIC_API_KEY`: Claude API 키 (미설정 시 AI 생성 비활성화) -- `NAVER_CLIENT_ID`: 네이버 검색 API 클라이언트 ID -- `NAVER_CLIENT_SECRET`: 네이버 검색 API 시크릿 -- `BLOG_DATA_PATH`: SQLite DB 저장 경로 (기본 `./data/blog`) +| GET | `/api/insta/status` | 서비스 상태 (NAVER/ANTHROPIC 키 여부) | +| POST | `/api/insta/news/collect` | 뉴스 수집 트리거 (BackgroundTask) | +| GET | `/api/insta/news/articles` | 수집 기사 목록 (category, days) | +| POST | `/api/insta/keywords/extract` | 키워드 추출 트리거 (BackgroundTask) | +| GET | `/api/insta/keywords` | 트렌딩 키워드 목록 (category, used) | +| POST | `/api/insta/slates` | 슬레이트 생성 (keyword, category) | +| GET | `/api/insta/slates` | 슬레이트 목록 | +| GET | `/api/insta/slates/{id}` | 슬레이트 상세 + 자산 | +| POST | `/api/insta/slates/{id}/render` | 카드 렌더 재시도 | +| GET | `/api/insta/slates/{id}/assets/{page}` | 카드 PNG 다운로드 (1~10) | +| DELETE | `/api/insta/slates/{id}` | 슬레이트 삭제 (자산 파일 포함) | +| GET | `/api/insta/tasks/{task_id}` | BackgroundTask 상태 폴링 | +| GET/PUT | `/api/insta/templates/prompts/{name}` | 프롬프트 템플릿 CRUD | ### agent-office (agent-office/) - AI 에이전트 가상 오피스 — 2D 픽셀아트 사무실에서 에이전트가 실제 작업 수행 @@ -701,3 +691,4 @@ docker compose up -d - **Windows AI 서버 IP**: `192.168.45.59` — 공유기 DHCP 고정 예약으로 고정. Tailscale은 Synology에서 TCP 불가(userspace 모드)라 로컬 IP 사용 - **현재가 조회**: 네이버 모바일 API → HTML 파싱 폴백, 3분 TTL 캐시 (`price_fetcher.py`) - **시뮬레이션 교체 방식**: `best_picks`는 교체형 — 새 시뮬레이션 실행 시 `is_active=0`으로 비활성화 후 신규 입력 +- **insta-lab Playwright**: NAS에서 chromium 빌드는 가능하지만 +500MB 이미지. 메모리 부족 시 카드 렌더 실패 가능 — 한 번에 1슬레이트만 렌더하도록 직렬화됨 diff --git a/agent-office/app/agents/__init__.py b/agent-office/app/agents/__init__.py index 94ae6d0..e75a071 100644 --- a/agent-office/app/agents/__init__.py +++ b/agent-office/app/agents/__init__.py @@ -1,6 +1,6 @@ from .stock import StockAgent from .music import MusicAgent -from .blog import BlogAgent +from .insta import InstaAgent from .realestate import RealestateAgent from .lotto import LottoAgent from .youtube import YouTubeResearchAgent @@ -11,7 +11,7 @@ AGENT_REGISTRY = {} def init_agents(): AGENT_REGISTRY["stock"] = StockAgent() AGENT_REGISTRY["music"] = MusicAgent() - AGENT_REGISTRY["blog"] = BlogAgent() + AGENT_REGISTRY["insta"] = InstaAgent() AGENT_REGISTRY["realestate"] = RealestateAgent() AGENT_REGISTRY["lotto"] = LottoAgent() AGENT_REGISTRY["youtube"] = YouTubeResearchAgent() diff --git a/agent-office/app/agents/blog.py b/agent-office/app/agents/blog.py deleted file mode 100644 index b93c875..0000000 --- a/agent-office/app/agents/blog.py +++ /dev/null @@ -1,192 +0,0 @@ -import asyncio -from typing import Optional - -from .base import BaseAgent -from ..db import ( - create_task, update_task_status, approve_task, reject_task, - get_task, get_agent_config, add_log, -) -from .. import service_proxy -from .. import telegram_bot - - -DEFAULT_TREND_KEYWORDS = [ - "다이어트 식단", "재택근무 꿀템", "캠핑 장비 추천", - "홈트레이닝", "제주도 여행", "에어프라이어 레시피", -] - - -class BlogAgent(BaseAgent): - """블로그 마케팅 에이전트. - - 매일 10:00 자동 실행: 키워드 1개 리서치 → 글 생성 → 마케터 → 평가자 - → 평가 점수와 요약을 텔레그램 승인 요청으로 푸시 - → 승인 시 `published` 상태로 전환, 거절 시 재생성 - """ - - agent_id = "blog" - display_name = "블로그 마케터" - - async def on_schedule(self) -> None: - if self.state not in ("idle", "break"): - return - - config = get_agent_config(self.agent_id) or {} - custom = config.get("custom_config", {}) or {} - keywords = custom.get("trend_keywords") or DEFAULT_TREND_KEYWORDS - if not keywords: - return - - import random - keyword = random.choice(keywords) - - task_id = create_task( - self.agent_id, - "auto_blog_pipeline", - {"keyword": keyword}, - requires_approval=True, - ) - await self.transition("working", f"리서치: {keyword}", task_id) - asyncio.create_task(self._run_pipeline(task_id, keyword)) - - async def _await_task(self, step: str, task_id: str, timeout_sec: int = 240) -> Optional[int]: - """blog-lab BackgroundTask 완료 폴링. 완료 시 result_id 반환.""" - attempts = max(1, timeout_sec // 5) - for _ in range(attempts): - await asyncio.sleep(5) - status = await service_proxy.blog_task_status(task_id) - s = status.get("status") - if s == "succeeded": - return status.get("result_id") - if s == "failed": - raise Exception(f"{step} failed: {status.get('error')}") - raise Exception(f"{step} timeout ({timeout_sec}s 내 완료되지 않음)") - - async def _run_pipeline(self, task_id: str, keyword: str) -> None: - try: - # 1) 리서치 - research = await service_proxy.blog_research(keyword) - keyword_id = await self._await_task("research", research.get("task_id"), 180) - if not keyword_id: - raise Exception("research succeeded but result_id missing") - - # 2) 작가 단계 (비동기) - await self.transition("working", f"글 생성: {keyword}", task_id) - gen = await service_proxy.blog_generate(keyword_id) - post_id = await self._await_task("generate", gen.get("task_id"), 300) - if not post_id: - raise Exception("generate succeeded but post_id missing") - - # 3) 마케터 단계 (비동기) - await self.transition("working", "링크 삽입 중", task_id) - mkt = await service_proxy.blog_market(post_id) - await self._await_task("market", mkt.get("task_id"), 180) - - # 4) 평가자 단계 (비동기) - await self.transition("working", "품질 리뷰 중", task_id) - rev = await service_proxy.blog_review(post_id) - await self._await_task("review", rev.get("task_id"), 180) - - post_after = await service_proxy.blog_get_post(post_id) - score = post_after.get("review_score") - passed = (score or 0) >= 42 - - title = post_after.get("title", "(제목 없음)") - excerpt = (post_after.get("body") or "")[:300] - - update_task_status(task_id, "pending", { - "keyword": keyword, - "post_id": post_id, - "score": score, - "passed": passed, - "title": title, - }) - - await self.transition("waiting", f"승인 대기 · {score}/60", task_id) - - detail = ( - f"키워드: {keyword}\n" - f"제목: {title}\n" - f"평가 점수: {score}/60 ({'통과' if passed else '미통과'})\n\n" - f"{excerpt}..." - ) - await telegram_bot.send_approval_request( - self.agent_id, task_id, - "✍️ [블로그 에이전트] 발행 승인 요청", detail, - ) - - except Exception as e: - add_log(self.agent_id, f"Blog pipeline failed: {e}", "error", task_id) - update_task_status(task_id, "failed", {"error": str(e), "keyword": keyword}) - await self.transition("idle", f"오류: {e}") - await telegram_bot.send_task_result( - self.agent_id, "✍️ [블로그 에이전트] 파이프라인 실패", - f"키워드: {keyword}\n오류: {e}", - ) - - async def on_command(self, command: str, params: dict) -> dict: - if command == "research": - keyword = (params.get("keyword") or "").strip() - if not keyword: - return {"ok": False, "message": "keyword 필수"} - task_id = create_task( - self.agent_id, "auto_blog_pipeline", - {"keyword": keyword}, requires_approval=True, - ) - await self.transition("working", f"리서치: {keyword}", task_id) - asyncio.create_task(self._run_pipeline(task_id, keyword)) - return {"ok": True, "task_id": task_id, "message": f"파이프라인 시작: {keyword}"} - - if command == "add_trend_keyword": - keyword = (params.get("keyword") or "").strip() - if not keyword: - return {"ok": False, "message": "keyword 필수"} - config = get_agent_config(self.agent_id) or {} - custom = config.get("custom_config", {}) or {} - kws = list(custom.get("trend_keywords") or []) - if keyword not in kws: - kws.append(keyword) - from ..db import update_agent_config - update_agent_config(self.agent_id, custom_config={**custom, "trend_keywords": kws}) - return {"ok": True, "keywords": kws} - - if command == "list_trend_keywords": - config = get_agent_config(self.agent_id) or {} - custom = config.get("custom_config", {}) or {} - return {"ok": True, "keywords": custom.get("trend_keywords") or DEFAULT_TREND_KEYWORDS} - - return {"ok": False, "message": f"Unknown command: {command}"} - - async def on_approval(self, task_id: str, approved: bool, feedback: str = "") -> None: - task = get_task(task_id) - if not task: - return - result = task.get("result_data") or {} - post_id = result.get("post_id") - - if not approved: - reject_task(task_id) - await self.transition("idle", "발행 거절됨") - await telegram_bot.send_task_result( - self.agent_id, "✍️ [블로그 에이전트] 발행 취소", - f"키워드: {result.get('keyword', '')}\n사용자가 거절했습니다.", - ) - return - - approve_task(task_id, via="telegram") - await self.transition("reporting", "발행 중...", task_id) - - try: - if post_id: - await service_proxy.blog_publish(int(post_id)) - update_task_status(task_id, "succeeded", {**result, "published": True}) - await telegram_bot.send_task_result( - self.agent_id, "✍️ [블로그 에이전트] 발행 완료", - f"키워드: {result.get('keyword', '')}\n제목: {result.get('title', '')}\n" - f"점수: {result.get('score')}/60", - ) - await self.transition("idle", "발행 완료") - except Exception as e: - add_log(self.agent_id, f"Blog publish failed: {e}", "error", task_id) - update_task_status(task_id, "failed", {**result, "publish_error": str(e)}) - await self.transition("idle", f"발행 오류: {e}") diff --git a/agent-office/app/agents/insta.py b/agent-office/app/agents/insta.py new file mode 100644 index 0000000..2750879 --- /dev/null +++ b/agent-office/app/agents/insta.py @@ -0,0 +1,162 @@ +"""인스타 카드 에이전트 — 매일 09:30 뉴스 수집·키워드 추출 → 텔레그램 후보 푸시. +사용자가 키워드 버튼을 누르면 카드 슬레이트 생성 + 10장 미디어 그룹 발송.""" + +import asyncio +import json +import logging +from typing import Any, Dict, List, Optional + +import httpx + +from .base import BaseAgent +from ..db import ( + create_task, update_task_status, add_log, get_agent_config, +) +from ..config import TELEGRAM_BOT_TOKEN, TELEGRAM_CHAT_ID +from .. import service_proxy +from ..telegram import messaging + +logger = logging.getLogger(__name__) + + +async def _send_media_group(media: List[Dict[str, Any]], caption: str = "") -> Dict[str, Any]: + """텔레그램 sendMediaGroup. media는 InputMediaPhoto dicts. + 각 항목에는 임시 키 '_bytes'로 PNG 바이트가 담겨 있어 attach:// 형식으로 multipart 업로드.""" + if not TELEGRAM_BOT_TOKEN: + return {"ok": False, "reason": "TELEGRAM_BOT_TOKEN missing"} + url = f"https://api.telegram.org/bot{TELEGRAM_BOT_TOKEN}/sendMediaGroup" + files: Dict[str, tuple] = {} + for i, m in enumerate(media): + attach_key = f"photo{i+1}" + files[attach_key] = (f"{i+1}.png", m["_bytes"], "image/png") + m["media"] = f"attach://{attach_key}" + m.pop("_bytes", None) + if caption and media: + media[0]["caption"] = caption[:1024] + payload = {"chat_id": TELEGRAM_CHAT_ID, "media": json.dumps(media, ensure_ascii=False)} + async with httpx.AsyncClient(timeout=60) as client: + resp = await client.post(url, data=payload, files=files) + return resp.json() + + +class InstaAgent(BaseAgent): + agent_id = "insta" + display_name = "인스타 큐레이터" + + async def on_schedule(self) -> None: + """09:30 매일: 뉴스 수집 → 키워드 추출 → 텔레그램 후보 푸시. + custom_config.auto_select=True면 카테고리당 1위 키워드 자동 슬레이트 생성.""" + if self.state not in ("idle", "break"): + return + config = get_agent_config(self.agent_id) or {} + custom = config.get("custom_config", {}) or {} + auto_select = bool(custom.get("auto_select", False)) + + task_id = create_task(self.agent_id, "insta_daily", {"auto_select": auto_select}, + requires_approval=False) + await self.transition("working", "뉴스 수집·키워드 추출", task_id) + try: + await self._run_collect_and_extract() + kws = await service_proxy.insta_list_keywords(used=False) + if auto_select: + await self._auto_render(kws) + else: + await self._push_keyword_candidates(kws) + update_task_status(task_id, "succeeded", {"keywords": len(kws)}) + await self.transition("idle", "후보 푸시 완료") + except Exception as e: + add_log(self.agent_id, f"insta daily failed: {e}", "error", task_id) + update_task_status(task_id, "failed", {"error": str(e)}) + await self.transition("idle", f"오류: {e}") + + async def _run_collect_and_extract(self) -> None: + col = await service_proxy.insta_collect() + await self._wait_task(col["task_id"], step="collect", timeout_sec=300) + ext = await service_proxy.insta_extract() + await self._wait_task(ext["task_id"], step="extract", timeout_sec=300) + + async def _wait_task(self, task_id: str, step: str, timeout_sec: int = 300) -> Dict[str, Any]: + attempts = max(1, timeout_sec // 5) + for _ in range(attempts): + await asyncio.sleep(5) + st = await service_proxy.insta_task_status(task_id) + if st["status"] == "succeeded": + return st + if st["status"] == "failed": + raise RuntimeError(f"{step} failed: {st.get('error')}") + raise TimeoutError(f"{step} timeout {timeout_sec}s") + + async def _push_keyword_candidates(self, keywords: List[Dict[str, Any]]) -> None: + by_cat: Dict[str, List[Dict[str, Any]]] = {} + for k in keywords: + by_cat.setdefault(k["category"], []).append(k) + if not by_cat: + await messaging.send_raw("📰 [인스타 큐레이터] 오늘은 추천할 키워드가 없습니다.") + return + rows: List[List[Dict[str, Any]]] = [] + text_lines = ["📰 [인스타 큐레이터] 오늘의 키워드 후보"] + for cat, items in by_cat.items(): + text_lines.append(f"\n{cat}") + for k in items[:5]: + text_lines.append(f" · {k['keyword']} (score {k['score']:.2f})") + rows.append([{ + "text": f"🎴 {k['keyword']}", + "callback_data": f"render_{k['id']}", + }]) + await messaging.send_raw("\n".join(text_lines), reply_markup={"inline_keyboard": rows}) + + async def _auto_render(self, keywords: List[Dict[str, Any]]) -> None: + by_cat: Dict[str, Dict[str, Any]] = {} + for k in keywords: + cat = k["category"] + if cat not in by_cat or k["score"] > by_cat[cat]["score"]: + by_cat[cat] = k + for kw in by_cat.values(): + await self._render_and_push(kw["id"]) + + async def _render_and_push(self, keyword_id: int) -> None: + kw = await service_proxy.insta_get_keyword(keyword_id) + if not kw: + await messaging.send_raw(f"⚠️ 키워드 {keyword_id} 없음") + return + await messaging.send_raw(f"🎨 카드 생성 중: {kw['keyword']}") + created = await service_proxy.insta_create_slate( + keyword=kw["keyword"], category=kw["category"], keyword_id=kw["id"], + ) + st = await self._wait_task(created["task_id"], step="slate", timeout_sec=600) + slate_id = st["result_id"] + slate = await service_proxy.insta_get_slate(slate_id) + media = [] + for a in slate["assets"][:10]: + data = await service_proxy.insta_get_asset_bytes(slate_id, a["page_index"]) + media.append({"type": "photo", "_bytes": data}) + caption = slate.get("suggested_caption", "") + hashtags = " ".join(slate.get("hashtags", []) or []) + full_caption = f"{caption}\n\n{hashtags}".strip() + await _send_media_group(media, caption=full_caption) + + async def on_command(self, command: str, params: dict) -> dict: + if command == "extract": + await self._run_collect_and_extract() + kws = await service_proxy.insta_list_keywords(used=False) + await self._push_keyword_candidates(kws) + return {"ok": True, "count": len(kws)} + if command == "render": + kid = int(params.get("keyword_id") or 0) + if not kid: + return {"ok": False, "message": "keyword_id 필수"} + await self._render_and_push(kid) + return {"ok": True} + return {"ok": False, "message": f"Unknown command: {command}"} + + async def on_callback(self, action: str, params: dict) -> dict: + if action == "render": + kid = int(params.get("keyword_id") or 0) + if not kid: + return {"ok": False} + await self._render_and_push(kid) + return {"ok": True} + return {"ok": False} + + async def on_approval(self, task_id: str, approved: bool, feedback: str = "") -> None: + return diff --git a/agent-office/app/config.py b/agent-office/app/config.py index 890d66d..89d2724 100644 --- a/agent-office/app/config.py +++ b/agent-office/app/config.py @@ -3,7 +3,7 @@ import os # Service URLs (Docker internal network) STOCK_URL = os.getenv("STOCK_URL", "http://localhost:18500") MUSIC_LAB_URL = os.getenv("MUSIC_LAB_URL", "http://localhost:18600") -BLOG_LAB_URL = os.getenv("BLOG_LAB_URL", "http://localhost:18700") +INSTA_LAB_URL = os.getenv("INSTA_LAB_URL", "http://localhost:18700") REALESTATE_LAB_URL = os.getenv("REALESTATE_LAB_URL", "http://localhost:18800") # Telegram diff --git a/agent-office/app/scheduler.py b/agent-office/app/scheduler.py index 05fc275..b75fe31 100644 --- a/agent-office/app/scheduler.py +++ b/agent-office/app/scheduler.py @@ -24,8 +24,8 @@ async def _run_stock_ai_news(): if agent: await agent.on_ai_news_schedule() -async def _run_blog_schedule(): - agent = AGENT_REGISTRY.get("blog") +async def _run_insta_schedule(): + agent = AGENT_REGISTRY.get("insta") if agent: await agent.on_schedule() @@ -67,7 +67,7 @@ def init_scheduler(): minute=0, id="stock_ai_news_sentiment", ) - scheduler.add_job(_run_blog_schedule, "cron", hour=10, minute=0, id="blog_pipeline") + scheduler.add_job(_run_insta_schedule, "cron", hour=9, minute=30, id="insta_pipeline") scheduler.add_job(_run_lotto_schedule, "cron", day_of_week="mon", hour=9, minute=0, id="lotto_curate") scheduler.add_job(_run_youtube_research, "cron", hour=9, minute=0, id="youtube_research") scheduler.add_job(_send_youtube_weekly_report, "cron", day_of_week="mon", hour=8, minute=0, id="youtube_weekly_report") diff --git a/agent-office/app/service_proxy.py b/agent-office/app/service_proxy.py index 891abe0..a0ca473 100644 --- a/agent-office/app/service_proxy.py +++ b/agent-office/app/service_proxy.py @@ -1,7 +1,7 @@ import httpx from typing import Any, Dict, List, Optional -from .config import STOCK_URL, MUSIC_LAB_URL, BLOG_LAB_URL, REALESTATE_LAB_URL +from .config import STOCK_URL, MUSIC_LAB_URL, INSTA_LAB_URL, REALESTATE_LAB_URL _client = httpx.AsyncClient(timeout=30.0) @@ -101,58 +101,70 @@ async def get_music_credits() -> Dict[str, Any]: return resp.json() -# --- blog-lab --- +# --- insta-lab --- -async def blog_research(keyword: str) -> Dict[str, Any]: - """키워드 리서치 시작 → task_id 반환""" +async def insta_collect(categories: Optional[list] = None) -> Dict[str, Any]: + """뉴스 수집 트리거 → task_id 반환.""" + payload = {"categories": categories} if categories else {} + resp = await _client.post(f"{INSTA_LAB_URL}/api/insta/news/collect", json=payload) + resp.raise_for_status() + return resp.json() + + +async def insta_extract(categories: Optional[list] = None) -> Dict[str, Any]: + payload = {"categories": categories} if categories else {} + resp = await _client.post(f"{INSTA_LAB_URL}/api/insta/keywords/extract", json=payload) + resp.raise_for_status() + return resp.json() + + +async def insta_list_keywords(category: Optional[str] = None, + used: Optional[bool] = None) -> List[Dict[str, Any]]: + params: Dict[str, Any] = {} + if category: + params["category"] = category + if used is not None: + params["used"] = "true" if used else "false" + resp = await _client.get(f"{INSTA_LAB_URL}/api/insta/keywords", params=params) + resp.raise_for_status() + return resp.json().get("items", []) + + +async def insta_get_keyword(keyword_id: int) -> Optional[Dict[str, Any]]: + items = await insta_list_keywords() + for it in items: + if it["id"] == keyword_id: + return it + return None + + +async def insta_create_slate(keyword: str, category: str, keyword_id: Optional[int] = None) -> Dict[str, Any]: resp = await _client.post( - f"{BLOG_LAB_URL}/api/blog-marketing/research", - json={"keyword": keyword}, + f"{INSTA_LAB_URL}/api/insta/slates", + json={"keyword": keyword, "category": category, "keyword_id": keyword_id}, ) resp.raise_for_status() return resp.json() -async def blog_task_status(task_id: str) -> Dict[str, Any]: - resp = await _client.get(f"{BLOG_LAB_URL}/api/blog-marketing/task/{task_id}") +async def insta_task_status(task_id: str) -> Dict[str, Any]: + resp = await _client.get(f"{INSTA_LAB_URL}/api/insta/tasks/{task_id}") resp.raise_for_status() return resp.json() -async def blog_generate(keyword_id: int) -> Dict[str, Any]: - resp = await _client.post( - f"{BLOG_LAB_URL}/api/blog-marketing/generate", - json={"keyword_id": keyword_id}, - ) +async def insta_get_slate(slate_id: int) -> Dict[str, Any]: + resp = await _client.get(f"{INSTA_LAB_URL}/api/insta/slates/{slate_id}") resp.raise_for_status() return resp.json() -async def blog_market(post_id: int) -> Dict[str, Any]: - resp = await _client.post(f"{BLOG_LAB_URL}/api/blog-marketing/market/{post_id}") - resp.raise_for_status() - return resp.json() - - -async def blog_review(post_id: int) -> Dict[str, Any]: - resp = await _client.post(f"{BLOG_LAB_URL}/api/blog-marketing/review/{post_id}") - resp.raise_for_status() - return resp.json() - - -async def blog_publish(post_id: int, url: str = "") -> Dict[str, Any]: - resp = await _client.post( - f"{BLOG_LAB_URL}/api/blog-marketing/posts/{post_id}/publish", - json={"url": url}, - ) - resp.raise_for_status() - return resp.json() - - -async def blog_get_post(post_id: int) -> Dict[str, Any]: - resp = await _client.get(f"{BLOG_LAB_URL}/api/blog-marketing/posts/{post_id}") - resp.raise_for_status() - return resp.json() +async def insta_get_asset_bytes(slate_id: int, page: int) -> bytes: + """카드 PNG 바이트를 가져와 텔레그램 미디어 그룹에 첨부.""" + async with httpx.AsyncClient(timeout=30) as client: + resp = await client.get(f"{INSTA_LAB_URL}/api/insta/slates/{slate_id}/assets/{page}") + resp.raise_for_status() + return resp.content # --- realestate-lab --- diff --git a/agent-office/app/telegram/webhook.py b/agent-office/app/telegram/webhook.py index b66d38a..82a1e63 100644 --- a/agent-office/app/telegram/webhook.py +++ b/agent-office/app/telegram/webhook.py @@ -37,6 +37,9 @@ async def _handle_callback(callback_query: dict) -> Optional[dict]: if callback_id.startswith("realestate_bookmark_"): return await _handle_realestate_bookmark(callback_query, callback_id) + if callback_id.startswith("render_"): + return await _handle_insta_render(callback_query, callback_id) + cb = get_telegram_callback(callback_id) if not cb: return None @@ -97,6 +100,38 @@ async def _handle_realestate_bookmark(callback_query: dict, callback_id: str) -> return {"ok": False, "error": str(e)} +async def _handle_insta_render(callback_query: dict, callback_id: str) -> dict: + """render_{keyword_id} 콜백 → InstaAgent.on_callback('render', ...). + + 텔레그램 인라인 버튼이 보낸 callback_data가 `render_` 형식. + InstaAgent._push_keyword_candidates가 callback_data를 그대로 박아 보내며, + 별도 DB lookup 없이 keyword_id를 파싱해 dispatch한다.""" + from .messaging import send_raw + from ..agents import AGENT_REGISTRY + + await api_call( + "answerCallbackQuery", + {"callback_query_id": callback_query["id"], "text": "카드 생성 시작"}, + ) + + try: + keyword_id = int(callback_id.removeprefix("render_")) + except ValueError: + await send_raw("⚠️ 잘못된 render 콜백 데이터") + return {"ok": False, "error": "invalid_callback_data"} + + agent = AGENT_REGISTRY.get("insta") + if not agent: + await send_raw("⚠️ insta agent 미등록") + return {"ok": False, "error": "agent_missing"} + + try: + return await agent.on_callback("render", {"keyword_id": keyword_id}) + except Exception as e: + await send_raw(f"⚠️ 카드 생성 실패: {e}") + return {"ok": False, "error": str(e)} + + async def _handle_message(message: dict, agent_dispatcher) -> Optional[dict]: """슬래시 명령 메시지 처리.""" from .router import parse_command, resolve_agent_command, HELP_TEXT diff --git a/agent-office/tests/test_insta_agent.py b/agent-office/tests/test_insta_agent.py new file mode 100644 index 0000000..13c0c96 --- /dev/null +++ b/agent-office/tests/test_insta_agent.py @@ -0,0 +1,85 @@ +import os +import sys +import tempfile + +_fd, _TMP = tempfile.mkstemp(suffix=".db") +os.close(_fd) +os.unlink(_TMP) +os.environ["AGENT_OFFICE_DB_PATH"] = _TMP + +sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) + +from unittest.mock import patch, AsyncMock, MagicMock + +import pytest + +from app.agents.insta import InstaAgent + + +@pytest.fixture(autouse=True) +def _init_db(): + import gc + gc.collect() + if os.path.exists(_TMP): + os.remove(_TMP) + from app.db import init_db + init_db() + yield + gc.collect() + + +@pytest.mark.asyncio +async def test_on_command_extract_dispatches(monkeypatch): + agent = InstaAgent() + fake_collect = AsyncMock(return_value={"task_id": "tcollect"}) + fake_extract = AsyncMock(return_value={"task_id": "textract"}) + fake_status = AsyncMock(side_effect=[ + {"status": "succeeded", "result_id": 0}, + {"status": "succeeded", "result_id": 0}, + ]) + fake_keywords = AsyncMock(return_value=[ + {"id": 1, "keyword": "K1", "category": "economy", "score": 0.9}, + {"id": 2, "keyword": "K2", "category": "psychology", "score": 0.8}, + ]) + + monkeypatch.setattr("app.agents.insta.service_proxy.insta_collect", fake_collect) + monkeypatch.setattr("app.agents.insta.service_proxy.insta_extract", fake_extract) + monkeypatch.setattr("app.agents.insta.service_proxy.insta_task_status", fake_status) + monkeypatch.setattr("app.agents.insta.service_proxy.insta_list_keywords", fake_keywords) + monkeypatch.setattr("app.agents.insta.messaging.send_raw", AsyncMock(return_value={"ok": True})) + + result = await agent.on_command("extract", {}) + assert result["ok"] is True + fake_collect.assert_awaited() + fake_extract.assert_awaited() + + +@pytest.mark.asyncio +async def test_on_callback_render_kicks_pipeline(monkeypatch): + agent = InstaAgent() + fake_kw = AsyncMock(return_value={"id": 7, "keyword": "테스트", "category": "economy"}) + fake_create = AsyncMock(return_value={"task_id": "tslate"}) + fake_status = AsyncMock(side_effect=[ + {"status": "processing"}, + {"status": "succeeded", "result_id": 42}, + ]) + fake_slate = AsyncMock(return_value={ + "id": 42, "status": "rendered", + "suggested_caption": "캡션", "hashtags": ["#a", "#b"], + "assets": [{"page_index": i, "file_path": f"/x/{i}.png"} for i in range(1, 11)], + }) + fake_bytes = AsyncMock(side_effect=[b"PNG"] * 10) + fake_send_media = AsyncMock(return_value={"ok": True}) + + monkeypatch.setattr("app.agents.insta.service_proxy.insta_get_keyword", fake_kw) + monkeypatch.setattr("app.agents.insta.service_proxy.insta_create_slate", fake_create) + monkeypatch.setattr("app.agents.insta.service_proxy.insta_task_status", fake_status) + monkeypatch.setattr("app.agents.insta.service_proxy.insta_get_slate", fake_slate) + monkeypatch.setattr("app.agents.insta.service_proxy.insta_get_asset_bytes", fake_bytes) + monkeypatch.setattr("app.agents.insta._send_media_group", fake_send_media) + monkeypatch.setattr("app.agents.insta.messaging.send_raw", AsyncMock(return_value={"ok": True})) + + out = await agent.on_callback("render", {"keyword_id": 7}) + assert out["ok"] is True + fake_create.assert_awaited() + fake_send_media.assert_awaited() diff --git a/blog-lab/.dockerignore b/blog-lab/.dockerignore deleted file mode 100644 index 35aac26..0000000 --- a/blog-lab/.dockerignore +++ /dev/null @@ -1,4 +0,0 @@ -__pycache__ -*.pyc -.env -data/ diff --git a/blog-lab/app/config.py b/blog-lab/app/config.py deleted file mode 100644 index 9907bae..0000000 --- a/blog-lab/app/config.py +++ /dev/null @@ -1,15 +0,0 @@ -import os - -# Anthropic Claude API -ANTHROPIC_API_KEY = os.getenv("ANTHROPIC_API_KEY", "") -CLAUDE_MODEL = os.getenv("CLAUDE_MODEL", "claude-sonnet-4-20250514") - -# Naver Search API -NAVER_CLIENT_ID = os.getenv("NAVER_CLIENT_ID", "") -NAVER_CLIENT_SECRET = os.getenv("NAVER_CLIENT_SECRET", "") - -# Database -DB_PATH = os.getenv("BLOG_DB_PATH", "/app/data/blog_marketing.db") - -# CORS -CORS_ALLOW_ORIGINS = os.getenv("CORS_ALLOW_ORIGINS", "http://localhost:3007,http://localhost:8080") diff --git a/blog-lab/app/content_generator.py b/blog-lab/app/content_generator.py deleted file mode 100644 index ba060f9..0000000 --- a/blog-lab/app/content_generator.py +++ /dev/null @@ -1,172 +0,0 @@ -"""Claude API 기반 콘텐츠 생성 — 트렌드 브리프 + 블로그 글 작성.""" - -import json -import logging -from datetime import date -from typing import Any, Dict, Optional - -import anthropic - -from .config import ANTHROPIC_API_KEY, CLAUDE_MODEL -from .db import get_template - -logger = logging.getLogger(__name__) - -_client: Optional[anthropic.Anthropic] = None - - -def _get_client() -> anthropic.Anthropic: - global _client - if _client is None: - _client = anthropic.Anthropic(api_key=ANTHROPIC_API_KEY) - return _client - - -def _call_claude(prompt: str, max_tokens: int = 4096) -> str: - """Claude API 호출. 단일 user 메시지. 현재 날짜 시스템 프롬프트 포함.""" - client = _get_client() - today = date.today().isoformat() - resp = client.messages.create( - model=CLAUDE_MODEL, - max_tokens=max_tokens, - system=f"현재 날짜는 {today}입니다. 모든 콘텐츠는 이 날짜 기준으로 작성하세요.", - messages=[{"role": "user", "content": prompt}], - ) - return resp.content[0].text - - -def generate_trend_brief(analysis: Dict[str, Any]) -> str: - """키워드 분석 데이터를 바탕으로 트렌드 브리프 생성.""" - template = get_template("trend_brief") - if not template: - raise RuntimeError("trend_brief 템플릿이 없습니다") - - top_blogs_text = "\n".join( - f"- {b.get('title', '')}" for b in analysis.get("top_blogs", []) - ) or "없음" - - top_products_text = "\n".join( - f"- {p.get('title', '')} ({p.get('lprice', '?')}원, {p.get('mallName', '')})" - for p in analysis.get("top_products", []) - ) or "없음" - - prompt = template.format( - keyword=analysis.get("keyword", ""), - competition=analysis.get("competition", 0), - opportunity=analysis.get("opportunity", 0), - top_blogs=top_blogs_text, - top_products=top_products_text, - ) - - return _call_claude(prompt) - - -def _parse_blog_json(raw: str, keyword: str) -> Dict[str, str]: - """Claude 응답에서 블로그 JSON을 파싱.""" - try: - text = raw.strip() - if text.startswith("```"): - lines = text.split("\n") - lines = [l for l in lines if not l.strip().startswith("```")] - text = "\n".join(lines) - result = json.loads(text) - return { - "title": result.get("title", ""), - "body": result.get("body", ""), - "excerpt": result.get("excerpt", ""), - "tags": result.get("tags", []), - } - except (json.JSONDecodeError, KeyError): - logger.warning("Blog post JSON parse failed, using raw text") - return { - "title": f"{keyword} 추천 리뷰", - "body": raw, - "excerpt": raw[:200], - "tags": [keyword], - } - - -def generate_blog_post( - analysis: Dict[str, Any], - trend_brief: str, - brand_links: Optional[list] = None, -) -> Dict[str, str]: - """트렌드 브리프를 바탕으로 블로그 글 작성. - - Returns: - {"title": str, "body": str, "excerpt": str, "tags": [...]} - """ - template = get_template("blog_write") - if not template: - raise RuntimeError("blog_write 템플릿이 없습니다") - - top_products_text = "\n".join( - f"- {p.get('title', '')} ({p.get('lprice', '?')}원, {p.get('mallName', '')})" - for p in analysis.get("top_products", []) - ) or "없음" - - # 크롤링된 블로그 본문 참고 자료 - reference_blogs_text = "" - for blog in analysis.get("top_blogs", []): - content = blog.get("content", "") - if content: - reference_blogs_text += f"\n### {blog.get('title', '제목 없음')}\n{content}\n" - if not reference_blogs_text: - reference_blogs_text = "없음" - - # 브랜드커넥트 링크 정보 - brand_products_text = "" - if brand_links: - for link in brand_links: - brand_products_text += ( - f"- 상품명: {link.get('product_name', '')}\n" - f" 설명: {link.get('description', '')}\n" - f" 링크: {link.get('url', '')}\n" - f" 배치 힌트: {link.get('placement_hint', '자연스럽게')}\n" - ) - if not brand_products_text: - brand_products_text = "없음 (제휴 링크 없이 일반 리뷰로 작성)" - - prompt = template.format( - keyword=analysis.get("keyword", ""), - trend_brief=trend_brief, - top_products=top_products_text, - reference_blogs=reference_blogs_text, - brand_products=brand_products_text, - ) - - # 구조화된 응답을 위한 추가 지시 - prompt += ( - "\n\n---\n" - "응답은 반드시 아래 JSON 형식으로 해주세요 (JSON만 출력, 다른 텍스트 없이):\n" - '{"title": "블로그 제목", "body": "HTML 본문", "excerpt": "2줄 요약", ' - '"tags": ["태그1", "태그2", ...]}' - ) - - raw = _call_claude(prompt, max_tokens=8192) - return _parse_blog_json(raw, analysis.get("keyword", "")) - - -def regenerate_blog_post( - analysis: Dict[str, Any], - trend_brief: str, - previous_body: str, - feedback: str, -) -> Dict[str, str]: - """피드백을 반영하여 블로그 글 재생성.""" - prompt = ( - "당신은 네이버 블로그에서 월 100만 이상 수익을 올리는 전문 블로거입니다.\n" - f"키워드: {analysis.get('keyword', '')}\n\n" - f"이전에 작성한 글:\n{previous_body[:3000]}\n\n" - f"리뷰어 피드백:\n{feedback}\n\n" - "위 피드백을 반영하여 글을 개선해주세요.\n" - "작성 규칙: 1인칭 체험기, 2,000자 이상, 자연스러운 구어체, " - "제품 비교표 포함, 광고 고지 문구 포함.\n" - "HTML 형식으로 작성하되, 네이버 블로그에서 바로 붙여넣기 가능한 형태로.\n\n" - "---\n" - "응답은 반드시 아래 JSON 형식으로 해주세요 (JSON만 출력):\n" - '{"title": "블로그 제목", "body": "HTML 본문", "excerpt": "2줄 요약", ' - '"tags": ["태그1", "태그2", ...]}' - ) - raw = _call_claude(prompt, max_tokens=8192) - return _parse_blog_json(raw, analysis.get("keyword", "")) diff --git a/blog-lab/app/db.py b/blog-lab/app/db.py deleted file mode 100644 index a531ad0..0000000 --- a/blog-lab/app/db.py +++ /dev/null @@ -1,790 +0,0 @@ -import os -import sqlite3 -import json -from typing import Any, Dict, List, Optional - -from .config import DB_PATH - - -def _conn() -> sqlite3.Connection: - os.makedirs(os.path.dirname(DB_PATH), exist_ok=True) - conn = sqlite3.connect(DB_PATH, timeout=120.0) - conn.row_factory = sqlite3.Row - conn.execute("PRAGMA journal_mode=WAL") - conn.execute("PRAGMA busy_timeout=120000") - return conn - - -def init_db() -> None: - with _conn() as conn: - # 키워드/상품 분석 결과 - conn.execute(""" - CREATE TABLE IF NOT EXISTS keyword_analyses ( - id INTEGER PRIMARY KEY AUTOINCREMENT, - keyword TEXT NOT NULL, - blog_total INTEGER NOT NULL DEFAULT 0, - shop_total INTEGER NOT NULL DEFAULT 0, - competition REAL NOT NULL DEFAULT 0, - opportunity REAL NOT NULL DEFAULT 0, - avg_price INTEGER, - min_price INTEGER, - max_price INTEGER, - top_products TEXT NOT NULL DEFAULT '[]', - top_blogs TEXT NOT NULL DEFAULT '[]', - ai_summary TEXT NOT NULL DEFAULT '', - created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ','now')) - ) - """) - conn.execute("CREATE INDEX IF NOT EXISTS idx_ka_created ON keyword_analyses(created_at DESC)") - conn.execute("CREATE INDEX IF NOT EXISTS idx_ka_keyword ON keyword_analyses(keyword)") - - # 블로그 포스트 - conn.execute(""" - CREATE TABLE IF NOT EXISTS blog_posts ( - id INTEGER PRIMARY KEY AUTOINCREMENT, - keyword_id INTEGER REFERENCES keyword_analyses(id), - title TEXT NOT NULL DEFAULT '', - body TEXT NOT NULL DEFAULT '', - excerpt TEXT NOT NULL DEFAULT '', - tags TEXT NOT NULL DEFAULT '[]', - status TEXT NOT NULL DEFAULT 'draft', - review_score INTEGER, - review_detail TEXT NOT NULL DEFAULT '{}', - naver_url TEXT NOT NULL DEFAULT '', - trend_brief TEXT NOT NULL DEFAULT '', - created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ','now')), - updated_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ','now')) - ) - """) - conn.execute("CREATE INDEX IF NOT EXISTS idx_bp_created ON blog_posts(created_at DESC)") - conn.execute("CREATE INDEX IF NOT EXISTS idx_bp_status ON blog_posts(status)") - - # 수익(커미션) 추적 - conn.execute(""" - CREATE TABLE IF NOT EXISTS commissions ( - id INTEGER PRIMARY KEY AUTOINCREMENT, - post_id INTEGER REFERENCES blog_posts(id), - month TEXT NOT NULL, - clicks INTEGER NOT NULL DEFAULT 0, - purchases INTEGER NOT NULL DEFAULT 0, - revenue INTEGER NOT NULL DEFAULT 0, - note TEXT NOT NULL DEFAULT '', - created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ','now')) - ) - """) - conn.execute("CREATE INDEX IF NOT EXISTS idx_comm_month ON commissions(month)") - conn.execute("CREATE INDEX IF NOT EXISTS idx_comm_post ON commissions(post_id)") - - # 비동기 작업 상태 (research / generate / review) - conn.execute(""" - CREATE TABLE IF NOT EXISTS generation_tasks ( - id TEXT PRIMARY KEY, - type TEXT NOT NULL DEFAULT 'research', - status TEXT NOT NULL DEFAULT 'queued', - progress INTEGER NOT NULL DEFAULT 0, - message TEXT NOT NULL DEFAULT '', - result_id INTEGER, - error TEXT, - params TEXT NOT NULL DEFAULT '{}', - created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ','now')), - updated_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ','now')) - ) - """) - conn.execute("CREATE INDEX IF NOT EXISTS idx_gt_created ON generation_tasks(created_at DESC)") - - # AI 프롬프트 템플릿 - conn.execute(""" - CREATE TABLE IF NOT EXISTS prompt_templates ( - id INTEGER PRIMARY KEY AUTOINCREMENT, - name TEXT NOT NULL UNIQUE, - description TEXT NOT NULL DEFAULT '', - template TEXT NOT NULL DEFAULT '', - updated_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ','now')) - ) - """) - - # 브랜드커넥트 제휴 링크 - conn.execute(""" - CREATE TABLE IF NOT EXISTS brand_links ( - id INTEGER PRIMARY KEY AUTOINCREMENT, - post_id INTEGER REFERENCES blog_posts(id), - keyword_id INTEGER REFERENCES keyword_analyses(id), - url TEXT NOT NULL, - product_name TEXT NOT NULL DEFAULT '', - description TEXT NOT NULL DEFAULT '', - placement_hint TEXT NOT NULL DEFAULT '', - created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ','now')) - ) - """) - conn.execute("CREATE INDEX IF NOT EXISTS idx_bl_post ON brand_links(post_id)") - conn.execute("CREATE INDEX IF NOT EXISTS idx_bl_keyword ON brand_links(keyword_id)") - - # 기본 프롬프트 템플릿 시딩 (존재하지 않을 때만) - _seed_templates(conn) - _migrate_templates(conn) - - -def _seed_templates(conn: sqlite3.Connection) -> None: - """기본 프롬프트 템플릿을 DB에 시딩.""" - templates = [ - { - "name": "trend_brief", - "description": "네이버 블로그 트렌드 분석 + 제목/훅 전략 브리프", - "template": ( - "당신은 네이버 블로그 마케팅 전문가입니다.\n" - "아래 키워드 분석 데이터를 바탕으로 블로그 포스팅 전략 브리프를 작성하세요.\n\n" - "키워드: {keyword}\n" - "블로그 경쟁도: {competition} (0-100, 높을수록 경쟁 치열)\n" - "쇼핑 기회 점수: {opportunity} (0-100, 높을수록 기회 큼)\n" - "상위 블로그 제목들: {top_blogs}\n" - "상위 상품들: {top_products}\n\n" - "다음을 포함해주세요:\n" - "1. 클릭을 유도하는 제목 공식 3가지\n" - "2. 도입부 훅 전략 (공감형, 질문형, 충격형 중 추천)\n" - "3. 추천 해시태그 5-10개\n" - "4. 경쟁 분석 요약 (기존 글 대비 차별화 포인트)\n" - "5. SEO 키워드 배치 전략" - ), - }, - { - "name": "blog_write", - "description": "공감형 1인칭 체험기 블로그 글 작성", - "template": ( - "당신은 네이버 블로그에서 월 100만 이상 수익을 올리는 전문 블로거입니다.\n" - "아래 브리프를 바탕으로 블로그 글을 작성하세요.\n\n" - "키워드: {keyword}\n" - "트렌드 브리프: {trend_brief}\n" - "상위 상품 정보: {top_products}\n\n" - "작성 규칙:\n" - "- 1인칭 체험기 형식 (\"제가 직접 써봤는데요\")\n" - "- 1,500자 이상\n" - "- 자연스러운 구어체 (네이버 블로그 톤)\n" - "- 제품 비교표 포함 (마크다운 테이블)\n" - "- 장단점 솔직하게 작성\n" - "- 광고 고지 문구 포함: \"이 포스팅은 쿠팡 파트너스 활동의 일환으로, 이에 따른 일정액의 수수료를 제공받습니다.\"\n" - "- 추천 매트릭스 (가성비/품질/디자인 기준)\n" - "- 자연스러운 CTA (구매 링크 유도)\n\n" - "HTML 형식으로 작성하되, 네이버 블로그에서 바로 붙여넣기 가능한 형태로 만들어주세요." - ), - }, - { - "name": "quality_review", - "description": "블로그 글 품질 리뷰 (6기준 × 10점)", - "template": ( - "당신은 블로그 콘텐츠 품질 평가 전문가입니다.\n" - "아래 블로그 글을 6가지 기준으로 평가해주세요.\n\n" - "제목: {title}\n" - "본문: {body}\n\n" - "평가 기준 (각 1-10점):\n" - "1. 독자 공감도 (empathy): 1인칭 체험기가 자연스럽고 공감되는가?\n" - "2. 제목 클릭 유도력 (click_appeal): 검색 결과에서 클릭하고 싶은 제목인가?\n" - "3. 구매 전환력 (conversion): 읽고 나서 제품을 사고 싶어지는가?\n" - "4. SEO 최적화 (seo): 키워드 배치, 소제목, 길이가 적절한가?\n" - "5. 형식 완성도 (format): 비교표, 이미지 설명, 단락 구성이 잘 되어있는가?\n" - "6. 링크 자연스러움 (link_natural): 제휴 링크가 광고처럼 느껴지지 않고 자연스럽게 녹아있는가? (링크가 없으면 5점 기본)\n\n" - "JSON 형식으로 응답:\n" - "{{\n" - " \"scores\": {{\n" - " \"empathy\": N,\n" - " \"click_appeal\": N,\n" - " \"conversion\": N,\n" - " \"seo\": N,\n" - " \"format\": N,\n" - " \"link_natural\": N\n" - " }},\n" - " \"total\": N,\n" - " \"pass\": true/false,\n" - " \"feedback\": \"개선 사항 설명\"\n" - "}}" - ), - }, - { - "name": "marketer_enhance", - "description": "마케터 전환율 강화 + 제휴 링크 삽입", - "template": ( - "당신은 네이버 블로그 수익화 전문 마케터입니다.\n" - "아래 블로그 초안에 제휴 링크를 자연스럽게 삽입하고 전환율을 강화하세요.\n\n" - "=== 블로그 초안 ===\n{draft_body}\n\n" - "=== 타겟 키워드 ===\n{keyword}\n\n" - "=== 삽입할 제휴 링크 ===\n{brand_links_info}\n\n" - "작업 규칙:\n" - "- 제휴 링크를 상품명 형태로 본문 흐름에 맞게 2~3곳 삽입\n" - "- 결론에 CTA(Call-to-Action) 블록 추가 (\"지금 확인하기\" 등)\n" - "- 글 맨 아래에 광고 고지 문구 자동 삽입: \"이 포스팅은 브랜드로부터 소정의 수수료를 받을 수 있습니다\"\n" - "- 작가의 1인칭 톤과 구어체를 유지\n" - "- 과도한 광고 느낌 없이 자연스러운 추천 흐름 유지\n" - "- 구매 심리를 자극하는 표현 강화 (한정 수량, 가격 비교, 실사용 만족도 등)\n" - "- 배치 힌트가 있으면 참고하되, 문맥이 더 자연스러운 위치 우선\n" - "- 기존 본문의 구조와 길이를 크게 변경하지 않음" - ), - }, - ] - for t in templates: - existing = conn.execute( - "SELECT id FROM prompt_templates WHERE name = ?", (t["name"],) - ).fetchone() - if not existing: - conn.execute( - "INSERT INTO prompt_templates (name, description, template) VALUES (?, ?, ?)", - (t["name"], t["description"], t["template"]), - ) - - -def _migrate_templates(conn: sqlite3.Connection) -> None: - """기존 템플릿을 최신 버전으로 업데이트.""" - new_blog_write = ( - "당신은 네이버 블로그에서 월 100만 이상 수익을 올리는 전문 블로거입니다.\n" - "아래 브리프와 참고 자료를 바탕으로 블로그 글을 작성하세요.\n\n" - "키워드: {keyword}\n" - "트렌드 브리프: {trend_brief}\n\n" - "=== 상위 블로그 참고 자료 ===\n" - "{reference_blogs}\n\n" - "=== 상위 상품 정보 ===\n" - "{top_products}\n\n" - "=== 제휴 상품 (브랜드커넥트 링크) ===\n" - "{brand_products}\n\n" - "작성 규칙:\n" - "- 1인칭 체험기 형식 (\"제가 직접 써봤는데요\")\n" - "- 2,000자 이상\n" - "- 자연스러운 구어체 (네이버 블로그 톤)\n" - "- 상위 블로그 참고하되 표절 금지 (자신만의 시각으로 재구성)\n" - "- 제품 비교표 포함 (HTML 테이블)\n" - "- 장단점 솔직하게 작성\n" - "- 제휴 상품이 있으면 자연스럽게 체험 맥락에 녹여서 작성\n" - "- 제휴 링크는 태그로 자연스럽게 삽입\n" - "- 추천 매트릭스 (가성비/품질/디자인 기준)\n" - "- 자연스러운 CTA (구매 링크 유도)\n\n" - "HTML 형식으로 작성하되, 네이버 블로그에서 바로 붙여넣기 가능한 형태로 만들어주세요." - ) - conn.execute( - "UPDATE prompt_templates SET template = ?, updated_at = strftime('%Y-%m-%dT%H:%M:%fZ','now') WHERE name = 'blog_write'", - (new_blog_write,), - ) - - new_quality_review = ( - "당신은 블로그 콘텐츠 품질 평가 전문가입니다.\n" - "아래 블로그 글을 6가지 기준으로 평가해주세요.\n\n" - "제목: {title}\n" - "본문: {body}\n\n" - "평가 기준 (각 1-10점):\n" - "1. 독자 공감도 (empathy): 1인칭 체험기가 자연스럽고 공감되는가?\n" - "2. 제목 클릭 유도력 (click_appeal): 검색 결과에서 클릭하고 싶은 제목인가?\n" - "3. 구매 전환력 (conversion): 읽고 나서 제품을 사고 싶어지는가?\n" - "4. SEO 최적화 (seo): 키워드 배치, 소제목, 길이가 적절한가?\n" - "5. 형식 완성도 (format): 비교표, 이미지 설명, 단락 구성이 잘 되어있는가?\n" - "6. 링크 자연스러움 (link_natural): 제휴 링크가 광고처럼 느껴지지 않고 자연스럽게 녹아있는가? (링크가 없으면 5점 기본)\n\n" - "JSON 형식으로 응답:\n" - "{{\n" - " \"scores\": {{\n" - " \"empathy\": N,\n" - " \"click_appeal\": N,\n" - " \"conversion\": N,\n" - " \"seo\": N,\n" - " \"format\": N,\n" - " \"link_natural\": N\n" - " }},\n" - " \"total\": N,\n" - " \"pass\": true/false,\n" - " \"feedback\": \"개선 사항 설명\"\n" - "}}" - ) - conn.execute( - "UPDATE prompt_templates SET template = ?, updated_at = strftime('%Y-%m-%dT%H:%M:%fZ','now') WHERE name = 'quality_review'", - (new_quality_review,), - ) - - # marketer_enhance가 없으면 추가 - existing = conn.execute("SELECT id FROM prompt_templates WHERE name = 'marketer_enhance'").fetchone() - if not existing: - conn.execute( - "INSERT INTO prompt_templates (name, description, template) VALUES (?, ?, ?)", - ("marketer_enhance", "마케터 전환율 강화 + 제휴 링크 삽입", - "당신은 네이버 블로그 수익화 전문 마케터입니다.\n" - "아래 블로그 초안에 제휴 링크를 자연스럽게 삽입하고 전환율을 강화하세요.\n\n" - "=== 블로그 초안 ===\n{draft_body}\n\n" - "=== 타겟 키워드 ===\n{keyword}\n\n" - "=== 삽입할 제휴 링크 ===\n{brand_links_info}\n\n" - "작업 규칙:\n" - "- 제휴 링크를 상품명 형태로 본문 흐름에 맞게 2~3곳 삽입\n" - "- 결론에 CTA(Call-to-Action) 블록 추가\n" - "- 글 맨 아래에 광고 고지 문구 자동 삽입\n" - "- 작가의 1인칭 톤과 구어체를 유지\n" - "- 과도한 광고 느낌 없이 자연스러운 추천 흐름 유지"), - ) - - -# ── keyword_analyses CRUD ──────────────────────────────────────────────────── - -def _ka_row_to_dict(r) -> Dict[str, Any]: - return { - "id": r["id"], - "keyword": r["keyword"], - "blog_total": r["blog_total"], - "shop_total": r["shop_total"], - "competition": r["competition"], - "opportunity": r["opportunity"], - "avg_price": r["avg_price"], - "min_price": r["min_price"], - "max_price": r["max_price"], - "top_products": json.loads(r["top_products"]) if r["top_products"] else [], - "top_blogs": json.loads(r["top_blogs"]) if r["top_blogs"] else [], - "ai_summary": r["ai_summary"], - "created_at": r["created_at"], - } - - -def add_keyword_analysis(data: Dict[str, Any]) -> Dict[str, Any]: - with _conn() as conn: - conn.execute( - """INSERT INTO keyword_analyses - (keyword, blog_total, shop_total, competition, opportunity, - avg_price, min_price, max_price, top_products, top_blogs, ai_summary) - VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""", - ( - data.get("keyword", ""), - data.get("blog_total", 0), - data.get("shop_total", 0), - data.get("competition", 0), - data.get("opportunity", 0), - data.get("avg_price"), - data.get("min_price"), - data.get("max_price"), - json.dumps(data.get("top_products", []), ensure_ascii=False), - json.dumps(data.get("top_blogs", []), ensure_ascii=False), - data.get("ai_summary", ""), - ), - ) - row = conn.execute( - "SELECT * FROM keyword_analyses WHERE rowid = last_insert_rowid()" - ).fetchone() - return _ka_row_to_dict(row) - - -def get_keyword_analysis(analysis_id: int) -> Optional[Dict[str, Any]]: - with _conn() as conn: - row = conn.execute( - "SELECT * FROM keyword_analyses WHERE id = ?", (analysis_id,) - ).fetchone() - return _ka_row_to_dict(row) if row else None - - -def get_keyword_analyses(limit: int = 30) -> List[Dict[str, Any]]: - with _conn() as conn: - rows = conn.execute( - "SELECT * FROM keyword_analyses ORDER BY created_at DESC LIMIT ?", (limit,) - ).fetchall() - return [_ka_row_to_dict(r) for r in rows] - - -def delete_keyword_analysis(analysis_id: int) -> bool: - with _conn() as conn: - row = conn.execute( - "SELECT id FROM keyword_analyses WHERE id = ?", (analysis_id,) - ).fetchone() - if not row: - return False - conn.execute("DELETE FROM keyword_analyses WHERE id = ?", (analysis_id,)) - return True - - -# ── blog_posts CRUD ────────────────────────────────────────────────────────── - -def _post_row_to_dict(r) -> Dict[str, Any]: - return { - "id": r["id"], - "keyword_id": r["keyword_id"], - "title": r["title"], - "body": r["body"], - "excerpt": r["excerpt"], - "tags": json.loads(r["tags"]) if r["tags"] else [], - "status": r["status"], - "review_score": r["review_score"], - "review_detail": json.loads(r["review_detail"]) if r["review_detail"] else {}, - "naver_url": r["naver_url"], - "trend_brief": r["trend_brief"], - "created_at": r["created_at"], - "updated_at": r["updated_at"], - } - - -def add_post(data: Dict[str, Any]) -> Dict[str, Any]: - with _conn() as conn: - conn.execute( - """INSERT INTO blog_posts - (keyword_id, title, body, excerpt, tags, status, review_score, - review_detail, naver_url, trend_brief) - VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""", - ( - data.get("keyword_id"), - data.get("title", ""), - data.get("body", ""), - data.get("excerpt", ""), - json.dumps(data.get("tags", []), ensure_ascii=False), - data.get("status", "draft"), - data.get("review_score"), - json.dumps(data.get("review_detail", {}), ensure_ascii=False), - data.get("naver_url", ""), - data.get("trend_brief", ""), - ), - ) - row = conn.execute( - "SELECT * FROM blog_posts WHERE rowid = last_insert_rowid()" - ).fetchone() - return _post_row_to_dict(row) - - -def get_post(post_id: int) -> Optional[Dict[str, Any]]: - with _conn() as conn: - row = conn.execute( - "SELECT * FROM blog_posts WHERE id = ?", (post_id,) - ).fetchone() - return _post_row_to_dict(row) if row else None - - -def get_posts(status: Optional[str] = None, limit: int = 50) -> List[Dict[str, Any]]: - with _conn() as conn: - if status: - rows = conn.execute( - "SELECT * FROM blog_posts WHERE status = ? ORDER BY created_at DESC LIMIT ?", - (status, limit), - ).fetchall() - else: - rows = conn.execute( - "SELECT * FROM blog_posts ORDER BY created_at DESC LIMIT ?", (limit,) - ).fetchall() - return [_post_row_to_dict(r) for r in rows] - - -def update_post(post_id: int, data: Dict[str, Any]) -> Optional[Dict[str, Any]]: - with _conn() as conn: - fields = [] - values = [] - for k in ("title", "body", "excerpt", "status", "naver_url", "trend_brief"): - if k in data: - fields.append(f"{k} = ?") - values.append(data[k]) - if "tags" in data: - fields.append("tags = ?") - values.append(json.dumps(data["tags"], ensure_ascii=False)) - if "review_score" in data: - fields.append("review_score = ?") - values.append(data["review_score"]) - if "review_detail" in data: - fields.append("review_detail = ?") - values.append(json.dumps(data["review_detail"], ensure_ascii=False)) - if not fields: - return get_post(post_id) - fields.append("updated_at = strftime('%Y-%m-%dT%H:%M:%fZ','now')") - values.append(post_id) - conn.execute( - f"UPDATE blog_posts SET {', '.join(fields)} WHERE id = ?", values - ) - row = conn.execute( - "SELECT * FROM blog_posts WHERE id = ?", (post_id,) - ).fetchone() - return _post_row_to_dict(row) if row else None - - -def delete_post(post_id: int) -> bool: - with _conn() as conn: - row = conn.execute( - "SELECT id FROM blog_posts WHERE id = ?", (post_id,) - ).fetchone() - if not row: - return False - conn.execute("DELETE FROM blog_posts WHERE id = ?", (post_id,)) - return True - - -# ── commissions CRUD ───────────────────────────────────────────────────────── - -def _comm_row_to_dict(r) -> Dict[str, Any]: - return { - "id": r["id"], - "post_id": r["post_id"], - "month": r["month"], - "clicks": r["clicks"], - "purchases": r["purchases"], - "revenue": r["revenue"], - "note": r["note"], - "created_at": r["created_at"], - } - - -def add_commission(data: Dict[str, Any]) -> Dict[str, Any]: - with _conn() as conn: - conn.execute( - """INSERT INTO commissions (post_id, month, clicks, purchases, revenue, note) - VALUES (?, ?, ?, ?, ?, ?)""", - ( - data.get("post_id"), - data.get("month", ""), - data.get("clicks", 0), - data.get("purchases", 0), - data.get("revenue", 0), - data.get("note", ""), - ), - ) - row = conn.execute( - "SELECT * FROM commissions WHERE rowid = last_insert_rowid()" - ).fetchone() - return _comm_row_to_dict(row) - - -def get_commissions(post_id: Optional[int] = None, limit: int = 100) -> List[Dict[str, Any]]: - with _conn() as conn: - if post_id: - rows = conn.execute( - "SELECT * FROM commissions WHERE post_id = ? ORDER BY month DESC LIMIT ?", - (post_id, limit), - ).fetchall() - else: - rows = conn.execute( - "SELECT * FROM commissions ORDER BY month DESC LIMIT ?", (limit,) - ).fetchall() - return [_comm_row_to_dict(r) for r in rows] - - -def update_commission(comm_id: int, data: Dict[str, Any]) -> Optional[Dict[str, Any]]: - with _conn() as conn: - fields = [] - values = [] - for k in ("month", "clicks", "purchases", "revenue", "note"): - if k in data: - fields.append(f"{k} = ?") - values.append(data[k]) - if not fields: - return None - values.append(comm_id) - conn.execute( - f"UPDATE commissions SET {', '.join(fields)} WHERE id = ?", values - ) - row = conn.execute( - "SELECT * FROM commissions WHERE id = ?", (comm_id,) - ).fetchone() - return _comm_row_to_dict(row) if row else None - - -def delete_commission(comm_id: int) -> bool: - with _conn() as conn: - row = conn.execute( - "SELECT id FROM commissions WHERE id = ?", (comm_id,) - ).fetchone() - if not row: - return False - conn.execute("DELETE FROM commissions WHERE id = ?", (comm_id,)) - return True - - -# ── brand_links CRUD ──────────────────────────────────────────────────────── - -def _bl_row_to_dict(r) -> Dict[str, Any]: - return { - "id": r["id"], - "post_id": r["post_id"], - "keyword_id": r["keyword_id"], - "url": r["url"], - "product_name": r["product_name"], - "description": r["description"], - "placement_hint": r["placement_hint"], - "created_at": r["created_at"], - } - - -def add_brand_link(data: Dict[str, Any]) -> Dict[str, Any]: - with _conn() as conn: - conn.execute( - """INSERT INTO brand_links (post_id, keyword_id, url, product_name, description, placement_hint) - VALUES (?, ?, ?, ?, ?, ?)""", - ( - data.get("post_id"), - data.get("keyword_id"), - data.get("url", ""), - data.get("product_name", ""), - data.get("description", ""), - data.get("placement_hint", ""), - ), - ) - row = conn.execute( - "SELECT * FROM brand_links WHERE rowid = last_insert_rowid()" - ).fetchone() - return _bl_row_to_dict(row) - - -def get_brand_links( - post_id: Optional[int] = None, - keyword_id: Optional[int] = None, -) -> List[Dict[str, Any]]: - with _conn() as conn: - if post_id is not None: - rows = conn.execute( - "SELECT * FROM brand_links WHERE post_id = ? ORDER BY id", (post_id,) - ).fetchall() - elif keyword_id is not None: - rows = conn.execute( - "SELECT * FROM brand_links WHERE keyword_id = ? ORDER BY id", (keyword_id,) - ).fetchall() - else: - rows = conn.execute("SELECT * FROM brand_links ORDER BY id DESC LIMIT 100").fetchall() - return [_bl_row_to_dict(r) for r in rows] - - -def update_brand_link(link_id: int, data: Dict[str, Any]) -> Optional[Dict[str, Any]]: - with _conn() as conn: - fields = [] - values = [] - for k in ("post_id", "keyword_id", "url", "product_name", "description", "placement_hint"): - if k in data: - fields.append(f"{k} = ?") - values.append(data[k]) - if not fields: - row = conn.execute("SELECT * FROM brand_links WHERE id = ?", (link_id,)).fetchone() - return _bl_row_to_dict(row) if row else None - values.append(link_id) - conn.execute(f"UPDATE brand_links SET {', '.join(fields)} WHERE id = ?", values) - row = conn.execute("SELECT * FROM brand_links WHERE id = ?", (link_id,)).fetchone() - return _bl_row_to_dict(row) if row else None - - -def delete_brand_link(link_id: int) -> bool: - with _conn() as conn: - row = conn.execute("SELECT id FROM brand_links WHERE id = ?", (link_id,)).fetchone() - if not row: - return False - conn.execute("DELETE FROM brand_links WHERE id = ?", (link_id,)) - return True - - -def link_brand_links_to_post(keyword_id: int, post_id: int) -> None: - """keyword_id로 등록된 링크들을 post_id에도 연결.""" - with _conn() as conn: - conn.execute( - "UPDATE brand_links SET post_id = ? WHERE keyword_id = ? AND post_id IS NULL", - (post_id, keyword_id), - ) - - -def get_dashboard_stats() -> Dict[str, Any]: - """대시보드 집계: 총 포스트/클릭/구매/수익 + 월별 추이.""" - with _conn() as conn: - total_posts = conn.execute("SELECT COUNT(*) FROM blog_posts").fetchone()[0] - published = conn.execute( - "SELECT COUNT(*) FROM blog_posts WHERE status = 'published'" - ).fetchone()[0] - - agg = conn.execute( - "SELECT COALESCE(SUM(clicks),0), COALESCE(SUM(purchases),0), COALESCE(SUM(revenue),0) FROM commissions" - ).fetchone() - - monthly = conn.execute( - """SELECT month, SUM(clicks) as clicks, SUM(purchases) as purchases, SUM(revenue) as revenue - FROM commissions GROUP BY month ORDER BY month DESC LIMIT 12""" - ).fetchall() - - top_posts = conn.execute( - """SELECT bp.id, bp.title, COALESCE(SUM(c.revenue),0) as total_revenue - FROM blog_posts bp LEFT JOIN commissions c ON c.post_id = bp.id - GROUP BY bp.id ORDER BY total_revenue DESC LIMIT 5""" - ).fetchall() - - return { - "total_posts": total_posts, - "published_posts": published, - "total_clicks": agg[0], - "total_purchases": agg[1], - "total_revenue": agg[2], - "monthly": [ - {"month": r["month"], "clicks": r["clicks"], "purchases": r["purchases"], "revenue": r["revenue"]} - for r in monthly - ], - "top_posts": [ - {"id": r["id"], "title": r["title"], "total_revenue": r["total_revenue"]} - for r in top_posts - ], - } - - -# ── generation_tasks CRUD ──────────────────────────────────────────────────── - -def _task_row_to_dict(r) -> Dict[str, Any]: - return { - "task_id": r["id"], - "type": r["type"], - "status": r["status"], - "progress": r["progress"], - "message": r["message"], - "result_id": r["result_id"], - "error": r["error"], - "params": json.loads(r["params"]) if r["params"] else {}, - "created_at": r["created_at"], - "updated_at": r["updated_at"], - } - - -def create_task(task_id: str, task_type: str, params: Dict[str, Any]) -> Dict[str, Any]: - with _conn() as conn: - conn.execute( - "INSERT INTO generation_tasks (id, type, params) VALUES (?, ?, ?)", - (task_id, task_type, json.dumps(params, ensure_ascii=False)), - ) - row = conn.execute( - "SELECT * FROM generation_tasks WHERE id = ?", (task_id,) - ).fetchone() - return _task_row_to_dict(row) - - -def update_task( - task_id: str, - status: str, - progress: int, - message: str, - result_id: Optional[int] = None, - error: Optional[str] = None, -) -> None: - with _conn() as conn: - conn.execute( - """UPDATE generation_tasks - SET status = ?, progress = ?, message = ?, result_id = ?, error = ?, - updated_at = strftime('%Y-%m-%dT%H:%M:%fZ','now') - WHERE id = ?""", - (status, progress, message, result_id, error, task_id), - ) - - -def get_task(task_id: str) -> Optional[Dict[str, Any]]: - with _conn() as conn: - row = conn.execute( - "SELECT * FROM generation_tasks WHERE id = ?", (task_id,) - ).fetchone() - return _task_row_to_dict(row) if row else None - - -# ── prompt_templates CRUD ──────────────────────────────────────────────────── - -def get_template(name: str) -> Optional[str]: - with _conn() as conn: - row = conn.execute( - "SELECT template FROM prompt_templates WHERE name = ?", (name,) - ).fetchone() - return row["template"] if row else None - - -def get_all_templates() -> List[Dict[str, Any]]: - with _conn() as conn: - rows = conn.execute("SELECT * FROM prompt_templates ORDER BY name").fetchall() - return [ - {"id": r["id"], "name": r["name"], "description": r["description"], - "template": r["template"], "updated_at": r["updated_at"]} - for r in rows - ] - - -def update_template(name: str, template: str) -> bool: - with _conn() as conn: - conn.execute( - "UPDATE prompt_templates SET template = ?, updated_at = strftime('%Y-%m-%dT%H:%M:%fZ','now') WHERE name = ?", - (template, name), - ) - return conn.execute( - "SELECT id FROM prompt_templates WHERE name = ?", (name,) - ).fetchone() is not None diff --git a/blog-lab/app/main.py b/blog-lab/app/main.py deleted file mode 100644 index a932731..0000000 --- a/blog-lab/app/main.py +++ /dev/null @@ -1,440 +0,0 @@ -import os -import uuid -import logging -from fastapi import FastAPI, HTTPException, BackgroundTasks, Query -from fastapi.middleware.cors import CORSMiddleware -from pydantic import BaseModel -from typing import List, Optional - -from .config import CORS_ALLOW_ORIGINS, NAVER_CLIENT_ID, ANTHROPIC_API_KEY -from .db import ( - init_db, - get_keyword_analyses, get_keyword_analysis, delete_keyword_analysis, - add_keyword_analysis, - get_posts, get_post, add_post, update_post, delete_post, - get_commissions, add_commission, update_commission, delete_commission, - get_dashboard_stats, - get_task, create_task, update_task, - add_brand_link, get_brand_links, update_brand_link, delete_brand_link, - link_brand_links_to_post, -) -from .naver_search import analyze_keyword_with_crawling -from .content_generator import generate_trend_brief, generate_blog_post, regenerate_blog_post -from .quality_reviewer import review_post -from .marketer import enhance_for_conversion - -logger = logging.getLogger(__name__) - -app = FastAPI() - -_cors_origins = CORS_ALLOW_ORIGINS.split(",") -app.add_middleware( - CORSMiddleware, - allow_origins=[o.strip() for o in _cors_origins], - allow_credentials=False, - allow_methods=["GET", "POST", "PUT", "DELETE", "OPTIONS"], - allow_headers=["Content-Type"], -) - - -@app.on_event("startup") -def on_startup(): - init_db() - os.makedirs("/app/data", exist_ok=True) - - -@app.get("/health") -def health(): - return {"ok": True} - - -@app.get("/api/blog-marketing/status") -def service_status(): - """서비스 상태 및 설정 현황.""" - return { - "ok": True, - "naver_api": bool(NAVER_CLIENT_ID), - "claude_api": bool(ANTHROPIC_API_KEY), - } - - -# ── 키워드 분석 API ────────────────────────────────────────────────────────── - -class ResearchRequest(BaseModel): - keyword: str - - -def _run_research(task_id: str, keyword: str): - """BackgroundTask: 네이버 검색 → 키워드 분석 → DB 저장.""" - try: - update_task(task_id, "processing", 30, "네이버 검색 중...") - result = analyze_keyword_with_crawling(keyword) - - update_task(task_id, "processing", 80, "분석 결과 저장 중...") - saved = add_keyword_analysis(result) - - update_task(task_id, "succeeded", 100, "분석 완료", result_id=saved["id"]) - except Exception as e: - logger.exception("Research failed for keyword=%s", keyword) - update_task(task_id, "failed", 0, "", error=str(e)) - - -@app.post("/api/blog-marketing/research") -def start_research(req: ResearchRequest, background_tasks: BackgroundTasks): - """키워드 분석 시작 (BackgroundTask). task_id 즉시 반환.""" - if not NAVER_CLIENT_ID: - raise HTTPException(status_code=400, detail="Naver API 키가 설정되지 않았습니다") - if not req.keyword.strip(): - raise HTTPException(status_code=400, detail="키워드를 입력하세요") - - task_id = str(uuid.uuid4()) - create_task(task_id, "research", {"keyword": req.keyword.strip()}) - background_tasks.add_task(_run_research, task_id, req.keyword.strip()) - return {"task_id": task_id} - - -@app.get("/api/blog-marketing/research/history") -def list_research(limit: int = Query(30, ge=1, le=100)): - return {"analyses": get_keyword_analyses(limit)} - - -@app.get("/api/blog-marketing/research/{analysis_id}") -def get_research(analysis_id: int): - result = get_keyword_analysis(analysis_id) - if not result: - raise HTTPException(status_code=404, detail="Analysis not found") - return result - - -@app.delete("/api/blog-marketing/research/{analysis_id}") -def remove_research(analysis_id: int): - if not delete_keyword_analysis(analysis_id): - raise HTTPException(status_code=404, detail="Analysis not found") - return {"ok": True} - - -# ── 작업 상태 폴링 API ────────────────────────────────────────────────────── - -@app.get("/api/blog-marketing/task/{task_id}") -def get_task_status(task_id: str): - task = get_task(task_id) - if not task: - raise HTTPException(status_code=404, detail="Task not found") - return task - - -# ── AI 글 생성 API ────────────────────────────────────────────────────────── - -class GenerateRequest(BaseModel): - keyword_id: int # keyword_analyses.id - - -class LinkRequest(BaseModel): - url: str - product_name: str - keyword_id: Optional[int] = None - post_id: Optional[int] = None - description: str = "" - placement_hint: str = "" - - -def _run_generate(task_id: str, keyword_id: int): - """BackgroundTask: 트렌드 브리프 → 블로그 글 생성 → DB 저장.""" - try: - analysis = get_keyword_analysis(keyword_id) - if not analysis: - update_task(task_id, "failed", 0, "", error="키워드 분석 결과를 찾을 수 없습니다") - return - - # 연결된 브랜드커넥트 링크 조회 - brand_links = get_brand_links(keyword_id=keyword_id) - - update_task(task_id, "processing", 20, "트렌드 브리프 생성 중...") - trend_brief = generate_trend_brief(analysis) - - update_task(task_id, "processing", 60, "블로그 글 작성 중...") - post_data = generate_blog_post(analysis, trend_brief, brand_links=brand_links) - - update_task(task_id, "processing", 90, "저장 중...") - saved = add_post({ - "keyword_id": keyword_id, - "title": post_data["title"], - "body": post_data["body"], - "excerpt": post_data["excerpt"], - "tags": post_data["tags"], - "status": "draft", - "trend_brief": trend_brief, - }) - - # keyword_id에 연결된 링크를 post_id에도 연결 - link_brand_links_to_post(keyword_id=keyword_id, post_id=saved["id"]) - - update_task(task_id, "succeeded", 100, "글 생성 완료", result_id=saved["id"]) - except Exception as e: - logger.exception("Generate failed for keyword_id=%s", keyword_id) - update_task(task_id, "failed", 0, "", error=str(e)) - - -@app.post("/api/blog-marketing/generate") -def start_generate(req: GenerateRequest, background_tasks: BackgroundTasks): - """AI 블로그 글 생성 시작. task_id 즉시 반환.""" - if not ANTHROPIC_API_KEY: - raise HTTPException(status_code=400, detail="Claude API 키가 설정되지 않았습니다") - analysis = get_keyword_analysis(req.keyword_id) - if not analysis: - raise HTTPException(status_code=404, detail="키워드 분석 결과를 찾을 수 없습니다") - - task_id = str(uuid.uuid4()) - create_task(task_id, "generate", {"keyword_id": req.keyword_id}) - background_tasks.add_task(_run_generate, task_id, req.keyword_id) - return {"task_id": task_id} - - -# ── 품질 리뷰 API ─────────────────────────────────────────────────────────── - -def _run_review(task_id: str, post_id: int): - """BackgroundTask: 블로그 글 품질 리뷰.""" - try: - post = get_post(post_id) - if not post: - update_task(task_id, "failed", 0, "", error="포스트를 찾을 수 없습니다") - return - - update_task(task_id, "processing", 50, "품질 리뷰 중...") - result = review_post(post["title"], post["body"]) - - update_post(post_id, { - "review_score": result["total"], - "review_detail": result, - "status": "reviewed" if result["pass"] else "draft", - }) - - update_task(task_id, "succeeded", 100, "리뷰 완료", result_id=post_id) - except Exception as e: - logger.exception("Review failed for post_id=%s", post_id) - update_task(task_id, "failed", 0, "", error=str(e)) - - -@app.post("/api/blog-marketing/review/{post_id}") -def start_review(post_id: int, background_tasks: BackgroundTasks): - """블로그 글 품질 리뷰 시작. task_id 즉시 반환.""" - if not ANTHROPIC_API_KEY: - raise HTTPException(status_code=400, detail="Claude API 키가 설정되지 않았습니다") - post = get_post(post_id) - if not post: - raise HTTPException(status_code=404, detail="Post not found") - - task_id = str(uuid.uuid4()) - create_task(task_id, "review", {"post_id": post_id}) - background_tasks.add_task(_run_review, task_id, post_id) - return {"task_id": task_id} - - -# ── 재생성 API ─────────────────────────────────────────────────────────────── - -def _run_regenerate(task_id: str, post_id: int): - """BackgroundTask: 피드백 기반 블로그 글 재생성.""" - try: - post = get_post(post_id) - if not post: - update_task(task_id, "failed", 0, "", error="포스트를 찾을 수 없습니다") - return - - analysis = get_keyword_analysis(post["keyword_id"]) if post["keyword_id"] else {} - feedback = post.get("review_detail", {}).get("feedback", "개선이 필요합니다") - - update_task(task_id, "processing", 50, "글 재생성 중...") - result = regenerate_blog_post( - analysis or {"keyword": ""}, - post.get("trend_brief", ""), - post["body"], - feedback, - ) - - update_post(post_id, { - "title": result["title"], - "body": result["body"], - "excerpt": result["excerpt"], - "tags": result["tags"], - "status": "draft", - "review_score": None, - "review_detail": {}, - }) - - update_task(task_id, "succeeded", 100, "재생성 완료", result_id=post_id) - except Exception as e: - logger.exception("Regenerate failed for post_id=%s", post_id) - update_task(task_id, "failed", 0, "", error=str(e)) - - -@app.post("/api/blog-marketing/regenerate/{post_id}") -def start_regenerate(post_id: int, background_tasks: BackgroundTasks): - """피드백 기반 블로그 글 재생성. task_id 즉시 반환.""" - if not ANTHROPIC_API_KEY: - raise HTTPException(status_code=400, detail="Claude API 키가 설정되지 않았습니다") - post = get_post(post_id) - if not post: - raise HTTPException(status_code=404, detail="Post not found") - - task_id = str(uuid.uuid4()) - create_task(task_id, "regenerate", {"post_id": post_id}) - background_tasks.add_task(_run_regenerate, task_id, post_id) - return {"task_id": task_id} - - -# ── 포스트 CRUD API ────────────────────────────────────────────────────────── - -@app.get("/api/blog-marketing/posts") -def list_posts(status: str = None, limit: int = Query(50, ge=1, le=100)): - return {"posts": get_posts(status=status, limit=limit)} - - -@app.get("/api/blog-marketing/posts/{post_id}") -def get_post_detail(post_id: int): - post = get_post(post_id) - if not post: - raise HTTPException(status_code=404, detail="Post not found") - return post - - -@app.put("/api/blog-marketing/posts/{post_id}") -def edit_post(post_id: int, data: dict): - result = update_post(post_id, data) - if not result: - raise HTTPException(status_code=404, detail="Post not found") - return result - - -@app.delete("/api/blog-marketing/posts/{post_id}") -def remove_post(post_id: int): - if not delete_post(post_id): - raise HTTPException(status_code=404, detail="Post not found") - return {"ok": True} - - -@app.post("/api/blog-marketing/posts/{post_id}/publish") -def publish_post(post_id: int, data: dict = None): - """네이버 URL 등록 + 상태를 published로 변경.""" - naver_url = (data or {}).get("naver_url", "") - result = update_post(post_id, {"status": "published", "naver_url": naver_url}) - if not result: - raise HTTPException(status_code=404, detail="Post not found") - return result - - -# ── 브랜드커넥트 링크 API ────────────────────────────────────────────────── - -@app.post("/api/blog-marketing/links", status_code=201) -def create_link(req: LinkRequest): - return add_brand_link(req.model_dump()) - - -@app.get("/api/blog-marketing/links") -def list_links(post_id: int = None, keyword_id: int = None): - return {"links": get_brand_links(post_id=post_id, keyword_id=keyword_id)} - - -@app.put("/api/blog-marketing/links/{link_id}") -def edit_link(link_id: int, data: dict): - result = update_brand_link(link_id, data) - if not result: - raise HTTPException(status_code=404, detail="Link not found") - return result - - -@app.delete("/api/blog-marketing/links/{link_id}") -def remove_link(link_id: int): - if not delete_brand_link(link_id): - raise HTTPException(status_code=404, detail="Link not found") - return {"ok": True} - - -# ── 마케터 API ────────────────────────────────────────────────────────────── - -def _run_market(task_id: str, post_id: int): - """BackgroundTask: 마케터 전환율 강화.""" - try: - post = get_post(post_id) - if not post: - update_task(task_id, "failed", 0, "", error="포스트를 찾을 수 없습니다") - return - - brand_links = get_brand_links(post_id=post_id) - if not brand_links and post.get("keyword_id"): - brand_links = get_brand_links(keyword_id=post["keyword_id"]) - - if not brand_links: - update_task(task_id, "failed", 0, "", error="브랜드커넥트 링크가 없습니다. 먼저 링크를 등록하세요.") - return - - analysis = get_keyword_analysis(post["keyword_id"]) if post.get("keyword_id") else {} - keyword = (analysis or {}).get("keyword", "") - - update_task(task_id, "processing", 50, "마케터가 전환율 강화 중...") - result = enhance_for_conversion( - post_body=post["body"], - post_title=post["title"], - brand_links=brand_links, - keyword=keyword, - ) - - update_post(post_id, { - "title": result["title"], - "body": result["body"], - "excerpt": result["excerpt"], - "status": "marketed", - }) - - update_task(task_id, "succeeded", 100, "마케팅 강화 완료", result_id=post_id) - except Exception as e: - logger.exception("Market failed for post_id=%s", post_id) - update_task(task_id, "failed", 0, "", error=str(e)) - - -@app.post("/api/blog-marketing/market/{post_id}") -def start_market(post_id: int, background_tasks: BackgroundTasks): - """마케터 단계 실행. task_id 즉시 반환.""" - if not ANTHROPIC_API_KEY: - raise HTTPException(status_code=400, detail="Claude API 키가 설정되지 않았습니다") - post = get_post(post_id) - if not post: - raise HTTPException(status_code=404, detail="Post not found") - - task_id = str(uuid.uuid4()) - create_task(task_id, "market", {"post_id": post_id}) - background_tasks.add_task(_run_market, task_id, post_id) - return {"task_id": task_id} - - -# ── 수익 추적 API ──────────────────────────────────────────────────────────── - -@app.get("/api/blog-marketing/commissions") -def list_commissions(post_id: int = None, limit: int = Query(100, ge=1, le=100)): - return {"commissions": get_commissions(post_id=post_id, limit=limit)} - - -@app.post("/api/blog-marketing/commissions", status_code=201) -def create_commission(data: dict): - return add_commission(data) - - -@app.put("/api/blog-marketing/commissions/{comm_id}") -def edit_commission(comm_id: int, data: dict): - result = update_commission(comm_id, data) - if not result: - raise HTTPException(status_code=404, detail="Commission not found") - return result - - -@app.delete("/api/blog-marketing/commissions/{comm_id}") -def remove_commission(comm_id: int): - if not delete_commission(comm_id): - raise HTTPException(status_code=404, detail="Commission not found") - return {"ok": True} - - -# ── 대시보드 API ───────────────────────────────────────────────────────────── - -@app.get("/api/blog-marketing/dashboard") -def dashboard(): - return get_dashboard_stats() diff --git a/blog-lab/app/marketer.py b/blog-lab/app/marketer.py deleted file mode 100644 index e82cbb9..0000000 --- a/blog-lab/app/marketer.py +++ /dev/null @@ -1,105 +0,0 @@ -"""마케터 단계 — 전환율 강화 + 브랜드커넥트 링크 삽입.""" - -import json -import logging -from datetime import date -from typing import Any, Dict, List, Optional - -import anthropic - -from .config import ANTHROPIC_API_KEY, CLAUDE_MODEL -from .db import get_template - -logger = logging.getLogger(__name__) - -_client: Optional[anthropic.Anthropic] = None - - -def _get_client() -> anthropic.Anthropic: - global _client - if _client is None: - _client = anthropic.Anthropic(api_key=ANTHROPIC_API_KEY) - return _client - - -def _call_claude(prompt: str, max_tokens: int = 8192) -> str: - client = _get_client() - today = date.today().isoformat() - resp = client.messages.create( - model=CLAUDE_MODEL, - max_tokens=max_tokens, - system=f"현재 날짜는 {today}입니다. 모든 콘텐츠는 이 날짜 기준으로 작성하세요.", - messages=[{"role": "user", "content": prompt}], - ) - return resp.content[0].text - - -def enhance_for_conversion( - post_body: str, - post_title: str, - brand_links: List[Dict[str, Any]], - keyword: str, -) -> Dict[str, str]: - """초안에 제휴 링크를 자연스럽게 삽입하고 전환율을 강화. - - Args: - post_body: 작가 초안 HTML 본문 - post_title: 작가 초안 제목 - brand_links: 브랜드커넥트 링크 리스트 - keyword: 타겟 키워드 - - Returns: - {"title": str, "body": str, "excerpt": str} - - Raises: - ValueError: 브랜드 링크가 없을 때 - """ - if not brand_links: - raise ValueError("브랜드커넥트 링크가 필요합니다") - - template = get_template("marketer_enhance") - if not template: - raise RuntimeError("marketer_enhance 템플릿이 없습니다") - - brand_links_text = "" - for i, link in enumerate(brand_links, 1): - brand_links_text += ( - f"{i}. 상품명: {link.get('product_name', '')}\n" - f" 설명: {link.get('description', '')}\n" - f" URL: {link.get('url', '')}\n" - f" 배치 힌트: {link.get('placement_hint', '자연스럽게')}\n\n" - ) - - prompt = template.format( - draft_body=post_body[:6000], - keyword=keyword, - brand_links_info=brand_links_text, - ) - - prompt += ( - "\n\n---\n" - "응답은 반드시 아래 JSON 형식으로 해주세요 (JSON만 출력):\n" - '{"title": "개선된 제목", "body": "개선된 HTML 본문", "excerpt": "2줄 요약"}' - ) - - raw = _call_claude(prompt) - - try: - text = raw.strip() - if text.startswith("```"): - lines = text.split("\n") - lines = [l for l in lines if not l.strip().startswith("```")] - text = "\n".join(lines) - result = json.loads(text) - return { - "title": result.get("title", post_title), - "body": result.get("body", post_body), - "excerpt": result.get("excerpt", ""), - } - except (json.JSONDecodeError, KeyError): - logger.warning("Marketer JSON parse failed, using raw text") - return { - "title": post_title, - "body": raw, - "excerpt": raw[:200], - } diff --git a/blog-lab/app/naver_search.py b/blog-lab/app/naver_search.py deleted file mode 100644 index 37b9969..0000000 --- a/blog-lab/app/naver_search.py +++ /dev/null @@ -1,203 +0,0 @@ -"""네이버 검색 API 연동 — 블로그 + 쇼핑 검색.""" - -import asyncio -import logging -import re -import requests -from typing import Any, Dict, List, Optional - -logger = logging.getLogger(__name__) - -from .config import NAVER_CLIENT_ID, NAVER_CLIENT_SECRET - -BLOG_URL = "https://openapi.naver.com/v1/search/blog.json" -SHOP_URL = "https://openapi.naver.com/v1/search/shop.json" - -_HEADERS = { - "X-Naver-Client-Id": NAVER_CLIENT_ID, - "X-Naver-Client-Secret": NAVER_CLIENT_SECRET, -} - -_TAG_RE = re.compile(r"<[^>]+>") - - -def _strip_html(text: str) -> str: - return _TAG_RE.sub("", text).strip() - - -def search_blog(keyword: str, display: int = 10, sort: str = "sim") -> Dict[str, Any]: - """네이버 블로그 검색. - - Args: - keyword: 검색 키워드 - display: 결과 수 (1-100) - sort: sim(정확도) | date(날짜) - - Returns: - {"total": int, "items": [...]} - """ - resp = requests.get( - BLOG_URL, - headers=_HEADERS, - params={"query": keyword, "display": display, "sort": sort}, - timeout=10, - ) - resp.raise_for_status() - data = resp.json() - items = [ - { - "title": _strip_html(item.get("title", "")), - "description": _strip_html(item.get("description", "")), - "link": item.get("link", ""), - "bloggername": item.get("bloggername", ""), - "postdate": item.get("postdate", ""), - } - for item in data.get("items", []) - ] - return {"total": data.get("total", 0), "items": items} - - -def search_shopping(keyword: str, display: int = 20, sort: str = "sim") -> Dict[str, Any]: - """네이버 쇼핑 검색. - - Args: - keyword: 검색 키워드 - display: 결과 수 (1-100) - sort: sim(정확도) | date(날짜) | asc(가격↑) | dsc(가격↓) - - Returns: - {"total": int, "items": [...], "price_stats": {...}} - """ - resp = requests.get( - SHOP_URL, - headers=_HEADERS, - params={"query": keyword, "display": display, "sort": sort}, - timeout=10, - ) - resp.raise_for_status() - data = resp.json() - - items = [] - prices = [] - for item in data.get("items", []): - lprice = _safe_int(item.get("lprice")) - hprice = _safe_int(item.get("hprice")) - parsed = { - "title": _strip_html(item.get("title", "")), - "link": item.get("link", ""), - "image": item.get("image", ""), - "lprice": lprice, - "hprice": hprice, - "mallName": item.get("mallName", ""), - "productId": item.get("productId", ""), - "productType": item.get("productType", ""), - "category1": item.get("category1", ""), - "category2": item.get("category2", ""), - "category3": item.get("category3", ""), - "brand": item.get("brand", ""), - "maker": item.get("maker", ""), - } - items.append(parsed) - if lprice and lprice > 0: - prices.append(lprice) - - price_stats = None - if prices: - price_stats = { - "min": min(prices), - "max": max(prices), - "avg": int(sum(prices) / len(prices)), - "count": len(prices), - } - - return { - "total": data.get("total", 0), - "items": items, - "price_stats": price_stats, - } - - -def _safe_int(val) -> Optional[int]: - if val is None: - return None - try: - return int(val) - except (ValueError, TypeError): - return None - - -def analyze_keyword(keyword: str) -> Dict[str, Any]: - """키워드 경쟁도/기회 분석. - - 블로그 총 결과수, 쇼핑 총 결과수, 가격 통계를 기반으로 - competition_score(경쟁도)와 opportunity_score(기회점수) 산출. - - Returns: - { - "keyword", "blog_total", "shop_total", - "competition", "opportunity", - "avg_price", "min_price", "max_price", - "top_products": [...], "top_blogs": [...] - } - """ - blog = search_blog(keyword, display=10, sort="sim") - shop = search_shopping(keyword, display=20, sort="sim") - - blog_total = blog["total"] - shop_total = shop["total"] - - # 경쟁도: 블로그 결과 수 기반 (로그 스케일 0-100) - import math - if blog_total > 0: - competition = min(100, int(math.log10(blog_total + 1) * 15)) - else: - competition = 0 - - # 기회 점수: 쇼핑 수요가 높고 블로그 경쟁이 낮을수록 높음 - if shop_total > 0 and blog_total > 0: - ratio = shop_total / blog_total - opportunity = min(100, int(ratio * 20)) - elif shop_total > 0: - opportunity = 90 # 경쟁 없이 수요만 있으면 높은 기회 - else: - opportunity = 10 # 쇼핑 수요 없음 - - price_stats = shop.get("price_stats") or {} - - return { - "keyword": keyword, - "blog_total": blog_total, - "shop_total": shop_total, - "competition": competition, - "opportunity": opportunity, - "avg_price": price_stats.get("avg"), - "min_price": price_stats.get("min"), - "max_price": price_stats.get("max"), - "top_products": shop["items"][:5], - "top_blogs": blog["items"][:5], - } - - -def _run_enrich(top_blogs: list) -> list: - """동기 컨텍스트에서 비동기 enrich_top_blogs 실행.""" - from .web_crawler import enrich_top_blogs - try: - loop = asyncio.get_event_loop() - if loop.is_running(): - import concurrent.futures - with concurrent.futures.ThreadPoolExecutor() as pool: - return pool.submit( - asyncio.run, enrich_top_blogs(top_blogs) - ).result(timeout=60) - else: - return asyncio.run(enrich_top_blogs(top_blogs)) - except Exception as e: - logger.warning("블로그 크롤링 실패, 기존 데이터 사용: %s", e) - return top_blogs - - -def analyze_keyword_with_crawling(keyword: str) -> Dict[str, Any]: - """analyze_keyword + 상위 블로그 본문 크롤링.""" - result = analyze_keyword(keyword) - result["top_blogs"] = _run_enrich(result["top_blogs"]) - return result diff --git a/blog-lab/app/quality_reviewer.py b/blog-lab/app/quality_reviewer.py deleted file mode 100644 index 93cd2e1..0000000 --- a/blog-lab/app/quality_reviewer.py +++ /dev/null @@ -1,85 +0,0 @@ -"""Claude API 기반 블로그 글 품질 리뷰 — 6기준 × 10점, 42/60 통과.""" - -import json -import logging -from datetime import date -from typing import Any, Dict, Optional - -import anthropic - -from .config import ANTHROPIC_API_KEY, CLAUDE_MODEL -from .db import get_template - -logger = logging.getLogger(__name__) - -PASS_THRESHOLD = 42 # 60점 만점 중 42점 이상이면 통과 (70%) - -_client: Optional[anthropic.Anthropic] = None - - -def _get_client() -> anthropic.Anthropic: - global _client - if _client is None: - _client = anthropic.Anthropic(api_key=ANTHROPIC_API_KEY) - return _client - - -def review_post(title: str, body: str) -> Dict[str, Any]: - """블로그 글 품질 리뷰. - - Returns: - { - "scores": { - "empathy": N, "click_appeal": N, "conversion": N, - "seo": N, "format": N, "link_natural": N - }, - "total": N, - "pass": bool, - "feedback": str - } - """ - template = get_template("quality_review") - if not template: - raise RuntimeError("quality_review 템플릿이 없습니다") - - prompt = template.format(title=title, body=body[:6000]) - - client = _get_client() - today = date.today().isoformat() - resp = client.messages.create( - model=CLAUDE_MODEL, - max_tokens=2048, - system=f"현재 날짜는 {today}입니다.", - messages=[{"role": "user", "content": prompt}], - ) - raw = resp.content[0].text - - try: - text = raw.strip() - if text.startswith("```"): - lines = text.split("\n") - lines = [l for l in lines if not l.strip().startswith("```")] - text = "\n".join(lines) - result = json.loads(text) - - scores = result.get("scores", {}) - total = sum(scores.values()) - passed = total >= PASS_THRESHOLD - - return { - "scores": scores, - "total": total, - "pass": passed, - "feedback": result.get("feedback", ""), - } - except (json.JSONDecodeError, KeyError, TypeError) as e: - logger.warning("Quality review JSON parse failed: %s", e) - return { - "scores": { - "empathy": 0, "click_appeal": 0, "conversion": 0, - "seo": 0, "format": 0, "link_natural": 0, - }, - "total": 0, - "pass": False, - "feedback": f"리뷰 파싱 실패. 원본 응답:\n{raw[:500]}", - } diff --git a/blog-lab/app/web_crawler.py b/blog-lab/app/web_crawler.py deleted file mode 100644 index 2bbd139..0000000 --- a/blog-lab/app/web_crawler.py +++ /dev/null @@ -1,97 +0,0 @@ -"""네이버 블로그 본문 크롤링 모듈.""" - -import asyncio -import logging -import re -from typing import Any, Dict, List, Optional, Tuple -import httpx -from bs4 import BeautifulSoup - -logger = logging.getLogger(__name__) - -_TIMEOUT = 10 # 글당 크롤링 타임아웃 (초) -_MAX_CONTENT_LENGTH = 2000 # 본문 최대 길이 - -# 네이버 블로그 URL 패턴: blog.naver.com/{blogId}/{logNo} -_BLOG_URL_RE = re.compile(r"blog\.naver\.com/([^/]+)/(\d+)") - - -def _parse_naver_blog_url(url: str) -> Optional[Tuple[str, str]]: - """네이버 블로그 URL에서 blogId, logNo 추출. 실패 시 None.""" - match = _BLOG_URL_RE.search(url) - if not match: - return None - return match.group(1), match.group(2) - - -async def _fetch_html(url: str) -> str: - """URL에서 HTML을 가져온다.""" - async with httpx.AsyncClient(timeout=_TIMEOUT, follow_redirects=True) as client: - resp = await client.get(url, headers={ - "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36" - }) - resp.raise_for_status() - return resp.text - - -def _extract_text(html: str) -> str: - """HTML에서 본문 텍스트를 추출한다.""" - soup = BeautifulSoup(html, "html.parser") - - # 스마트에디터 3 (SE3) - container = soup.select_one("div.se-main-container") - if not container: - # 구 에디터 - container = soup.select_one("div#postViewArea") - if not container: - # 폴백: body 전체 - container = soup.body - - if not container: - return "" - - # 스크립트/스타일 제거 - for tag in container.find_all(["script", "style"]): - tag.decompose() - - text = container.get_text(separator="\n", strip=True) - return text[:_MAX_CONTENT_LENGTH] - - -async def crawl_blog_content(url: str) -> str: - """네이버 블로그 URL에서 본문 텍스트 추출. - - - 네이버 블로그가 아니면 빈 문자열 - - 크롤링 실패 시 빈 문자열 (에러 로그만) - - 본문 최대 2,000자 - """ - parsed = _parse_naver_blog_url(url) - if not parsed: - return "" - - blog_id, log_no = parsed - # iframe 내부 실제 본문 URL - post_url = f"https://blog.naver.com/PostView.naver?blogId={blog_id}&logNo={log_no}" - - try: - html = await _fetch_html(post_url) - return _extract_text(html) - except Exception as e: - logger.warning("블로그 크롤링 실패 (%s): %s", url, e) - return "" - - -async def enrich_top_blogs(top_blogs: List[Dict[str, Any]]) -> List[Dict[str, Any]]: - """top_blogs 리스트 각 항목에 content 필드를 추가. - - 개별 크롤링 실패 시 해당 항목의 content를 빈 문자열로 설정하고 나머지 계속 진행. - """ - result = [] - for blog in top_blogs: - enriched = dict(blog) - try: - enriched["content"] = await crawl_blog_content(blog.get("link", "")) - except Exception: - enriched["content"] = "" - result.append(enriched) - return result diff --git a/blog-lab/tests/conftest.py b/blog-lab/tests/conftest.py deleted file mode 100644 index 4495650..0000000 --- a/blog-lab/tests/conftest.py +++ /dev/null @@ -1,9 +0,0 @@ -"""공통 테스트 픽스처.""" -import os -import sys - -# app 패키지를 blog_lab_app으로도 import 가능하게 -sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..")) -if "blog_lab_app" not in sys.modules: - import app as blog_lab_app - sys.modules["blog_lab_app"] = blog_lab_app diff --git a/blog-lab/tests/test_api_links.py b/blog-lab/tests/test_api_links.py deleted file mode 100644 index 810059e..0000000 --- a/blog-lab/tests/test_api_links.py +++ /dev/null @@ -1,85 +0,0 @@ -"""브랜드커넥트 링크 API 테스트.""" -import os -import pytest -from fastapi.testclient import TestClient - - -@pytest.fixture(autouse=True) -def setup_db(tmp_path): - test_db = str(tmp_path / "test.db") - import app.config as config - config.DB_PATH = test_db - from app import db - db.DB_PATH = test_db - db.init_db() - yield - - -@pytest.fixture -def client(): - from app.main import app - return TestClient(app) - - -def test_create_link(client): - resp = client.post("/api/blog-marketing/links", json={ - "keyword_id": 1, - "url": "https://link.coupang.com/abc", - "product_name": "테스트 상품", - "description": "상품 설명", - }) - assert resp.status_code == 201 - data = resp.json() - assert data["url"] == "https://link.coupang.com/abc" - assert data["product_name"] == "테스트 상품" - - -def test_create_link_requires_url(client): - resp = client.post("/api/blog-marketing/links", json={ - "product_name": "상품", - }) - assert resp.status_code == 422 - - -def test_create_link_requires_product_name(client): - resp = client.post("/api/blog-marketing/links", json={ - "url": "https://a.com", - }) - assert resp.status_code == 422 - - -def test_list_links_by_keyword_id(client): - client.post("/api/blog-marketing/links", json={ - "keyword_id": 1, "url": "https://a.com", "product_name": "A", - }) - client.post("/api/blog-marketing/links", json={ - "keyword_id": 2, "url": "https://b.com", "product_name": "B", - }) - resp = client.get("/api/blog-marketing/links?keyword_id=1") - assert resp.status_code == 200 - assert len(resp.json()["links"]) == 1 - - -def test_update_link(client): - create_resp = client.post("/api/blog-marketing/links", json={ - "url": "https://a.com", "product_name": "원래", - }) - link_id = create_resp.json()["id"] - resp = client.put(f"/api/blog-marketing/links/{link_id}", json={ - "product_name": "새이름", - }) - assert resp.status_code == 200 - assert resp.json()["product_name"] == "새이름" - - -def test_delete_link(client): - create_resp = client.post("/api/blog-marketing/links", json={ - "url": "https://a.com", "product_name": "삭제", - }) - link_id = create_resp.json()["id"] - resp = client.delete(f"/api/blog-marketing/links/{link_id}") - assert resp.status_code == 200 - assert resp.json()["ok"] is True - - resp = client.delete(f"/api/blog-marketing/links/{link_id}") - assert resp.status_code == 404 diff --git a/blog-lab/tests/test_db_brand_links.py b/blog-lab/tests/test_db_brand_links.py deleted file mode 100644 index a84d7b2..0000000 --- a/blog-lab/tests/test_db_brand_links.py +++ /dev/null @@ -1,67 +0,0 @@ -"""brand_links DB CRUD 테스트.""" -import os -import pytest -from app import db -from app.config import DB_PATH - - -@pytest.fixture(autouse=True) -def setup_db(tmp_path): - """테스트용 임시 DB 사용.""" - test_db = str(tmp_path / "test.db") - import app.config as config - config.DB_PATH = test_db - db.DB_PATH = test_db - db.init_db() - yield - - -def test_add_brand_link(): - link = db.add_brand_link({ - "keyword_id": 1, - "url": "https://link.coupang.com/abc", - "product_name": "테스트 상품", - "description": "상품 설명", - "placement_hint": "본문 중간", - }) - assert link["id"] is not None - assert link["url"] == "https://link.coupang.com/abc" - assert link["product_name"] == "테스트 상품" - assert link["keyword_id"] == 1 - assert link["post_id"] is None - - -def test_get_brand_links_by_keyword_id(): - db.add_brand_link({"keyword_id": 1, "url": "https://a.com", "product_name": "A"}) - db.add_brand_link({"keyword_id": 1, "url": "https://b.com", "product_name": "B"}) - db.add_brand_link({"keyword_id": 2, "url": "https://c.com", "product_name": "C"}) - links = db.get_brand_links(keyword_id=1) - assert len(links) == 2 - - -def test_get_brand_links_by_post_id(): - db.add_brand_link({"post_id": 10, "url": "https://a.com", "product_name": "A"}) - links = db.get_brand_links(post_id=10) - assert len(links) == 1 - assert links[0]["post_id"] == 10 - - -def test_update_brand_link(): - link = db.add_brand_link({"url": "https://a.com", "product_name": "원래 이름"}) - updated = db.update_brand_link(link["id"], {"product_name": "새 이름", "post_id": 5}) - assert updated["product_name"] == "새 이름" - assert updated["post_id"] == 5 - - -def test_delete_brand_link(): - link = db.add_brand_link({"url": "https://a.com", "product_name": "삭제할 링크"}) - assert db.delete_brand_link(link["id"]) is True - assert db.delete_brand_link(link["id"]) is False - - -def test_link_keyword_to_post(): - db.add_brand_link({"keyword_id": 1, "url": "https://a.com", "product_name": "A"}) - db.add_brand_link({"keyword_id": 1, "url": "https://b.com", "product_name": "B"}) - db.link_brand_links_to_post(keyword_id=1, post_id=10) - links = db.get_brand_links(post_id=10) - assert len(links) == 2 diff --git a/blog-lab/tests/test_evaluator.py b/blog-lab/tests/test_evaluator.py deleted file mode 100644 index 12bb631..0000000 --- a/blog-lab/tests/test_evaluator.py +++ /dev/null @@ -1,74 +0,0 @@ -"""평가자 단계 테스트 — 6기준 60점.""" -import json -import pytest -from unittest.mock import patch - - -def test_review_post_has_6_criteria(): - """6개 기준으로 채점하는지 확인.""" - from app.quality_reviewer import review_post - - mock_response = json.dumps({ - "scores": { - "empathy": 8, "click_appeal": 7, "conversion": 9, - "seo": 8, "format": 7, "link_natural": 9, - }, - "total": 48, - "pass": True, - "feedback": "전체적으로 우수합니다", - }) - - with patch("app.quality_reviewer._get_client") as mock_client_fn, \ - patch("app.quality_reviewer.get_template", return_value="제목: {title}\n본문: {body}"): - mock_client = mock_client_fn.return_value - mock_client.messages.create.return_value.content = [type("C", (), {"text": mock_response})()] - result = review_post("테스트 제목", "

본문

") - - assert "link_natural" in result["scores"] - assert len(result["scores"]) == 6 - assert result["total"] == 48 - assert result["pass"] is True - - -def test_review_pass_threshold_is_42(): - """통과 기준이 42점인지 확인.""" - from app.quality_reviewer import PASS_THRESHOLD - assert PASS_THRESHOLD == 42 - - -def test_review_fails_below_42(): - """42점 미만이면 불통과.""" - from app.quality_reviewer import review_post - - mock_response = json.dumps({ - "scores": { - "empathy": 5, "click_appeal": 5, "conversion": 5, - "seo": 5, "format": 5, "link_natural": 5, - }, - "total": 30, - "pass": False, - "feedback": "개선 필요", - }) - - with patch("app.quality_reviewer._get_client") as mock_client_fn, \ - patch("app.quality_reviewer.get_template", return_value="제목: {title}\n본문: {body}"): - mock_client = mock_client_fn.return_value - mock_client.messages.create.return_value.content = [type("C", (), {"text": mock_response})()] - result = review_post("제목", "

본문

") - - assert result["pass"] is False - - -def test_review_handles_parse_failure(): - """JSON 파싱 실패 시 기본값 반환 (6개 기준).""" - from app.quality_reviewer import review_post - - with patch("app.quality_reviewer._get_client") as mock_client_fn, \ - patch("app.quality_reviewer.get_template", return_value="제목: {title}\n본문: {body}"): - mock_client = mock_client_fn.return_value - mock_client.messages.create.return_value.content = [type("C", (), {"text": "잘못된 응답"})()] - result = review_post("제목", "

본문

") - - assert result["pass"] is False - assert "link_natural" in result["scores"] - assert result["total"] == 0 diff --git a/blog-lab/tests/test_marketer.py b/blog-lab/tests/test_marketer.py deleted file mode 100644 index 96e3551..0000000 --- a/blog-lab/tests/test_marketer.py +++ /dev/null @@ -1,66 +0,0 @@ -"""마케터 단계 테스트.""" -import json -import pytest -from unittest.mock import patch - - -def test_enhance_for_conversion_inserts_links(): - """마케터가 브랜드 링크를 본문에 삽입.""" - from app.marketer import enhance_for_conversion - - brand_links = [ - {"url": "https://link.coupang.com/abc", "product_name": "갤럭시 버즈3", - "description": "노이즈캔슬링", "placement_hint": "본문 중간"}, - ] - - mock_response = json.dumps({ - "title": "마케팅된 제목", - "body": '

본문 갤럭시 버즈3

', - "excerpt": "요약", - }) - - with patch("app.marketer._call_claude", return_value=mock_response) as mock_call, \ - patch("app.marketer.get_template", return_value="초안: {draft_body}\n키워드: {keyword}\n링크:\n{brand_links_info}"): - result = enhance_for_conversion( - post_body="

초안 본문

", - post_title="초안 제목", - brand_links=brand_links, - keyword="무선 이어폰", - ) - - prompt_used = mock_call.call_args[0][0] - assert "갤럭시 버즈3" in prompt_used - assert "노이즈캔슬링" in prompt_used - assert result["title"] == "마케팅된 제목" - - -def test_enhance_requires_brand_links(): - """브랜드 링크가 없으면 ValueError.""" - from app.marketer import enhance_for_conversion - - with pytest.raises(ValueError, match="브랜드커넥트 링크가 필요합니다"): - enhance_for_conversion( - post_body="

본문

", - post_title="제목", - brand_links=[], - keyword="테스트", - ) - - -def test_enhance_json_parse_fallback(): - """JSON 파싱 실패 시 원본 제목 유지.""" - from app.marketer import enhance_for_conversion - - brand_links = [{"url": "https://a.com", "product_name": "상품"}] - - with patch("app.marketer._call_claude", return_value="잘못된 JSON"), \ - patch("app.marketer.get_template", return_value="초안: {draft_body}\n키워드: {keyword}\n링크:\n{brand_links_info}"): - result = enhance_for_conversion( - post_body="

원본

", - post_title="원본 제목", - brand_links=brand_links, - keyword="테스트", - ) - - assert result["title"] == "원본 제목" - assert result["body"] == "잘못된 JSON" diff --git a/blog-lab/tests/test_pipeline_integration.py b/blog-lab/tests/test_pipeline_integration.py deleted file mode 100644 index ab7b380..0000000 --- a/blog-lab/tests/test_pipeline_integration.py +++ /dev/null @@ -1,146 +0,0 @@ -"""4단계 파이프라인 통합 테스트.""" -import os -import pytest -from unittest.mock import patch -from fastapi.testclient import TestClient - - -@pytest.fixture(autouse=True) -def setup_db(tmp_path): - test_db = str(tmp_path / "test.db") - import app.config as config - config.DB_PATH = test_db - from app import db - db.DB_PATH = test_db - db.init_db() - yield - - -@pytest.fixture -def client(): - from app.main import app - return TestClient(app) - - -def test_full_pipeline_status_flow(client): - """draft → marketed → reviewed → published 상태 흐름.""" - from app import db - - # 1. 키워드 분석 결과 직접 삽입 - analysis = db.add_keyword_analysis({ - "keyword": "무선 이어폰", - "blog_total": 1000, - "shop_total": 500, - "competition": 45, - "opportunity": 60, - "top_products": [{"title": "에어팟", "lprice": 200000, "mallName": "애플"}], - "top_blogs": [{"title": "리뷰", "link": "https://blog.naver.com/user/123", "content": "본문"}], - }) - - # 2. 브랜드 링크 등록 - resp = client.post("/api/blog-marketing/links", json={ - "keyword_id": analysis["id"], - "url": "https://link.coupang.com/abc", - "product_name": "삼성 버즈3", - "description": "노이즈캔슬링", - }) - assert resp.status_code == 201 - - # 3. 포스트 직접 생성 (generate는 Claude API 필요) - post = db.add_post({ - "keyword_id": analysis["id"], - "title": "무선 이어폰 추천", - "body": "

초안 본문

", - "excerpt": "요약", - "tags": ["이어폰"], - "status": "draft", - }) - db.link_brand_links_to_post(keyword_id=analysis["id"], post_id=post["id"]) - - # 4. 상태 확인: draft - resp = client.get(f"/api/blog-marketing/posts/{post['id']}") - assert resp.json()["status"] == "draft" - - # 5. marketed 상태 - db.update_post(post["id"], {"status": "marketed", "body": "

마케팅된 본문

"}) - resp = client.get(f"/api/blog-marketing/posts/{post['id']}") - assert resp.json()["status"] == "marketed" - - # 6. reviewed 상태 (점수 48/60 = 통과) - db.update_post(post["id"], { - "status": "reviewed", - "review_score": 48, - "review_detail": { - "scores": {"empathy": 8, "click_appeal": 8, "conversion": 8, "seo": 8, "format": 8, "link_natural": 8}, - "total": 48, "pass": True, "feedback": "우수" - }, - }) - resp = client.get(f"/api/blog-marketing/posts/{post['id']}") - assert resp.json()["status"] == "reviewed" - assert resp.json()["review_score"] == 48 - - # 7. 발행 - resp = client.post(f"/api/blog-marketing/posts/{post['id']}/publish", json={ - "naver_url": "https://blog.naver.com/mypost/123", - }) - assert resp.json()["status"] == "published" - - -def test_links_associated_with_post(client): - """keyword_id로 등록한 링크가 post 생성 후 post_id로도 조회 가능.""" - from app import db - - analysis = db.add_keyword_analysis({"keyword": "테스트", "blog_total": 10, "shop_total": 5}) - client.post("/api/blog-marketing/links", json={ - "keyword_id": analysis["id"], - "url": "https://link.com/1", - "product_name": "상품1", - }) - - post = db.add_post({"keyword_id": analysis["id"], "title": "제목", "body": "본문", "status": "draft"}) - db.link_brand_links_to_post(keyword_id=analysis["id"], post_id=post["id"]) - - resp = client.get(f"/api/blog-marketing/links?post_id={post['id']}") - links = resp.json()["links"] - assert len(links) == 1 - assert links[0]["product_name"] == "상품1" - - -@patch("app.main.ANTHROPIC_API_KEY", "fake-key-for-test") -def test_market_endpoint_returns_404_for_missing_post(client): - """존재하지 않는 post_id로 마케터 호출 시 404.""" - resp = client.post("/api/blog-marketing/market/9999") - assert resp.status_code == 404 - - -@patch("app.main.ANTHROPIC_API_KEY", "fake-key-for-test") -def test_review_endpoint_returns_404_for_missing_post(client): - """존재하지 않는 post_id로 리뷰 호출 시 404.""" - resp = client.post("/api/blog-marketing/review/9999") - assert resp.status_code == 404 - - -def test_multiple_links_per_keyword(client): - """하나의 키워드에 복수 링크 등록 가능.""" - from app import db - analysis = db.add_keyword_analysis({"keyword": "테스트", "blog_total": 10, "shop_total": 5}) - - for i in range(3): - resp = client.post("/api/blog-marketing/links", json={ - "keyword_id": analysis["id"], - "url": f"https://link.com/{i}", - "product_name": f"상품{i}", - }) - assert resp.status_code == 201 - - resp = client.get(f"/api/blog-marketing/links?keyword_id={analysis['id']}") - assert len(resp.json()["links"]) == 3 - - -def test_dashboard_still_works(client): - """대시보드 API가 여전히 정상 작동.""" - resp = client.get("/api/blog-marketing/dashboard") - assert resp.status_code == 200 - data = resp.json() - assert "total_posts" in data - assert "published_posts" in data diff --git a/blog-lab/tests/test_research_crawling.py b/blog-lab/tests/test_research_crawling.py deleted file mode 100644 index 598eb4c..0000000 --- a/blog-lab/tests/test_research_crawling.py +++ /dev/null @@ -1,58 +0,0 @@ -"""리서치 단계 크롤링 통합 테스트.""" -from unittest.mock import patch - - -def test_analyze_keyword_with_crawling_enriches_top_blogs(): - """analyze_keyword_with_crawling가 top_blogs에 content 필드를 추가.""" - from app.naver_search import analyze_keyword_with_crawling - - mock_blog_result = { - "total": 100, - "items": [ - {"title": "테스트 블로그", "link": "https://blog.naver.com/user1/111", - "bloggername": "유저1", "description": "설명", "postdate": "20260401"}, - ], - } - mock_shop_result = { - "total": 50, - "items": [{"title": "상품1", "lprice": 10000, "mallName": "쿠팡"}], - "price_stats": {"min": 10000, "max": 10000, "avg": 10000, "count": 1}, - } - - with patch("app.naver_search.search_blog", return_value=mock_blog_result), \ - patch("app.naver_search.search_shopping", return_value=mock_shop_result), \ - patch("app.naver_search._run_enrich", return_value=[ - {"title": "테스트 블로그", "link": "https://blog.naver.com/user1/111", - "bloggername": "유저1", "description": "설명", "postdate": "20260401", - "content": "크롤링된 본문 내용"} - ]): - result = analyze_keyword_with_crawling("테스트 키워드") - - assert "content" in result["top_blogs"][0] - assert result["top_blogs"][0]["content"] == "크롤링된 본문 내용" - - -def test_analyze_keyword_with_crawling_fallback_on_enrich_failure(): - """크롤링 실패 시 기존 데이터 유지.""" - from app.naver_search import analyze_keyword_with_crawling - - mock_blog_result = { - "total": 50, - "items": [{"title": "블로그", "link": "https://blog.naver.com/u/1", "bloggername": "유저", "description": "설명"}], - } - mock_shop_result = {"total": 10, "items": [], "price_stats": None} - - with patch("app.naver_search.search_blog", return_value=mock_blog_result), \ - patch("app.naver_search.search_shopping", return_value=mock_shop_result), \ - patch("app.naver_search._run_enrich", side_effect=Exception("크롤링 실패")): - # _run_enrich 내부에서 예외를 잡으므로 실제로는 이 테스트에서는 - # _run_enrich 자체가 예외를 던지는 상황을 시뮬레이션 - # 하지만 _run_enrich는 내부에서 잡으므로, 직접 fallback 테스트 - pass - - # _run_enrich 자체 fallback 테스트 - from app.naver_search import _run_enrich - original_blogs = [{"title": "원본", "link": "https://blog.naver.com/u/1"}] - with patch("app.web_crawler.enrich_top_blogs", side_effect=Exception("fail")): - result = _run_enrich(original_blogs) - assert result == original_blogs # fallback으로 원본 반환 diff --git a/blog-lab/tests/test_web_crawler.py b/blog-lab/tests/test_web_crawler.py deleted file mode 100644 index 617c2d6..0000000 --- a/blog-lab/tests/test_web_crawler.py +++ /dev/null @@ -1,94 +0,0 @@ -"""web_crawler 모듈 테스트.""" -import pytest -from unittest.mock import patch, AsyncMock -from app.web_crawler import crawl_blog_content, enrich_top_blogs, _parse_naver_blog_url, _extract_text - - -def test_parse_naver_blog_url_valid(): - """blog.naver.com URL에서 blogId와 logNo를 올바르게 파싱.""" - result = _parse_naver_blog_url("https://blog.naver.com/testuser/123456") - assert result == ("testuser", "123456") - - -def test_parse_returns_none_for_invalid_url(): - """잘못된 URL은 None 반환.""" - result = _parse_naver_blog_url("https://example.com/post") - assert result is None - - -def test_extract_text_prefers_se_main_container(): - """SE3 에디터 컨테이너를 우선 선택.""" - html = '

SE3 본문

구 에디터

' - assert _extract_text(html) == "SE3 본문" - - -def test_extract_text_falls_back_to_post_view_area(): - """SE3 없으면 구 에디터 컨테이너 사용.""" - html = '

구 에디터 본문

' - assert _extract_text(html) == "구 에디터 본문" - - -def test_extract_text_removes_script_and_style(): - """스크립트/스타일 태그 제거.""" - html = '

본문

' - result = _extract_text(html) - assert "alert" not in result - assert ".x" not in result - assert "본문" in result - - -def test_extract_text_returns_empty_on_no_container(): - """컨테이너가 없고 body도 없으면 빈 문자열.""" - assert _extract_text("") == "" - - -@pytest.mark.asyncio -async def test_crawl_returns_empty_on_non_naver_url(): - """네이버 블로그가 아닌 URL은 빈 문자열 반환.""" - result = await crawl_blog_content("https://example.com/post") - assert result == "" - - -@pytest.mark.asyncio -async def test_crawl_truncates_to_2000_chars(): - """본문이 2000자를 초과하면 잘라낸다.""" - long_html = f'

{"가" * 3000}

' - with patch("app.web_crawler._fetch_html", new_callable=AsyncMock, return_value=long_html): - result = await crawl_blog_content("https://blog.naver.com/testuser/123") - assert len(result) <= 2000 - - -@pytest.mark.asyncio -async def test_crawl_returns_empty_on_fetch_failure(): - """HTTP 요청 실패 시 빈 문자열 반환.""" - with patch("app.web_crawler._fetch_html", new_callable=AsyncMock, side_effect=Exception("timeout")): - result = await crawl_blog_content("https://blog.naver.com/testuser/123") - assert result == "" - - -@pytest.mark.asyncio -async def test_enrich_top_blogs_adds_content_field(): - """enrich_top_blogs가 각 블로그에 content 필드를 추가.""" - blogs = [ - {"title": "테스트", "link": "https://blog.naver.com/user1/111", "bloggername": "유저1", "description": "설명"}, - {"title": "테스트2", "link": "https://blog.naver.com/user2/222", "bloggername": "유저2", "description": "설명2"}, - ] - with patch("app.web_crawler.crawl_blog_content", new_callable=AsyncMock, return_value="크롤링된 본문"): - result = await enrich_top_blogs(blogs) - assert len(result) == 2 - assert result[0]["content"] == "크롤링된 본문" - assert result[1]["content"] == "크롤링된 본문" - - -@pytest.mark.asyncio -async def test_enrich_top_blogs_handles_partial_failure(): - """일부 크롤링 실패 시에도 나머지는 정상 처리.""" - blogs = [ - {"title": "성공", "link": "https://blog.naver.com/user1/111"}, - {"title": "실패", "link": "https://blog.naver.com/user2/222"}, - ] - side_effects = ["성공 본문", Exception("fail")] - with patch("app.web_crawler.crawl_blog_content", new_callable=AsyncMock, side_effect=side_effects): - result = await enrich_top_blogs(blogs) - assert result[0]["content"] == "성공 본문" - assert result[1]["content"] == "" diff --git a/blog-lab/tests/test_writer.py b/blog-lab/tests/test_writer.py deleted file mode 100644 index 5638b81..0000000 --- a/blog-lab/tests/test_writer.py +++ /dev/null @@ -1,86 +0,0 @@ -"""작가 단계 테스트 -- 크롤링 본문 + 링크 참조 글 생성.""" -import json -import pytest -from unittest.mock import patch - - -def test_generate_blog_post_includes_crawled_content(): - """크롤링 본문이 프롬프트에 포함되는지 확인.""" - from app.content_generator import generate_blog_post - - analysis = { - "keyword": "무선 이어폰", - "top_products": [{"title": "에어팟", "lprice": 200000, "mallName": "애플"}], - "top_blogs": [ - {"title": "에어팟 리뷰", "content": "에어팟을 한 달간 써봤는데 음질이 정말 좋았습니다."}, - ], - } - - mock_response = json.dumps({ - "title": "무선 이어폰 추천", - "body": "

본문

", - "excerpt": "요약", - "tags": ["이어폰"], - }) - - with patch("app.content_generator._call_claude", return_value=mock_response) as mock_call, \ - patch("app.content_generator.get_template", return_value=( - "키워드: {keyword}\n참고 블로그:\n{reference_blogs}\n상품: {top_products}\n링크 상품: {brand_products}" - )): - result = generate_blog_post(analysis, "트렌드 브리프", brand_links=[]) - - prompt_used = mock_call.call_args[0][0] - assert "에어팟을 한 달간 써봤는데" in prompt_used - assert result["title"] == "무선 이어폰 추천" - - -def test_generate_blog_post_includes_brand_links(): - """브랜드커넥트 링크 정보가 프롬프트에 포함되는지 확인.""" - from app.content_generator import generate_blog_post - - analysis = {"keyword": "무선 이어폰", "top_products": [], "top_blogs": []} - brand_links = [ - {"url": "https://link.coupang.com/abc", "product_name": "삼성 버즈3", - "description": "노이즈캔슬링 지원", "placement_hint": "본문 중간"}, - ] - - mock_response = json.dumps({ - "title": "제목", "body": "

본문

", "excerpt": "요약", "tags": ["태그"], - }) - - with patch("app.content_generator._call_claude", return_value=mock_response) as mock_call, \ - patch("app.content_generator.get_template", return_value=( - "키워드: {keyword}\n참고 블로그:\n{reference_blogs}\n상품: {top_products}\n링크 상품: {brand_products}" - )): - result = generate_blog_post(analysis, "트렌드 브리프", brand_links=brand_links) - - prompt_used = mock_call.call_args[0][0] - assert "삼성 버즈3" in prompt_used - assert "노이즈캔슬링 지원" in prompt_used - - -def test_generate_blog_post_works_without_links(): - """링크 없이도 정상 동작.""" - from app.content_generator import generate_blog_post - - analysis = {"keyword": "테스트", "top_products": [], "top_blogs": []} - mock_response = json.dumps({ - "title": "제목", "body": "

본문

", "excerpt": "요약", "tags": ["태그"], - }) - - with patch("app.content_generator._call_claude", return_value=mock_response), \ - patch("app.content_generator.get_template", return_value=( - "키워드: {keyword}\n참고 블로그:\n{reference_blogs}\n상품: {top_products}\n링크 상품: {brand_products}" - )): - result = generate_blog_post(analysis, "브리프") - - assert result["title"] == "제목" - - -def test_parse_blog_json_fallback(): - """JSON 파싱 실패 시 원본 텍스트를 body로 사용.""" - from app.content_generator import _parse_blog_json - - result = _parse_blog_json("잘못된 JSON", "테스트 키워드") - assert result["title"] == "테스트 키워드 추천 리뷰" - assert result["body"] == "잘못된 JSON" diff --git a/docker-compose.yml b/docker-compose.yml index b7dfee3..3f820fa 100644 --- a/docker-compose.yml +++ b/docker-compose.yml @@ -86,21 +86,25 @@ services: timeout: 5s retries: 3 - blog-lab: + insta-lab: build: - context: ./blog-lab - container_name: blog-lab + context: ./insta-lab + container_name: insta-lab restart: unless-stopped ports: - "18700:8000" environment: - TZ=${TZ:-Asia/Seoul} - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:-} + - ANTHROPIC_MODEL_HAIKU=${ANTHROPIC_MODEL_HAIKU:-claude-haiku-4-5-20251001} + - ANTHROPIC_MODEL_SONNET=${ANTHROPIC_MODEL_SONNET:-claude-sonnet-4-6} - NAVER_CLIENT_ID=${NAVER_CLIENT_ID:-} - NAVER_CLIENT_SECRET=${NAVER_CLIENT_SECRET:-} + - INSTA_DATA_PATH=/app/data + - CARD_TEMPLATE_DIR=/app/app/templates - CORS_ALLOW_ORIGINS=${CORS_ALLOW_ORIGINS:-http://localhost:3007,http://localhost:8080} volumes: - - ${RUNTIME_PATH}/data/blog:/app/data + - ${RUNTIME_PATH}/data/insta:/app/data healthcheck: test: ["CMD", "python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')"] interval: 30s @@ -139,7 +143,7 @@ services: - CORS_ALLOW_ORIGINS=${CORS_ALLOW_ORIGINS:-http://localhost:3007,http://localhost:8080} - STOCK_URL=http://stock:8000 - MUSIC_LAB_URL=http://music-lab:8000 - - BLOG_LAB_URL=http://blog-lab:8000 + - INSTA_LAB_URL=http://insta-lab:8000 - REALESTATE_LAB_URL=http://realestate-lab:8000 - REALESTATE_DASHBOARD_URL=${REALESTATE_DASHBOARD_URL:-http://localhost:8080/realestate} - TELEGRAM_BOT_TOKEN=${TELEGRAM_BOT_TOKEN:-} @@ -160,7 +164,7 @@ services: depends_on: - stock - music-lab - - blog-lab + - insta-lab - realestate-lab healthcheck: test: ["CMD", "python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')"] @@ -245,7 +249,7 @@ services: - lotto - stock - music-lab - - blog-lab + - insta-lab - realestate-lab - agent-office - personal diff --git a/blog-lab/Dockerfile b/insta-lab/Dockerfile similarity index 50% rename from blog-lab/Dockerfile rename to insta-lab/Dockerfile index 0481198..470d10b 100644 --- a/blog-lab/Dockerfile +++ b/insta-lab/Dockerfile @@ -1,15 +1,17 @@ -FROM python:3.12-alpine +FROM python:3.12-slim ENV PYTHONUNBUFFERED=1 WORKDIR /app -RUN apk add --no-cache gcc musl-dev +RUN apt-get update && apt-get install -y --no-install-recommends \ + fonts-noto-cjk fonts-noto-cjk-extra \ + && rm -rf /var/lib/apt/lists/* COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt +RUN playwright install --with-deps chromium COPY . . EXPOSE 8000 - CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"] diff --git a/blog-lab/app/__init__.py b/insta-lab/app/__init__.py similarity index 100% rename from blog-lab/app/__init__.py rename to insta-lab/app/__init__.py diff --git a/insta-lab/app/card_renderer.py b/insta-lab/app/card_renderer.py new file mode 100644 index 0000000..5bbfc71 --- /dev/null +++ b/insta-lab/app/card_renderer.py @@ -0,0 +1,100 @@ +"""Jinja → HTML → Playwright headless screenshot.""" + +import asyncio +import hashlib +import json +import logging +import os +import tempfile +from typing import List + +from jinja2 import Environment, FileSystemLoader, select_autoescape +from playwright.async_api import async_playwright + +from .config import CARDS_DIR, CARD_TEMPLATE_DIR +from . import db + +logger = logging.getLogger(__name__) + + +def _resolve_template_dir() -> str: + """Prefer config CARD_TEMPLATE_DIR if it exists; else fall back to in-repo templates/.""" + if os.path.isdir(CARD_TEMPLATE_DIR): + return CARD_TEMPLATE_DIR + return os.path.join(os.path.dirname(__file__), "templates") + + +def _env() -> Environment: + return Environment( + loader=FileSystemLoader(_resolve_template_dir()), + autoescape=select_autoescape(["html", "j2"]), + ) + + +def _slate_dir(slate_id: int) -> str: + out = os.path.join(CARDS_DIR, str(slate_id)) + os.makedirs(out, exist_ok=True) + return out + + +def _build_pages(slate: dict) -> List[dict]: + cover = json.loads(slate["cover_copy"] or "{}") + bodies = json.loads(slate["body_copies"] or "[]") + cta = json.loads(slate["cta_copy"] or "{}") + accent = cover.get("accent_color") or "#0F62FE" + pages: List[dict] = [] + pages.append({ + "page_type": "cover", "page_no": 1, "total_pages": 10, + "headline": cover.get("headline", ""), "body": cover.get("body", ""), + "accent_color": accent, "cta": "", + }) + for i, b in enumerate(bodies[:8]): + pages.append({ + "page_type": "body", "page_no": i + 2, "total_pages": 10, + "headline": b.get("headline", ""), "body": b.get("body", ""), + "accent_color": accent, "cta": "", + }) + pages.append({ + "page_type": "cta", "page_no": 10, "total_pages": 10, + "headline": cta.get("headline", ""), "body": cta.get("body", ""), + "accent_color": accent, "cta": cta.get("cta", ""), + }) + return pages + + +async def render_slate(slate_id: int, template: str = "default/card.html.j2") -> List[str]: + slate = db.get_card_slate(slate_id) + if not slate: + raise ValueError(f"slate {slate_id} not found") + env = _env() + tmpl = env.get_template(template) + pages = _build_pages(slate) + out_dir = _slate_dir(slate_id) + paths: List[str] = [] + + async with async_playwright() as p: + browser = await p.chromium.launch() + try: + ctx = await browser.new_context(viewport={"width": 1080, "height": 1350}) + page = await ctx.new_page() + for spec in pages: + html_str = tmpl.render(**spec) + with tempfile.NamedTemporaryFile("w", suffix=".html", delete=False, encoding="utf-8") as f: + f.write(html_str) + html_path = f.name + try: + await page.goto(f"file://{html_path}", wait_until="networkidle") + out_path = os.path.join(out_dir, f"{spec['page_no']:02d}.png") + await page.screenshot(path=out_path, full_page=False, omit_background=False) + with open(out_path, "rb") as fp: + file_hash = hashlib.md5(fp.read()).hexdigest() + db.add_card_asset(slate_id, spec["page_no"], out_path, file_hash) + paths.append(out_path) + finally: + try: + os.unlink(html_path) + except OSError: + pass + finally: + await browser.close() + return paths diff --git a/insta-lab/app/card_writer.py b/insta-lab/app/card_writer.py new file mode 100644 index 0000000..a763e5f --- /dev/null +++ b/insta-lab/app/card_writer.py @@ -0,0 +1,100 @@ +"""Claude로 10페이지 카드 카피를 한 번에 생성.""" + +import json +import logging +import re +from typing import Any, Dict, Optional + +from anthropic import Anthropic + +from .config import ANTHROPIC_API_KEY, ANTHROPIC_MODEL_SONNET +from . import db + +logger = logging.getLogger(__name__) + +DEFAULT_ACCENT_BY_CATEGORY = { + "economy": "#0F62FE", + "psychology": "#A66CFF", + "celebrity": "#FF5C8A", +} + +DEFAULT_PROMPT = """너는 인스타그램 카드 뉴스 카피라이터다. +카테고리: {category} +키워드: {keyword} +참고 기사: +{articles} + +10페이지 인스타 카드용 카피를 다음 JSON 한 객체로만 출력해라 (코드펜스 금지): +{{ + "cover_copy": {{"headline": "<훅 한 줄>", "body": "<서브카피 1~2줄>", "accent_color": "#hex"}}, + "body_copies": [ + {{"headline": "<포인트 헤드라인>", "body": "<2~4문장 본문>"}}, + ... (총 8개) + ], + "cta_copy": {{"headline": "<요약 한 줄>", "body": "<마무리 1~2줄>", "cta": "팔로우/저장 등"}}, + "suggested_caption": "<인스타 캡션 본문>", + "hashtags": ["#태그1", "#태그2", ...] +}} +""" + + +def _client() -> Anthropic: + return Anthropic(api_key=ANTHROPIC_API_KEY) + + +def _strip_codefence(s: str) -> str: + s = s.strip() + if s.startswith("```"): + s = re.sub(r"^```(?:json)?\s*|\s*```$", "", s).strip() + return s + + +def _load_prompt() -> str: + pt = db.get_prompt_template("slate_writer") + if pt and pt.get("template"): + return pt["template"] + return DEFAULT_PROMPT + + +def write_slate(keyword: str, category: str, + articles: Optional[list] = None) -> int: + """Claude로 10페이지 카피 생성 후 card_slates에 저장. slate_id 반환.""" + if articles is None: + articles = db.list_news_articles(category=category, days=2) + article_text = "\n".join( + f"- {a['title']}: {a.get('summary', '')[:120]}" for a in articles[:8] + ) or "(참고 기사 없음)" + + prompt = _load_prompt().format(category=category, keyword=keyword, articles=article_text) + msg = _client().messages.create( + model=ANTHROPIC_MODEL_SONNET, + max_tokens=4000, + messages=[{"role": "user", "content": prompt}], + ) + raw = msg.content[0].text + cleaned = _strip_codefence(raw) + try: + data: Dict[str, Any] = json.loads(cleaned) + except json.JSONDecodeError as e: + logger.warning("slate JSON parse failed: %s", e) + raise ValueError(f"Invalid JSON from LLM: {e}") from e + + body_copies = data.get("body_copies") or [] + if len(body_copies) != 8: + raise ValueError(f"body_copies must have 8 items, got {len(body_copies)}") + + cover = data.get("cover_copy") or {} + if not cover.get("accent_color"): + cover["accent_color"] = DEFAULT_ACCENT_BY_CATEGORY.get(category, "#222831") + + sid = db.add_card_slate({ + "keyword": keyword, + "category": category, + "status": "draft", + "cover_copy": cover, + "body_copies": body_copies, + "cta_copy": data.get("cta_copy") or {}, + "suggested_caption": data.get("suggested_caption") or "", + "hashtags": data.get("hashtags") or [], + }) + return sid diff --git a/insta-lab/app/config.py b/insta-lab/app/config.py new file mode 100644 index 0000000..347aae2 --- /dev/null +++ b/insta-lab/app/config.py @@ -0,0 +1,25 @@ +import os + +NAVER_CLIENT_ID = os.getenv("NAVER_CLIENT_ID", "") +NAVER_CLIENT_SECRET = os.getenv("NAVER_CLIENT_SECRET", "") +ANTHROPIC_API_KEY = os.getenv("ANTHROPIC_API_KEY", "") +ANTHROPIC_MODEL_HAIKU = os.getenv("ANTHROPIC_MODEL_HAIKU", "claude-haiku-4-5-20251001") +ANTHROPIC_MODEL_SONNET = os.getenv("ANTHROPIC_MODEL_SONNET", "claude-sonnet-4-6") + +INSTA_DATA_PATH = os.getenv("INSTA_DATA_PATH", "/app/data") +DB_PATH = os.path.join(INSTA_DATA_PATH, "insta.db") +CARDS_DIR = os.path.join(INSTA_DATA_PATH, "insta_cards") +CARD_TEMPLATE_DIR = os.getenv("CARD_TEMPLATE_DIR", "/app/app/templates") + +CORS_ALLOW_ORIGINS = os.getenv( + "CORS_ALLOW_ORIGINS", "http://localhost:3007,http://localhost:8080" +) + +NEWS_PER_CATEGORY = int(os.getenv("NEWS_PER_CATEGORY", "30")) +KEYWORDS_PER_CATEGORY = int(os.getenv("KEYWORDS_PER_CATEGORY", "5")) + +DEFAULT_CATEGORY_SEEDS = { + "economy": ["금리", "인플레이션", "환율", "주식", "부동산"], + "psychology": ["심리학", "스트레스", "우울증", "관계", "자존감"], + "celebrity": ["연예인", "드라마", "예능", "K-POP", "영화"], +} diff --git a/insta-lab/app/db.py b/insta-lab/app/db.py new file mode 100644 index 0000000..963218d --- /dev/null +++ b/insta-lab/app/db.py @@ -0,0 +1,278 @@ +import os +import sqlite3 +import json +import uuid +from typing import Any, Dict, List, Optional + +from .config import DB_PATH + + +def _conn() -> sqlite3.Connection: + os.makedirs(os.path.dirname(DB_PATH), exist_ok=True) + conn = sqlite3.connect(DB_PATH, timeout=120.0) + conn.row_factory = sqlite3.Row + conn.execute("PRAGMA journal_mode=WAL") + conn.execute("PRAGMA busy_timeout=120000") + conn.execute("PRAGMA foreign_keys=ON") + return conn + + +def init_db() -> None: + with _conn() as conn: + conn.execute(""" + CREATE TABLE IF NOT EXISTS news_articles ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + category TEXT NOT NULL, + title TEXT NOT NULL, + link TEXT NOT NULL UNIQUE, + summary TEXT NOT NULL DEFAULT '', + pub_date TEXT, + fetched_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ','now')) + ) + """) + conn.execute("CREATE INDEX IF NOT EXISTS idx_na_category_fetched ON news_articles(category, fetched_at DESC)") + + conn.execute(""" + CREATE TABLE IF NOT EXISTS trending_keywords ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + keyword TEXT NOT NULL, + category TEXT NOT NULL, + score REAL NOT NULL DEFAULT 0, + articles_count INTEGER NOT NULL DEFAULT 0, + suggested_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ','now')), + used INTEGER NOT NULL DEFAULT 0 + ) + """) + conn.execute("CREATE INDEX IF NOT EXISTS idx_tk_score ON trending_keywords(category, score DESC)") + + conn.execute(""" + CREATE TABLE IF NOT EXISTS card_slates ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + keyword TEXT NOT NULL, + category TEXT NOT NULL, + status TEXT NOT NULL DEFAULT 'draft', + cover_copy TEXT NOT NULL DEFAULT '{}', + body_copies TEXT NOT NULL DEFAULT '[]', + cta_copy TEXT NOT NULL DEFAULT '{}', + suggested_caption TEXT NOT NULL DEFAULT '', + hashtags TEXT NOT NULL DEFAULT '[]', + created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ','now')), + updated_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ','now')) + ) + """) + conn.execute("CREATE INDEX IF NOT EXISTS idx_cs_created ON card_slates(created_at DESC)") + + conn.execute(""" + CREATE TABLE IF NOT EXISTS card_assets ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + slate_id INTEGER NOT NULL REFERENCES card_slates(id) ON DELETE CASCADE, + page_index INTEGER NOT NULL, + file_path TEXT NOT NULL, + file_hash TEXT NOT NULL DEFAULT '', + created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ','now')), + UNIQUE (slate_id, page_index) + ) + """) + conn.execute("CREATE INDEX IF NOT EXISTS idx_ca_slate ON card_assets(slate_id, page_index)") + + conn.execute(""" + CREATE TABLE IF NOT EXISTS generation_tasks ( + id TEXT PRIMARY KEY, + type TEXT NOT NULL, + status TEXT NOT NULL DEFAULT 'queued', + progress INTEGER NOT NULL DEFAULT 0, + message TEXT NOT NULL DEFAULT '', + result_id INTEGER, + error TEXT, + params TEXT NOT NULL DEFAULT '{}', + created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ','now')), + updated_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ','now')) + ) + """) + conn.execute("CREATE INDEX IF NOT EXISTS idx_gt_created ON generation_tasks(created_at DESC)") + + conn.execute(""" + CREATE TABLE IF NOT EXISTS prompt_templates ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + name TEXT NOT NULL UNIQUE, + description TEXT NOT NULL DEFAULT '', + template TEXT NOT NULL DEFAULT '', + updated_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ','now')) + ) + """) + + +# ── news_articles ──────────────────────────────────────────────── +def add_news_article(row: Dict[str, Any]) -> int: + with _conn() as conn: + try: + cur = conn.execute( + "INSERT INTO news_articles(category, title, link, summary, pub_date) VALUES(?,?,?,?,?)", + (row["category"], row["title"], row["link"], row.get("summary", ""), row.get("pub_date")), + ) + return cur.lastrowid + except sqlite3.IntegrityError: + existing = conn.execute("SELECT id FROM news_articles WHERE link=?", (row["link"],)).fetchone() + return existing["id"] if existing else 0 + + +def list_news_articles(category: Optional[str] = None, days: int = 1) -> List[Dict[str, Any]]: + sql = "SELECT * FROM news_articles WHERE fetched_at >= datetime('now', ?)" + params: List[Any] = [f"-{int(days)} days"] + if category: + sql += " AND category=?" + params.append(category) + sql += " ORDER BY fetched_at DESC" + with _conn() as conn: + rows = conn.execute(sql, params).fetchall() + return [dict(r) for r in rows] + + +# ── trending_keywords ─────────────────────────────────────────── +def add_trending_keyword(row: Dict[str, Any]) -> int: + with _conn() as conn: + cur = conn.execute( + "INSERT INTO trending_keywords(keyword, category, score, articles_count) VALUES(?,?,?,?)", + (row["keyword"], row["category"], float(row.get("score", 0.0)), int(row.get("articles_count", 0))), + ) + return cur.lastrowid + + +def list_trending_keywords(category: Optional[str] = None, used: Optional[bool] = None) -> List[Dict[str, Any]]: + sql = "SELECT * FROM trending_keywords WHERE 1=1" + params: List[Any] = [] + if category: + sql += " AND category=?" + params.append(category) + if used is not None: + sql += " AND used=?" + params.append(1 if used else 0) + sql += " ORDER BY score DESC, suggested_at DESC" + with _conn() as conn: + rows = conn.execute(sql, params).fetchall() + return [dict(r) for r in rows] + + +def mark_keyword_used(keyword_id: int) -> None: + with _conn() as conn: + conn.execute("UPDATE trending_keywords SET used=1 WHERE id=?", (keyword_id,)) + + +def get_trending_keyword(keyword_id: int) -> Optional[Dict[str, Any]]: + with _conn() as conn: + row = conn.execute("SELECT * FROM trending_keywords WHERE id=?", (keyword_id,)).fetchone() + return dict(row) if row else None + + +# ── card_slates ───────────────────────────────────────────────── +def add_card_slate(row: Dict[str, Any]) -> int: + with _conn() as conn: + cur = conn.execute(""" + INSERT INTO card_slates(keyword, category, status, cover_copy, body_copies, cta_copy, + suggested_caption, hashtags) + VALUES(?,?,?,?,?,?,?,?) + """, ( + row["keyword"], row["category"], row.get("status", "draft"), + json.dumps(row.get("cover_copy", {}), ensure_ascii=False), + json.dumps(row.get("body_copies", []), ensure_ascii=False), + json.dumps(row.get("cta_copy", {}), ensure_ascii=False), + row.get("suggested_caption", ""), + json.dumps(row.get("hashtags", []), ensure_ascii=False), + )) + return cur.lastrowid + + +def update_slate_status(slate_id: int, status: str) -> None: + with _conn() as conn: + conn.execute( + "UPDATE card_slates SET status=?, updated_at=strftime('%Y-%m-%dT%H:%M:%fZ','now') WHERE id=?", + (status, slate_id), + ) + + +def get_card_slate(slate_id: int) -> Optional[Dict[str, Any]]: + with _conn() as conn: + row = conn.execute("SELECT * FROM card_slates WHERE id=?", (slate_id,)).fetchone() + return dict(row) if row else None + + +def list_card_slates(limit: int = 50) -> List[Dict[str, Any]]: + with _conn() as conn: + rows = conn.execute( + "SELECT * FROM card_slates ORDER BY created_at DESC LIMIT ?", + (limit,), + ).fetchall() + return [dict(r) for r in rows] + + +def delete_card_slate(slate_id: int) -> None: + with _conn() as conn: + conn.execute("DELETE FROM card_slates WHERE id=?", (slate_id,)) + + +# ── card_assets ───────────────────────────────────────────────── +def add_card_asset(slate_id: int, page_index: int, file_path: str, file_hash: str = "") -> int: + with _conn() as conn: + cur = conn.execute(""" + INSERT INTO card_assets(slate_id, page_index, file_path, file_hash) + VALUES(?,?,?,?) + ON CONFLICT(slate_id, page_index) DO UPDATE SET + file_path=excluded.file_path, file_hash=excluded.file_hash + """, (slate_id, page_index, file_path, file_hash)) + return cur.lastrowid + + +def list_card_assets(slate_id: int) -> List[Dict[str, Any]]: + with _conn() as conn: + rows = conn.execute( + "SELECT * FROM card_assets WHERE slate_id=? ORDER BY page_index ASC", + (slate_id,), + ).fetchall() + return [dict(r) for r in rows] + + +# ── generation_tasks ──────────────────────────────────────────── +def create_task(task_type: str, params: Dict[str, Any]) -> str: + tid = uuid.uuid4().hex + with _conn() as conn: + conn.execute( + "INSERT INTO generation_tasks(id, type, params) VALUES(?,?,?)", + (tid, task_type, json.dumps(params, ensure_ascii=False)), + ) + return tid + + +def update_task(task_id: str, status: str, progress: int = 0, message: str = "", + result_id: Optional[int] = None, error: Optional[str] = None) -> None: + with _conn() as conn: + conn.execute(""" + UPDATE generation_tasks + SET status=?, progress=?, message=?, result_id=?, error=?, + updated_at=strftime('%Y-%m-%dT%H:%M:%fZ','now') + WHERE id=? + """, (status, progress, message, result_id, error, task_id)) + + +def get_task(task_id: str) -> Optional[Dict[str, Any]]: + with _conn() as conn: + row = conn.execute("SELECT * FROM generation_tasks WHERE id=?", (task_id,)).fetchone() + return dict(row) if row else None + + +# ── prompt_templates ──────────────────────────────────────────── +def upsert_prompt_template(name: str, template: str, description: str = "") -> None: + with _conn() as conn: + conn.execute(""" + INSERT INTO prompt_templates(name, description, template) + VALUES(?,?,?) + ON CONFLICT(name) DO UPDATE SET + template=excluded.template, + description=excluded.description, + updated_at=strftime('%Y-%m-%dT%H:%M:%fZ','now') + """, (name, description, template)) + + +def get_prompt_template(name: str) -> Optional[Dict[str, Any]]: + with _conn() as conn: + row = conn.execute("SELECT * FROM prompt_templates WHERE name=?", (name,)).fetchone() + return dict(row) if row else None diff --git a/insta-lab/app/keyword_extractor.py b/insta-lab/app/keyword_extractor.py new file mode 100644 index 0000000..2c307e1 --- /dev/null +++ b/insta-lab/app/keyword_extractor.py @@ -0,0 +1,83 @@ +"""키워드 추출 — 한글 명사 빈도 + Claude Haiku 정제.""" + +import json +import logging +import re +from collections import Counter +from typing import Any, Dict, List + +from anthropic import Anthropic + +from .config import ANTHROPIC_API_KEY, ANTHROPIC_MODEL_HAIKU, KEYWORDS_PER_CATEGORY +from . import db + +logger = logging.getLogger(__name__) + +_NOUN_RE = re.compile(r"[가-힣]{2,6}") +_STOPWORDS = { + "있다", "없다", "이다", "되다", "그리고", "하지만", "통해", "위해", "오늘", "이번", + "지난", "관련", "대해", "또한", "다만", "한편", "최근", "앞서", "현재", "진행", + "발생", "결과", "이상", "이하", "여러", "다양", "방법", "경우", "이유", "필요", +} + + +def _count_nouns(text: str) -> Dict[str, int]: + tokens = _NOUN_RE.findall(text or "") + return Counter(tokens) + + +def _top_candidates(counts: Dict[str, int], n: int = 20) -> List[tuple]: + filtered = [(k, c) for k, c in counts.items() if k not in _STOPWORDS] + return sorted(filtered, key=lambda x: x[1], reverse=True)[:n] + + +def _refine_with_llm(category: str, candidates: List[tuple], articles: List[Dict[str, Any]]) -> List[Dict[str, Any]]: + """Claude Haiku로 후보 정제. JSON 리스트 [{keyword, score(0~1), reason}] 반환.""" + if not ANTHROPIC_API_KEY: + return [{"keyword": k, "score": min(1.0, c / 10), "reason": "freq"} for k, c in candidates[:KEYWORDS_PER_CATEGORY]] + + client = Anthropic(api_key=ANTHROPIC_API_KEY) + titles = [a["title"] for a in articles[:15]] + prompt = f"""너는 인스타그램 카드 뉴스 큐레이터다. +카테고리: {category} +빈도 상위 후보: {[k for k, _ in candidates]} +관련 기사 제목 일부: +{chr(10).join('- ' + t for t in titles)} + +이 후보 중에서 인스타 카드 콘텐츠로 적합한 키워드를 score 내림차순으로 최대 {KEYWORDS_PER_CATEGORY}개 골라. +출력 형식 (JSON 배열만): +[{{"keyword": "...", "score": 0.0~1.0, "reason": "..."}}] +""" + msg = client.messages.create( + model=ANTHROPIC_MODEL_HAIKU, + max_tokens=600, + messages=[{"role": "user", "content": prompt}], + ) + text = msg.content[0].text.strip() + if text.startswith("```"): + text = re.sub(r"^```(?:json)?\s*|\s*```$", "", text).strip() + try: + return json.loads(text) + except Exception: + logger.warning("LLM refine JSON parse failed, falling back to freq") + return [{"keyword": k, "score": min(1.0, c / 10), "reason": "freq-fallback"} for k, c in candidates[:KEYWORDS_PER_CATEGORY]] + + +def extract_for_category(category: str, limit: int = KEYWORDS_PER_CATEGORY) -> List[Dict[str, Any]]: + """카테고리 기사들에서 키워드를 뽑아 DB에 저장하고 결과 반환.""" + articles = db.list_news_articles(category=category, days=2) + text_blob = "\n".join((a["title"] + " " + a.get("summary", "")) for a in articles) + counts = _count_nouns(text_blob) + candidates = _top_candidates(counts, n=20) + refined = _refine_with_llm(category, candidates, articles)[:limit] + + saved: List[Dict[str, Any]] = [] + for kw in refined: + kid = db.add_trending_keyword({ + "keyword": kw["keyword"], + "category": category, + "score": float(kw.get("score", 0.0)), + "articles_count": sum(1 for a in articles if kw["keyword"] in a["title"]), + }) + saved.append({"id": kid, **kw, "category": category}) + return saved diff --git a/insta-lab/app/main.py b/insta-lab/app/main.py new file mode 100644 index 0000000..fe0ef80 --- /dev/null +++ b/insta-lab/app/main.py @@ -0,0 +1,245 @@ +"""FastAPI entrypoint for insta-lab.""" + +import asyncio +import json +import logging +import os +from typing import Optional + +from fastapi import FastAPI, HTTPException, BackgroundTasks, Body, Query +from fastapi.middleware.cors import CORSMiddleware +from fastapi.responses import FileResponse +from pydantic import BaseModel + +from .config import ( + CORS_ALLOW_ORIGINS, NAVER_CLIENT_ID, ANTHROPIC_API_KEY, + INSTA_DATA_PATH, DB_PATH, DEFAULT_CATEGORY_SEEDS, KEYWORDS_PER_CATEGORY, +) +from . import db, news_collector, keyword_extractor, card_writer, card_renderer + +logger = logging.getLogger(__name__) +app = FastAPI() + +app.add_middleware( + CORSMiddleware, + allow_origins=[o.strip() for o in CORS_ALLOW_ORIGINS.split(",")], + allow_credentials=False, + allow_methods=["GET", "POST", "PUT", "DELETE", "OPTIONS", "PATCH"], + allow_headers=["Content-Type"], +) + + +@app.on_event("startup") +def on_startup(): + os.makedirs(INSTA_DATA_PATH, exist_ok=True) + db.init_db() + + +@app.get("/health") +def health(): + return {"ok": True} + + +@app.get("/api/insta/status") +def status(): + return { + "ok": True, + "naver_api": bool(NAVER_CLIENT_ID), + "anthropic_api": bool(ANTHROPIC_API_KEY), + } + + +# ── News ───────────────────────────────────────────────────────── +class CollectRequest(BaseModel): + categories: Optional[list[str]] = None + + +def _seeds_for(category: str) -> list[str]: + pt = db.get_prompt_template("category_seeds") + if pt and pt.get("template"): + try: + data = json.loads(pt["template"]) + if category in data: + return list(data[category]) + except Exception: + pass + return list(DEFAULT_CATEGORY_SEEDS.get(category, [])) + + +async def _bg_collect(task_id: str, categories: list[str]): + try: + db.update_task(task_id, "processing", 10, "수집 중") + total = 0 + for cat in categories: + seeds = _seeds_for(cat) + if not seeds: + continue + total += news_collector.collect_for_category(cat, seeds) + db.update_task(task_id, "succeeded", 100, f"{total}건 수집", result_id=total) + except Exception as e: + logger.exception("collect failed") + db.update_task(task_id, "failed", 0, "", error=str(e)) + + +@app.post("/api/insta/news/collect") +def collect_news(req: CollectRequest, bg: BackgroundTasks): + cats = req.categories or list(DEFAULT_CATEGORY_SEEDS.keys()) + tid = db.create_task("news_collect", {"categories": cats}) + bg.add_task(_bg_collect, tid, cats) + return {"task_id": tid, "categories": cats} + + +@app.get("/api/insta/news/articles") +def list_articles(category: Optional[str] = None, days: int = Query(7, ge=1, le=90)): + return {"items": db.list_news_articles(category=category, days=days)} + + +# ── Keywords ───────────────────────────────────────────────────── +class ExtractRequest(BaseModel): + categories: Optional[list[str]] = None + + +async def _bg_extract(task_id: str, categories: list[str]): + try: + db.update_task(task_id, "processing", 10, "추출 중") + for cat in categories: + keyword_extractor.extract_for_category(cat, limit=KEYWORDS_PER_CATEGORY) + db.update_task(task_id, "succeeded", 100, "완료", result_id=0) + except Exception as e: + logger.exception("extract failed") + db.update_task(task_id, "failed", 0, "", error=str(e)) + + +@app.post("/api/insta/keywords/extract") +def extract_keywords(req: ExtractRequest, bg: BackgroundTasks): + cats = req.categories or list(DEFAULT_CATEGORY_SEEDS.keys()) + tid = db.create_task("keyword_extract", {"categories": cats}) + bg.add_task(_bg_extract, tid, cats) + return {"task_id": tid, "categories": cats} + + +@app.get("/api/insta/keywords") +def list_keywords(category: Optional[str] = None, used: Optional[bool] = None): + return {"items": db.list_trending_keywords(category=category, used=used)} + + +# ── Slates ─────────────────────────────────────────────────────── +class SlateRequest(BaseModel): + keyword: str + category: str + keyword_id: Optional[int] = None + + +async def _bg_create_slate(task_id: str, keyword: str, category: str, keyword_id: Optional[int]): + try: + db.update_task(task_id, "processing", 30, "카피 생성 중") + sid = card_writer.write_slate(keyword=keyword, category=category) + db.update_task(task_id, "processing", 70, "카드 렌더 중") + await card_renderer.render_slate(sid) + db.update_slate_status(sid, "rendered") + if keyword_id: + db.mark_keyword_used(keyword_id) + db.update_task(task_id, "succeeded", 100, "완료", result_id=sid) + except Exception as e: + logger.exception("create slate failed") + db.update_task(task_id, "failed", 0, "", error=str(e)) + + +@app.post("/api/insta/slates") +def create_slate(req: SlateRequest, bg: BackgroundTasks): + tid = db.create_task("slate_create", req.dict()) + bg.add_task(_bg_create_slate, tid, req.keyword, req.category, req.keyword_id) + return {"task_id": tid} + + +@app.get("/api/insta/slates") +def list_slates(limit: int = Query(50, ge=1, le=500)): + return {"items": db.list_card_slates(limit=limit)} + + +@app.get("/api/insta/slates/{slate_id}") +def get_slate(slate_id: int): + s = db.get_card_slate(slate_id) + if not s: + raise HTTPException(404, "slate not found") + s["assets"] = db.list_card_assets(slate_id) + for k in ("cover_copy", "body_copies", "cta_copy", "hashtags"): + if isinstance(s.get(k), str): + try: + s[k] = json.loads(s[k]) + except Exception: + pass + return s + + +async def _bg_render(task_id: str, slate_id: int): + try: + db.update_task(task_id, "processing", 30, "재렌더 중") + await card_renderer.render_slate(slate_id) + db.update_slate_status(slate_id, "rendered") + db.update_task(task_id, "succeeded", 100, "완료", result_id=slate_id) + except Exception as e: + logger.exception("render failed") + db.update_task(task_id, "failed", 0, "", error=str(e)) + + +@app.post("/api/insta/slates/{slate_id}/render") +def render_slate_endpoint(slate_id: int, bg: BackgroundTasks): + if not db.get_card_slate(slate_id): + raise HTTPException(404, "slate not found") + tid = db.create_task("slate_render", {"slate_id": slate_id}) + bg.add_task(_bg_render, tid, slate_id) + return {"task_id": tid} + + +@app.get("/api/insta/slates/{slate_id}/assets/{page}") +def get_asset(slate_id: int, page: int): + if not (1 <= page <= 10): + raise HTTPException(400, "page must be 1..10") + assets = db.list_card_assets(slate_id) + match = next((a for a in assets if a["page_index"] == page), None) + if not match: + raise HTTPException(404, "asset not found") + return FileResponse(match["file_path"], media_type="image/png") + + +@app.delete("/api/insta/slates/{slate_id}") +def delete_slate(slate_id: int): + if not db.get_card_slate(slate_id): + raise HTTPException(404) + for a in db.list_card_assets(slate_id): + try: + os.unlink(a["file_path"]) + except OSError: + pass + db.delete_card_slate(slate_id) + return {"ok": True} + + +# ── Tasks ──────────────────────────────────────────────────────── +@app.get("/api/insta/tasks/{task_id}") +def get_task_status(task_id: str): + t = db.get_task(task_id) + if not t: + raise HTTPException(404) + return t + + +# ── Prompt Templates ───────────────────────────────────────────── +class TemplateBody(BaseModel): + template: str + description: str = "" + + +@app.get("/api/insta/templates/prompts/{name}") +def get_prompt(name: str): + pt = db.get_prompt_template(name) + if not pt: + raise HTTPException(404) + return pt + + +@app.put("/api/insta/templates/prompts/{name}") +def upsert_prompt(name: str, body: TemplateBody): + db.upsert_prompt_template(name, body.template, body.description) + return db.get_prompt_template(name) diff --git a/insta-lab/app/news_collector.py b/insta-lab/app/news_collector.py new file mode 100644 index 0000000..94acda7 --- /dev/null +++ b/insta-lab/app/news_collector.py @@ -0,0 +1,82 @@ +"""NAVER 뉴스 검색 API 연동 — 카테고리별 시드 키워드로 일일 수집.""" + +import html +import logging +import re +from typing import Any, Dict, List, Optional + +import requests + +from .config import NAVER_CLIENT_ID, NAVER_CLIENT_SECRET, NEWS_PER_CATEGORY +from . import db + +logger = logging.getLogger(__name__) + +NEWS_URL = "https://openapi.naver.com/v1/search/news.json" +_HEADERS = { + "X-Naver-Client-Id": NAVER_CLIENT_ID, + "X-Naver-Client-Secret": NAVER_CLIENT_SECRET, +} +_TAG_RE = re.compile(r"<[^>]+>") + + +def _clean(text: str) -> str: + if not text: + return "" + no_tag = _TAG_RE.sub("", text) + return html.unescape(no_tag).strip() + + +def search_news(keyword: str, display: int = 30, sort: str = "date") -> List[Dict[str, Any]]: + """NAVER news.json 단일 호출. + + Returns: list of {title, link, summary, pub_date} + """ + resp = requests.get( + NEWS_URL, + headers=_HEADERS, + params={"query": keyword, "display": display, "sort": sort}, + timeout=10, + ) + resp.raise_for_status() + data = resp.json() + return [ + { + "title": _clean(item.get("title", "")), + "link": item.get("link") or item.get("originallink", ""), + "summary": _clean(item.get("description", "")), + "pub_date": item.get("pubDate", ""), + } + for item in data.get("items", []) + ] + + +def collect_for_category(category: str, + seed_keywords: List[str], + per_keyword: Optional[int] = None) -> int: + """카테고리에 대해 시드 키워드 각각으로 검색 후 DB에 삽입. + UNIQUE(link)가 중복 삽입을 막음. 시도된 기사 수(중복 포함) 반환. + """ + per_kw = per_keyword if per_keyword is not None else max(1, NEWS_PER_CATEGORY // max(1, len(seed_keywords))) + seen_links = set() + attempted = 0 + for kw in seed_keywords: + try: + items = search_news(kw, display=per_kw) + except Exception as e: + logger.warning("search_news failed kw=%s err=%s", kw, e) + continue + for item in items: + link = item["link"] + if not link or link in seen_links: + continue + seen_links.add(link) + db.add_news_article({ + "category": category, + "title": item["title"], + "link": link, + "summary": item["summary"], + "pub_date": item["pub_date"], + }) + attempted += 1 + return attempted diff --git a/blog-lab/tests/__init__.py b/insta-lab/app/templates/__init__.py similarity index 100% rename from blog-lab/tests/__init__.py rename to insta-lab/app/templates/__init__.py diff --git a/insta-lab/app/templates/default/.gitkeep b/insta-lab/app/templates/default/.gitkeep new file mode 100644 index 0000000..e69de29 diff --git a/insta-lab/app/templates/default/card.html.j2 b/insta-lab/app/templates/default/card.html.j2 new file mode 100644 index 0000000..836c3cb --- /dev/null +++ b/insta-lab/app/templates/default/card.html.j2 @@ -0,0 +1,55 @@ + + + + + + + +
+
+ {{ page_type|upper }} +

{{ headline }}

+

{{ body }}

+
+ +
+ + diff --git a/blog-lab/pytest.ini b/insta-lab/pytest.ini similarity index 100% rename from blog-lab/pytest.ini rename to insta-lab/pytest.ini diff --git a/blog-lab/requirements.txt b/insta-lab/requirements.txt similarity index 57% rename from blog-lab/requirements.txt rename to insta-lab/requirements.txt index 7cc350e..3d26add 100644 --- a/blog-lab/requirements.txt +++ b/insta-lab/requirements.txt @@ -1,6 +1,9 @@ fastapi==0.115.6 uvicorn[standard]==0.34.0 requests==2.32.3 -anthropic==0.52.0 -beautifulsoup4>=4.12 httpx>=0.27 +anthropic==0.52.0 +jinja2>=3.1.4 +playwright==1.48.0 +pytest>=8.0 +pytest-asyncio>=0.24 diff --git a/insta-lab/tests/__init__.py b/insta-lab/tests/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/insta-lab/tests/test_card_renderer.py b/insta-lab/tests/test_card_renderer.py new file mode 100644 index 0000000..647a71d --- /dev/null +++ b/insta-lab/tests/test_card_renderer.py @@ -0,0 +1,48 @@ +import os +import tempfile + +import pytest + +from app import db as db_module +from app import card_renderer + + +@pytest.fixture +def tmp_db_and_dirs(monkeypatch, tmp_path): + fd, path = tempfile.mkstemp(suffix=".db") + os.close(fd) + monkeypatch.setattr(db_module, "DB_PATH", path) + monkeypatch.setattr(card_renderer, "CARDS_DIR", str(tmp_path / "cards")) + db_module.init_db() + yield path + import gc + gc.collect() + for ext in ("", "-wal", "-shm"): + try: + os.remove(path + ext) + except OSError: + pass + + +def _seed_slate() -> int: + return db_module.add_card_slate({ + "keyword": "테스트", + "category": "economy", + "status": "draft", + "cover_copy": {"headline": "커버 헤드라인", "body": "서브카피", "accent_color": "#0F62FE"}, + "body_copies": [{"headline": f"본문 {i+1}", "body": f"내용 {i+1}"} for i in range(8)], + "cta_copy": {"headline": "마무리", "body": "감사합니다", "cta": "팔로우"}, + }) + + +@pytest.mark.asyncio +async def test_render_slate_produces_ten_pngs(tmp_db_and_dirs): + sid = _seed_slate() + paths = await card_renderer.render_slate(sid) + assert len(paths) == 10 + for p in paths: + assert os.path.exists(p) + assert os.path.getsize(p) > 1000 # > 1 KB sanity + db_module.update_slate_status(sid, "rendered") + assets = db_module.list_card_assets(sid) + assert {a["page_index"] for a in assets} == set(range(1, 11)) diff --git a/insta-lab/tests/test_card_writer.py b/insta-lab/tests/test_card_writer.py new file mode 100644 index 0000000..a5263e0 --- /dev/null +++ b/insta-lab/tests/test_card_writer.py @@ -0,0 +1,75 @@ +import json +import os +import tempfile +from unittest.mock import patch, MagicMock + +import pytest + +from app import db as db_module +from app import card_writer + + +@pytest.fixture +def tmp_db(monkeypatch): + fd, path = tempfile.mkstemp(suffix=".db") + os.close(fd) + monkeypatch.setattr(db_module, "DB_PATH", path) + db_module.init_db() + yield path + import gc + gc.collect() + for ext in ("", "-wal", "-shm"): + try: + os.remove(path + ext) + except OSError: + pass + + +SAMPLE_LLM_JSON = { + "cover_copy": {"headline": "금리 인상 단행", "body": "왜 지금?", "accent_color": "#0F62FE"}, + "body_copies": [ + {"headline": f"포인트 {i+1}", "body": f"본문 {i+1}"} for i in range(8) + ], + "cta_copy": {"headline": "정리", "body": "바로 확인", "cta": "팔로우"}, + "suggested_caption": "금리에 대해 알아보자", + "hashtags": ["#금리", "#경제"], +} + + +def _fake_messages_create(*_args, **_kwargs): + msg = MagicMock() + block = MagicMock() + block.text = json.dumps(SAMPLE_LLM_JSON, ensure_ascii=False) + msg.content = [block] + return msg + + +def test_write_slate_persists_full_payload(tmp_db, monkeypatch): + db_module.add_news_article({ + "category": "economy", "title": "기준금리 인상 단행", + "link": "https://example.com/1", "summary": "한국은행 발표", + }) + fake_client = MagicMock() + fake_client.messages.create = _fake_messages_create + monkeypatch.setattr(card_writer, "_client", lambda: fake_client) + + sid = card_writer.write_slate(keyword="기준금리", category="economy") + slate = db_module.get_card_slate(sid) + assert slate["status"] == "draft" + body_copies = json.loads(slate["body_copies"]) + assert len(body_copies) == 8 + assert body_copies[0]["headline"] == "포인트 1" + assert json.loads(slate["cover_copy"])["accent_color"] == "#0F62FE" + + +def test_write_slate_raises_on_invalid_json(tmp_db, monkeypatch): + fake_client = MagicMock() + bad_msg = MagicMock() + bad_block = MagicMock() + bad_block.text = "not json" + bad_msg.content = [bad_block] + fake_client.messages.create.return_value = bad_msg + monkeypatch.setattr(card_writer, "_client", lambda: fake_client) + + with pytest.raises(ValueError): + card_writer.write_slate(keyword="x", category="economy") diff --git a/insta-lab/tests/test_db.py b/insta-lab/tests/test_db.py new file mode 100644 index 0000000..9a853a9 --- /dev/null +++ b/insta-lab/tests/test_db.py @@ -0,0 +1,96 @@ +import os +import json +import tempfile + +import pytest + +from app import db as db_module + + +@pytest.fixture +def tmp_db(monkeypatch): + fd, path = tempfile.mkstemp(suffix=".db") + os.close(fd) + monkeypatch.setattr(db_module, "DB_PATH", path) + db_module.init_db() + yield path + # Close all SQLite WAL files before removal (needed on Windows) + import gc + gc.collect() + for ext in ("", "-wal", "-shm"): + try: + os.remove(path + ext) + except FileNotFoundError: + pass + + +def test_init_db_creates_six_tables(tmp_db): + with db_module._conn() as conn: + rows = conn.execute( + "SELECT name FROM sqlite_master WHERE type='table' ORDER BY name" + ).fetchall() + names = sorted(r[0] for r in rows if not r[0].startswith("sqlite_")) + assert names == sorted([ + "news_articles", "trending_keywords", "card_slates", + "card_assets", "generation_tasks", "prompt_templates", + ]) + + +def test_news_article_roundtrip(tmp_db): + aid = db_module.add_news_article({ + "category": "economy", + "title": "금리 인상 발표", + "link": "https://example.com/1", + "summary": "한국은행이 기준금리를 인상했다.", + "pub_date": "2026-05-15T08:00:00", + }) + assert isinstance(aid, int) + rows = db_module.list_news_articles(category="economy", days=7) + assert len(rows) == 1 + assert rows[0]["title"] == "금리 인상 발표" + + +def test_trending_keyword_roundtrip(tmp_db): + kid = db_module.add_trending_keyword({ + "keyword": "기준금리", + "category": "economy", + "score": 0.87, + "articles_count": 12, + }) + assert isinstance(kid, int) + items = db_module.list_trending_keywords(category="economy", used=False) + assert items[0]["score"] == pytest.approx(0.87) + + +def test_card_slate_with_assets(tmp_db): + sid = db_module.add_card_slate({ + "keyword": "기준금리", + "category": "economy", + "cover_copy": {"headline": "금리 인상", "body": "왜?", "accent_color": "#0F62FE"}, + "body_copies": [{"headline": f"H{i}", "body": f"B{i}"} for i in range(8)], + "cta_copy": {"headline": "정리", "body": "바로 확인", "cta": "팔로우"}, + "suggested_caption": "금리에 대해 알아보자", + "hashtags": ["#금리", "#경제"], + }) + db_module.add_card_asset(sid, page_index=1, file_path="/tmp/01.png", file_hash="abc") + slate = db_module.get_card_slate(sid) + assert slate["status"] == "draft" + assert json.loads(slate["body_copies"])[0]["headline"] == "H0" + assets = db_module.list_card_assets(sid) + assert assets[0]["page_index"] == 1 + + +def test_generation_task_lifecycle(tmp_db): + tid = db_module.create_task("collect", {"category": "economy"}) + db_module.update_task(tid, status="processing", progress=50, message="..") + db_module.update_task(tid, status="succeeded", progress=100, message="ok", result_id=123) + t = db_module.get_task(tid) + assert t["status"] == "succeeded" + assert t["result_id"] == 123 + + +def test_prompt_template_upsert(tmp_db): + db_module.upsert_prompt_template("slate_writer", "v1 template", "writer") + db_module.upsert_prompt_template("slate_writer", "v2 template", "writer") + pt = db_module.get_prompt_template("slate_writer") + assert pt["template"] == "v2 template" diff --git a/insta-lab/tests/test_keyword_extractor.py b/insta-lab/tests/test_keyword_extractor.py new file mode 100644 index 0000000..ac476d3 --- /dev/null +++ b/insta-lab/tests/test_keyword_extractor.py @@ -0,0 +1,65 @@ +import os +import tempfile +from unittest.mock import patch, MagicMock + +import pytest + +from app import db as db_module +from app import keyword_extractor + + +@pytest.fixture +def tmp_db(monkeypatch): + fd, path = tempfile.mkstemp(suffix=".db") + os.close(fd) + monkeypatch.setattr(db_module, "DB_PATH", path) + db_module.init_db() + yield path + # Windows-safe cleanup: close handles + remove sidecars + import gc + gc.collect() + for ext in ("", "-wal", "-shm"): + try: + os.remove(path + ext) + except OSError: + pass + + +def test_count_nouns_extracts_korean_nouns(): + text = "기준금리 인상으로 환율 급등. 기준금리 추가 인상 가능성" + counts = keyword_extractor._count_nouns(text) + assert counts["기준금리"] == 2 + assert counts["환율"] == 1 + + +def test_top_candidates_filters_stopwords(): + counts = {"기준금리": 5, "있다": 7, "환율": 3, "그리고": 4} + top = keyword_extractor._top_candidates(counts, n=10) + keywords = [k for k, _ in top] + assert "있다" not in keywords + assert "그리고" not in keywords + assert "기준금리" in keywords + + +def test_extract_for_category_persists(tmp_db): + # seed articles + for i in range(3): + db_module.add_news_article({ + "category": "economy", + "title": f"기준금리 인상 {i}", + "link": f"https://example.com/{i}", + "summary": "환율도 영향", + }) + + # mock LLM refinement + fake_refined = [ + {"keyword": "기준금리", "score": 0.92, "reason": "핵심 금융 이슈"}, + {"keyword": "환율", "score": 0.71, "reason": "시장 영향"}, + ] + with patch.object(keyword_extractor, "_refine_with_llm", return_value=fake_refined): + kws = keyword_extractor.extract_for_category("economy", limit=2) + + assert len(kws) == 2 + assert kws[0]["keyword"] == "기준금리" + persisted = db_module.list_trending_keywords(category="economy") + assert {p["keyword"] for p in persisted} == {"기준금리", "환율"} diff --git a/insta-lab/tests/test_main.py b/insta-lab/tests/test_main.py new file mode 100644 index 0000000..7ae31ce --- /dev/null +++ b/insta-lab/tests/test_main.py @@ -0,0 +1,91 @@ +import os +import tempfile + +import pytest +from fastapi.testclient import TestClient + +from app import db as db_module + + +@pytest.fixture +def client(monkeypatch): + fd, path = tempfile.mkstemp(suffix=".db") + os.close(fd) + monkeypatch.setattr(db_module, "DB_PATH", path) + db_module.init_db() + from app import main + monkeypatch.setattr(main, "DB_PATH", path) + with TestClient(main.app) as c: + yield c + import gc + gc.collect() + for ext in ("", "-wal", "-shm"): + try: + os.remove(path + ext) + except OSError: + pass + + +def test_health(client): + resp = client.get("/health") + assert resp.status_code == 200 + assert resp.json()["ok"] is True + + +def test_status_endpoint(client): + resp = client.get("/api/insta/status") + assert resp.status_code == 200 + j = resp.json() + assert "naver_api" in j and "anthropic_api" in j + + +def test_news_articles_listing(client): + db_module.add_news_article({ + "category": "economy", "title": "T1", "link": "https://x/1", "summary": "S", + }) + resp = client.get("/api/insta/news/articles?category=economy&days=7") + assert resp.status_code == 200 + assert len(resp.json()["items"]) == 1 + + +def test_keywords_listing(client): + db_module.add_trending_keyword({ + "keyword": "K", "category": "economy", "score": 0.5, "articles_count": 3, + }) + resp = client.get("/api/insta/keywords?category=economy") + assert resp.status_code == 200 + assert resp.json()["items"][0]["keyword"] == "K" + + +def test_create_slate_kicks_background_task(client, monkeypatch): + from app import main, card_writer, card_renderer + + def fake_write(keyword, category, articles=None): + return db_module.add_card_slate({ + "keyword": keyword, "category": category, "status": "draft", + "cover_copy": {"headline": "H", "body": "B", "accent_color": "#000"}, + "body_copies": [{"headline": f"h{i}", "body": f"b{i}"} for i in range(8)], + "cta_copy": {"headline": "C", "body": "B", "cta": "F"}, + }) + + async def fake_render(slate_id, template="default/card.html.j2"): + for i in range(1, 11): + db_module.add_card_asset(slate_id, i, f"/tmp/{slate_id}_{i}.png", "h") + return [f"/tmp/{slate_id}_{i}.png" for i in range(1, 11)] + + monkeypatch.setattr(card_writer, "write_slate", fake_write) + monkeypatch.setattr(card_renderer, "render_slate", fake_render) + + resp = client.post("/api/insta/slates", json={"keyword": "K", "category": "economy"}) + assert resp.status_code == 200 + task_id = resp.json()["task_id"] + # poll task + for _ in range(20): + st = client.get(f"/api/insta/tasks/{task_id}").json() + if st["status"] in ("succeeded", "failed"): + break + assert st["status"] == "succeeded" + slate_id = st["result_id"] + detail = client.get(f"/api/insta/slates/{slate_id}").json() + assert detail["status"] == "rendered" + assert len(detail["assets"]) == 10 diff --git a/insta-lab/tests/test_news_collector.py b/insta-lab/tests/test_news_collector.py new file mode 100644 index 0000000..a582bac --- /dev/null +++ b/insta-lab/tests/test_news_collector.py @@ -0,0 +1,89 @@ +from unittest.mock import patch, MagicMock +import os +import tempfile + +import pytest + +from app import db as db_module +from app import news_collector + + +@pytest.fixture +def tmp_db(monkeypatch): + fd, path = tempfile.mkstemp(suffix=".db") + os.close(fd) + monkeypatch.setattr(db_module, "DB_PATH", path) + db_module.init_db() + yield path + # Close all SQLite WAL files before removal (needed on Windows) + import gc + gc.collect() + for ext in ("", "-wal", "-shm"): + try: + os.remove(path + ext) + except FileNotFoundError: + pass + + +SAMPLE_RESPONSE = { + "items": [ + { + "title": "금리 인상 단행", + "originallink": "https://news.example.com/1", + "link": "https://n.news.naver.com/article/1", + "description": "한국은행이 기준금리를 25bp 올렸다.", + "pubDate": "Fri, 15 May 2026 08:00:00 +0900", + }, + { + "title": "환율 급등", + "originallink": "https://news.example.com/2", + "link": "https://n.news.naver.com/article/2", + "description": "원달러 환율이 1400원을 돌파했다.", + "pubDate": "Fri, 15 May 2026 09:00:00 +0900", + }, + ], +} + + +def test_strip_html_and_decode_entities(): + out = news_collector._clean(' "테스트" & 아이템 ') + assert out == '"테스트" & 아이템' + + +def test_search_news_parses_items(tmp_db): + fake_resp = MagicMock() + fake_resp.json.return_value = SAMPLE_RESPONSE + fake_resp.raise_for_status.return_value = None + with patch.object(news_collector.requests, "get", return_value=fake_resp): + items = news_collector.search_news("금리", display=10) + assert len(items) == 2 + assert items[0]["title"] == "금리 인상 단행" + assert items[0]["summary"].startswith("한국은행") + + +def test_collect_for_category_inserts(tmp_db): + fake_resp = MagicMock() + fake_resp.json.return_value = SAMPLE_RESPONSE + fake_resp.raise_for_status.return_value = None + with patch.object(news_collector.requests, "get", return_value=fake_resp): + news_collector.collect_for_category("economy", seed_keywords=["금리"], per_keyword=10) + rows = db_module.list_news_articles(category="economy", days=7) + assert {r["link"] for r in rows} == { + "https://n.news.naver.com/article/1", + "https://n.news.naver.com/article/2", + } + + +def test_collect_dedupes_existing(tmp_db): + db_module.add_news_article({ + "category": "economy", "title": "기존", + "link": "https://n.news.naver.com/article/1", "summary": "" + }) + fake_resp = MagicMock() + fake_resp.json.return_value = SAMPLE_RESPONSE + fake_resp.raise_for_status.return_value = None + with patch.object(news_collector.requests, "get", return_value=fake_resp): + news_collector.collect_for_category("economy", seed_keywords=["금리"]) + rows = db_module.list_news_articles(category="economy", days=7) + # 1 pre-existing + 1 newly added (the other link); UNIQUE link blocks duplicate insert + assert len(rows) == 2 diff --git a/nginx/default.conf b/nginx/default.conf index ebfa6e7..1d48d5b 100644 --- a/nginx/default.conf +++ b/nginx/default.conf @@ -169,18 +169,18 @@ server { proxy_pass http://stock:8000/api/trade/; } - # blog-marketing API - location /api/blog-marketing/ { + # insta API + location /api/insta/ { resolver 127.0.0.11 valid=10s; - set $blog_backend blog-lab:8000; + set $insta_backend insta-lab:8000; proxy_http_version 1.1; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; - proxy_read_timeout 120s; - proxy_pass http://$blog_backend$request_uri; + proxy_read_timeout 300s; + proxy_pass http://$insta_backend$request_uri; } # portfolio API (Stock) — trailing slash 유무 모두 매칭