docs: 배치 음악 생성 + 자동 영상 파이프라인 spec + plan

2026-05-10 18:49:16 +09:00
parent 84548a326e
commit f074cbec2d
2 changed files with 1320 additions and 0 deletions
--- a/docs/superpowers/specs/2026-05-10-batch-music-generation-design.md
+++ b/docs/superpowers/specs/2026-05-10-batch-music-generation-design.md
@@ -0,0 +1,505 @@
+# 배치 음악 생성 + 자동 영상 파이프라인 설계
+
+> 작성일: 2026-05-10
+> 관련: `2026-05-09-essential-mix-pipeline-design.md` (영상 파이프라인 베이스)
+
+---
+
+## 1. 배경
+
+현재 Create 탭은 사용자가 모든 파라미터(genre/mood/instruments/BPM/key/scale/duration/prompt) 수동 입력 후 1트랙 생성. 1시간+ mix 영상 만들려면 동일 장르 트랙 10개를 일일이 만들어야 함.
+
+목표: **장르 1개만 입력 → 10트랙 자동 생성 → 자동 컴파일 → 자동 영상 파이프라인 시작 → 텔레그램 승인만 하면 발행 완료**.
+
+전체 흐름:
+```
+[사용자] Create 탭 → 배치 모드 → 장르 + 트랙 수 선택 → 생성 시작
+   ↓ Suno API 순차 호출 (트랙당 ~1-2분)
+   ↓ Track 1: "{Genre} Mix Track 1", 랜덤 mood/instr/BPM/key
+   ↓ Track 2: "{Genre} Mix Track 2", ...
+   ↓ ... Track 10
+   ↓ 모두 완료 → compile_job 자동 생성 (acrossfade 3s)
+   ↓ compile 완료 → video_pipeline 자동 시작 (cover step)
+   ↓ 텔레그램에 "🎵 [{Genre} Mix] 커버 검토" 알림
+[사용자] 5번 승인으로 영상 발행
+```
+
+---
+
+## 2. 비목표
+
+- 병렬 음악 생성 — VRAM 부담 회피, 순차로 단순하게
+- 트랙별 prompt 자동 작성(Claude) — Suno는 genre+mood+instruments만으로도 충분
+- 트랙별 길이 가변 — 모든 트랙 동일 `target_duration_sec` (default 180s)
+- 사용자가 진행 중 트랙 prompt 편집 — 한 번 시작하면 끝까지
+
+---
+
+## 3. 사용자 흐름
+
+### 3-1. Create 탭의 신규 "배치 생성" 섹션
+
+```
+┌─ 🎲 배치 생성 (장르 + 자동 영상까지) ─────────────────┐
+│                                                       │
+│  장르         [▼ lo-fi                            ]   │
+│  트랙 수      [● 1 — 10] (10)                         │
+│  트랙당 길이  [● 60 — 300s] (180s)                    │
+│  ☑ 모든 트랙 생성 후 자동 영상 파이프라인 시작          │
+│                                                       │
+│  예상 시간: 약 15-25분 (트랙당 1-2분 × 10)             │
+│  예상 비용: ~$0.10 (Suno 10트랙 + DALL·E + Claude)    │
+│                                                       │
+│  [🎵 배치 생성 시작]                                   │
+│                                                       │
+│  ── 진행 상태 ──────────────────────────────────────  │
+│  배치 #3 — lo-fi · 7/10 완료 · 2:43 경과              │
+│  ✓ Track 1: Lo-Fi Mix Track 1 (chill, piano+synth)    │
+│  ✓ Track 2: Lo-Fi Mix Track 2 (relaxing, piano+drums) │
+│  ...                                                   │
+│  ⏳ Track 8: 생성 중...                                │
+│  ○ Track 9: 대기                                       │
+│  ○ Track 10: 대기                                      │
+└──────────────────────────────────────────────────────┘
+```
+
+### 3-2. 완료 후
+
+10트랙 모두 Library에 저장됨. compile_job_id가 자동 생성되고 영상 파이프라인이 cover step부터 시작 → 텔레그램 알림. 진행 탭에 카드 1장 추가.
+
+---
+
+## 4. 데이터 모델
+
+### 4-1. 신규 테이블 `music_batch_jobs`
+
+```sql
+CREATE TABLE music_batch_jobs (
+    id INTEGER PRIMARY KEY AUTOINCREMENT,
+    genre TEXT NOT NULL,
+    count INTEGER NOT NULL,                       -- 1-10
+    target_duration_sec INTEGER NOT NULL DEFAULT 180,
+    auto_pipeline INTEGER NOT NULL DEFAULT 1,     -- 0/1 boolean
+    completed INTEGER NOT NULL DEFAULT 0,
+    track_ids_json TEXT NOT NULL DEFAULT '[]',
+    current_track_index INTEGER NOT NULL DEFAULT 0,  -- 진행 중 트랙 (1..count)
+    current_track_status TEXT,                     -- queued | generating | failed
+    status TEXT NOT NULL DEFAULT 'queued',
+        -- queued: 시작 전
+        -- generating: 트랙 생성 중
+        -- generated: 모든 트랙 생성 완료 (compile 시작 전)
+        -- compiling: compile 진행 중
+        -- piped: 영상 파이프라인 시작됨 (=cover_pending 상태)
+        -- failed: 어느 단계에서 실패
+        -- cancelled: 사용자 취소
+    error TEXT,
+    compile_job_id INTEGER,
+    pipeline_id INTEGER,
+    created_at TEXT NOT NULL,
+    updated_at TEXT NOT NULL
+);
+```
+
+`init_db()`에 `CREATE TABLE IF NOT EXISTS` 추가.
+
+### 4-2. 헬퍼 함수 (`db.py` 추가)
+
+- `create_batch_job(genre, count, target_duration_sec, auto_pipeline) -> int`
+- `get_batch_job(id) -> dict | None`
+- `update_batch_job(id, **fields)` — allowlist 검증
+- `list_batch_jobs(active_only=False) -> list[dict]`
+- `append_batch_track(batch_id, track_id)` — 완료된 트랙 ID 추가, completed++ 
+
+---
+
+## 5. 백엔드 — 랜덤 풀 + 배치 실행
+
+### 5-1. `app/random_pools.py` (신규)
+
+장르별 음악적으로 어울리는 랜덤 풀 정의:
+
+```python
+"""장르별 음악 파라미터 랜덤 풀."""
+import random
+
+POOLS = {
+    "lo-fi": {
+        "moods": ["chill", "relaxing", "dreamy", "melancholic", "mellow", "nostalgic", "peaceful"],
+        "instruments_pool": ["piano", "synth", "drums", "vinyl", "rhodes", "soft bass", "ambient pads"],
+        "instruments_count": (3, 4),
+        "bpm": (70, 90),
+        "keys": ["C", "D", "F", "G", "A"],
+        "scales": ["minor", "major"],
+        "prompt_modifiers": ["cozy bedroom vibes", "rainy night", "late night study", "cafe ambience"],
+    },
+    "phonk": {
+        "moods": ["dark", "aggressive", "moody", "intense", "hypnotic"],
+        "instruments_pool": ["808 bass", "hi-hat", "synth lead", "vocal chops", "bass drops", "trap drums"],
+        "instruments_count": (3, 4),
+        "bpm": (130, 160),
+        "keys": ["C", "D", "F", "G"],
+        "scales": ["minor"],
+        "prompt_modifiers": ["drift atmosphere", "dark neon", "midnight drive"],
+    },
+    "ambient": {
+        "moods": ["peaceful", "meditative", "ethereal", "spacious", "dreamy"],
+        "instruments_pool": ["pad synths", "atmospheric guitar", "soft strings", "field recordings", "drone bass"],
+        "instruments_count": (2, 3),
+        "bpm": (50, 75),
+        "keys": ["C", "D", "E", "G", "A"],
+        "scales": ["major", "minor"],
+        "prompt_modifiers": ["misty mountain morning", "deep space", "still water", "forest dawn"],
+    },
+    "pop": {
+        "moods": ["uplifting", "happy", "energetic", "romantic", "catchy"],
+        "instruments_pool": ["acoustic guitar", "piano", "drums", "bass", "synth", "vocals harmonies"],
+        "instruments_count": (3, 5),
+        "bpm": (95, 130),
+        "keys": ["C", "D", "E", "F", "G", "A"],
+        "scales": ["major"],
+        "prompt_modifiers": ["radio-ready", "summer vibe", "feel-good"],
+    },
+    "default": {  # 알 수 없는 장르 fallback
+        "moods": ["chill", "relaxing", "uplifting", "mellow"],
+        "instruments_pool": ["piano", "synth", "drums", "guitar", "bass", "strings"],
+        "instruments_count": (3, 4),
+        "bpm": (80, 110),
+        "keys": ["C", "D", "F", "G", "A"],
+        "scales": ["minor", "major"],
+        "prompt_modifiers": [""],
+    },
+}
+
+
+def randomize(genre: str, rng: random.Random | None = None) -> dict:
+    """랜덤 음악 파라미터 1세트 생성."""
+    rng = rng or random.Random()
+    pool = POOLS.get(genre.lower(), POOLS["default"])
+    n_instr = rng.randint(*pool["instruments_count"])
+    instruments = rng.sample(pool["instruments_pool"], min(n_instr, len(pool["instruments_pool"])))
+    return {
+        "moods": [rng.choice(pool["moods"])],
+        "instruments": instruments,
+        "bpm": rng.randint(*pool["bpm"]),
+        "key": rng.choice(pool["keys"]),
+        "scale": rng.choice(pool["scales"]),
+        "prompt_modifier": rng.choice(pool["prompt_modifiers"]),
+    }
+```
+
+향후(P3): 장르별 풀을 `youtube_setup`/별도 테이블로 옮겨 SetupTab에서 편집 가능하게.
+
+### 5-2. `app/batch_generator.py` (신규) — 순차 실행 오케스트레이터
+
+```python
+"""배치 음악 생성 + 자동 컴파일·영상 파이프라인."""
+import asyncio
+import logging
+import json
+
+from . import db
+from .suno_provider import run_suno_generation
+from .random_pools import randomize
+
+logger = logging.getLogger("music-lab.batch")
+
+POLL_INTERVAL_S = 5
+TRACK_GEN_TIMEOUT_S = 240  # 트랙당 최대 4분
+
+
+async def run_batch(batch_id: int) -> None:
+    """1) genre로 N트랙 순차 Suno 생성
+       2) 모두 완료 후 compile_job 자동 생성·실행
+       3) compile 완료 후 영상 파이프라인 시작 (cover step)
+    """
+    job = db.get_batch_job(batch_id)
+    if not job:
+        return
+    genre = job["genre"]
+    count = job["count"]
+    duration = job["target_duration_sec"]
+    auto_pipe = bool(job["auto_pipeline"])
+
+    db.update_batch_job(batch_id, status="generating")
+
+    track_ids: list[int] = []
+    for i in range(1, count + 1):
+        title = f"{genre.title()} Mix Track {i}"
+        params = randomize(genre)
+
+        db.update_batch_job(batch_id,
+                            current_track_index=i,
+                            current_track_status="generating")
+
+        # Suno 호출 (기존 task 패턴 활용)
+        task_id = _start_suno(title=title, genre=genre,
+                               duration_sec=duration, **params)
+        track_id = await _wait_for_track(task_id, timeout=TRACK_GEN_TIMEOUT_S)
+
+        if track_id:
+            track_ids.append(track_id)
+            db.append_batch_track(batch_id, track_id)
+        else:
+            logger.warning("배치 %d 트랙 %d 실패 — 계속 진행", batch_id, i)
+            db.update_batch_job(batch_id, current_track_status="failed")
+            # 정책: 실패한 트랙은 skip하고 계속 (나머지 9개라도 만든다)
+
+    if not track_ids:
+        db.update_batch_job(batch_id, status="failed",
+                            error="모든 트랙 생성 실패")
+        return
+
+    db.update_batch_job(batch_id, status="generated")
+
+    if not auto_pipe:
+        return  # 음악만 만들고 종료
+
+    # === 자동 compile ===
+    db.update_batch_job(batch_id, status="compiling")
+    compile_id = db.create_compile_job(
+        title=f"{genre.title()} Mix",
+        track_ids=track_ids,
+        crossfade_sec=3,
+    )
+    db.update_batch_job(batch_id, compile_job_id=compile_id)
+
+    # 기존 compiler 호출 (동기 → asyncio.to_thread)
+    from . import compiler
+    await asyncio.to_thread(compiler.run, compile_id)
+
+    job_after = db.get_compile_job(compile_id)
+    if not job_after or job_after.get("status") not in ("done", "succeeded"):
+        db.update_batch_job(batch_id, status="failed",
+                            error=f"compile 실패 (status={job_after.get('status') if job_after else 'unknown'})")
+        return
+
+    # === 자동 영상 파이프라인 ===
+    pipeline_id = db.create_pipeline(compile_job_id=compile_id)
+    db.update_batch_job(batch_id, pipeline_id=pipeline_id, status="piped")
+
+    from .pipeline import orchestrator
+    await orchestrator.run_step(pipeline_id, "cover")
+```
+
+- `_start_suno(...)` — 기존 `run_suno_generation` 호출, task_id 반환
+- `_wait_for_track(task_id, timeout)` — task 완료 폴링, 성공 시 music_library의 새 track id 반환
+
+### 5-3. 변경되는 기존 모듈
+
+`app/main.py`에 신규 endpoint 3개 + BackgroundTask. 변경 없는 기존 endpoint들은 그대로.
+
+`db.py`에 헬퍼 함수 5개 추가 + `init_db()`에 `music_batch_jobs` CREATE 추가.
+
+---
+
+## 6. API 엔드포인트
+
+### 6-1. `POST /api/music/generate-batch`
+
+Request:
+```json
+{
+  "genre": "lo-fi",
+  "count": 10,
+  "target_duration_sec": 180,
+  "auto_pipeline": true
+}
+```
+
+Validation:
+- `count` 1-10
+- `target_duration_sec` 60-300
+- `genre` 필수
+
+Response 201:
+```json
+{
+  "id": 3,
+  "status": "queued",
+  ...
+}
+```
+
+배치 작업은 BackgroundTask로 실행 (~15-25분 소요).
+
+### 6-2. `GET /api/music/generate-batch/{id}`
+
+진행 상태 조회. 응답 예:
+```json
+{
+  "id": 3,
+  "genre": "lo-fi",
+  "count": 10,
+  "completed": 7,
+  "current_track_index": 8,
+  "current_track_status": "generating",
+  "status": "generating",
+  "track_ids": [12, 13, 14, 15, 16, 17, 18],
+  "tracks": [
+    {"id": 12, "title": "Lo-Fi Mix Track 1", ...},
+    ...
+  ],
+  "compile_job_id": null,
+  "pipeline_id": null,
+  "created_at": "2026-05-10T17:00:00",
+  "updated_at": "2026-05-10T17:08:30"
+}
+```
+
+`tracks` 필드는 LEFT JOIN으로 채워짐 (각 트랙 메타 포함).
+
+### 6-3. `GET /api/music/generate-batch?status=active`
+
+전체 배치 목록. `active`면 queued/generating/compiling/piped 만.
+
+---
+
+## 7. 프론트엔드 — Create 탭 배치 섹션
+
+### 7-1. `MusicStudio.jsx` Create 영역에 신규 collapsible
+
+Create form 위 또는 옆에 새 섹션 (`<details>` 또는 토글):
+
+```jsx
+<details className="ms-batch-section" open={batchOpen}>
+    <summary onClick={...}>🎲 배치 생성 (1-10트랙 + 자동 영상)</summary>
+
+    <div className="ms-batch-form">
+        <label>장르
+            <select value={batchGenre} onChange={...}>
+                <option value="lo-fi">Lo-Fi</option>
+                <option value="phonk">Phonk</option>
+                <option value="ambient">Ambient</option>
+                <option value="pop">Pop</option>
+            </select>
+        </label>
+
+        <label>트랙 수: {batchCount}
+            <input type="range" min={1} max={10} value={batchCount} onChange={...}/>
+        </label>
+
+        <label>트랙당 길이: {batchDuration}초
+            <input type="range" min={60} max={300} step={10} value={batchDuration} onChange={...}/>
+        </label>
+
+        <label>
+            <input type="checkbox" checked={autoPipeline} onChange={...}/>
+            모든 트랙 생성 후 자동 영상 파이프라인 시작
+        </label>
+
+        <p className="ms-batch-estimate">
+            예상: 약 {batchCount * 1.5 | 0}-{batchCount * 2}분 · 비용 ~${(batchCount * 0.005 + (autoPipeline ? 0.05 : 0)).toFixed(2)}
+        </p>
+
+        <button className="button primary" onClick={startBatch} disabled={generating}>
+            🎵 배치 생성 시작
+        </button>
+    </div>
+
+    {currentBatch && <BatchProgress batch={currentBatch} />}
+</details>
+```
+
+### 7-2. 신규 컴포넌트 `BatchProgress.jsx`
+
+```jsx
+export default function BatchProgress({ batch }) {
+    return (
+        <div className="ms-batch-progress">
+            <div className="ms-batch-header">
+                배치 #{batch.id} — {batch.genre} ·
+                {' '}{batch.completed}/{batch.count} 완료 ·
+                {' '}status: <strong>{batch.status}</strong>
+            </div>
+            <ol className="ms-batch-tracks">
+                {Array.from({ length: batch.count }, (_, i) => i + 1).map(n => {
+                    const completed = n <= batch.completed;
+                    const current = n === batch.current_track_index && batch.status === 'generating';
+                    const track = (batch.tracks || []).find(t => t._batch_index === n);
+                    return (
+                        <li key={n} className={completed ? 'done' : current ? 'current' : 'pending'}>
+                            {completed ? '✓' : current ? '⏳' : '○'}
+                            {' '}Track {n}: {track ? track.title : (current ? '생성 중...' : '대기')}
+                        </li>
+                    );
+                })}
+            </ol>
+            {batch.compile_job_id && <div>📀 컴파일 #{batch.compile_job_id}</div>}
+            {batch.pipeline_id && (
+                <div>
+                    🎬 영상 파이프라인 #{batch.pipeline_id} —
+                    <a href={`#youtube-pipeline-${batch.pipeline_id}`}> 진행 탭에서 확인</a>
+                </div>
+            )}
+        </div>
+    );
+}
+```
+
+### 7-3. 폴링
+
+배치 시작 시 5초 간격 `getBatchJob(id)` 호출. status가 `piped`/`failed`/`cancelled`되면 폴링 중지.
+
+### 7-4. `api.js` 헬퍼
+
+```javascript
+export const startBatchGen   = (payload) => apiPost('/api/music/generate-batch', payload);
+export const getBatchJob     = (id)      => apiGet(`/api/music/generate-batch/${id}`);
+export const listBatchJobs   = (status='all') => apiGet(`/api/music/generate-batch?status=${status}`);
+```
+
+---
+
+## 8. 에러 처리
+
+| 시나리오 | 동작 |
+|---------|------|
+| Suno API 트랙 1개 실패 | 로그 + skip + 다음 트랙 진행. 최종 track_ids에 누락. |
+| 모든 트랙 실패 | status=failed, error 기록 |
+| compile 실패 | status=failed, compile_job_id 보존 |
+| 영상 파이프라인 cover step 실패 | pipeline 자체에서 failed로 마크. batch는 piped 상태 그대로 (파이프라인 측에서 처리) |
+| count > 10 또는 < 1 | 400 |
+| genre 누락 | 400 |
+| Suno API key 미설정 | 400 ("SUNO_API_KEY 미설정") |
+
+---
+
+## 9. 테스트 전략
+
+### 9-1. 단위 테스트
+
+- `random_pools.randomize(genre)` — 각 장르별 결과가 풀 안에 있는지, 시드 고정 시 재현 가능
+- `db.create_batch_job` / `update_batch_job` / `append_batch_track` — 정상 흐름
+- `_wait_for_track` — task 성공/실패/timeout mock
+
+### 9-2. 통합 테스트
+
+- `POST /api/music/generate-batch` 호출 → 201 반환 + 배치 row 생성
+- `GET /api/music/generate-batch/{id}` 응답 schema
+- `run_batch` mocked Suno + mocked compiler + mocked orchestrator → 전체 흐름 happy path
+
+### 9-3. 수동 E2E
+
+- Create 탭 → 배치 생성 → 장르 선택 → 시작 → 진행 표시 확인
+- 10트랙 완료 → Library에 10개 추가 확인 → compile_job 자동 생성 확인 → 진행 탭에 새 카드 등장 확인
+
+---
+
+## 10. 산출물
+
+| 영역 | 파일 |
+|------|------|
+| Spec/Plan | 본 문서 + plan |
+| NAS music-lab | `db.py` (테이블/헬퍼), `random_pools.py` (신규), `batch_generator.py` (신규), `main.py` (3 endpoints) |
+| Frontend | `MusicStudio.jsx` (Create 배치 섹션), `BatchProgress.jsx` (신규), `MusicStudio.css`, `api.js` 헬퍼 |
+| 테스트 | NAS 단위 + 통합, 수동 E2E |
+
+---
+
+## 11. 후속 (P3)
+
+- 장르별 풀 SetupTab에서 편집 가능
+- 트랙별 prompt에 시나리오/카페 분위기 등 자동 추가 (트랙간 다양성 증대)
+- 배치 일시정지/재개
+- 한 배치 안에서 Track-N별 재생성 (실패한 트랙만)
+- 트랙 길이 가변 (랜덤 분포)