fix(insta-render): BLMOVE dequeue가 짧은 socket_timeout으로 깨지던 문제 해결

REDIS_URL의 socket_timeout(<5s)이 ReliableQueue BLMOVE 5초 블록보다 짧아
idle dequeue마다 "Timeout reading"으로 잡을 못 꺼내 슬레이트가 draft에 정지(~2026-05-22~).
큐 연결을 socket_timeout=None + socket_keepalive로 생성(make_queue_redis)해 정상화.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-06 16:08:43 +09:00
parent 9241b5cd90
commit c451f5313b
2 changed files with 26 additions and 1 deletions

View File

@@ -222,3 +222,14 @@ async def test_poll_once_returns_false_on_timeout(monkeypatch):
process_mock.assert_not_awaited()
fake_queue.ack.assert_not_awaited()
fake_queue.fail.assert_not_awaited()
def test_make_queue_redis_no_read_timeout():
"""BLMOVE(블록 5s) dequeue가 read-timeout으로 깨지지 않도록 socket_timeout=None 보장 (회귀 가드)."""
import os, sys
_here = os.path.dirname(os.path.abspath(__file__))
sys.path.insert(0, os.path.dirname(_here)) # services/insta-render
sys.path.insert(0, os.path.dirname(os.path.dirname(_here))) # services (_shared)
import worker
c = worker.make_queue_redis()
assert c.connection_pool.connection_kwargs.get("socket_timeout") is None

View File

@@ -98,9 +98,23 @@ async def poll_once(queue: ReliableQueue, client: httpx.AsyncClient) -> bool:
return True
def make_queue_redis():
"""블로킹 dequeue(BLMOVE 5s)용 redis 클라이언트.
BLMOVE 블록보다 짧은 socket_timeout(예: REDIS_URL ?socket_timeout=)이 걸려 있으면
idle 폴링마다 "Timeout reading"으로 dequeue가 실패해 잡을 영영 못 꺼낸다(슬레이트 draft 정지).
→ read-timeout을 두지 않는다(socket_timeout=None). 죽은 연결은 socket_keepalive +
worker_loop 재시도로 감지/복구. (explicit kwarg가 URL의 socket_timeout을 override)
"""
return aioredis.from_url(
REDIS_URL, decode_responses=False,
socket_timeout=None, socket_keepalive=True,
)
async def worker_loop():
"""무한 루프 — paused 체크 → ReliableQueue.dequeue → process_one → ack/fail."""
redis = aioredis.from_url(REDIS_URL, decode_responses=False)
redis = make_queue_redis()
queue = ReliableQueue(redis, queue_key=QUEUE_KEY)
async with httpx.AsyncClient() as client:
logger.info("insta-render worker started worker_id=%s queue=%s",