Private research infrastructure

Turn a research brief into structured evidence.

Signal Room runs a four-stage video evidence pipeline: discover leads from YouTube, pull captions, extract contradictions with an LLM, and curate the strongest signals in a review queue. All of it stays off the open web.

Discover YouTube search across staged query batches, never re-running a used query

Caption YouTube captions fetched via InnerTube, parsed and stored timestamped to D1

Analyze LLM reads caption text and extracts claim pairs, actor stances, contradiction signals

Review Keyboard-driven curation queue — star, annotate, and score evidence items

Pipeline active D1 indexed Access gated

How it works

Discover

Query-level search with saturation tracking.

The orchestrator generates search batches from your brief using an LLM and dispatches them to YouTube.

No repetition: every query run is logged to D1 and filtered out of all future batches
Yield feedback: how many new unique leads each query produced informs the next generation prompt
Expanding coverage: each cycle explores angles the previous batch didn't touch

0×Duplicate queries

YT APISearch backend

Discovery Queue live

Doneمقابله نظامی ایران اسرائیل ۱۴۰۴247

DoneIran ceasefire statement official183

RunningIRGC leadership denial press conference—

Queuednuclear facility strike response timeline—

Query saturation 73 / 120 used

Caption

YouTube auto-captions, fetched and timestamped.

For each discovered video the pipeline fetches the auto-generated caption track using InnerTube, YouTube's internal request layer.

No API quota: InnerTube bypasses the official Data API entirely, no rate limits or keys required
Full text stored: caption XML is parsed into timestamped segments and written to D1
No audio processing: YouTube's own captioning does the work; coverage follows wherever auto-captions exist

YTCaption source

±2sTimestamp precision

Captions · YT_kXGQe8n7m4A

0:11The statement released Monday confirmed

0:18no nuclear facilities were targeted in the first exchange.

0:35Precision was the stated objective, per the briefing.

0:58Civilian infrastructure was not considered a

1:04legitimate target under the rules of engagement.

Source InnerTube en · fetched 4m ago · no API quota

Analyze

LLM extracts contradictions across sources.

Each caption batch is sent through a structured LLM prompt that extracts claims, stances, and contradiction pairs.

Claim extraction: factual assertions are pulled with speaker attribution and timestamp
Contradiction flagging: direct conflicts with prior statements are stored as pairs with confidence scores
Async pipeline: analysis runs on the VPS via router.darra.ai and surfaces in the dashboard once D1 is written

N:NPair matching

0.1–1.0Confidence range

Contradiction Signals

Direct contradiction conf 0.92

"No nuclear facilities targeted in the first exchange." vs "The Natanz cooling system was struck at 03:14."

Source A · 2025-04-12 04:30 · Source B · 2025-04-13 09:15

Stance shift conf 0.77

"Civilian infrastructure was not a target." vs "The power grid failed in three districts simultaneously."

Press briefing · Field report · 6h gap

Intelligence

Entity aggregation across the entire corpus.

The intelligence layer operates above per-video analysis, aggregating extracted claims by named entity across the entire campaign corpus.

Entity timelines: every statement attributed to a person or organization is collected chronologically across all videos, not per-video
Stance drift detection: where an actor's stated position on a specific topic changed between two time-separated appearances, the delta is flagged
Cross-source contradictions: conflicts between what two different entities claim about the same event surface as corpus-level findings invisible at the per-video layer
Corroboration mapping: which actors are reinforcing or undermining each other's claims, scored by frequency and confidence

CrossSource matching

EntityAggregation

Intelligence · Entities

Ali Khamenei 14 statements · 3 contradictions

Apr 12 "Precision strikes only. No civilian targets."

Apr 18 "The resistance has the right to target any infrastructure."

IRGC Spokesperson 9 statements · 1 contradiction

Apr 14 "All operations confirmed within rules of engagement."

Saturation

Every query logged. No search runs twice.

The saturation tracker maintains a complete, per-query history of everything the system has ever dispatched.

Hard deduplication: before any new batch is staged, every candidate query is filtered against the full history log
Yield scoring: how many new non-duplicate leads each past query produced is stored and fed back into the next LLM generation prompt
Depth estimation: the Query Plan view in the dashboard shows remaining estimated search depth per topic cluster and flags when a cluster approaches saturation

0×Query reuse

YieldScore feedback

Query Plan · Batch 7

Staged queries 12 fresh · 4 rejected (seen)

Khamenei statement nuclear deal 2025 yield 0.84

IRGC ceasefire announcement April yield 0.61

بیانیه شورای امنیت درباره ایران yield 0.79

Iran missile response timeline seen · skipped

Total queries used73 / est. 240

Review

One item at a time, keyboard-driven.

The review queue presents one evidence item at a time: video, captions, and extracted contradiction signals together.

Keyboard-driven: arrow keys to navigate, S to star, Enter to mark seen and advance
Annotation layer: labels, notes, and a 0–10 operator score write back to D1 per video
Starred export view: a separate queue collects all starred items for structured export

⌨Keyboard nav

D1Durable storage

Review item 3 of 42

Starred conf 0.92 S star ↵ next

★ Star Mark seen Hide ← → Nav

Operator access only.

This workspace is private. If you have credentials, authenticate to open the pipeline console.

pipeline active · access gated