Reverse Engineering the YouTube Algorithm
Published as: Parametric Algorithmic Transformer Based Weighted YouTube Video Analysis
Automated live implementation: 2026 - fully automated end-to-end with Claude Code.
Crawl YouTube's recommendation graph for any search query, detect community clusters with ForceAtlas2 + Louvain, then use an LLM judge to compare your video's transcript against the top recommended videos, weighted by engagement metrics.
Enter a search query, paste your video URL, pick an LLM, and get a full parametric breakdown of how your content stacks up, plus actionable recommendations on what to improve.
How It Works
Crawl
Search-based graph construction
Cluster
ForceAtlas2 + Louvain modularity
Analyze
LLM scores 12 transcript parameters
Score
Weighted normalization via engagement
Live Analysis
Requires YouTube Data API key + one LLM API key (server env vars or entered below)
The topic you want to optimize your video for
The video you want to compare against recommendations
Depth 2: ~200 videos, balanced
Parameters that map to YouTube algorithm signals
Estimated cost: ~$0.38
~12 LLM calls via Claude Sonnet 4.6 (76k in / 9.8k out tokens) - ~2100 YouTube API quota units
Methodology
W = (Popularity × Engagement × Sentiment) + Consistency
Popularity = Views × Subscribers
Engagement = (Likes + Comments) / (2 × Views)
Sentiment = Likes / Views
Consistency = (avgLikes + avgComments + avgViews) / avgViews
P_i = Σ (T1_j − T2_j) × W_j
For each parameter, the difference between your video's score and each comparison video's score is multiplied by that video's weight, then standardized via min-max normalization.
f(x) = (X − min) / (max − min)
Normalizes the weighted parametric differences to a 0-100 scale for interpretability.
f(x) = −1/x
Floors weight values to maintain a similar baseline and deviation, ensuring proportional comparison.
Parameter Presets
YouTube Optimization
Parameters that map to YouTube algorithm signals
How engaging and attention-grabbing the opening 30 seconds are -does it create curiosity or promise value immediately
Amount of useful, actionable information per minute of content -high density keeps watch time up
Clarity of story arc -setup, development, payoff -does it have a logical flow that drives completion
Use of curiosity gaps, teasers, open loops, pattern interrupts, and 'but wait' moments that keep viewers watching
How well the spoken content covers search-relevant keywords and phrases that match the title topic
Emotional peaks and valleys throughout -humor, surprise, empathy, excitement -drives likes and shares
How well complex concepts are broken down -use of analogies, examples, step-by-step -affects satisfaction
Speed of content delivery -is it well-paced or does it drag/rush -affects audience retention curve
Presence and quality of subscribe, like, comment prompts -drives engagement metrics YouTube tracks
Would someone send this to a friend -unique insights, surprising facts, quotable moments
Does the speaker sound knowledgeable, cite sources, demonstrate expertise -affects trust signals
Tightness of script -minimal filler words, repetition, tangents -polished vs rambling delivery
Original Research (2023)
The 12 linguistic parameters from the published paper
Overall readability score
Flesch-Kincaid grade level -education level needed to understand the text
Coleman-Liau reading difficulty metric based on sentence and word structure
Percentage of content-bearing words vs total words
Flow and connectivity of ideas throughout the transcript
Emotional positivity and tone of the content
Density and frequency of topic-relevant keywords
How well the spoken content matches the video title
How easy the content is to follow and comprehend for a general audience
Level of technical depth and specialized knowledge required
Amount of domain-specific or specialized terminology used