Beyond the algorithms: why streaming needs human-AI teams

mvansabben4
Oct 14
5 min read

The scale problem isn't what you think

The streaming industry faces a paradox: platforms already employ legions of human curators and sophisticated hybrid systems, yet viewers still struggle with discovery. The issue isn't that algorithms work alone, it is that the balance between human judgment and machine learning remains poorly calibrated. Something that my friend Justine Powell is championing in her recent research.

Recent research reveals the complexity. A 2025 study of 50,000 music streaming users found that algorithmic curation may introduce more novelty than organic user behaviour. Yet this novelty remains semantically confined within similar genres and styles. The algorithms diversify within boundaries rather than breaking through them.

This matters for streaming video because the stakes are higher: unlike 3-minute songs, viewers commit 30-90 minutes of their time and attention. Poor discovery doesn't just waste time; it haemorrhages subscriber value.

The real cost of algorithmic drift

Industry metrics tell a stark story. Nielsen/Gracenote data shows 20% of viewers abandon browsing sessions entirely, with average decision times exceeding 10 minutes. But the underlying issue runs deeper than choice paralysis.

A comprehensive 2024 review of over 100 empirical studies on human-AI collaboration found that current interaction patterns are "dominated by simplistic collaboration paradigms, leading to little support for truly interactive functionality". The problem isn't algorithms per se, it points to how poorly integrated they are with human decision-making processes.

Three specific failure modes emerge:

Feedback loops amplify initial advantages. Once a title gains marginal traction, algorithms reinforce it relentlessly, producing winner-takes-all dynamics while sidelining potentially valuable content (Burghardt & Lerman, 2022). This is an inevitable consequence of optimisation systems without counterbalancing mechanisms.

Cold start problems remain unsolved. Regional content and emerging titles with limited interaction data stay invisible. Algorithms default to proven patterns because they (currently) lack the contextual understanding to take calculated risks on unproven content.

The exposure-selection gap widens. Research shows that while algorithmic systems expose users to diverse options, users' organic selections within those recommendations often further constrain consumption diversity. Sometimes to an even greater extent than the algorithm intended. We blame the algorithm, but human selection behaviour appears to amplify homogeneity. I guess it is why I, after 20 minutes of deciding what to watch, end up regularly watching an episode of Seinfeld...again...

Where hybrid systems already work

The critique that "platforms already use curation" is valid and precisely why we need to examine what is working and what is not.

Recent systematic analysis reveals that most human-AI collaboration in decision-making relies on either "AI-first" patterns where recommendations appear immediately, or "AI-follow" patterns where humans make initial judgments before seeing AI input. Both have problems: one makes you rely on it too much, the other makes things harder without really ensuring better results.

Music streaming provides the proof of concept. Hybrid playlists combining algorithmic suggestions with editorial curation consistently outperform pure automation. But the success hinges on specific implementation: curators don't override algorithms arbitrarily, they provide direction and add context that algorithms can’t pick up.

Video streaming needs similar sophistication, adapted for its unique constraints.

A practical framework for implementation

The answer lies in making humans and AI work together better instead of choosing one over the other. Studies show there are seven ways people and AI can share decisions, yet most systems only use two or three of the simplest ones. That means we are missing out on many options.

Contextual override zones with measurement. Give part of the homepage to hand-picked content and measure how it performs rigorously. Don’t just look at clicks (which favour familiar titles); also track watch time, completions, repeat viewing, and how much viewing comes from less obvious choices (discovery share). That shows if curation is really working.
Temporal and cultural guardrails. Set simple limits to keep the catalogue balanced: no more than two blockbusters at once, always at least one local title per region, and guaranteed space for new or under-represented voices. These rules protect against the sameness algorithms tend to create.
Request-driven assistance. Let users ask for AI help instead of forcing recommendations on them. This gives them more control, as long as the design avoids simply feeding back what they already like. Go beyond search and offer filters and exploration tools that algorithms can fill with fresh options.
Delegation with oversight. Split responsibilities between humans and AI. Algorithms can handle the heavy lifting, while curators set rules and check results. They don’t need to oversee every single recommendation, but they should guide the system and make sure it stays aligned with strategic goals.

The economic reality

The valid criticism: hybrid systems scale linearly with cost while algorithms scale logarithmically. A global platform serving 200+ territories can't have curators watching everything.

This frames the problem incorrectly. Human curation doesn't mean reviewing every title. It means strategic intervention at leverage points:

Onboarding new content. Algorithms struggle with cold start; human judgment provides initial signal.
Regional calibration. Local curators understand cultural context algorithms miss.
Guardrail monitoring. Automated systems check for drift; humans investigate anomalies.
A/B test design. Curators hypothesise what might work; algorithms validate at scale.

The cost equation changes when humans amplify rather than replace algorithmic work. A curator setting rules for 1,000 algorithmic decisions creates more value than individually selecting 1,000 titles.

Measuring success differently

Current success metrics create perverse incentives. Optimising for immediate engagement produces algorithm-friendly content that performs well initially and can breed long-term dissatisfaction: “the browsing abandonment problem”...Hello again Jerry...

Better metrics capture:

Discovery efficiency: How quickly do viewers find satisfying content?
Satisfaction persistence: Do viewing sessions lead to subscription renewal?
Catalogue activation: What percentage of available content generates meaningful engagement?
Serendipity value: How often do viewers discover content outside their predicted preferences and report satisfaction?

These metrics favour human input, because they track results that algorithms can’t easily target or optimise.

The path forward

When examining music consumption across different interaction scales, research found that algorithmic and organic curation each excel at different dimensions: algorithms introduce novelty while human curation adds variety in meaning and perspective. The solution lies in orchestrating both.

Some practical next steps ideas:

Run pilot experiments with clear hypotheses. In one market or genre, inject 10-15% curated content into algorithmic feeds. Measure not just clicks, but completion rates, satisfaction scores, and catalogue diversity accessed. Set success thresholds in advance.

Document interaction patterns. Systematic research shows that most human-AI collaboration relies on only a few interaction patterns, missing opportunities for richer collaboration. Map how curators currently interact with recommendation systems. Identify bottlenecks and design interventions.

Build curator-algorithm feedback loops. When human overrides outperform algorithm predictions, feed that signal back into training data. The algorithm should learn from curator judgment, not just user behaviour.

Acknowledge what algorithms can't know. Licensing windows, rights fragmentation, cultural moments, competitive dynamics are all contextual factors that fundamentally cannot emerge from user behaviour data alone. Design systems that surface them to algorithms through structured curator input.

AI teams: collaboration, not replacement

The future of streaming discovery is about building systems where humans and AI boost each other’s strengths. Real collaboration is still rare. Most platforms appear to use AI and humans one after the other, instead of working together as true partners.

Platforms that crack this integration, where algorithms handle scale while humans provide context, judgment, and strategic direction, will stand out by helping people discover content in a market where everything else feels the same.

The question is whether we are sophisticated enough to design systems where both work as actual teams?

Beyond the algorithms: why streaming needs human-AI teams

Recent Posts

More CINDIE