Data Intelligence — Mast Labs

Data Intelligence · Genomic Taxonomy
Genomic taxonomy and content intelligence rebuild across 200M+ digital assets.
Assets reclassified
200M+
Analysis time reduction
60%
Metadata accuracy
+15%
Revenue impact
$215M+
Users on rebuilt system
70M+

01 · CONTEXT

Who. What scale. What was at stake.

A major audio subscription platform with 70M+ active users ran its recommendation, personalization, and ad targeting systems on a proprietary genomic taxonomy. The taxonomy was the product. It was also the constraint. The original schema was brittle, hard to extend, and incompatible with the ML pipeline direction the business needed to move toward.

02 · CONSTRAINT

The architectural problem.

The migration had to happen across 200M+ assets without breaking downstream systems generating revenue in production. None could be taken offline. The second constraint: the new taxonomy had to unlock capability the numeric schema had been blocking, not just be a cleaner version of the same structure.

03 · DECISION POINTS

Three decisions that shaped everything downstream.

Decision 01

Schema direction

Migrated from numeric genre identifiers to a classified, semantic, multi-dimensional tag structure. Each asset carries multiple dimensions of meaning rather than a single genre assignment. This unlocked cross-genre and mood-based recommendation logic the product had been unable to build.

Decision 02

Migration sequencing

Parallel schema operation during migration. Old taxonomy stayed live for downstream systems. New taxonomy built alongside it. Cutover happened system by system, not all at once. Reduced production risk at the cost of temporary infrastructure complexity.

Decision 03

Knowledge graph integration

Taxonomy became a queryable graph, not a flat tag store. Relationships between assets, artists, moods, tempo, and context became traversable. This was the structural change that enabled the downstream revenue impact.

04 · SYSTEM

What was built.

Rebuilt genomic tagging system. Migrated from numeric to classified, semantic, multi-dimensional taxonomy across 200M+ assets.

New enrichment pipelines. Faster ingestion, cleaner labels, downstream-ready outputs for ML systems.

Knowledge graph integration. Taxonomy became a queryable graph with traversable relationships.

Parallel schema operation during migration. No production downtime. Old taxonomy maintained while new taxonomy was validated.

Cross-functional alignment across content, data, engineering, and product maintained throughout.

05 · OUTCOMES

All metrics from production systems.

200M+ assets reclassified under the new taxonomy

60% reduction in per-asset analysis time, from 17 minutes to 8 minutes per track

15% metadata accuracy improvement across the full catalog

Foundation for $215M+ in annualized revenue impact across recommendation, personalization, and ad targeting

70M+ users served by systems running on the rebuilt taxonomy

06 · DISCUSS FURTHER

The architecture above is public. What follows is a conversation.

Taxonomy schema design, migration sequencing decisions, ML pipeline integration approach, knowledge graph structure, team scaling model, and the specific enrichment pipeline architecture are shared in a 30-minute conversation with executives evaluating similar domain intelligence work.

Start a conversation →

Connect

Engagements

mastOS.ai

Case Studies

About