Seven sources. One knowledge graph. Music that remembers what you forget.
Scroll to explore
Spotify's algorithm is optimized to keep you listening, not to deepen your understanding of what you actually like. It feeds you adjacencies to what you've already consumed, reinforcing a narrowing corridor of taste. Over time, the bubble tightens. Artists you loved three years ago disappear from rotation. Genres you explored once never resurface. Your listening history exists only as a stream of play counts with no structure, no context, and no memory of why you were drawn to something.
The deeper problem is fragmentation. Your taste profile is scattered across half a dozen platforms: Spotify has play history, Last.fm has scrobbles, MusicBrainz has metadata, Discogs has pressings and lineage, Genius has lyrics and annotations. Each platform holds a shard of understanding, but none of them talk to each other, and none of them model the relationships between artists, genres, and tracks in a way that evolves with your behavior.
This project is built on a different premise: what if your music library was a knowledge graph that learned from how you listen?
Spotify is the spine — tracks, albums, artists, all keyed on Spotify IDs. From there, six other sources feed in. Last.fm brings listening history and artist similarity. MusicBrainz adds authoritative metadata and recording relationships. ListenBrainz contributes community listening data. Deezer has audio previews and its own recommendation graph. Discogs has physical release lineage. Genius has lyrics.
Everything collapses into a single SQLite database: 27 tables, designed so any entity can be enriched from any source without duplication. A rate limiter keeps seven APIs happy simultaneously. Cross-referencing happens automatically through external ID mapping.
The result: every track exists in context. Its genre lineage. Its artist's collaborators. Its acoustic neighbors. How often you listen to it and where you found it.
Enrichment Pipeline
Data flows from Spotify through six enrichment sources into the unified SQLite graph. Each source contributes a different dimension of understanding. Illustrative visualization.
Genre labels are a mess. Spotify assigns them to artists, not tracks. Last.fm uses crowd-sourced tags that drift. MusicBrainz has its own vocabulary. Instead of picking one system, the project builds a co-occurrence network: genres connect when they show up together on shared artists.
Post-punk and darkwave cluster together. Math rock bridges prog and indie. Electronic music splinters into dozens of micro-communities. The network reveals more about how genres actually relate than any taxonomy could.
Bigger nodes = more tracks. Thicker edges = genres that co-occur more. Colors show detected communities. Drag to explore, hover for details.
Genre Co-occurrence Network
Force-directed layout of genre co-occurrence across 79 genres. Nodes sized by track count, colored by community. Drag to explore, hover for details. Live data from the Resonance database.
Every track has a frequency: a target interval in days between listens. It's not fixed. It evolves from your actual behavior, using the same golden-ratio mechanism from the task intelligence system.
Listen to something ahead of schedule and the interval shrinks — you hear it more often. Let it go overdue and the interval grows — it fades into the background. The multiplier is 1.382, the golden ratio. Stable, symmetric, and fast to adapt.
Over time this creates a natural rotation. Tracks you seek out earn tight intervals. Tracks you ignore drift. The system doesn't decide what you should listen to. It watches what you do listen to. Survival of the chosen.
Frequency Evolution Over Time
Choosing tracks early pulls their interval down (/1.382); neglecting tracks pushes it up (×1.382). The golden ratio keeps adaptation stable. Illustrative visualization with simulated data.
Connect artists who share genre tags. More overlap, stronger edge. What falls out is a map of how a personal library actually fits together. Ambient and IDM cluster tight. Vaporwave forms its own island. Rap links through shared tags that have nothing to do with what ambient listeners care about. The structure is there whether you see it or not.
This is also what powers the discovery engine. When the system recommends someone new, it can point to the map: this artist sits between your ambient cluster and your electronic cluster. A location in the network, not a match score.
Artist Similarity Constellation
Top 35 artists by listen count, colored by primary genre. Edges weighted by shared genre overlap. Drag to rearrange, hover for details. Live data from the Resonance database.
A snapshot of your library tells you what you listen to. It doesn't tell you who you were six months ago. The timeline traces genre distribution week by week — which genres dominate when, where new ones enter, where old ones fade. You can see discovery bursts as new colors suddenly appearing in the stack. You can see obsessions as bands of color that swell and then thin.
This also feeds the prediction model. Ambient at 1am. High-energy stuff on weekday mornings. Experimental on weekends. Time becomes a feature, not just a timestamp.
Listening Timeline by Genre
Stacked area chart showing genre distribution over time. Width of each band reflects relative listening volume. Illustrative visualization with simulated data.
Genre labels are messy: Spotify assigns them to artists, Last.fm uses crowd-sourced tags, MusicBrainz has its own vocabulary. Rather than force one taxonomy, the system normalizes sub-genres into broader families and counts track associations. The result is a structural view of the library: where the mass is concentrated, what the long tail looks like, and how niche genres relate to broader categories.
Soundtrack and electronic music dominate, with ambient and vaporwave forming a substantial core. The "Other" category captures the long tail: genres with too few tracks to form their own cluster, or tracks whose artists span multiple incompatible genre labels. As the library grows, new categories will emerge from this residual.
Library Composition by Genre Family
Top genre families across 1,332 tracks. Sub-genres normalized into display families. Hover for track counts and percentages. Live data from the Resonance database.
Five strategies, each running independently. Last.fm finds similar tracks and artists from scrobble data. Deezer has its own related-artist graph. ListenBrainz pulls collaborative filtering from an open community of listeners. Genre exploration looks for underexplored corners of your own library. And Spotify's catalog, seeded from the frequency model's most overdue artists, resurfaces people you've been neglecting.
Results merge. When multiple sources agree on an artist, that artist ranks higher. When only one source suggests something, it still shows up — it just ranks lower. Add a discovery to the library or dismiss it. Either way, the feedback loop tightens.
The difference from Spotify's recommendations: you own the graph. Every suggestion has provenance. You can see exactly which sources agreed, why, and where the recommendation sits relative to what you already know.
Every artist has a target frequency (how many days should pass between listens) and a cycle progress: days since you last listened, divided by that frequency. Over 1.0 means overdue. The scatter plot puts each artist on a map: target frequency on x, cycle progress on y. Above the line, you owe them a listen.
This is the system's pulse. At a glance: who's due, who's comfortable, and who's been drifting for weeks. It's a living picture. The frequencies never stop moving.
Artist Frequency Distribution
Each point is an artist. X-axis: target frequency in days. Y-axis: cycle progress (days since listened / frequency). Points above the dashed line are overdue. Illustrative visualization with simulated data.
Genre labels are shortcuts. Two tracks tagged "electronic" can sound nothing alike. Two tracks from different genres can share tempo, harmonic structure, spectral texture — everything that actually hits your ears. Labels approximate. Audio features don't.
CLAP (Contrastive Language-Audio Pretraining) turns each Deezer 30-second preview into a 512-dimensional embedding. Every track becomes a point in acoustic space. Close points sound alike. The t-SNE projection below collapses 512 dimensions into two, and the clusters that emerge have nothing to do with how the tracks were tagged. Cosine similarity in CLAP space powers the audio search: find what sounds like what you like, not what's labeled like what you like.
Audio Feature Space (CLAP Embeddings)
t-SNE projection of 1,169 CLAP audio embeddings. Each point is a track, colored by genre family. Clusters reveal acoustic neighborhoods that transcend genre labels. Hover for track details. Live data from the Resonance database.
Genres nest inside each other. Electronic contains ambient, techno, house, IDM, and dozens more. Rock branches into post-rock, shoegaze, psychedelic, math rock. Rather than flatten everything into tags, the system builds a hierarchy: 226 genres organized into 14 root families with parent-child relationships at every level.
The sunburst shows relative size by track count. Hover for details. The structure emerges from domain knowledge applied to how genres actually relate in practice, not from any single source's taxonomy.
Genre Hierarchy Sunburst
226 genres organized into 14 root families. Ring depth = hierarchy depth. Arc size = track count. Hover for genre name and percentage.
The primary signal is cycle progress — how overdue something is. This works from day one. No training data. No cold start. The frequency model adapts as you listen, and that's enough to rank meaningfully from the first session.
An ML layer sits on top for when the data is richer: 32-dimensional co-occurrence embeddings trained from listening sessions, temporal features (what time, what day), and a GradientBoosting classifier that learns your selection patterns. The pipeline is built. The frequency model carries things until enough sessions accumulate to train on.
ML Prediction & Taste Profile
Radar chart of genre preferences, temporal listening heatmap (hour × day-of-week), and a live suggestion flow showing ML-ranked predictions with confidence scores and feature attribution.
Training pipeline implemented. Requires ~100 listening sessions for meaningful co-occurrence embeddings and model training. Currently using frequency-based ranking as fallback.
The whole system is exposed through 35 MCP (Model Context Protocol) tools. Add tracks. Log listens. Trigger enrichment. Run discovery. Control playback. Query stats. All within a single conversation with an AI assistant that remembers the collection, understands the graph, and can explain why it's recommending what it's recommending.
The whole system sits behind a structured API layer. A general-purpose AI gets deep access to the personal music graph, and 35 specialized tools to act on it.
Architecture pattern: a general intelligence agent orchestrates domain-specific tools and specialized models through MCP.
Start a recommendation session and the system meets you with a question: how do you want to find music today? Vibe check, seed artist, genre exploration, or just let it surprise you. Each mode leads to a different discovery strategy, but the interface is the same: a short conversation that narrows intent before the system goes searching.
Pick "Vibe check" and the system asks two follow-ups: what's the mood, and what about vocals? These aren't filters on a database query. They're context for the agent, shaping how it searches, what it prioritizes, and how it sequences the final set.
The system comes back with a curated set: a name, a tracklist with roles (opener, build, peak, cool-down), and a choice. Queue the whole thing, cherry-pick tracks, or start over. Every track is already matched against your library so nothing duplicates what you own.
The structured flow is one way in. The other is just talking. Ask for "more Gesaffelstein type vibes but not his collab stuff with lyrics" and the agent figures out the intent, queries the knowledge graph for what's already in your library, then searches across sources for tracks that fit. It thinks through taste zones, checks for duplicates, and hunts down specific albums and artists.
Behind the scenes, that single request triggers a cascade: SQL queries against the knowledge graph, searches across Spotify's catalog, thinking steps to plan genre coverage, and deduplication against the existing library. All of it visible. The agent searches for Aphex Twin, Boards of Canada, Floating Points, Four Tet, Nils Frahm, Jamie xx, and dozens more, building a set that spans the user's taste zones.
The result: 50 tracks queued and organized by genre zone. Not a flat playlist. A structured set with provenance, grouped by the acoustic neighborhoods that emerged from the search. Ambient/IDM, downtempo electronic, modern classical, electronic/dance, warm electronic, psychedelic world-funk. Each zone with specific artists and tracks, all sourced from the conversation.