Sound and Sentiment
Data Visualisation

For some people, music is more than just sound; it connects people to feelings, moods, and memories. As someone who creates playlists based on how I am feeling. I have always been fascinated by the emotional undertones in music. This project focuses on the larger question: 

Can the emotional landscape of today’s popular music be visualized, and make it feel like the music itself?

The Spark Behind the Project:
We, humans, tend to respond to music on instinct. But platforms such as Spotify quantify emotions with metrics like valence (a measure of musical positivity), danceability, speechiness, tempo, and energy. I wanted to explore what those numbers actually look like, and how they vary across genres, artists and songs. 
Would more popular genres tend to be “happier”? Could I spot emotional patterns in different tracks? And what visual language could best represent these feelings?


Data and Methodologies:
Dataset: Spotify API dataset, which is publicly available on Kaggle
Software and Libraries:

I worked in RStudio and used various libraries. 1. dplyr & tidyverse for data wrangling
2. ggplot2, treemapify, packcircles for static visualisations
3. plotly and shiny for interactive charts and the final dashboard

The Process - From Data to Design:

I filtered the dataset to songs with a popularity score of 80 or above to isolate major hits, then aggregated by genre to compute total popularity and mean valence as proxies for each genre's reach and emotional tone.

Some of the attributes that were used for these visualisations: 

  • Track ID: The unique Spotify identifier for each track
  • Artists: The artist(s) who performed the track, separated by a semicolon when there are multiple
  • Track Name: The title of the track
  • Track Genre: The genre the track is categorized under.
  • Popularity: A score from 0-100 reflecting how popular the track is, based on how frequently and how recently it has been played.
  • Duration(ms): The length of the track, measured in milliseconds.
  • Energy: A score from 0.0 to 1.0 representing the intensity of a track. Higher values indicate a faster, louder sound.
  • Speechiness: A measure of how much spoken word content a track contains.
                   - Values above 0.66 suggest mostly speech (e.g. podcasts), 
                   - Values between 0.33 and 0.66 suggest a mix of speech and music (e.g. rap), 
                   - Values below 0.33 suggest mostly music
  • Valence: A score from 0.0 to 1.0 indicating the musical positivity of a track. Higher values correspond to a more positive sound (e.g. happy, cheerful, euphoric), while lower values suggest 
    a more negative tone (e.g. sad, depressed, angry).
  • Tempo: The estimated speed of a track in beats per minute (BPM), derived from the average beat duration.



Design Rationale and Visualisation Principles:

While designing the visualisations, I focused on blending data clarity with a visual aesthetic that feels emotionally resonant and musically inspired, guided by key principles of information visualization such as intuitive color encoding and visual hierarchy.

Valence, the core emotional variable, is mapped using a cyan-to-magenta color gradient, chosen for its strong contrast and its alignment with emotional polarity (cool for positive, warm for negative). Color and spatial grouping work together to help users naturally compare patterns across genres, moods, and track features. The neon glow-wave theme reinforces the modern, digital feel of music platforms while maintaining visual consistency across all plots. 

Finally, interactive tools such as Plotly and Shiny invite users to explore their own questions, making the dashboard both visually engaging and user-centered. This makes it well-suited for a general audience of music lovers, designers, and data enthusiasts. 



Treemap: Popular Genres and Their Vibe:





Take a look at my interactive visualisation!

This treemap shows how emotional positivity, measured by valence, is distributed across popular music at the genre level. Each tile represents a genre, where size reflects total popularity and color shade represents average valence, with higher values indicating greater emotional positivity. Built using R and the treemapify library, the chart distils hundreds of songs into a single, scannable landscape of mood and genre. 

The color palette ranges from magenta (low valence) to cyan (high valence), aligning with the dashboard’s glow-wave aesthetic while clearly differentiating emotional tones. A few patterns worth noting: 

  • Dance music scores high on both valence and popularity, reflecting its broad emotional appeal.
  • Genres like pop and reggaeton, despite their popularity, sit at more moderate levels of emotional positivity. 
  • EDM and alt-rock appear smaller and darker, suggesting a narrower audience and a tendency toward more intense, lower-valence sounds. 

This visualization invites users to explore how genre and mood intersect, encouraging them to think critically about the emotional atmosphere different genres create and how that shapes their listening experience. 



Scatterplot: Duration vs Danceability





Take a look at my interactive visualisation!

To further explore the relationship between musical attributes and emotional positivity. I created a scatterplot in R to study how track duration influences both danceability and valence. The axes and encoding are as follows:

  • The x-axis represents the track duration in minutes
  • The y-axis indicates danceability, a measure of how suitable the track is for dancing
  • Each point corresponds to an individual track
  •  Point color reflects valence using the dashboard's gradient, from magenta (low valence) to cyan (high valence)

The scatterplot reveals that most tracks cluster in the 2.5 to 4 minute range, consistent with typical song lengths in popular music. Tracks with higher danceability also tend to show higher valence, visible through the concentration of cyan points in the upper-middle region of the plot. Longer tracks, by contrast, are less common and show greater variability in both danceability and valence. 

Notably, there is no clear linear relationship between duration and valence, suggesting that a track’s length does not inherently make it more or less positive. 



Bubble Chart: Valence and Speechiness by Track




Take a look at my interactive visualisation!

The bubble chart explores the relationship between valence (emotional positivity) and speechiness, a measure of how much a song resembles spoken word, at the individual track level. Each circle represents a unique song, where color encodes valence using the dashboard’s gradient from magenta (lower valence, darker or sadder mood) to cyan (higher valence, happier or more euphoric tones). Bubble size corresponds to speechiness, so larger bubbles indicate tracks with more spoken-word elements, such as rap, experimental vocals, or interludes.

The chart was built in R using the parkcircles library, which generates non-overlapping circular layouts based on speechiness values. Valence was then mapped to color, and selected track names were wrapped and labeled inside the largest bubbles for readability. 

A few patterns emerge from the data:

  • Larger magenta-toned bubbles tend to represent tracks with high speechiness and lower valence, suggesting a spoken, intense, or reflective mood. 
  • Smaller cyan-toned bubbles reflect melodic, upbeat songs with low speechiness and high valence, typically pop or dance tracks. 


Bar Chart: Comparing the Audio DNA of Three Popular Songs.





Take a look at my interactive visualisation!

This bar chart visualises the normalised and rescaled audio features composition of three well-known songs , “La Bachata”, “Quevedo: Bzrp Music Sessions, Vol. 52”, and “Unholy (feat. Kim Petras)”, offering a side-by-side look at what makes each track acoustically distinct. Each bar represents one song, broken into five stacked segments that show the relative contribution of the following attributes: 

1. Valence (Emotional Positivity)
2. Danceability
3. Energy
4. Speechiness
5. Tempo

To ensure a fair comparison, all features were first normalized and then rescaled so that each song’s attributes sum to 100%. This approach highlights the relative strength of each feature within a track’s overall sound rather than comparing raw values across songs. 

The chart was built in R using ggplot2 and dplyr, with a consistent color scheme making each feature easily identifiable. The neon-inspired palette, magenta for valence, cyan for danceability, green for speechiness, orange for energy, and purple for tempo, reinforces the glow-wave aesthetic carried throughout the dashboard. 

A few patterns worth noting:

  • "La Bachata" scores high on valence and danceability, reflecting its romantic, rhythmic character.
  • "Quevedo: Bzrp Music Sessions, Vol. 52" leans heavily on energy and tempo, giving it a high-momentum feel.
  • "Unholy" carries the highest proportion of speechiness, pointing to its narrative, theatrical delivery and experimental tone.

Together, these comparisons invite viewers to explore how energy, mood and vocal style play out differently across tracks, revealing what gives each song its distinct sonic identity.


Dashboard:



This music dashboard offers a deeper understanding of how listening habits intersect with emotional states, favorite genres, and top artists. By tracking and visualizing patterns in music consumption, it uncovers how different moods influence what we listen to, and how music, in turn, might shape our emotional landscape. It captures not only which songs and genres are most frequently played, but also the emotional valence associated with those choices.

Designed for music lovers, casual listeners, and data enthusiasts alike, the dashboard makes it easy to reflect on personal listening trends through a data-driven lens. Whether you are curious about how your mood influences your taste, interested in which genres dominate your listening history, or simply looking to discover something new about yourself, this tool offers an interactive, visual way to connect music and emotion.

Ultimately, the dashboard transforms passive listening into active self-awareness.


There is significant scope to expand this project by deepening the emotional analysis of music and broadening its relevance across different listener contexts. Potential directions include:

  • Tracking changes in valence, energy, or danceability over time to understand evolving emotional trends in listening habits.
  • Comparing mood-based metrics across different countries, regions, or languages could reveal cultural differences in emotional expression through music.
  • Analysing listener behaviour—such as engagement levels, skip rates, or replay frequency, which is based on a song’s emotional tone, to better understand what keeps people emotionally connected to music.
  • Gathering and analysing multiple years of personal listening data to observe long-term trends in music preferences and emotional states, and how they might shift with age, life events, or global moments (like the pandemic).

This project highlights how music analytics can offer nuanced insight into the emotional architecture of songs and shifting patterns in music consumption. By bridging personal mood with musical data, it opens new opportunities for self-reflection, audience segmentation, and even emotionally intelligent recommendation systems.

You can explore the full project, including code and interactive visuals, in my GitHub repository.