Conversation with Minsu Park & Michael Macy: Mining Spotify’s treasure trove of data for worldwide listening habits

SCIENTIFIC INQUIRER: What prompted you to investigate the listening habits of people listening to music around the world? What is music an indicator for?

MINSU PARK & MICHAEL MACY: The study of human emotion is a major topic in both clinical psychology and affective science. Until recently, research in this area has been constrained by data limitations. Previously, most researchers measured emotions based on affective display, that is, emotional expressions such as facial expressions, body postures, and vocal expressions.

To measure affective expression, researchers generally had to rely on lab observations and self-reports that suffer from reporting bias and cannot be feasibly collected at scale, especially with temporal granularity. Lab observation is also not robust to cultural variation. This is rapidly changing with the opportunity to obtain large-scale social media data.

Researchers can now observe digital traces of expressive behavior in real time, at the individual level, and on a global scale. For example, we can measure an individual’s emotional expression through sentiment analysis techniques applied to user-generated text and even track affective rhythms at the individual level, as in this seminal paper published by Golder and Macy in Science in 2011.

While previous big data research made it possible to track the dynamics of affective expression, we still did not know much about how people manage mood over time. That is because affective expression reveals the mood that an individual is experiencing, not the affective state they want to achieve.

Our interest in mood management thus led us to look at music as a ubiquitous stimuli that people actively use in their everyday lives to achieve a desired emotional state. In particular, we focused on diurnal and seasonal variations in musical arousal as indicated by listener’s musical choices. These choices may reflect current mood as well as the desire to alter mood, and they do so at a very high level of temporal granularity and across diverse cultures and demographic groups.

Michael W. Macy (Credit: Michael W. Macy)

SI: How did you decide to work with data from Spotify? What types of tools and data do they offer?

MP: Michael and I were initially exploring the potential use of a combination of Last.fm and Twitter data. Although the Last.fm dataset was interesting, we also knew that Spotify data would be much richer in terms of the size, completeness, and coverage. With the global Spotify streaming logs, we could sample users to match the population distributions in terms of age and gender which would not be possible with the Last.fm data due to the limited number of users that we could collect. Also, we could have 51 countries across diverse cultures in Spotify data while in Last.fm data we only had a few countries like the US, Canada, the UK, and Brazil with enough data points for meaningful cross-cultural comparisons.

MM: Minsu had a chance to work at Spotify as a research scientist intern in 2017 and proposed the research to his hosts as one of his summer research projects. The Spotify team liked the proposal, and provided access to the complete streaming logs for the entire global user population, along with the computing resources needed to process massive amounts of data.

Minsu Park (Credit: Minsu Park)

SI: How granular does the Spotify data allow you to get?

MP: As we noted in the paper, we had each user’s listening/streaming logs timestamped in local time, including information about whether the song was skipped, how long it was played, and audio descriptions using 11 audio attributes (e.g., tempo, loudness, danceability, etc.). We also had the user’s self-disclosed age, gender, and country. Based on each user’s temporal activity pattern, we could also classify them into four chronotypes (e.g. morning people and night owls).

SI: How did you design your experiment?

MP: Our study design was largely based on Golder and Macy’s 2011 study of diurnal rhythms using sentiment analysis of tweets. The alignment with their design made it possible to compare rhythms of affective preference (indicated by musical choices) with affective expression (indicated by Golder and Macy’s sentiment analysis of twitter content). This comparison of the temporal patterns yielded possible insights into how people manage mood in their everyday lives.

We also wanted to know if the temporal rhythms of affective preference follow social activities (e.g. commuting, eating, and interacting with others) or chronobiological processes in which mood is altered by neurochemicals associated with what are sometimes called circadian rhythms (such as the sleep-wake cycle).

SI: What did you discover about people’s listening behavior?

MP&MM: People prefer more relaxing music late at night and start listening to more arousing music early in the morning. The highest level of musical intensity sustains for about 12 hours around typical working hours 8am-8pm and decreases over night. This pattern is remarkably consistent across different demographics (e.g., age, gender, and cultures), chronotypes, and different days of the week, while affective baselines differ across groups (e.g., older users prefer more relaxing music).

One of the most surprising findings is the difference in gender patterns between the Northern and Southern hemispheres. In the Northern hemisphere, women listen to music with lower intensity than do men, while in the Southern hemisphere, it is the other way around.

Another interesting finding was that night owls tend to prefer relaxing music overall, but they listen to significantly more intense music during the daytime. The daytime increase is much larger than the changes observed among the other three chronotypes. Although we did not have data with which to test possible explanations, one possibility might be that night owls need stronger stimuli to stay alert during the day.

SI: Your data was revealing in terms of when and where people listen to high intensity and relaxing music. What about music in the middle? Was mid-tempo music listened to across seasons and at all times of the day at some baseline level?

MP&MM: We measured the average intensity of all the songs a user chose during each of 168 one- hour time periods each week. So a mid-intensity score in one of those 168 time periods could indicate equal numbers of high and low intensity songs, or it could indicate large numbers of mid-intensity songs, or both.

In addition, we measured the average intensity score of all the songs in all 168 time periods and then subtracted this baseline intensity from intensity score of each individual song. This gives us a “within-individual” measure of the change from hour to hour and day to day instead of a “between-individual” measure of the temporal changes in the number of users who prefer high-intensity or low-intensity music. We did this because we want to tease apart two very different temporal dynamics: how each individual’s affective preferences change over time, and how the psychological composition of different demographic groups changes over time.

SI: By design, your study doesn’t deal with the causes of people’s behavior. What types of insights can be garnered from the results of your study?

MP&MM: Studies based on observational data are much better for describing patterns than for revealing the causes of those patterns. That is certainly true for our study as well. We had no opportunity to administer a survey to the users and therefore had very limited data beyond the music they choose to stream. Nevertheless, the patterns we observed do suggest some possible directions for future studies designed to test alternative explanations.

For example, as noted above, by comparing the patterns of affective expression in Golder and Macy’s study with affective preferences in the Spotify data, we can speculate as to whether people’s choice of music reflects their existing mood or the mood they are seeking to attain.

For example, late at night, affective expression and affective preference move together, and this may indicate that music reflects current mood. In the afternoon, however, the mood indicated by affective expression goes steadily down hill and hits the bottom at 3pm, while affective preference remains at a high level of intensity until late in the evening. This difference may indicate that people use music as a mid-day stimulant, much the way they might use coffee or tea.

Another important insight is that the temporal patterns of affective preference tend to align with daily and seasonal activities more than with chronobiological cycles such as circadian rhythm. For example, biological differences between men and women do not depend on whether the individuals are north or south of the equator, but gender differences in preference for musical intensity do depend on location. So this change almost certainly reflects cultural differences in gender roles and not innate biological sex differences.

SI: What is the next step in terms of research?

MP&MM: We want to drill down deeper into this question about why people choose music with different levels of intensity – is it similar to how people express emotion in what they choose to write in a twitter message? Or do people choose music to alter their mood? To find out, Minsu built a smartwatch app that measures continuous physiological arousal, which can then be used to see if the emotional intensity of the music people choose matches what they are feeling when the choice is made? Or does it match what they are feeling after they have listened to the music?

Diurnal and seasonal variations in affective behavior are not limited to what people choose to write or to listen to. Future research is needed to see if the patterns we observed in music consumption are similar to other temporal patterns, such as the intensity of social interaction, the intensity of physical exertion, the consumption of spicy vs sweet tasting food, and of course libidinal intensity.

You may be wondering, “How can we possibly measure these changes hour to hour at the individual level?” Well, that is exactly how we used to respond to the idea of measuring emotional intensity of what people write and listen to. Even five years ago, it was hard to imagine that we would be able to track the emotional intensity of every song chosen by countless individuals all across the planet. But that is just what we did!

For more information about Minsu Park (@mansumansu) and Michael Macy’s research, follow the links.

IMAGE SOURCE: Creative Commons

The Scientific Inquirer needs your support. Please visit our Patreon page and discover ways that you can make a difference. http://bit.ly/2jjiagi. Alternatively, to make a one time $10 contribution visit our Support page.

Leave a Reply

Discover more from Scientific Inquirer

Subscribe now to keep reading and get access to the full archive.

Continue reading