Taking a stab at data mining Dylan

UPDATE (June 3, 2018): About a year after initially attempting this project, I decided to take another stab at data mining Dylan. With more programming experience, especially in the world of “data science”, I wanted to try to do things in a cleaner and more sophisticated way, and produce a more interesting end product. You can view the result at data-mining-dylan.dustinmichels.com.

My goal was the same: count references to cities throughout Bob Dylan’s lyrics and make an interactive bubble map of the results. However I made a few interesting changes. The second time around:

  1. Scraping: I did the web scraping with Scrapy instead of Beautiful Soup.
  2. Data formats: I saved the web scraped data in a structured way (JSON) instead of plain .txt files
  3. Data processing: I did the data processing using Pandas within Jupyter Notebooks, rather than using pure Python. So much nicer!! (See code here.)
  4. Identifying cities in lyrics: I identified cities by using a simple regex to search for one or more capitalized words and then cross-referencing those words against a csv file listing world cities. This was much faster, simpler, and more effective than my original approach of using the nltk package to do named entity recognition, and then cross referencing that against my list of cities.
  5. Making an interactive map: Finally, for the end product, I created a custom mapping widget using Javascrpt, leaflet.js, and vue.js. Previously I just uploaded a csv of mapping data to CARTO. My tool is much better custom-tailored to this project: it let’s you click on a city on the map and easily see exactly which lyrics mention that city.
Summary of different techniques for data mining Dylan project, the first time vs. the second time.

I got to present my project to digital humanities scholars at Carleton College’s “Day of Digital Humanities 2018,” which was a gratifying conclusion to this independent project. (See slides here). The current version of my map is live at: data-mining-dylan.dustinmichels.com.

We know that the freewheelin’ Bob Dylan rambled and roamed all across the United States. He grew up bored and cold in the mining town of Hibbing, Minnesota. When he learned that his musical idol, Woody Guthrie, was on his death bed, he made a pilgrimage to NYC in hopes of seeing Guthrie in the hospital. Once he was in New York, Dylan hung around Greenwich Village for a while, soaking up new musical and lyrical styles from that 1960’s creative hub. He recorded an album, got himself famous, and went on to travel all over the US and the world.

We know he went lots of places. But which places did he sing about? To answer that question, I made a tentative foray into text mining with Python and its web scraping/ natural language processing modules, then mapped the results with Carto.com. Here’s the result, so far:

Continue reading “Taking a stab at data mining Dylan”

Travel Update

GOTHENBURG, SWEDEN — Today marks exactly 100 days of traveling in Europe. I’ve been backpacking alone, taking classes, and spending time with family. I cycled through Amsterdam and hitchhiked through the Highlands. I learned Swedish folks songs in Uppsala and sailing fundamentals in Norfolk. I swam in the Gulf of Finland (saunaed for warmth) and in the Sound of Raasay (drank whisky for warmth.) I caught free jazz in Copenhagen, Dresden and Hamburg. I got to see an ecovillage in Scotland (and help out in the kitchen) and a refugee aid operation in France (and help out in the kitchen.) I made a bunch of mistakes and handful friends. I’m grateful, grateful, grateful.

Hey, here’s a map of where I slept for the past 100 nights.

What next? Now I’m in Gothenburg, Sweden, embarking on an internship with a small, clean-tech startup company. Next term I’ll go back to school.

The Poop Map I Made

I discovered CartoDB— a free and open source web mapping tool– through a class I’m currently taking titled “Hacking the Humanities.” Upon learning about ol’ Carto and other tools for visualizing/analyzing spatial data I developed a strong (and unfamiliar) desire to make digital maps.

My ambitions were momentarily thwarted when I realized I had no location data to map. But then came a surreal moment of total clarity, and I knew what had to be done.

Since October 16, I have painstakingly logged the GPS coordinates of my every poop using an app called GPS Logger for Android. I transmitted these time-stamped coordinates to Google Drive, and then uploaded them to CartoDB. Now, as fall term at Carleton comes to an end, it is my honor and privilege to present to you the results of my labor: a gorgeous and interactive poop map!

Continue reading “The Poop Map I Made”