UPDATE (June 3, 2018): About a year after initially attempting this project, I decided to take another stab at data mining Dylan. With more programming experience, especially in the world of “data science”, I wanted to try to do things in a cleaner and more sophisticated way, and produce a more interesting end product. You can view the result at data-mining-dylan.dustinmichels.com.
My goal was the same: count references to cities throughout Bob Dylan’s lyrics and make an interactive bubble map of the results. However I made a few interesting changes. The second time around:
- Scraping: I did the web scraping with Scrapy instead of Beautiful Soup.
- Data formats: I saved the web scraped data in a structured way (JSON) instead of plain .txt files
- Data processing: I did the data processing using Pandas within Jupyter Notebooks, rather than using pure Python. So much nicer!! (See code here.)
- Identifying cities in lyrics: I identified cities by using a simple regex to search for one or more capitalized words and then cross-referencing those words against a csv file listing world cities. This was much faster, simpler, and more effective than my original approach of using the nltk package to do named entity recognition, and then cross referencing that against my list of cities.
- Making an interactive map: Finally, for the end product, I created a custom mapping widget using Javascrpt, leaflet.js, and vue.js. Previously I just uploaded a csv of mapping data to CARTO. My tool is much better custom-tailored to this project: it let’s you click on a city on the map and easily see exactly which lyrics mention that city.
I got to present my project to digital humanities scholars at Carleton College’s “Day of Digital Humanities 2018,” which was a gratifying conclusion to this independent project. (See slides here). The current version of my map is live at: data-mining-dylan.dustinmichels.com.
We know that the freewheelin’ Bob Dylan rambled and roamed all across the United States. He grew up bored and cold in the mining town of Hibbing, Minnesota. When he learned that his musical idol, Woody Guthrie, was on his death bed, he made a pilgrimage to NYC in hopes of seeing Guthrie in the hospital. Once he was in New York, Dylan hung around Greenwich Village for a while, soaking up new musical and lyrical styles from that 1960’s creative hub. He recorded an album, got himself famous, and went on to travel all over the US and the world.
We know he went lots of places. But which places did he sing about? To answer that question, I made a tentative foray into text mining with Python and its web scraping/ natural language processing modules, then mapped the results with Carto.com. Here’s the result, so far: