Dropbox Files Word Cloud

In one of my previous posts I walked through how I generated a wordcloud based on my most recent 20 tweets. I though it would be neat to do this for my Dropbox file names as well. just to see if I could.

When I first tried to do it (as previously stated, the Twitter Word Cloud post was the first python script I wrote) I ran into some difficulties. I didn't really understand what I was doing (although I still don't really understand, I at least have a vague idea of what the heck I'm doing now).

The script isn't much different than the Twitter word cloud. The only real differences are:

the way in which the words variable is being populated
the mask that I'm using to display the cloud

In order to go get the information from the file system I use the glob library:

import glob

The next lines have not changed

import matplotlib.pyplot as plt
from wordcloud import WordCloud, STOPWORDS
from scipy.misc import imread

Instead of writing to a 'tweets' file I'm looping through the files, splitting them at the / character and getting the last item (i.e. the file name) and appending it to the list f:

f = []
for filename in glob.glob('/Users/Ryan/Dropbox/Ryan/**/*', recursive=True):
    f.append(filename.split('/')[-1])

The rest of the script generates the image and saves it to my Dropbox Account. Again, instead of using a Twitter logo, I'm using a Cloud image I found here

words = ' '
for line in f:
    words= words + line

stopwords = {'https'}

logomask = imread('mask-cloud.png')

wordcloud = WordCloud(
    font_path='/Users/Ryan/Library/Fonts/Inconsolata.otf',
    stopwords=STOPWORDS.union(stopwords),
    background_color='white',
    mask = logomask,
    max_words=1000,
    width=1800,
    height=1400
).generate(words)

plt.imshow(wordcloud.recolor(color_func=None, random_state=3))
plt.axis('off')
plt.savefig('/Users/Ryan/Dropbox/Ryan/Post Images/dropbox_wordcloud.png', dpi=300)
plt.show()

And we get this:

Word Cloud