Word Cloud

Skills: Python, Jupyter Notebook

Objective + Process

This is a script that I implemented in Python that generates a word cloud from a text file.

The objective of this project was to generate a word cloud that shows only the most frequently used words in a given text, while getting rid of punctuation and uninteresting words. I chose to use a text file of the novel, Alice in Wonderland by Lewis Carroll.

After initializing the necessary variables, I processed the text to remove any punctuation. I started by checking each character in the text file against a string of punctuation symbols. If the character was not in the punctuation string, I added it to an empty string that I created. Then, I checked each word to make sure that it contained all alphabets (no numbers or symbols) and that the word was not in a pre-defined list of uninteresting words (such as “the”, “and”, “it”, etc). If the word satisfied both conditions, I appended it to an empty list. Finally, I took each word and added it to a dictionary, counting the frequency of the words as I went through the list. Once the dictionary was complete, I generated a word cloud of the words with the highest frequencies.