Applying Natural Language Processing to a Twitter Project

SGO48 is girlgroup that is trained in dancing and singing by a Japanese entertainment company that makes the best-selling artist AKB48. SGO48's Facebook Official page garners 90,000 likes (May 2019) and increasing since their inception on December 2018. I enjoy SGO48 music and refreshing energy, that is why I operate this Twitter account under @SGO48ENG to promote them and translate the content to international fans. Using Twitter Analytics, I was able to gauge insights of my tweets and audiences. Some questions I posed were: Who are the audience (gender, interests...)? What day of the week my tweets gain the most impressions? I discovered that the tweets that I posted that were associated with a Sister Group (A tweet with #MNL48 in the picture above) gained much more impressions. Tweets with CAPITAL letters were also more popular.  Engagement rates between 0.02% and 0.09% are considered to be good, according to Scrunch, and I have exceeded that.

Natural Language Processing is an Artificial Intelligence Technology that maps the input of human languages and create useful representations. With the help of toolkit such as NLTK, Textblob, and Tweepy, I was able to tokenize words (word segmentation) and count the most frequent words and unique words (FreqDist toolkit). This is useful in marketing.

On Anaconda Navigator's Interactive Spyder, I imported Python packages TextBlob and Tweepy to analyze the emotions of the Tweets associated with a certain keyword. I wanted to see how #Salesforce is mentioned on Twitter. In the bottom right, The Polarity indicator means sentiment. The more negative the number, the sadder the tweet, and vice versa. For the Subjectivity scale, the bigger the number, the more objective and less emotional the tweet is.

 By analyzing the keyword #SGO48, I discovered the Tweets featured in the "Top Tweet" section were really high in emotions and less on the subjectivity. I used this information to leverage my emotions in the Tweets as most of my posts are fact-based. If you search #sgo48 and #Vpop on Twitter, the top tweets related to the keywords are associated with my account.

I generated the Excel dashboard of the tweets

After I exported the tweets of May 2019 to an Excel workbook file, I imported them to Tableau Desktop. Using Tableau Tree and Packed Bubble Maps, I compared the four important indicator: User Profile Clicks ( Who clicked the profile @SGO48ENG to visit the page ), Impressions ( Who saw my tweets ), Engagements ( Who ACTUALLY did something with the tweets ), and Hashtag Clicks ( Who clicked on the important hashtag #SGO48 ). I then generated some conclusions with The Story Tool.

I applied a Calculation "% of Total" to a tweet. If I click on a square (the deeper the color, the more engagement), I see how many percentage share the tweet has. I also used Filter and Highlight Action. When I clicked on a tweet, its part on another metric is highlighted. 

Using Advanced Excel to format and compute data
I completed a course on Lynda called "Excel for Marketer." I learned about functions and shortcuts. I applied Excel pivot table to a Temple University's Undergraduate Criminal Justice class. 
Some key metrics I was exposed to:
Cost per Mille
Cost per Click
Cost per Acquisition
Clickthrough rate
Conversion Rate
Revenue per Click
Average Order Value
Return on Investment
Here are the sample screenshots of my work

How to work with a marketing dashboard

How to calculate the duration of an A/B Test

How to use vlookup to join datasets

Crime Report with Excel Pivot Tables and Government Proprietary Data
In my upper level Criminal Justice Honors class, I made a semester long analytical project investigating crimes in the Northern Philadelphia area, utilizing Excel pivot tables and government data from Simply Analytics. The project also entailed field study and observation. I created graphs, charts, and was able to see the patterns of crimes and explain the elements of crimes using demographics data such as high school drop-out rate, income, male to female ratio. Full report ( 22 pages ) click here. 
Result: I received a 99.5/100 score on the final presentation of the findings of this project from the professor.

Map of My Crime Study Site I generated. Each color represented a crime and each dot was an instance of the crime.

Two screenshots show the grand totals of crime data and other information (police dispatch time, location of the crime etc.)


I tampered a bit with R and exploratory analysis. I was able to generate a heatmap from a dataset of airline tickets. The redder the color, the more frequent the ticket is bought. This is useful in marketing knowing what customers are looking for. 

Back to Top