Introduction

Hi, I am a senior data science student from Tokyo, Japan. I am majoring in data science because I see intersections of quantitative methods and social justice. When I started my major, I had no knowledge of programming nor statistics. I have so much more to learn in those fields but I am grateful of the knowledge I’ve earned in the last 3 years. Here, I introduce my senior project in topic modeling with web-scraped data.

Abstract

This project aims to compare the topics covered and tones of the words used in different media sources online.
The project collects news articles on the web by web scraping, extract topics and keywords of each article through unsupervised machine learning and make visualizations of the topic’s relevance to each other and associations with the keywords. Second, using the web scraped data, the research will conduct sentiment analysis on articles that concern certain specific topics. After the sentiment score was found out, the project further illustrates the daily changes in sentiments. It then finally explores the possible correlation between topic’s sentiment score changes and stock prices changes.

Data Architecture Diagram

Presentation Poster