Since my project involves a significant part that’s marketing, I was advised by my instructor to talk to Seth and other professors about how I should approach a dataset. After talking to them, I have decided that a good approach would be creating a dataset using the readability formulas. First I will calculate the average readability and then filter the dataset using that average readability. A marketing dataset has been extremely hard to find, but asking around has led me to the Groupon API – it lets me get 100 deals per second which will help me easily scrape millions of deals in a few days. I plan to run a script in the background that does it. Since last week, I have also successfully implemented word2vec using Genism – a python library.