I have been working on my final proposal this week. I will post it after the deadline for the assignment.
This week I have worked on the presentation. That’s it.
This week I have not been able to do much progress. I have decided to scrap the idea to use Machine Learning in the module for altering images, due to difficulty in implementation. Besides that, I have worked on the second draft of my proposal.
I have been continuing learning machine learning with Python, specifically PyTorch. I started with PyTorch because it has a less steep learning curve compared to Tensorflow (the alternative). However, there are more tutorials for Tensorflow and I might pivot next semester as the image processing gets more complicated and I need more resources in incorporating image processing into the machine learning. I think I will be able to build both a ResNet and AlexNet algorithm and compare them to decide which one to use. I have also written the code for video editing in Python to convert the input video into frames. For this task I am using OpenCV. It is straightforward to do this. I have not yet decided how many frames I will take in the first round.
I finally received the Gourmet Dataset. In fact, I received a devised version that has twice as many images as the original one. I also have the Yelp dataset, although that dataset has not be curated by humans, I am hoping to use it for training my algorithm in addition/instead of ImageNet or AVA.
Since I already have gotten access to the datasets, I have been reading about ResNet/AlexNet implementations, which was my goal for next week.
I have been looking more into the image processing part. I have created my first draft of the code to alter the colors of an image. I have also looked into the rotating of the food in the image. This seems not doable (in the way and scope I wanted to), so I changed my framework to take a video as an input, instead of an image. The video can then be split into images, and the images from the better angels will be picked. I have also written to the Gourmet Food Dataset researchers to ask for their dataset, but have not received a reply yet. I have been looking at the yelp dataset. I have found an online project that assumed all images taken with DSLR cameras were good, and the rest wasn’t. This seems to have worked pretty well for the classifying. I will look into that.
During my weakly meeting with Igor, he brought to my attention a better way to increment the ranking algorithm. In the first round, a certain number of image processing techniques will be applied to the original image and the top 10 or so images will be passed on to the next round. For each round after, permutations of the image processing techniques will be applied to the images, and the next 10 winners will be promoted to the next round. This way, we can keep applying several techniques on the images, and find the best combination. The process would go on until either the machine has a confidence interval beyond a certain threshold, or a certain number of rounds have passed. The latter is important so the machine does not keep going for ever (or for too long) if the image is simply to bad be made decent. This brings me to the question of what to do if no food is found in the image. Should it return an error, or maybe apply the process to the image and see how it turns out? It is possible that the user submitted an image that contains an unusually morphed food, which the AI might not recognize as food, but still be able to make look good.
I also have heard about genetic algorithms, and will look into those as a safety net/supplement.
While preparing my diagram for the quiz, I got a much better conceptual understanding of what I want my project to look like. I have also found nice papers this week. I started thinking of a few different image processing techniques that might help with making the image, and picked a computer vision algorithm for my AI (AlexNet) . I also decided on an image ranking algorithm to decide which image to return, a binary comparison. I feel significantly more comfortable about my project now that I have a more concrete idea for my software architecture, even though I am still fuzzy on the implementation details/
I am going with the auto-image processing for food. I have been looking more into the literature, and there is limited literature that deals with making food look better, and they are all relatively recent publications. However, there is literature about how to train machines to assess how aesthetically pleasing images of food are, and also on how to make regular images more aesthetically pleasing.
I will start experimenting with different machine learning algorithms, and also image processing techniques.
I am pretty certain at this point that I want to do something food and image processing related. This evolved out of my first idea (MyOrder), but is not quite the same.
I want to do some sort of auto image processing to make food look better, and provide an interface/API to use the service. I have found a very large dataset that could help with that, albeit unlabeled. There also is research done on labeling food images as good-loking/not-good-lookingm, which I could use to label the food.
I have two other ideas I will consider (as long as I am still allowed to consider). The first is an app that can tell the ingredients of food and/or some kind of shazaam for food that tells what dish a certain dish is. The other is an app that counts calories by looking at pictures of the food you eat.
After talking to Igor, I realized I could incorporate a food recommendation system to my food ordering app. After thinking some more, I realized the food recommender might actually be more interesting than the food ordering system.
I have read some literature on all three of my ideas, and was pleasantly surprised to see that there was a lot of academic research potential in the ideas that I thought were a little dry in research, which was my main concern with my ideas. The most exciting idea I got was adding a food recommendation system for my food ordering app, where the user can get either a healthy or a tasty recommendation depending on their preferences.
For my second idea, I realized I need to research the intersection of human-computer interaction and pedagogy.
For my third idea, there is a lot to research about indexing a large number of text documents, where the corpora grows, and there is a space constraint. I can also look into text summarization techniques and metadata extraction from the documents.
My first idea stays the same, making an online food ordering system that works over wifi. I need to think about how to differentiate my product, what is new with this, and how to test it.
A web platform aimed at helping people learn/practice math. Using a template, it would generate math questions, and allow people to try to solve them. It would also provide step by step solutions for people. This would be especially applicable to calculus. There are APIs that help with this, at least some open source.
A time tracker app for TAs, who are currently logging in their hours on a Google spreadsheet. This site could allow TAs to clock-in and clock-out, take attendance, and also show them how much they worked in each pay period. The app would also update the currently existing Google sheet with the new logs. This can also be packaged in a way that others can also use it.