I finished out some final kinks and have a fully functional version now.
This last week I tried to connect my C++ code with my Python code. I failed because I couldn’t convert from cv::Mat to Numpy Array. There were several problems with that. I will instead save the output of my C++ code as a jpg, and read the jpg in python. This is still considerably faster than using Python for the image processing part.
I finished the first draft of the poster. While doing that, I needed to get the accuracies for my AI model and my image processing algorithm overall.
My AI accuracy was around 90% for the testing set. However, I realized my initial idea for measuring image processing accuracy was flawed. I have been working on some new, improved ideas.
I am having trouble with binding the C++ code to Python. Besides that pretty much everything is finished.
This week was (is going to) spent on working on the poster.
I am struggling with linking my C++ code and Python code. The conversion from numpy array to cv2::Mat and back is the problem. There are libraries to help with that, but I fail at setting them up.
I parallelized some of the python code.
I also rewrote the color processing code into C++. The C++ code for individually changing each pixel is 600 times faster.
I read up on how to make images of food look better and added more algorithms. I want to improve the algorithms/add more algorithms. They are currently not hooped up to the rest of the code.
Next I need to link the C++ code to my Python code, and find a way to to generate meaning full varaibles to pass the processing algorithms. I might rewrite and benchmark the cropping-and-retargeting code in c++ as well.
I fixed the problem where all returned images look the same. I started looking into how to parallelize the code and learning photography techniques to make images better.
I have also made a more presentable diagram, which I will also use in my paper and poster.
I did not manage to get the foreground (the food) with zero user interaction. I have achieved pretty good success at picking just the food with minimal user interaction. I will try it with a pure white background next.
I can now successfully manipulating individual contour areas.
For next week, I will (finally) work on learning more about food photography.
I I have most of the standard functionality working. Pretty much the only untouched coding is the resizing.
I am trying to succesfully detect the foreground object in the image, which will be food. While the initial idea was to use AI to detect the food in the image, this is not a completely solved problem yet. Since I can choose my own input, using the foreground image makes more sense. Charlie recommended a software called Image Magick, and I will look at that.
For next week, I want to finish the foreground, and start resizing the images, and passing them to the ranking AI.
I finished the ranking module. It take a folder of images, converts them to an array, passes the n best images to another function, which keeps processing the images, and then picking the best n again to be processed. There is no processing yet.
For the processing, I have started working on the genetic pixel changing for the image processing. I am reading the pixels into an array, and am changing each individual pixel. While it is (sort of) a genetic algorithm, I want the changes to be a little more intentional.
I have started working on passing images to the ranking algorithm.
I also have found some online food-photography courses I want to look at. Learning that will be helpful in knowing how to improve my images.
I have spent some time thinking about how to split up the timeline into more detail. I have met with Charlie, and decided that the program should take a bulk of images as an input rather than a video. The next step is to learn more on the photography aspect of things.
It was the first week, and we had out presentations. I also found an advisor. Everything is swell.
I have been working on my final proposal this week. I will post it after the deadline for the assignment.
This week I have worked on the presentation. That’s it.
This week I have not been able to do much progress. I have decided to scrap the idea to use Machine Learning in the module for altering images, due to difficulty in implementation. Besides that, I have worked on the second draft of my proposal.
I have been continuing learning machine learning with Python, specifically PyTorch. I started with PyTorch because it has a less steep learning curve compared to Tensorflow (the alternative). However, there are more tutorials for Tensorflow and I might pivot next semester as the image processing gets more complicated and I need more resources in incorporating image processing into the machine learning. I think I will be able to build both a ResNet and AlexNet algorithm and compare them to decide which one to use. I have also written the code for video editing in Python to convert the input video into frames. For this task I am using OpenCV. It is straightforward to do this. I have not yet decided how many frames I will take in the first round.
I finally received the Gourmet Dataset. In fact, I received a devised version that has twice as many images as the original one. I also have the Yelp dataset, although that dataset has not be curated by humans, I am hoping to use it for training my algorithm in addition/instead of ImageNet or AVA.
Since I already have gotten access to the datasets, I have been reading about ResNet/AlexNet implementations, which was my goal for next week.
I have been looking more into the image processing part. I have created my first draft of the code to alter the colors of an image. I have also looked into the rotating of the food in the image. This seems not doable (in the way and scope I wanted to), so I changed my framework to take a video as an input, instead of an image. The video can then be split into images, and the images from the better angels will be picked. I have also written to the Gourmet Food Dataset researchers to ask for their dataset, but have not received a reply yet. I have been looking at the yelp dataset. I have found an online project that assumed all images taken with DSLR cameras were good, and the rest wasn’t. This seems to have worked pretty well for the classifying. I will look into that.
During my weakly meeting with Igor, he brought to my attention a better way to increment the ranking algorithm. In the first round, a certain number of image processing techniques will be applied to the original image and the top 10 or so images will be passed on to the next round. For each round after, permutations of the image processing techniques will be applied to the images, and the next 10 winners will be promoted to the next round. This way, we can keep applying several techniques on the images, and find the best combination. The process would go on until either the machine has a confidence interval beyond a certain threshold, or a certain number of rounds have passed. The latter is important so the machine does not keep going for ever (or for too long) if the image is simply to bad be made decent. This brings me to the question of what to do if no food is found in the image. Should it return an error, or maybe apply the process to the image and see how it turns out? It is possible that the user submitted an image that contains an unusually morphed food, which the AI might not recognize as food, but still be able to make look good.
I also have heard about genetic algorithms, and will look into those as a safety net/supplement.
While preparing my diagram for the quiz, I got a much better conceptual understanding of what I want my project to look like. I have also found nice papers this week. I started thinking of a few different image processing techniques that might help with making the image, and picked a computer vision algorithm for my AI (AlexNet) . I also decided on an image ranking algorithm to decide which image to return, a binary comparison. I feel significantly more comfortable about my project now that I have a more concrete idea for my software architecture, even though I am still fuzzy on the implementation details/
I am going with the auto-image processing for food. I have been looking more into the literature, and there is limited literature that deals with making food look better, and they are all relatively recent publications. However, there is literature about how to train machines to assess how aesthetically pleasing images of food are, and also on how to make regular images more aesthetically pleasing.
I will start experimenting with different machine learning algorithms, and also image processing techniques.
I am pretty certain at this point that I want to do something food and image processing related. This evolved out of my first idea (MyOrder), but is not quite the same.
I want to do some sort of auto image processing to make food look better, and provide an interface/API to use the service. I have found a very large dataset that could help with that, albeit unlabeled. There also is research done on labeling food images as good-loking/not-good-lookingm, which I could use to label the food.
I have two other ideas I will consider (as long as I am still allowed to consider). The first is an app that can tell the ingredients of food and/or some kind of shazaam for food that tells what dish a certain dish is. The other is an app that counts calories by looking at pictures of the food you eat.
After talking to Igor, I realized I could incorporate a food recommendation system to my food ordering app. After thinking some more, I realized the food recommender might actually be more interesting than the food ordering system.
I have read some literature on all three of my ideas, and was pleasantly surprised to see that there was a lot of academic research potential in the ideas that I thought were a little dry in research, which was my main concern with my ideas. The most exciting idea I got was adding a food recommendation system for my food ordering app, where the user can get either a healthy or a tasty recommendation depending on their preferences.
For my second idea, I realized I need to research the intersection of human-computer interaction and pedagogy.
For my third idea, there is a lot to research about indexing a large number of text documents, where the corpora grows, and there is a space constraint. I can also look into text summarization techniques and metadata extraction from the documents.
My first idea stays the same, making an online food ordering system that works over wifi. I need to think about how to differentiate my product, what is new with this, and how to test it.
A web platform aimed at helping people learn/practice math. Using a template, it would generate math questions, and allow people to try to solve them. It would also provide step by step solutions for people. This would be especially applicable to calculus. There are APIs that help with this, at least some open source.
A time tracker app for TAs, who are currently logging in their hours on a Google spreadsheet. This site could allow TAs to clock-in and clock-out, take attendance, and also show them how much they worked in each pay period. The app would also update the currently existing Google sheet with the new logs. This can also be packaged in a way that others can also use it.