I just worked on my final paper this week. I met with Xunfei to ask questions about it.
I finished my final proposal and I’m rechecking everything for submission this week.
In the past week, I have been working mostly on my presentation and
my proposal. My proposal is close to a finished state, but I am still
working on collecting preliminary results. I have also been trying to
create new figures (images and charts) which are easier to read on
printed copies of my proposal.
For the implementation itself, I am still working on the things I outlined in the first section of my project timeline (setting up the pipeline of the project without adding all the features at each stage), to try and get a minimum version working. I think that this will take a couple more weeks, but I am hopeful that it will lead to me having some buffer time next semester during my implementation of the project.
I finished my presentation. My next step is to add abstract and more introduction to my proposal paper, and finish the final version of it. I have done more research in the past week and planed to change my modeling method from GMM-UBM to Convolutional Neural Network or Deep Neural Network. GMM-UBM is very classical but also “old-fashioned”. CNN and DNN are newer and better. GMM-UBM’s performance lowers as the amount of speakers increases. But I do not have enough time to change method for this semester. I will do more research during winter break and probably change next semester.
I have researched and read a few more papers in the last week. I have expanded upon my analyze -> split -> replace modules with actual implementation details using an encoder-decoder model to swap less engaging text with more engaging text. In order to do this, the text needs to be vectorized and then trained. I have also found a module that can help me achieve that. I have also extensively worked on my proposal presentation. I also met with my advisor and went over the presentation and was advised to explain the slides in a way that a person with no understanding of neural networks can understand what is being communicated.
- I spent the vast majority of this week looking for projects that have specifically detailed how they implemented a fake news detector and reading through the articles I’ve already found.
- While some have given a lot more detail on their process, unfortunately, I can’t understand some of the details.
- A lot of the details go into the mathematical aspects of machine learning and convolutional neural networks. That’s very difficult for me because math is not my strong suit.
- I will either have to find a tutorial that will actually explain it well or I might have to compromise my big goals for this project. I need help finding papers or tutorials that clearly explain their processes so I can move forward in the way that I want to.
This week I have been working on my presentation and have refined my proposal a bit. I will keep working on my presentation until the upcoming Wednesday as I wait for feedback on my second draft of my proposal.
- I focused this week on fixing my first proposal.
- I re-did all of my diagrams so that they would use the proper shapes
- I re-wrote my design section
- I add more to my introduction to better explain the importance and the gaps
- I elaborated about the timeline and gave a high level overview by month
- I also did research into the postgres database using SQL because that seems like the best tool for my project.
- Next week over the break, I hope to go more in depth into my readings and start to finalize the tools I want to use
I read some new papers and research about different modeling algorithms and started to worry about the accuracy on my system. The accuracy is not only rely on the modeling but also based on the dataset for training and the quality of acoustic input (the speaking environment). But selecting a suitable modeling algorithm is important. Now the popular models are: HMM, VQ, DTW, GMM, UBM, i-Vector. I temporarily chose hybrid GMM-UBM. I might change in the future or mix other modeling to enhance the accuracy. My goal is to reach an accuracy at least 90%.
This week, I continued working on my project proposal, submitting my second draft after some much-needed updates. I still need to work further on the Related Works section. I additionally continued working on early implementation of the project. Lastly, I prepared a first draft of my presentation slides.
I have mostly worked on the second draft of my proposal in the past week. I simplified and restructured my framework model. I also looked for more use cases of my project and decided what frameworks to use while implementing it.
This week, I worked on the similar project posted online. While working on it, I found some challenges in modifying the content-based data set to fit the collaborative-filtering method. I might end up modifying my project from a hybrid recommender to a content-based recommender. But I will keep looking for alternatives to make it possible.
During this week I met with Charlie to review feedback for the first draft of the proposal. I reconstructed this version for the 2nd draft. I also read some papers for the next pass. These were more directly related to the software I am going to use.
I discussed my proposal draft with my advisor. I got her feedback and suggestion, and knew how to revise and improve my proposal. In the past week, I read more papers about the GMM-UBM modeling method that I plan to use for my project. I understood the specific procedure now but it is still hard to fully understand this principle… Now my another problem is to find a suitable dataset and decide if my system is text-dependent. There are three primary ways for speaker verification now: text-dependent, mixed, text-independent. The text-independent way is very difficult and complicated to do because user can say anything to pass the verification. But text-dependent way is restricted and not safe for spoofing attacks. For example, people can replay pre-recorded voice to pass the verification. Therefore, the mixed way is better. It restricts the text in a way but safe for spoofing attacks. For example, they user can only speak numbers one – ten, but every time the text is random. But it is hard to find a dataset of all audio file in numbers in English. Now I need to decide which text way my system will use.
This previous week, the work I’ve done has been two-pronged, as has become the norm and will continue to be for the rest of this semester. First, I continued work on the basic implementation of the game. I currently have the control module working, as well as a looping stage that I created in order to test the controls. On the proposal side, I’ve been making edits based on the in-class peer review that we did, as well as working more recently based on the feedback given by Xunfei. I also met with Xunfei to go over her feedback of my first draft, and updated her on my progress.
This week, I investigated the technologies being used in my found papers more closely to find which technologies would be more feasible for my project. For data collection, I have found that the facebook-sdk python library (https://pypi.org/project/facebook-sdk/) used by Pool and Nissim is the best option to connect to the Facebook Graph API, since it looks well documented and has all the options I might need. I also decided to use the Facebook Pages of politicians as my dataset. I reread As the Tweet, So the Reply?: Gender Bias in Digital Communication with Politicians by Mertens et al. to see if their methods could be adapted to my project. I will need to look at their references for methods in more detail to see if I can feasibly apply them to my project.
This past week, I worked on mainly reading my new papers. I did a third pass reading on all my old papers and did at least second pass reading on the new ones. I tried finishing more than half of the existing project on python notebook and played with the data set. I now have a better sense of how to start my project next semester. I also met Xunfei and updated my progress to her. As soon as the feedback for proposal draft 1 comes out, I will be revising my writing and finishing the existing project I have been working on. I also plan to test the existing project on collaborative filtering to make sure it works with a different data set.
In the past week, I have spent most of my time working on the first draft of the proposal. I decided to research and include a new category of papers in my proposal that I had not spent a lot of time before on. The new category that I included was “Sentiment Analysis.” While working on the proposal and refining the design of my framework, I realized that sentiment analysis, something that has been thoroughly covered by researchers of neural networks is very close to my research since I also need to know the sentiment behind the email/piece of text that is to be improved.
In the past week, I have used my peer review from Jordan, as well as my own proof-reading of a physical copy of my draft to fix a lot of errors. I wrote my draft in a bit of a rush, and as a result, there were a lot of formatting errors, most of which I have now fixed. I have also updated some of my diagrams in accordance with feedback I have received and expanded some content in my draft that needed to be clarified.
In addition to working on my draft, I have been working on my project itself (preliminary work can be found in the git repo). I created a mockup GUI to give me some ideas about how I want to design the actual version next semester, as well as testing some implementations of different filters, operators, and edge detectors. Some of these results will hopefully be represented in the next version of my draft.
During the past week, I finished the first draft of my proposal and started to make those changes for the second draft. I have also continued reading some papers for their next pass. I continued to watch videos and read content related to the USB Rubber Ducky. I have started to put together some scripts that I would like to use for the attack. I also spoke with Charlie to refine my methods for the physical attacks I am going to implement. I now have a better/ more related CS implementation for this attack than what I previously had. During this next week, I am going to be working more with Metasploit on Kali Linux.
- I spent a lot of time this week trying to closely read the texts I’ve found already to try and find any mention of the data set they are using. This was very difficult because the research articles usually don’t name what their data set was called or don’t explain where to find the data set they were using. There is not a lot of details in these papers about the researchers’ process and methodology in a way that would allow me to replicate their results. This made finding fake news data sets extremely difficult. However, through the close reading and intense web searches, I have found 21 fake news related data sets.
- I also spent a lot of time researching what would perhaps be the best machine learning tool to use for my project. I’ve narrowed it down to these possibilities: Oryx 2, Tensorflow, Azure ML Studio, Weka, Shogun, AWS CLI, TensorBoard, Kerras, Caffe2. I think that I might be able to use more than one for my project to get the best results but more research still needs to be done about which tool is better for the type of data set I have (which is not a timeseries data set).
- Started my work in reviewing new found research that has more relevant research about a fake news detector application
- Finished my first draft for my project proposal
- I will need to update my related works section because of the new research I’ve found
- After the peer review session, I will need to go back and redesign my figures so that they are easier to read
- Started finding datasets which is very difficult because the research papers never tell you where to find the data set they use and often they never name the dataset either.
- However, I was able to find some github repositories with datasets and some websites of the authors in the research papers that actually linked to the dataset of their works
Finished my first draft of proposal. I read some blogs about speaker verification tech and found out that I was wrong on some aspects (actually I was confused). Those blogs help me understand more and deeper about speaker verification. So I revised my framework and flowcharts: take voice input -> feature extraction -> modeling -> database. The modeling part is the most difficult part in speaker verification. The most popular models are: Hidden Markov Model, Gaussian Mixture Model, Vector Quantization, etc. I am not sure which one I will use for sure. It all depends on my dataset and customer need. I need to experiment several models to know which one I want the best. But I chose GMM temporarily on my proposal.
I only worked on finishing the first draft of the proposal this week.
This past week, I’ve finalized the basic design for the game I will be implementing. It will be a horizontal auto-runner, where the player ducks/jumps to avoid obstacles to the beat of the music in order to keep playing. I continued familiarizing myself with Unity2D, and plan on starting work on the game this upcoming week. Additionally, I wrote up the first draft of my project proposal.
This past week, my work has been split in two directions: First, I’ve been refamiliarizing myself with Unity, by means of going through my Game Design second project. Further than that, I’ve been familiarizing myself with Unity2D for the first time, which I plan on using for the senior project due to the simplicity as compared to Unity3D. Besides getting used to the main software engine I will be using, I also continued reflection on my proposal outline; I’ve been looking more into different PCG-G algorithms and have decided on using the chunk paradigm as my second stage generation algorithm. Its stages won’t be as directly aligned to the music, but it should improve efficiency.
I extensively worked on the proposal last week, reading more papers and writing out what I plan to do helped me figure out the scope of the proposed project. I also experimented a bit more with tensorflow. I made some changes to my initial framework design to now include a frontend and backend for the end-user to interact with.
This week, I revised my diagram again and selected papers to use for my proposal. I read through all of them carefully and wrote an outline for my proposal. I discussed the details of my ideas with Xunfei, modified some, and finished writing the proposal.
Much of my work this week has consisted on working through the first draft of the proposal as well as reading some papers for the next pass. I also found a couple other sources to use. These new sources were not research papers but rather articles related to my project.
I finally received the Gourmet Dataset. In fact, I received a devised version that has twice as many images as the original one. I also have the Yelp dataset, although that dataset has not be curated by humans, I am hoping to use it for training my algorithm in addition/instead of ImageNet or AVA.
Since I already have gotten access to the datasets, I have been reading about ResNet/AlexNet implementations, which was my goal for next week.
I have narrowed my project to studying Facebook Reactions and how reactions may differ based on the gender of the post creator. I have also found papers that focus on facebook reactions. Because Facebook Reactions were released as a feature in 2016, the papers on the subject are limited, and I haven’t found any relating to gender. However, some papers I found analyze facebook reactions in a way that would be interesting to compare between the gender of the post creator. For example, one paper uses the reactions to measure the controversy of a post, so I could measure if posts by women are more controversial in general than that of men. I also found tools from some of these papers that I could use in my project.
After meeting with Xunfei, I decided to modify my diagram a bit so I redesigned it from my practice proposal. I also collected more papers that could be used in my proposal and read more articles and research papers. I also found an online tutorial of a project that is closely related to mine, so I enrolled in the course for free and downloaded the jupyter file to play with it on my own. I think doing this now will help me figure out some possible options and directions for next semester. I also made a timeline of my work for this semester as well as next semester. I asked Xunfei some remaining questions about the proposal and my project in general to clarify my thoughts. I also checked out the rubric for project proposal and brainstormed ideas for my first draft of proposal.
In order to get more familiar with neural networks I decided to use a program that lets you create neural networks. In order to do this I started reading about tensorflow and tensorflow graphs and their inner workings like variables, constants and operations. I read some tutorials on tensorflow and also studied about the Keras model subclassing API which is one of the building blocks of tensor flow to start building a simple neural network. I also read I also searched for more papers that are similar to my research and read Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification, Semantic Clustering and Convolutional Neural Network for Short Text Categorization in order to familiarize myself more with neural networks that are used for text classification.
- This past week I have been trying to find more research papers that discuss how to create a prototype fake news detector instead of papers that just talk about how to discern fake news from media.
- The hunt for those kinds of papers is more difficult than the one for methods of fake news detection which tells me that my work in creating a user application for fake news detection is very much needed.
- It also seems like a lot of these papers are geared specifically towards twitter and so hopefully my research can fill a gap.
- I have also been trying to figure out a good timeline for myself both for this semester and the next. I do believe this project is feasible if I can just find some data sets in a timely manner.
I have been looking more into the image processing part. I have created my first draft of the code to alter the colors of an image. I have also looked into the rotating of the food in the image. This seems not doable (in the way and scope I wanted to), so I changed my framework to take a video as an input, instead of an image. The video can then be split into images, and the images from the better angels will be picked. I have also written to the Gourmet Food Dataset researchers to ask for their dataset, but have not received a reply yet. I have been looking at the yelp dataset. I have found an online project that assumed all images taken with DSLR cameras were good, and the rest wasn’t. This seems to have worked pretty well for the classifying. I will look into that.
During this past week, I have revised how I want to implement my social engineering attack. I want to use what is called a USB rubber ducky where you insert a MicroSD card into the USB. This card has payloads on it which you insert into the victims computer and then the payload is executed. Many different types of payloads can be written. These scripts are written in a language called duck script.
Charlie and I also discussed how to better implement my physical attack. This includes using a wireless adapter as well as ethernet cords to jack into ports around campus and see how easily I can get in.
I found a MFCC library in GitHub and explored it a little bit. It directly takes a wav file as input and returns one N*1 array (a sequence of acoustic vector). I recorded my voice and converted to a wav file. I briefly tested the code. It took my wav file and return an array containing a sequence of vectors. I will use this library in my project. But there are many related factors that i need to study. I also wrote the timeline for the rest of this semester and next semester. My next step is to keep working on this MFCC library and explore the Dynamic Time Warping library in GitHub.
I finished a first pass of all but one of the papers in my reading list, and also read some of the papers that are highly relevant to my project, which I had already read for a first pass, for a second or third pass depending on how relevant the content seemed. I have also outlined the introduction and motivation for my project, and am working through the related works section. Apart from the project proposal, I have also spent some time trying to find some ‘gadgets’ that will assist with the proof for a simpler variant of parks puzzle, which is an effort that has not yet borne much fruit.
This week, I worked more on my proposal outline. Tuesday morning, I met with somebody from EPIC to go over my grant application to go to GDC, at which I may try getting some playtesting data from my project from professionals. This morning, I met with Xunfei to look at my proposal outline before revising and finalizing it.
During this week, I continued work on my literature review after meeting with Xunfei. After finishing the review, I also started work on my proposal outline, and continued looking for more resources to use in my project. I specifically need to find more procedural generation source code for game stages, I’m fairly happy with my two music generation methods.
- I’ve been working on the Lit Review and Proposal Outline this week and I finally finished the Lit Review fully.
- My Lit Review included sections on:
- Data sets: what kind of data has been tested and what do they extract from the data to use as a way of identifying misinformation
- Identification/Classification Methods: what approaches did people take to test the data and have it respond with whether or not it was fake news
- Prototype Design Consideration: some papers outlined what a good prototype detector should have or be used for and that is something I want to deliver on so it was important I note what I found
- The Proposal was a bit challenging
- While drawing designs, I realized that there are a lot of parts to my idea (not that it’s unfeasible though) so I’ll have to sit down and not only figure out a good overall framework but good designs for all the smaller parts
- I also struggled with the methodology, budget, and timeline section because my framework is very much in flux since I actually need to figure out what works before doing the meat of my project.
I am working on finding more papers that study gender bias in social media posts and narrowing my idea further. One challenge is finding a feasible way to collect data from a site (since APIs have limits), or finding an existing data set or web scraper that fits my needs. I am also looking for authors that have published their code for their work and/or who have described their methods in detail.
This past week I worked on revising my literature review as well as writing my proposal outline that will serve as a starting point for my first proposal draft. I met with my advisor who helped me to come up with a good starting dataset for my initial neural network. I will continue read about neural networks and maybe try to implement a simple one in the upcoming weeks.
I went over the broad categories that need to be addressed by the project proposal, and created a proposal outline for Assignment 7. I also worked with my advisor, Igor, to find a good candidate for the reduction adn start working on the proof. We found that there was a natural way of reducing a subset of 3SAT, 2SAT with distinct variables was, to an instance of Parks Puzzle. For next week I hope to generalize the technique to a larger subset of 3SAT.
This week, I changed my method from hybrid to content-based filtering because there isn’t much research done in the hybrid method. So I chose to improve the content-based filtering instead. I also wrote my proposal outline and revised my diagram with the help of Xunfei. I might explore more ways to improve the existing method and see if there is anything else I can add.
For this week, I wrote my proposal outline. During the next two weeks I will use this to construct the 1st draft of my proposal. I also spoke with Charlie about taking a different approach to the social engineering aspect of my project. Most of these I have found YouTube videos to demonstrate and describe the process but I have yet to find any hard research.
I also constructed more details for implementing the technical part of my project. This will also be discussed with Craig.
I finished my proposal outline. The next step is to write my proposal draft. I also discussed with Xunfei and she help me drew a better flowchart. I gained a clearer understanding about the flow of my project. I downloaded a SDK of the iFlytek company’s voiceprint recognizer product for reference.
During my weakly meeting with Igor, he brought to my attention a better way to increment the ranking algorithm. In the first round, a certain number of image processing techniques will be applied to the original image and the top 10 or so images will be passed on to the next round. For each round after, permutations of the image processing techniques will be applied to the images, and the next 10 winners will be promoted to the next round. This way, we can keep applying several techniques on the images, and find the best combination. The process would go on until either the machine has a confidence interval beyond a certain threshold, or a certain number of rounds have passed. The latter is important so the machine does not keep going for ever (or for too long) if the image is simply to bad be made decent. This brings me to the question of what to do if no food is found in the image. Should it return an error, or maybe apply the process to the image and see how it turns out? It is possible that the user submitted an image that contains an unusually morphed food, which the AI might not recognize as food, but still be able to make look good.
I also have heard about genetic algorithms, and will look into those as a safety net/supplement.
- I wrote my project proposal this week and I already had most of the info ready.
- The sections that I didn’t feel really prepared for were the design and what software/hardware do you need sections.
- I didn’t have a good idea of what should go into my design and what counts as a component
- I’m also not sure what kind of software/hardware I need because I’m not completely sure what my own unique approach will be so I don’t know which software/hardware is the best for my approach yet
- After reading articles for my literature review, I see that there are a lot of different ways to identify and classify media with misinformation.
- I will need to do a bit of work to be able to combine all the methods in a way that it will be able to identify and classify media of all different topics and types
- I’ve also split up my project into smaller more manageable goals to accomplish
- First Level Basic Goals:
- Find a large enough dataset that is properly vetted as credible and one that is properly vetted as not credible
- I want this dataset to be on a variety of topics
- Find an algorithm/classifier that is accurate 80%-100% of the time on the dataset with a variety of topics
- Find the key features that are the most reliable for classifying
- Find a large enough dataset that is properly vetted as credible and one that is properly vetted as not credible
- Second Level Goals:
- Expand the dataset to include pictures where the text was extracted from it
- Re-test the algorithm/classifier to make sure a drop in accuracy hasn’t occurred
- Re-test that the key features for articles can apply to the text within a picture
- Third Level Goals:
- Expand the dataset to include videos where they are transcribed as accurately as possible.
- Re-test the algorithm/classifier to make sure a drop in accuracy hasn’t occurred
- Re-test that the key features for articles can apply to the transcriptions
- Fourth Level Goals:
- Create an app/website where you can upload a piece of media and the app will use the algorithm/classifier and tell you if it is credible or not
- Fifth Level Goal:
- The app/website will keep a record of things that have been deemed credible or not
- Create a browser extension that will take the media from the current tab and check if it is credible
- Sixth Level Goal:
- The app/website and browser extension will scan and search for certain keywords set by the user and check new content that’s been uploaded
- First Level Basic Goals:
- I’ve chosen Charlie as my adviser for my project
- The idea that I picked for my Capstone is my Fake News Detection Idea
- The basic idea is that I would create a website/application and a website extension that takes mediums as input and will tell the user if it is factual or not
In the past few weeks I have settled on my topic being exploring gender bias on a website using a combination of computational linguistics and quantitative analysis. After writing my literature review on work that explored a variety of sites, I decided this week to focus on a social media site for my project. My next step is to explore more papers focused on analyzing social media and the APIs available for different sites in order to choose which site I want to focus on.
I read over 20 papers in the last two weeks to work on my literature review. I met with my advisor and talked to him about my proposal and what could be improved in my literature review. I have found some projects/papers that touch upon what my project aims to be. This will help me find and establish a starting point once I start working on my project. One of the challenges that I am currently facing is finding a dataset. I have come across a few datasets that I can use from kaagle.com.
This is what my initial model looks like (This does not delve deeper into how the neural networks are configured.)
While preparing my diagram for the quiz, I got a much better conceptual understanding of what I want my project to look like. I have also found nice papers this week. I started thinking of a few different image processing techniques that might help with making the image, and picked a computer vision algorithm for my AI (AlexNet) . I also decided on an image ranking algorithm to decide which image to return, a binary comparison. I feel significantly more comfortable about my project now that I have a more concrete idea for my software architecture, even though I am still fuzzy on the implementation details/
This week, I mainly worked on reviewing my papers carefully and summarizing them for the literature review. I also met and talked with Xunfei about my proposal idea and came up with a preliminary idea. I will have to work more to develop it.
This week, I wrapped up my first draft of the literature review. I’ll be meeting with the writing center as I embark on my final draft. I’ve continued working on nailing down my exact idea that I’ll be proposing, as well as looking into the available resources found throughout the papers I’ve read, from algorithms to source codes. I’ve done some basic work on a prototype game, but have been too busy to make much progress yet. I think a good portion of my project may include comparisons between different methods and combinations of methods between the PCG-G (different algorithms, mostly) and music generation (mostly grammar-based versus machine learning).
This week I worked on improving my understanding of the Parks Puzzle and exploring possible proof techniques to show that it is NP complete from two directions. I continued working on the Time Complexity chapters of “Introduction to The Theory of Computation” by Michael Sipser to round out my theoretical understanding, while also solving many instances of the puzzle using an app on my phone. I came onto one general idea for the proof involving only ‘AND’ and ‘OR’ gadgets that I discussed with my advisor, who made some suggestions involving an ‘IFF’ gadget, which I am going to continue working on. I also received feedback on my literature review, which showed some significant problems that I corrected according to the grading rubric.
I finalized my proposal to “Applying Voiceprint Recognition Technology to Identity Verification”. The keywords are voice recognition, voiceprint, feature extraction, voice detection, voice verification. The difficulty I might encounter is that there may be background noise in the voice input. If the noise is loud, it may affect the feature extraction and voice recognition. I probably need to explore methods for removing noise.
I did not add an update to week 6 due to the long weekend but had been working on my literature review, which I finished today. It was very useful to read and re-read certain articles and realise some are useful and some are not. I now have further inspiration with where I can take my idea and am happy with its process. I will be looking in the next week or two to start looking into potential technologies to use for my project, which currently seems to be leaning on public Python libraries.
I now know what idea I am going to go with, it’s a new idea and is not related to any of my old ideas. My new idea is about using neural networks and natural language processing to predict a better way to write emails or other forms of text in order to better engage the reader. This will be focused on business emails and other forms of business-related texts. I have read a lot of papers on neural networks in the past week and have spent most of my time writing my literature review on it.
I’ve finalized the base project idea – I’m going with the one involving music generating AI. Additionally, I’ve officially gotten a proposal adviser, Xunfei, who I’ll be meeting with every Wednesday morning. I’m having some issues in limiting the scope and application of my project, which I’ll be focusing on while I finish up my literature review. In terms of the review, I’ve read 9 of the 10 papers, so I only need to read the last one and put the notes I have into the literature review format.
- The First Responders Tech idea is not panning out in terms of finding any relevant research articles so that may be way to difficult for me to achieve in this time frame
- The Secure Paperless Voting Machine is working out in terms of finding research articles but upon reading a couple of the them, I’ve realized how complex the issues are. Both the hardware itself and the software need to be more secure. This might be too big of a project.
- There are a lot of articles and info about my Fake News Detector idea and my Ancient Sites in VR idea so that bodes really well for the feasibility of those two ideas as my official capstone idea
Updates in Ideas
- My first “Fake News Detector” idea has remained mostly the same
- I still have not come up with a more doable idea for the secure paperless voting machines because there’s so much oversight with voting regulations and laws and because there’s an added obstacle of different voting systems/mechanisms.
- My “911 Tech” has hit a wall in terms of find research articles on it
Updates in Process
- I talked to Charlie about my three ideas and he suggested a change from 911 specific tech to First Responders/Disaster Relief tech because they actually use open source technology and it might be more feasible to create something and find research on it.
- I also created a fourth idea in case the First Responders idea doesn’t pan out. My idea was creating very realistic recreations of ancient archaeological sites that you can interact with in VR.
My project is using Voice Print Recognition technology to check if the voiceprint of the input match the corresponding one in the database. This technology can be used in many identity verification scenes like customer services for bank, door lock, business transaction. The main steps approximately will be: take voice input -> (remove noise -> ) extract voice features -> building models with selected algorithms -> compare voice features -> check if voiceprint match. The possible algorithms might be: VQ, MFCC, DTW.
I read more papers on my ideas, met with my advisor to schedule a weekly meeting and discussed a new idea proposed by the company I interned with this summer.
The new idea is a Neural Network driven A.I. that can learn and predict better ways to write emails, marketing campaigns and other forms of communication with the customer.
This week, I finalized who my advisor will be (Charlie). I also decided that I will be working on my security testing idea as my main project. To start this, I spoke with Brendan Post (IT) to discuss my ideas. He was happy to help me and I will be in further discussion with him as I move forward with my proposal.
I also found a couple other papers related to my idea. There was one that had much more lower level detail and actuially described the implementation of their testing. The researchers used Kali Linux to hack into a router through different ways such as SSH, Telnet, and SNMP. There were images that showed the commands they used. It was the first article I found to have a lot of low-level detail.
This week I spent some time finalizing my proposal idea. I discussed the Parks Puzzle with Igor and come to the conclusion that I should work on proving its NP-Completeness for my final project. I had to discuss this idea with my advisor Igor, as well as Charlie and Xunfei before I could finalize this plan. Once I had this confirmed by Xunfei, I put aside my work on the other ideas and started to solely focus on NP-Completeness. My first task is to go through the relevant chapters of “Introduction to The Theory of Computation” by Michael Sipser nad working through problems to clear up my understanding of the problem, which I have started to work on.
In the past week, I searched and skimmed lots of papers to use for my own research. I also talked with Dave and Xunfei to refine my ideas. I also found a data scraping software to extract product information from Sephora.
I am going with the auto-image processing for food. I have been looking more into the literature, and there is limited literature that deals with making food look better, and they are all relatively recent publications. However, there is literature about how to train machines to assess how aesthetically pleasing images of food are, and also on how to make regular images more aesthetically pleasing.
I will start experimenting with different machine learning algorithms, and also image processing techniques.
Most of the articles I have read this week have been quite informative and take the time to explain even the most basic things. However, this has not been true for all the articles that I have read. Some of these articles assume that the reader is already familiar with the terminologies and technologies used in their research or study. Although this helps the reader dig deeper into the topic and come out with a better understanding of the study or research, it also means that doing a 1 or 2 par reading often does not suffice in these cases. Along with reading research papers this week, I also worked on some technical diagrams for my projects this week.
In the past week, I explored different ideas, talked to the faculty, and found more papers to read. I found lots of interesting paper related to my personalized skin care product recommendation idea. I found out that using content-based filtering to recommend products might be helpful, and I could modify the algorithm to have better performance.
Some papers I found useful include:
 Recommender System By Grasping Individual Preference and Influence from other users
 Recommendations System for Purchase of Cosmetics Using Content- Based Filtering
 Item Clustering as An Input for Skin Care Product Recommended System using Content Based Filtering
These papers gave me an idea for what algorithm to use for the recommendation system and how to modify it according to my need. Some of the research was done using five skin types, but I’m thinking to increase it to 16 or more.
I am pretty certain at this point that I want to do something food and image processing related. This evolved out of my first idea (MyOrder), but is not quite the same.
I want to do some sort of auto image processing to make food look better, and provide an interface/API to use the service. I have found a very large dataset that could help with that, albeit unlabeled. There also is research done on labeling food images as good-loking/not-good-lookingm, which I could use to label the food.
I have two other ideas I will consider (as long as I am still allowed to consider). The first is an app that can tell the ingredients of food and/or some kind of shazaam for food that tells what dish a certain dish is. The other is an app that counts calories by looking at pictures of the food you eat.
I haven’t decide my topic yet, but I was reading papers related to my three ideas to gain a deeper understanding on these ideas.
My first idea is an AI tech for voice print recognition. It can be used for avoiding voice spoofing attacks on business, banks (like mobile phone customer service), etc. The main steps for voice recognition is: take vocal input -> identity-feature analysis -> deviating feature selection -> deviating feature comparison -> distance to reference pattern estimate -> check if voice match. The main algorithms for feature extractions are: GMM, JFA, GMM-SVM, etc. On the paper “Vulnerability of speaker verification systems against voice conversion spoofing attacks: The case of telephone speech”, the authors experimented several algorithms and concluded that although JFA has a high inaccuracy but the converted samples with JFA sounds very mechanical so human can easily distinguish. The authors of paper “Voice command recognition system based on MFCC and VQ algorithms” discuss and examine two significant modules: MFCC and DTW. Their results were good. So I will consider use these two modules.
For my second idea which is creating an AI tool for safety driving, the key tech is 3D dynamic facial recognition. I learned that the most Facial recognition tech can be decided into two main parts: facial detection and facial recognition. I can use open sources like opencv and dlib to do facial detection. There are 3 factors i need to care about: detection rate, misdetection rate, false alarm rate. The authors of the paper “BP4D-Spontaneous: A high-resolution spontaneous 3D dynamic facial expression database” reported a newly developed spontaneous 3D dynamic facial expression database in their paper. I am not sure if I can or should use their new database. Although the paper “Real time facial expression recognition in video using support vector machines” primary discuss detecting emotion from facial expression, it provides some facial recognition tech info for me.
My third idea is creating a smart tool to grade algebra on handwritten homework. The APP takes a photo of the handwritten homework and using OCR tech to extract the texts and grade them. The main tech is just OCR. Although the two papers I read both talk about their own APP and OCR system, I can refer some technologies they used, like matrix matching, fuzzy logic for facial extraction.
This week I researched deeper into my three ideas, and now have developed preferences and better understandings of them. Specifically, I believe the natural language processing idea where I will create summaries of subjects in PDF’s seems to not only be the most interesting to me, but the most achievable and most understandable of the ideas. It requires a lot more CS than the other two, which I like. The others, especially the math proof assistant idea, seem like they will contain much more design choices and therefore will distract from the CS side of it. I did make more progress though on finding approaches to the math notes to LaTeX idea, as there is a decent amount of research in symbol recognition, however handwritten was harder to find as opposed to digitally drawn with electric pens. The proof assistant idea is the hardest to research but I am making slow steady progress in this field.
In the past two weeks, I’ve read through a total of 12 sources, four for each idea, and created annotated bibliographies based on them. While some of these papers are more useful than others, each has been helpful in one way or another – and, indeed, I’ve been able to find more papers to look into further as I modify my ideas based off of both peer feedback and my own research. Some papers have given me actual algorithms to either implement or look at as a building block for my own- others have shown me, for example, how much one of my ideas (copyright detection) is already extensively researched, and pointed me towards new directions in which I could modify the project to make it something original.
I spoke with Charlie this past week and we both decided that my first idea is my best one. So, I decided this was the one I was going to pursue my penetration testing idea and found 6 more articles for this.
A couple of the articles I found were about professors having/teaching a first-time hacking course. It was really cool to see the different designs of the classes. There was a general theme of keeping a subset of computers in a controlled environment and then allowing the students to work from there. One article included a complete description of their syllabus and the assignments of the course. This was helpful as it gave me an idea as to what software was being used in these courses.
For my first idea, I have been researching methods and corner and shape detection that I might be able to apply to my implementation. I found a very relevant paper about shape detection using VLI and NIR imagery from drones, which, while new and not especially
For my second idea, I have been researching different classification methods for audio. Both of the papers I found this week use a metric called Mel-Frequency Cepstral Coefficients, in both KNN and DNN algorithms. I think, having read over these papers, I will continue to search more specifically for research related to this concept.
For my final idea, I have found two papers that discuss the obfuscation and user controls necessary to make geo-location safe, while still preserving its usefulness. One of these papers also discusses a
I found a very promising paper called “an introduction to the conjugate gradient method without the agonizing pain” and started working through it. I have only gotten through the first 10 pages or so, but it is helping me think of narrower research questions. I have also skimmed through a fair number of NP-completeness reductions, and have a much better idea about the background work I will need to do to be able to work on the Parks Puzzle problem. Some especially strange and interesting reductions I have come across are a reduction of SAT to minesweeper and a reduction of the Hamiltonian cycle problem for cubic graphs to the zen garden puzzle, which use boolean ‘gadgets’ and nodes and edges that can be combined to create instances of the game.
For this week I tried to narrow down my research scope; I talked with Dave about my ideas, and where I might begin with researching each of them. I got some good ideas about previous work and what might work well to back up my project ideas, and then used that information to pick what I think were the six more relevant papers.
(I was attending a CS conference in California this week so my post is late.) I have read some papers related to my 3 topics. I gained a clear understanding of the technologies I need for my three topics. I also explored the new ideas from Xunfei’s feedback. There are already available APPs that can scan printed music sheet and play the music. Most of them are not free. I only found two free APPs called PlayScore2 and iSeeNotes. PlayScore2 works much better than iSeeNotes. I tested the APP with my printed music sheet, and the result was not as good as I thought. It couldn’t read all the music notes. If I am going on this topic, my goal will be enhancing the accuracy. But scanning and reading hand written music sheet would be very challenging. Even I cannot read those old music sheets very well.
I did a 2 pass reading of all the sources that I had collected over the last two weeks. I realized that some of the things I had in mind already exist. I’m still researching about pre-existing technologies/frameworks that could help me set up a reference point to build upon as I work on my project.
After talking to Igor, I realized I could incorporate a food recommendation system to my food ordering app. After thinking some more, I realized the food recommender might actually be more interesting than the food ordering system.
I have pretty much decided on my main topic, which is working on the complexity of the Parks Puzzle. I found a survey paper that goes over a number of puzzle problems, and provides a vast reading list, which I am working through now.
For this week, I started reading more into the papers that I found initially interesting for my ideas. I found some to be less relevant than others, but still drew some nice views that may be helpful later to shape my ideas. I want to perhaps find some replacement papers, and also talk to a CS professor this week about idea refinement.
For this week, I created my annotated bibliographies for 2 papers of each of my 3 ideas. I also spoke with Charlie and refined 2 of 3 ideas. For my penetration testing idea, instead of making the project solely on the technical side, I want to basically have three phases of this project- technical, social, and physical. I plan to speak w/ Brendan Post in ITS and see how I could implement this at Earlham. I’d plan to publish the results of this testing to where other schools could do something similar.
In regards to the bibliography, one of my ideas is tracking Digital Footprints of users and I found a paper where the researchers created their own piece of software that was a blend of a password manager and an auditing service. I thought of being able to do something similar and create my own piece of software but with additional features.
This week, I did a first pass reading for a total of 6 papers and visited the writing center for the annotated bibliography. Some of them turned out to be less relevant to the topic than they had appeared, but they were mostly useful and interesting. For one of my journals, the authors introduced the eigen-values and associated them with our facial features which I thought was fascinating. A lot of the methods that I thought were either subjective or determined by consensus actually involved detailed algorithms and logical processes. I look forward to reading them more in depth.
This week, I created annotated bibliographies for papers pertaining to each of my three ideas. I also further considered the possibilities for updating my project ideas – I’m likely to not go with the copyright detection idea. If I go with the visualization of music idea, I’m likely to turn it on its head and deal with the audiation of visual art. I’m still fairly content with the rhythm game idea as it is in its current state.
This week, I browsed through chiefly ACM and Google Scholar to find five papers on each of my topics. I additionally met with Charlie to go through my three ideas, getting some helpful insights about my projects and what is being looked for in the capstone project, as well as generating some ideas based off of the original three. The Copyright Detection idea in particular is likely to be cut – it’s a field already rich with research and work done, making it harder to come up with something better or wholly original.
Summary of Updates:
This week I spent a lot of time trying to refine my last two ideas to make it more suitable for a capstone and tried finding enough research articles about all three ideas. I struggled with finding related articles to my second idea so I came up with a fourth idea that was easy to research. Below I’ll put all four of my ideas that I have right now.
First Idea: Automatically Detecting Fake News
- Create a database of media of different opinions that are credible with truthful information.
- Using Machine Learning, find/develop an algorithm that when given a medium could accurately determine whether it is credible and truthful.
- Create an app/website/browser extension as a user interface where you can manually input the medium you want to check, or follow certain keywords and be alerted when new credible content with those keywords has been created, or tell you when a medium you are currently looking at is not credible and link you to credible sources of both opinions.
Second Idea: First Responder’s Tech
- Disaster relief first responders, like 911 operators, would benefit from a tech upgrade as well and they actually use open source software.
- I would like to set up a text system for those responders because you can handle more texts at once than you can calls and sometimes it’s easier to text for help.
- With texts, it would also be easier to automatically send location data to the responders
- I would like to create a technology that uses the phone’s barometric pressure sensor and gps location tech to be able to tell first responders exactly where and what floor a person is on.
Third Idea: Secure Paperless Voting Machines
- The paperless voting machines that we have at this very moment have been proven to have many flaws and vulnerabilities and are easy to hack into. Given the interference of the Russian in the last presidential election, this is extremely concerning.
- I would like to create a paperless voting method that would allow people unable to securely cast their ballot and have the capabilities to have a secure virtual caucus for those unable to make it to the physical location.
- I would make sure all data is encrypted to the highest standards we have at the moment and that all vulnerabilities noted in our current methods and efforts are addressed.
Fourth Idea: Virtual Recreation of Ancient Sites
- world in an immersive environment without having to damage the sites and spend money building on top of the actual ancient stones and what not
- People should be able to walk into places and into rooms
- People should be able to hear sounds as well
- Perhaps incorporate smell???
- People should be able to walk up stairs!
- There should be an interactive element that will give you facts and info about a specific thing that you touched or selected
- This could be used as an educational tool in classes and could be used by museums so that they could give back the original pieces and works to the countries that can safely house them and use the VR with recreations
I am still exploring my new ideas.
Idea 1 Title: AI Assistant for Safety Driving
Description: My idea is to use facial recognition to detect fatigue driving or dangerous driving. A small camera will be placed in front of the driver and the AI Assistant will be a mobile phone APP. For fatigue driving, the camera detects behaviors like closing eyes for long time, yarning, frequently rubbing eyes with hand, etc. For other dangerous driving behaviors, the camera can detect behaviors like playing mobile phone, turning back to chat, smoking, etc. After detecting dangerous driving behaviors, the APP will do a series of operations according to the dangerous behavior: giving an alarm, play refresh songs, report the location of the nearest rest stop. Actually there are already research and real products available now, but most of them are commercialized. They are expensive and big-sized. What makes my idea different is that I am going to develop it as a cheap and handy daily life tool. And I probably can enhance the accuracy of detection under dark light or special environment. The main tech is facial recognition and there are many open sources online.
Idea 2 Title: Voice Print Recognition Identifier
Description: Facial recognition is very popular recently, but many people do not notice voice print recognition. VPR is cheaper and more convenient because it only needs one microphone. Voice can tell many information like age, gender, emotion, environment, and etc. Wearing glasses or makeup might affect the performance of facial recognition, but a mature human being’s voice is stable. There are already research and real application on voice print recognition. For example, WeChat user uses voice recognition to log in. But I can innovate my idea on the application, for example, combine it with bank customer service. When a user steals other’s password and calls Chase to do money transaction, the VPR can detect if the voiceprint matches the bank account owner. Or use it to fight crime, home robots, etc. The VPR tech is already exist. I need to innovate from other aspects. But the two primary usages are: speaker identification or speaker verification. Voice print is a spectrogram of voice, forming from wavelength, frequency, intensity and other hundreds of characteristics. I can differentiate people from differences in purity of voice, resonance ways, average pitch characteristic, voice range.
The popular used spectrogram characteristics are: MFCC, PLP, FBank, D-vector by Google, Deep feature, Bottleneck feature, Tandem feature. Existing models: GMM-UBM, JFA, GMM-UBM i-vector, DNN i-vector, Supervised-UBM i-vector. Scoring algorithm: SVM, PLDA, LDA, Cosine Distance.
(But i don’t know if the scope is too big)
Idea 3 Title: Using OCR to grade hand-written homework
Description: Take a picture of the hand-written homework and use OCR tech to extract the content. (OCR is a technology converting printed text into editable text.) Then check if the algebra calculation is correct. Mark the correctness in the picture near the problem. There are already APP like that available online. But it only checks for algebra calculation, and it is not that accurate. It requires good hand-writing. I can expand my project to enhance the OCR tech, or make it check for simple English homework as well. Especially some parents in non english speaking countries cannot speak english to help their children. And probably I can expand the idea to a mobile phone APP that can summarize and analyze the grades data: collect incorrect problems, analyze which category did incorrectly the most, and etc.
I have finished buying the parts to make the headset and i have finished setting up a unity environment, but haven’t been able to run tests there yet.
I have read some literature on all three of my ideas, and was pleasantly surprised to see that there was a lot of academic research potential in the ideas that I thought were a little dry in research, which was my main concern with my ideas. The most exciting idea I got was adding a food recommendation system for my food ordering app, where the user can get either a healthy or a tasty recommendation depending on their preferences.
For my second idea, I realized I need to research the intersection of human-computer interaction and pedagogy.
For my third idea, there is a lot to research about indexing a large number of text documents, where the corpora grows, and there is a space constraint. I can also look into text summarization techniques and metadata extraction from the documents.
Name of Your Project Computational complexity of the ‘Park Puzzle’
What research topic/question your project is going to address? The Park Puzzle is a game that involves splitting a n*n grid into different colored continuous ‘parks’. A solution to a park puzzle involves marking the position of a tree in each park such that no two trees share a row or a column. The research topic involves exploring various algorithms for solving the park puzzle, and determining some bounds on the computational complexity of the puzzle. The most involved version of this project could be to prove those bounds. (The initial, intuitive hypothesis is that the park puzzle cannot be solved in polynomial time)
Update: I am looking into the proofs of NP-completeness of other pencil puzzle problems to gain an understanding of the techniques involved in proving such complexity bounds.
Name of Your Project Fast Integer multiplication
What research topic/question your project is going to address? There have been significant improvements in algorithms for fast multiplication of integers within the last decade (see Fast integer Multiplication by Anandya et al), approaching O(nlogn), which use modular arithmetic and computation on p-adics The research project would be to explore the theory behind these algorithms and verify their results.
Update: I am looking into the minimal mathematical background required to make sense of how the algorithms work, without necessarily getting into all of the details.
Idea#3 Iterative solvers for systems of Linear Equations
Description Large and sparse systems of linear equations appear in the solution to some partial differential equations by finite difference methods, and in general many day-to-day problem areas such as in engineering and scientific computing. As such the theory of developing methods for solving such large systems, and proving their correctness is an area of active research. in this project I would like to test the effectiveness of various iterative methods at finding solutions to systems linear equations with various characteristics by taking an experimental approach.
Update: no update for this idea
Idea: Breakdown of Mathematical Proofs Using Natural Language Processing
I wish to use natural language processing to break down a bachelor-level mathematical proof into the components that make up a proof (e.g. given knowledge, statements, calculations). This will be shown in a GUI so that the user can clearly see the build-up of their proof. Without constraints of time, some more technologies could be applied (machine learning) to give feedback to the user of the strength of their proof.
No update to this idea.
Character Description Generation from Novels
I want to use natural language processing to allow a user to enter a character’s name (fictional or non-fictional) in whatever PDF they please, and have the algorithm generate a list of facts/short descriptive paragraph about the character. This could be useful when doing novel summaries, studying history and remembering the roles of persons, etc.
This idea is new and came after a discussion with Charlie that my previous ideas were not too interesting ideas (ie. not much room for expansion, not much coding).
Photo to LaTex Generation
This idea came while looking over some of my math notes and wanting to digitalize them. After looking online, the technology does not seem to be there, with some very primitive attempts. I would like to attempt myself at allowing someone to digitalize their math notes in LaTex form. It would require symbolic and natural language processing, potentially machine learning upon more investigation.
This is also a new idea, again discussed with Charlie after doubts about the previous idea.
I further developed my ideas and added more details to each of them. Then I talked to Xunfei about modifications and got more suggestions from her. We looked through the research paper I found on Google Scholars together and searched for related articles. After the meeting, I went back to find some more articles. I think I now know what to choose as my final idea.
As suggested by Charlie, I chose a new direction for my third idea:
Name of Your Project? Geosocial Weight Algorithm
What research topic/question your project is going to address? Lots of social networks, including, for example, one that I created with my classmates last year, use geolocation services to sort data. Our social network uses location to not only sort and tag data, but to filter what is available to the user.
What technology will be used in your project? Location APIs, React Native (assuming that I build on to the pre-existing app we have built), Geosocial algorithms.
What software and hardware will be needed for your project? Requirements might include access to Apple devices for development and testing, such as a mac/macbook (could use one in hopper, or another lab), and some version of iPhone for testing, if making a mobile app is the most effective.
How are you planning to implement? Assuming I continue to work on the same basic network that we already began to set up, my implementation would continue in react native. We already have the basic framework of a social networking app, but I think it would be a good foundation on which to apply my ideas for geosocial algorithms
How is your project different from others? What’s new in your project? My main idea for this is to create an algorithm which can efficiently update the viewable radius of a post based on the feedback from other users in the area. This algorithm would have to take many things into account, such as the population density, user density, post density, and how much positive and negative feedback should affect the geographic radius of each post.
What’s the difficulties of your project? What problems you might encounter during your project? Programming in react-native can be a bit of a challenge. This algorithm could be very complex, so it might get tricky to work with.
After updating my ideas, I picked out 5-7 articles for each of my ideas, quickly finding an abundance of relevant material. I also set up a meeting at the writing center this weekend to discuss how to get started writing, and I will start reading into the articles I found soon. I plan to try and meet with Charlie again soon to discuss my strategy moving forward, and to talk about the quality of the sources I aquired this week.
For this week, I updated my 3 proposal ideas and met with Charlie to review and seek his advice. Prior to this meeting, I found my 5 articles for each and used these discuss potential problems if each idea was used.
Post meeting w/ Charlie, I read a little deeper into the articles and now have a better idea on which proposal I may want to further investigate. Before I ead into the articles, I read the two PDF’s on Moodle about reading research papers and tried to implement this into my first and second pass of each article.
My first idea stays the same, making an online food ordering system that works over wifi. I need to think about how to differentiate my product, what is new with this, and how to test it.
A web platform aimed at helping people learn/practice math. Using a template, it would generate math questions, and allow people to try to solve them. It would also provide step by step solutions for people. This would be especially applicable to calculus. There are APIs that help with this, at least some open source.
A time tracker app for TAs, who are currently logging in their hours on a Google spreadsheet. This site could allow TAs to clock-in and clock-out, take attendance, and also show them how much they worked in each pay period. The app would also update the currently existing Google sheet with the new logs. This can also be packaged in a way that others can also use it.
- Name of Your Project? Corner Finder AI
- What research topic/question your project is going to address? Can deep learning and/or neural networks be used over a set of imagery to identify rectilinear structures which might be undetectable to the human eye (in other words, structures that are buried, or very obfuscated).
- What technology will be used in your project? Drones and drone imagery, as well as machine learning concepts like deep learning with training data and neural networks.
- What software and hardware will be needed for your project? I would like to make this project so that it can be used with the data the IFS trip collected from Iceland this summer. This also gives me an advantage with testing/learning data, as we created quite a stockpile of the kind of drone imagery I would like to analyse.
- How are you planning to implement? I plan to use python, or another language to create the software for this, possibly with support from some libraries (depending on language).
- How is your project different from others? What’s new in your project? From my first hand experience and preliminary research, it seems that most image analysis like this is done through direct processing of data, rather than by training an AI. The scripts we have used are mostly filtering the data that is there, rather than analyzing patterns. I think that this different approach might help us find things we wouldn’t find otherwise.
- What’s the difficulties of your project? What problems you might encounter during your project? The main difficulty will likely be the complicated nature of this kind of software. I have a medium level of experience with AI concepts, but something at this level will undoubtedly get murky with details eventually.
- Name of Your Project? Scatter Box
- What research topic/question your project is going to address? Is there an analysis based solution for organizing sounds based on their parameters? Can it be applied to music creation / production / organization? Can it be applied to scientific data? Can it be made efficient enough to be practical? Can it be abstracted to be broader; perform the same analysis over images, or text. In many ways, what I want this to be is a user-friendly, abstract K-nearest-neighbor classifier.
- What technology will be used in your project? Audio Manipulation algorithms, maybe other analysis algorithms for things other than audio (pictures, text), 3D data plotting.
- What software and hardware will be needed for your project? Libraries that include processing ability like Numpy and Pyaudio, 3d mesh plot output (matplotlib has a simple version of this).
- How are you planning to implement? I plan to use python, or another language to create the software for this, possibly with support from some libraries (depending on language).
- How is your project different from others? What’s new in your project? There are many software options for organizing and evaluating sounds. What makes this project stand out is the idea of visualizing the data in terms of the actual parameters of the audio. These parameters could be anything from audio length, to pitch or frequency, and more could be added throughout the course of the project, as testing made it clearer which were the most beneficial to the purpose of the project.
- What’s the difficulties of your project? What problems you might encounter during your project? There could be problems with processing efficiency, as well as with the actual output of the program. I haven’t worked much with programming visuals, let alone 3d visuals.
- Name of Your Project? Earlham CS Info Tool / Resource Concatenation Tool
- What research topic/question your project is going to address? The CS department at Earlham has a huge range of resources available to students through the cs.earlham.edu website. Many of these resources, however, are somewhat buried. Taking myself for example, I didn’t find out about some of the available resources, such as the CS wiki, until after I was a Junior CS major.
- What software and hardware will be needed for your project? Requirements might include access to Apple devices for development and testing, such as a mac/MacBook (could use one in hopper, or another lab), and some version of iPhone for testing if making a mobile app is the most effective.
- How are you planning to implement? I plan to use a platform-agnostic framework, such as React Native, to provide as broad of support as possible to students.
- How is your project different from others? What’s new in your project? All the information to provide students with the help they need is there, but it needs to be connected in a simple, front-facing user interface. My project would provide a single, unified way to search the information made available to students in CS. The project would (in theory) be able to search all the resources, including the CS website, as well as the CS wiki, portfolio page, and any other resources. From a single search bar, the app would be able to return a list, sorted by relevance or most recent, of all the matching information from each one of these resources. My side goal for this project is to create a more abstract version of the concept, capable of applying the same combined search to a user-defined set of web pages and resources.
- What’s the difficulties of your project? What problems you might encounter during your project? I have little experience with crawling web pages for data, which could be a big part of the implementation of the project.
- My first idea stays the same. Some items Xunfei helped me consider is what questions administrators would be prompted to answer, and how to moderate student answers for reliability.
- My second idea is to build a simulation of Earlham’s energy usage. I would research energy simulations and if visualizations of energy usage have been created. My visualization would show how adding renewable energy sources would affect the power grid. It may also show how improving the energy efficiency of certain buildings would save energy on campus.
- My third idea is to research climate simulations. My research would be focused on how simulations have changed over time, and the ways in which they could be made more accessible to the wider public. Visualizations of climate simulations may be a useful tool for the media and to inform policy makers.