Week 2

with No Comments

I have spent some time thinking about how to split up the timeline into more detail. I have met with Charlie, and decided that the program should take a bulk of images as an input rather than a video. The next step is to learn more on the photography aspect of things.

CS 488 – Week 2

with No Comments

This week I have looked at some papers of most recent models for classifying images to build for my dataset. I encountered some challenges while reading those papers since there were terms that were hard to understand. Next week, I will continue to work on the image dataset and model.

CS 388 Idea 2

with No Comments
  1. Name of Project

Visual representation of nation’s development level

  • What research topic/question your project is going to address?

The goal of this project is to use the various world bank data that is available to evaluate different development metrics for each nation. Then I want to use visualization tools to effectively communicate to the interested audience. The visuals will change as the indicators for the countries change so the website would be a ‘live image’

  • What technology will be used in your project?

Api, data visualization tools like Tableau or python, statistical tools to calculate the indicators and compare between nations.

  • What software and hardware will be needed for your project?

Python, SQL, json. Maybe some database management system to store the data. Tableau for visuals.

  • How are you planning to implement?

I want to pull the data from various data sources like the world bank website using api and load it into some sort of database. Using this data, I want to use some tools to calculate and compare the indicators of development for various countries. The output from these calculations would be then visualize in a website live and these visuals would change based on any changes noticed in the world bank dataset.

  • How is your project different from others? What’s new in your project?

I want to create a live version of this problem. I found a few websites that visualize these metrics or tabulate them, but it is hard to interpret for people who are not very informed about the topics involved. I want to make my website very intuitive so people with different experience levels can look and interpret the data intuitively.

  • What’s the difficulties of your project? What problems you might encounter during your project?

The problem I anticipate is figuring out how to have the database where I store my data update in a lively manner so that any changes in the data bank is represented instantly in the website without any intervention required. I will have to learn various methods that are hopefully available readily that can make this possible for me

CSS 488 – Week 2

with No Comments

Due to a lack of available usable datasets, after talking to my advisor and instructor I decided to modify my project to focus on readability and sentiment instead. I researched papers on readability and sentiment this last week and have starting writing code using python(Keras). My next week’s goals are to have some working code for a trained network that produces more readable code. I still need to look a bit more into what constitutes as readable when it comes to marketing material.

CS 488 – Update – Week 2

with No Comments
  • This week I recovered all of the data sets I found last semester that were on my other computer. I then downloaded and extracted the data.
  • I also chose to set up my own Developer SQL database on my laptop so that I can keep my training data and the user data in one accessible place.
  • Because I wasn’t able to have my mentor meeting last week, I wasn’t sure where to begin with all the work I’ve set up. So I’ve decided to go back through all of my notes on the research papers I have read and create a giant spreadsheet detailing the tools used, features used, classification methods used, whether the dataset or the code was available, and if I’ve contacted the authors of these papers for more info.
    • This will help me figure out how I’ll need to create the learning loop to not forget any feature or method.
    • This will also help me show my advisor exactly what was in previous work and what I have to build off of

CS388 – Week 2 – Second Idea

with No Comments
  1. Name of Your Project

Driver Drowsiness Detection Using Deep Neural Networks

  • What research topic/question your project is going to address?

Driving while feeling sleepy or tired is one of the main causes of traffic accidents. One solution for this might be having a device in the car that monitor drivers’ behaviors and facial expressions and ring the alarm if the drivers tend to fall asleep.

  • What technology will be used in your project?

Dataset of facial expressions (images and videos)

  • What software and hardware will be needed for your project?

Python, PyTorch (or Keras)

  • How are you planning to implement?

Build a pipeline that first apply some image processing techniques to improve the quality of the images, then train a model (using neural networks) to detect and locate face position in the images, and the last step is to build a model (also using deep neural networks) to classify the behaviors and facial expressions.

  • How is your project different from others? What’s new in your project?

Most relevant projects track the drivers’ eyes to see if they close their eyes. I am considering checking eyes movements and also other behaviors such as yawning or nodding off in order to improve the classification performance.

  • What’s the difficulties of your project? What problems you might encounter during your project?

There might not be a big dataset for me to use.

CS 488 – Week 2 – Update

with No Comments

This past week, I have started phase 1 of my project, testing the physical security of the network. Along with starting this phase, I started to write the Google survey that will be used w/ the social engineering experiment. I also ordered the hardware needed for the social engineering test. I have not encountered any obstacles. This next will I will continue to use WireShark to test the physical network using both a wireless and ethernet adapter. 

CS488 – Week 2

with No Comments

In the past week, I have spent most of my capstone time organizing my project and testing some options for the machine learning component. I have been working with fast.ai and ImageAI python packages, trying to set up some groundwork for when I have data ready.

I have also organized all the algorithms that I want to try, at least until after I can compare some results (after I see the results, I may opt to implement more)

My hope for the next week is to make progress on acquiring training data with drones, or at least narrow down where I might want to survey.

CS488 – Week 2

with No Comments

I forked MLMAN, a PyTorch model that achieved the second-highest accuracy of validation on the FewRel dataset for semantic relation extraction. Running locally with a useful amount of iterations, it took to long to train, so I will be training the module on hopper and saving the model there to fetch for local use. With this saved model, I hope to start pre-processing and feeding sentences into it for validation.

CS488-Week2-Update

with No Comments

Over this week I finished up an non-contiguous IFF and OR gadgets, however I came to the conclusion, after meeting with Igor, that there does not seem to be a way to effectively put together these two gadgets. However, we also concluded that in most cases, it is not a particularly difficult challenge to find a gadget for contiguous parks, if one already knows the equivalent gadget for contiguous parks. Since I have reached a dead end, over the next week I am going to try out one promising new direction, and hopefully by close to proving the result for non-contiguous parks within the next two weeks.

CS 488 – Week 2 – Updates

with No Comments

My project is to collect and study the Facebook Reactions and comments on posts by U.S. politicians to see if bias exists based on the gender of the politician. I have decided with Charlie’s advice to focus my project on the 2020 Senate races. The 2020 Presidential election doesn’t have enough candidates to be a good sample size. The 2020 House races would likely have a wide variety of candidate strategies based on the district, many districts with no competition, and less voters per race. By contrast, the Senate races have enough candidates to be a good sample size, while also having more voters per race, meaning there should be more Facebook Pages with enough user activity to be used in my dataset.

This week I found sources for the Senate races, created a spreadsheet for candidates, and decided on which relevant columns should be in the spreadsheet. I am filling out the sheet first for races where the filing deadline has passed for the primary first. Next, I plan to learn how to access the Facebook API using the Facebook SDK Python library, and to collect sample data for candidates I have already added to the spreadsheet.

CS 488 – Week 2 – Updates

with No Comments

I decided to change my modeling method to neural networks. I have read a paper called Text-Independent Speaker Verification Using 3D Convolutional Neural Networks and checked their resources on GitHub. I tried to run their demo but required packages couldn’t be installed on my laptop. i probably need to request a place to run on CS/Cluster from the SysAdmins. I also found other similar resources on GitHub. My next step is to run them with testing files. I also had the first weekly meeting with my advisor Xunfei to discuss timeline and future plans. 

CS 488 – Week 2

with No Comments

I made a visualization (plot) displaying ingredient composition similarity between different products and skin types. I attached two drop-down options for users to select from product categories and skin types. I also attached labels to the graph so that it displays the product’s name, brand, price, and rank. 

CS 488 – Week 2

with No Comments

Working on the login and sign up system. Reading existing papers that talk about such system. Sign up will be via Zimbra only since using Facebook and other applications could lead to fraud accounts. The home page would be ready by next week.

CS388 – Week 1 – First Idea

with No Comments
  1. Name of Your Project

Detect and Translate Chinese text in images

  • What research topic/question your project is going to address?

Lately many translator applications have introduced the new feature that can scan a document or take an image with texts to detect and translate the texts into another language.

Many of these applications perform well with very neat and clear handwriting or high quality images but not quite well with cursive handwriting or low quality images. My research goal is to improve the detection performance in these cases.

  • What technology will be used in your project?

Chinese – English Dictionary API

  • What software and hardware will be needed for your project?

Python, PyTorch, matlab

  • How are you planning to implement?

Build a pipeline that first enhance the quality of the image data using image processing techniques, then feeds data to a deep neural network model (maybe CNN) to detect the Chinese characters and connect to a dictionary API to translate the text into English.

  • How is your project different from others? What’s new in your project?

The current applications do not perform very well on low quality images, so my goal is to find solutions to this limit of the translation apps.

  • What’s the difficulties of your project? What problems you might encounter during your project?

I did some experiments and found that big apps like Google Translate still had trouble detecting the not-very-neat handwriting. Therefore it could be very challenging to achieve my research goal.

CS 488 – Week 1

with No Comments

I go through the project again because it has been a while since I had CS 388 last Spring. I downloaded the data set and started doing some data manipulation and preprocessing. I will start looking at the models for image data set next week.

CS 488 – Week 1 – Update

with No Comments

This week I worked on setting up Keras and completed a course on deep learning using Keras (Learn Keras: Build 4 Deep Learning Applications). As I prepped for implementing the project, one of the significant challenges I have encountered is finding an appropriate dataset to train my neural network. Since my project aims to make a business’s marketing material more engaging, an appropriate dataset with labeled data to set up a clear definition of what counts as engaging and what counts as non-engaging is necessary. After some research and talking to my advisor and the instructor, one of the parameters that I am now looking for while searching for datasets is data that might be labeled based on reading level/hard to read/easy to read. The main goal for next week as I move forward with my project is to have a concrete dataset that I can train my neural network with. 

CS 488 – Week 1 Update

with No Comments

In the past week, I loaded the data, extracted ingredients from products, and made a document-term matrix containing product names and ingredient composition. I plan to visualize ingredient similarity between products this week. I haven’t faced many obstacles yet, but I want to finish things earlier than planned to allow some time for future obstacles.

CS 488 – Week 1 – Update

with No Comments
  • I bought a new computer over the break because my older one was unreliable and crashed unexpectedly from time to time. So I spent this first week setting up the computer and downloading the tools that I believe I’ll be using.
  • I also have spent a lot of time hunting down the data sets from the research papers that I have read and have a collection of over 22 different fake news data sets.
  • I created my presentation slides which helped me think about the project in a different way since I need to think about how to explain things in a way that will make sense to everyone and not just myself.
  • Finally, I chose my adviser and set up a meeting time and shared notes space but we were unable to meet this week since she will be at a conference.

CS488 – Week 1 Update

with No Comments

This week I created the presentation for Wednesday, which helped to make clear to me my new current goal after work done over break. I have found some new datasets and repositories for models online, which I will be presenting to my advisor to figure out which best suits my project. I have also tried to better breakdown my timeline following the selection of a module for the following month, and have personal project goals. I researched some libraries for GUI implementations, currently leaning towards Electron (Java) or PyQt5 (Python).

CS 488 – Week 1 – Update

with No Comments

This week was mainly for refreshing myself on the details of my project. I finalized Charlie to be my advisor for 488 and set up a weekly meeting time with him. I also completed the 3 slide powerpoint in preparation for the presentation in the joint class of 388/488. I adjusted my timeline and plan to start the first phase of my project on Monday. I did not have any obstacles this week. Within this next week I plan to start the physical testing phase of my project. 

CS488 – Week 1 Update

with No Comments

This week has been mostly organizational for me. I found some more resources on Github that I want to try and make use of, and I worked on my design plan for implementation. I talked with Igor about technologies I can use, and what I might need to use them effectively.

The main obstacle right now is the amount of structure that my project requires, which is why I am taking my time to create a solid plan for how things will connect to one another.

Next week, as my design becomes concrete, I will start coding different segments of my project, using some of the preliminary work I have done as a guide.

CS 488 – Week 1 – Updates

with No Comments

First of all, I decided my advisor to be Xunfei who was my advisor as well last semester. We decided our weekly meeting time. I have read some new papers and decided to change my modeling method from GMM-UBM to Neural Networks, and combine with i-vectors or x-vectors. I have found related code sources about Deep Neural Networks/Convolutional Neural Networks for speaker verification on GitHub. GMM-UBM is one of the most classical and dominant methods for speaker verification, but its accuracy decreases as the amount of users increases. Nowadays, there are new methods performs better than it, like Deep Neural Networks/Convolutional Neural Networks. This change on my project might be more challenging because I am using a new method which probably has fewer recourses. But I really want to make the accuracy for speaker verification higher than 90%. 

CS 488 – Week 1

with No Comments

I am getting familiar with Android studio. As per my timeline, the first step in the application is to implement the login system. Aim is to decide by end of this week whether to use Firebase and SQL or only SQL. I have to speak to Charlie regarding this. I revised my project through the first presentation, submitted the advisor form. Next week, work on the application should begin!