Senior Capstone, Looking Back

with No Comments

Abstract

An ever-present problem in the world of politics and governance in the United States is that of unfairly political congressional redistricting, often referred to as gerrymandering. One method for removing gerrymandering that has been proposed is that of using software to create nonpartisan, unbiased congressional district maps, and there have been some researchers who have done work along these very same lines. This project seeks to be a tool with which one can create congressional maps while adjusting the weights of various factors that it takes into account, and further evaluate these maps using the Monte Carlo method to simulate thousands of elections to see how ‘fair’ the maps are.

Software Architecture Diagram

As shown in the figure above, this software will create a congressional district map based off of pre-existing datasets (census and voting history) as well as user-defined factor weighting, which then goes under a Monte Carlo method of simulating thousands of elections in order to evaluate the fairness of this new map. The census data is used both for the user-defined factor weighting and for determining the likelihood to vote for either party (Republican or Democrat), which includes race/ethnicity, income, age, gender, geographical location (urban, suburban, or rural), and educational attainment. The voting history is based on a precinct-by-precinct voting history in Congressional races, and has a heavy weight on the election simulation.

Research Paper

The current version of my research paper can be found here.

Software Demonstration Video

A demonstration of my software can be found here.

CS488 Update

with No Comments

In the past week, I worked on modifying my diagram and finished generating results. I also tried to draw meaningful insights from the validation results. After chatting with Xunfei, I finalized my poster and thought about elements to add to my paper.

CS488 – Week 15

with No Comments

This week I prepared my poster for the final submission, while editing also reworking my paper for the final submission next week. Last week I received feedback regarding user interaction and did research this week into what is the best way to build a GUI for my software.

CS 488 – Week 14

with No Comments

I completed the second draft of my paper which required changes in the diagram, motivation for the project. The final draft shall have all the required and suggested materials. I am working on the final version of the poster which requires similar changes as the paper. Currently the application is in testing process. The QR code generation, scanning and the user login/logout process is all set. The aim is to improve the return books feature by adding an additional security layer to prevent book thefts. The librarian gets a list of books away from the library on the homepage and the student is charged a fine if its returned late. 

CS 488 – Week 13

with No Comments

I have been working on wrapping up my project and worked on finishing up the final paper and the poster. The obstacle remains the same where the model gives out very erratic outputs for sentences that are not in its vocabulary but works well with sentences that it has seen before. I will keep continuing to keep working on this and try to figure out a way to make it work, but this goal might be outside the scope of the project given the current workload and time. 

CS488 – Week 14 – Update

with No Comments

During the past week, I have submitted the second version of my paper. After submission, I have continued working on the final parts of the paper. These parts include finishing the social engineering results and making the recommended changes to certain images to enhance my paper. With the given feedback, I have also started making changes to my senior poster which is due Sunday. 

CS488 – Week 14

with No Comments

This week I submitted my second draft of the paper, which required lots of results production as well as time to write. I graphed prediction trends between different materials of foreign data with my model which also was demonstrating how my model was performing. I received great feedback from my advisor regarding the poster and paper and will look to improve these both this week for final submission.

Week of April 20th

with No Comments

This last week I tried to connect my C++ code with my Python code. I failed because I couldn’t convert from cv::Mat to Numpy Array. There were several problems with that. I will instead save the output of my C++ code as a jpg, and read the jpg in python. This is still considerably faster than using Python for the image processing part.

CS 488 – Week 13

with No Comments

The second video was primarily focussed on the student application. As of now, a user can search for books based on categories like sports, academics, fiction, etc. They can issue books by scanning the QR code as well. The homepage of the student app needs some work, it will show which book is currently issued and give a brief history of the user and their activities in the library. In the coming week, I shall work on the 2nd draft of my poster, hope to complete the project and work on testing.

CS 488 – Week 12

with No Comments

I worked on finishing the frontend and wrapping up my project and figuring out what metrics to use measure and compare my models with. I also worked on the poster for my project. This upcoming week I will work on wrapping up my final paper and add the description of my second model to the final paper. 

Week of April 13th

with No Comments

I finished the first draft of the poster. While doing that, I needed to get the accuracies for my AI model and my image processing algorithm overall.

My AI accuracy was around 90% for the testing set. However, I realized my initial idea for measuring image processing accuracy was flawed. I have been working on some new, improved ideas.

CS488 – Week 13

with No Comments

This week we had the first draft of the poster due, which meant producing and visualizing a lot of results from my project. From this motivation, I compared my predictions across very different data (news articles, fictional novels, ect) and also was able to produce a convolution matrix that showed just how accurate my model was. This coming week I want to transfer these results into the next paper draft and continue with the user flow of the software.

CS 488 Update

with No Comments

I’ve mostly been working on making the poster and validating results. I found out that my original plan for validation would not work as nicely, so I will discuss with Xunfei to figure out what I can do. My diagram also needs to be tweaked.

CS 488 – Week 12

with No Comments

I downloaded some past posters to understand how to present my project idea through a poster. Working on the second draft of paper. Finishing up the admin app, in progress to complete all the features of the student app by the end of this week.

CS488 – Update

with No Comments

I couldn’t do much due to comps, but collected all information from participants to conduct the validation process. I also ran their results on my software and recorded the results. This week, I will work on extracting relevant reviews from Sephora to evaluate the efficiency of my method.

CS488 – Week 12

with No Comments

This week I further improved the pre-processing of sentences so that they are cleaner and easier to read on output. I then downloaded some previous year project posters to help with designing my own and have already completed half of it. It showed me that now I need to work further on results to present, on the accuracy of my model outside of its dataset.

CS 488 – Week 11

with No Comments

I have finished the front end of the project, and am trying to wrap it up. One obstacle that I am facing is that my project proposed to have human testing, which will not be possible due to the current situation. I will be working on the poster in the next few days.

CS 488 – Update – April 6th

with No Comments

This past week I mainly worked on the first draft of my poster. It was much easier to complete since I have the majority of my project finished. In the coming week, I plan to continue my testing with my virtual servers, Kali and Metasploitable. Luckily, I have not encountered any obstacles when trying to use these two for testing. I also plan to continue work on the second draft of my paper. 

CS 488 – Week 11

with No Comments

I submitted the second software video. Finishing up the first version of the poster. I also received feedback on the first draft of my paper from Charlie, and discussed the paper and feedback with Charlie. Got valuable suggestion to work and improve in the second draft.

CS 488 Update

with No Comments

In the past week, I mainly worked on implementing a simple interface for my program. I decided to take text inputs for skin type and beauty effects and use a button that returns the recommended products when clicked. To test my program, I collected more input data from participants. I will be using them in the upcoming week to validate my methods.

CS488 – Week 11

with No Comments

During the past couple of weeks, I made some good progress on my project. I now have a functioning driver file that the user will run to train or validate a model, or to predict from their input data. This has allowed me to tie up different parts of my software into one functioning project. I also have predictions working (big step) and am currently working on the best way for the user to view their results.

CS 488 – Week 9

with No Comments

This past week I worked on my other model that uses both engaging/readable and non-engaging/readable advertisements, I found that this model does not perform well because of the lack of parallel data and sentence structures between the two different sentences. For the next week, I will work on refining my initial model as well as writing the frontend for the project. 

Week of 23 March

with No Comments

I parallelized some of the python code.

I also rewrote the color processing code into C++. The C++ code for individually changing each pixel is 600 times faster.

I read up on how to make images of food look better and added more algorithms. I want to improve the algorithms/add more algorithms. They are currently not hooped up to the rest of the code.

Next I need to link the C++ code to my Python code, and find a way to to generate meaning full varaibles to pass the processing algorithms. I might rewrite and benchmark the cropping-and-retargeting code in c++ as well.

CS 488 – Week 9

with No Comments

Working on the logout procedure and to be able to pull up books from the database correctly when the QR code is scanned. The profile page of the librarian and student needs some design elements since the users can view the profile page to see the history of books issued. Planning on working on the second video in the coming week.

CS488 – Week 8 – Updates

with No Comments

The feature extraction module is finished (ready to use) now, but I am still stuck on the modeling module… The model I am using is called VGGVOX which is available on Keras. I am stuck on input pre-processing. The bug is on a function on BatchNormalization(). This function normalizes the activations of the previous layer at each batch. But the issue is not on this function, instead, it is on some deep layers of tensorflow innate functions… which i cannot modify. I am kind of lost which step is wrong exactly.

Project Description: Gender Bias Detection Using Facebook Reactions

with No Comments

Gender bias on Facebook might be measured by analyzing the difference in reactions on posts by women or men. My project is studying bias on Facebook pages of United States politicians using Facebook Reactions and post comments. Specifically, I am focusing on politicians running for US Senate in 2020. Data is being collected from Facebook pages of the politicians using a crawler and will be into a database. 

The data will be analyzed by performing sentiment analysis on the comments and using an entropy function on the reactions for each post. The comment analysis is both focused on whether a comment contains more negative or positive words, and if it contains more personal or professional related words. My hypothesis is that female politicians may have comments directed at them that are both more negative, and more focused on personal issues. I am using an entropy function on the reactions to each post to measure how divided the reactions are. Related work used an entropy function on reactions to measure the controversy of a post. My hypothesis is that, in general, posts by female politicians will be more controversial than posts by male politicians.

CS 488 – Succinct description

with No Comments

My project aims to develop models where we can predict the risk of having cancer based on both numerical data and image data. After training all the models, they will be analyzed to see which has the best accuracy and possible ideas to improve the accuracy. After that we can decide the best model to use if we want to predict the risk of having cancer.

CS 488 – Update – Week 8

with No Comments

Since I uploaded the architecture design last week, this will I will go back to posting the normal updates here – I have been slowly working on my second model that I will compare my initial mode to. I have not faced any obstacles yet except the learning curve that comes with learning Keras, but since Keras is well documented it does not take much time for me to figure out something that I am stuck in. In the upcoming week, I will keep working on the second model and plan to have it finished by the end of spring break.

Week 7

with No Comments

 I fixed the problem where all returned images look the same. I started looking into how to parallelize the code and learning photography techniques to make images better. 

I have also made a more presentable diagram, which I will also use in my paper and poster.

CS488 – Elevator Pitch

with No Comments

Information informs our entire lives. Information shapes public opinion which shapes things like public policy, elections, the health and safety of the public, and more. No one is above the harm that can come from misinformation, which is why we need to fight against its spread.

Fake News as an area of research is relatively new and so some of the aspects are not very well researched, and new aspects to research pop up. Some existing problems in this research are that all of the solutions to these aspects are made in isolation, therefore no one solution can be used to find all instances of fake news, and that most solutions do not have an accessible, comprehensive user platform to disseminate their solution to the people.

This solution that I will provide will be a functional model of a user platform that demonstrates how an engaging and accessible one-stop-shop for fake news detection can work. It allows the user to interact in many different ways that require different levels of effort and is able to scale to include many different automatic detection methods.

CS 488 – Week 8 Update

with No Comments

In the past week, I worked to finish implementing non content-based filtering which recommends products based on the user’s skin type and desired beauty effects. I was able to apply the concept of TF-IDF to judge which ingredients are heavily related to each beauty effect. Now that all my methods are working, I will implement widgets to the python notebook to create some sort of interface so that I don’t need to change the input each time. I will also start revising the paper and validating my method.

CS488 – Elevator Pitch

with No Comments

My project is about extracting features from images. Using low-cost collection techniques such as satellite imagery or drone surveys, a database of positive and negative cases can be created. Additional information will be extrapolated from each image in the database using a combination of modern algorithms and combined back into a single imager as different colored layers of a JPEG image. These processed images, the goal of which is to provide as much information as is possible, are used to train a machine learning model. Hypothetically, the additional information provided by the edge detection algorithms will enhance the accuracy and reliability of the machine learning model, reducing the need for expensive surveying equipment.

CS488 – Week 8

with No Comments

This week I focused more on refining my idea and how it would flow for a user, which then helped me to create a flow diagram for this week. During this process, I realized some flows in my code were inefficient, so I changed the flow of information through certain functions to match up with my flow diagram.

I created a validation function to test a loaded model and also an argument parser to make it easier to pass values for different and important variables into the code.

CS488 – Week 7 – Updates

with No Comments

I developed the feature extraction module for my project and it is working. It now converts a voice input file (.wav) to a sequence of acoustic feature vectors. I tested with my own voice. The two files of my voice recording produce two very different sequences of vectors. But I think we cannot tell my looking at these numbers. They are just a list of numbers of the .wav file. I am still having bugs on my modeling module. I followed Charlie’s suggestion to learn TensorFlow from the basic. I build and trained a model with TF’s dataset and it worked. But this is just a basic try. I will keep looking at it.

CS488 – Week 6 – Updates

with No Comments

Last week CS was down so I couldn’t post my week6 updates. I finally finished the environment setting for my modeling module code. I am using a model called VGGVox Models which are created by the same authors of the dataset I am using. I almost gave up this resource because it is written in Matlab which I have never used before. But then I found a python resource guiding me how to import this model. However, I am still having bugs running this model. It says the true_fn and false_fn have different data types. I tracked the error and found that the error is in TF innate files which i cannot modify. But I don’t know which step that I pass data incorrectly.

Elevator Pitch

with No Comments

My senior project is to develop a technology that provides higher performance and security for target applications. It is called unikernel which is an optimized library operating system. Unikernel consists of the minimum set of components that a target application requires from a complete operating system. Unikernel is light weight and has higher isolation than containers. It will be the trend of running environment for applications in many fields such as cloud computing and high performance computing in the near future.

CS488 – Week 7 Updates

with No Comments

In the past week, I worked on creating a survey to take inputs for content-based filtering, modified the skin type test questions, and obtained some responses. I also worked on implementing non-content-based filtering using TF-IDF which I am struggling with. I will be meeting with Xunfei on Thursday and try to finish this part as soon as possible.

CS 488 – Elevator Pitch

with No Comments

My project aims to create a skincare product recommender system based on the user’s skin type and ingredient composition of a product. The main component of the project is content-based filtering and the secondary component is non content-based filtering. For content-based filtering, a user provides his or her skin type and selects a skincare product from sephora.com. The system then identifies the chemical components of products and uses cosine similarity to recommend products that have similar ingredient compositions. 5 recommendations for each product category are then made and returned to the user. Non content-based filtering allows users not to input the product if they lack knowledge or have not found a product they like. A user provides his or her skin type and desired beauty effect to obtain top 5 product recommendations across all 6 categories.

CS488 – Week 7

with No Comments

This week I began creating a model using the Keras Python library. I have been training it on the SemVal Task 8 2010 dataset, with accuracies of around 90% during training and 5 epochs and 60-65% validation accuracy. I was successfully able to save and reload the model.

I will be working on increasing the accuracy of this model in the coming week before applying it outside of its dataset.

CS488 – Elevator Pitch

with No Comments

My project aims to see how applicable semantic relation extraction models are outside of their dataset. Semantic relations are how we draw knowledge and facts from a text and no text is the same and when we research we usually look for these relationships regarding certain subjects in the text important to us. I want to see if a normal user can use state-of-the-art semantic models outside of their dataset to decrease the time needed to find specific knowledge about any entity in an unstructured text.

CS 488 – Week 7 – Elevator Pitch

with No Comments

My project aims to develop a reproducible penetration test that can help secure a large network. Tests will come from three different avenues- physical and technical testing, as well as social engineering. The results from these tests will be put together in a final report and given to the appropriate people who can make appropriate changes as needed.

CS 488 – Week 6

with No Comments

— Elevator Pitch — 

My project aims to use a sequence to sequence encoder-decoder model to make text-based advertisements more engaging and readable. This will help businesses get an edge over their competitors by attracting new customers as well as retaining their existing customers by making sure that their advertisements are readable and engaging to their target audience. This will be done through the analysis of pre-existing advertisements which will then be used to train the model on how to restructure sentences to make them more readable and engaging. 

CS 488 – Week 7

with No Comments

My project is an application used in a library to issue and return books using QR code. The primary usage of this app is in college libraries. Using personal smartphones, users can scan the QR code and check out the books which reduce human work and reduces the average time spent in the library. Users can also search for any book in the library and learn basic information rapidly.

The login data and book data is stored in a firebase database. The librarian application involves 3-4 staff users who can manage the flow of books through the app and when a fraud activity takes place they get notified. 

CS 488 – Update – Week 6

with No Comments
  • This week I focused on getting part of the PolicialNews Data set from Castello et al. to work with Weka to be able to see if I can recreate the results used by their classification methods
    • Downloaded a tool to combine excel files into one sheet without data loss, manually added headers and an extra column denoting which was fake and which was real
    • But Weka still won’t load the data so that I can test it
  • Next week I will focus on making smaller versions of the data set to see what features are the issue for Weka and testing features individually; I will also look into Keras as a machine learning tool and see what kind of testing can be done

CS488 – Week 6

with No Comments

This week I was trying to run two different TensorFlow models with checkpoints, however, I could get the checkpoints to work which is key to my project. After a discussion with Dave, I have decided to implement a simple model myself using the Keras library since it is more abstracted and well documented, so it shouldn’t take too much time. I will be aiming for a minimum of 50% accuracy with my model and the SemVal-2010 Task 8 Dataset, which I think is the best dataset choice for this task.

Following this implementation, I want then start testing my model outside its dataset.

Week 6

with No Comments

I did not manage to get the foreground (the food) with zero user interaction. I have achieved pretty good success at picking just the food with minimal user interaction. I will try it with a pure white background next.

I can now successfully manipulating individual contour areas. 

For next week, I will (finally) work on learning more about food photography. 


I I have most of the standard functionality working. Pretty much the only untouched coding is the resizing.

1 2 3 4