Senior Capstone

with No Comments

Finding correlation between fake news and correspondingsentiment analysis

Abstract

Detection of misinformation has become of great relevance and importance in the past few years. A significant amount of work has been done in the field of fake news detection using natural text processing tools combined with many other filtering algorithms. However, these studies lacked to observe any possible connection that might exist between the tone of the news and the validity of it. In order to research this field and find any existent correlation, my project addresses the potential role that sentiment associated with the news plays in identifying its validity. I perform sentiment analysis on tweets through natural language processing and use neural networks to train the model and test its accuracy.

Final Paper

Final Software Delivery

Final Software Diagram

Senior Capstone – Wildfire Simulation Program

with No Comments

Abstract

With the increase in the number of forest fires worldwide, especially in the West of the United States, there is an urgent need to develop a reliable fire propagation model to aid fire fighting as well as save lives and resources. Wildfire spread simulation is used to predict possible fire behavior, which is essential in assisting fire management and training purposes. This paper proposed an agent-based model that simulates wildfire using a cellular automata approach. The proposed model incorporated a machine learning technique to automatically calculate the igniting probability without the need to manually adjust the input data for a specific location.

Software architecture diagram

The program includes three large components: the input service, the simulation model, and the output service.

The input service processes users inputs. The input includes diffusion coefficient, ambient temperature, ignition temperature, burn temperature, matrix size, and a wood density data set. All of the inputs can be set to default values if users choose not to provide data. The wood density data set can also be auto-generated if a real-world data set is not provided.

The simulation model is the most important component of the program. This part consists of two main parts, fire temperature simulation service and the wood density simulation service. As the names suggest, the fire temperature simulation service is responsible for processing how fire temperature changes throughout the simulation process. The wood density simulation service is in charge of processing the changes in wood density of the locations described in the input when fire passes through.

The final component, the output service, creates a graph at each time step, and puts together the graphs into a gif file. By using the gif file, users can visualize how fire spreads given the initial inputs.

Design Overview
Simulation ModelThe simulation model

Link to the final version of the paper

https://drive.google.com/file/d/1d4UQhkRWoYSxDWYb5SY-lca6QLaZFJMZ/view?usp=sharing

Link to the final version of the software demonstration video (hosted on YouTube)

https://youtu.be/u7L3QIGLRFg

Senior Capstone

with No Comments

Sarcasm Detection Using Neural Nets

Abstract

Over the last decade, researchers have come to realize that sarcasm detection is more than just another natural language task such as sentiment analysis. Problems like human error and longer processing times pertaining to sarcasm arise because previous researchers manually created features that would detect sarcasm. In an effort to limit these problems, researchers desisted from using the pre-crafted-feature-prediction models and turned to using neural networks to predict sarcasm. To understand sarcasm, one needs to have a bit of background information on the topic, common shared knowledge and also exist in the space in which the sarcastic statement exists. With this in mind, introducing visual aspects of a conversation would help improve the accuracy of a sarcasm prediction model.

Paper
Software Demo Video
Software Architecture Diagram

Senior Capstone, Looking Back

with No Comments

Abstract

An ever-present problem in the world of politics and governance in the United States is that of unfairly political congressional redistricting, often referred to as gerrymandering. One method for removing gerrymandering that has been proposed is that of using software to create nonpartisan, unbiased congressional district maps, and there have been some researchers who have done work along these very same lines. This project seeks to be a tool with which one can create congressional maps while adjusting the weights of various factors that it takes into account, and further evaluate these maps using the Monte Carlo method to simulate thousands of elections to see how ‘fair’ the maps are.

Software Architecture Diagram

As shown in the figure above, this software will create a congressional district map based off of pre-existing datasets (census and voting history) as well as user-defined factor weighting, which then goes under a Monte Carlo method of simulating thousands of elections in order to evaluate the fairness of this new map. The census data is used both for the user-defined factor weighting and for determining the likelihood to vote for either party (Republican or Democrat), which includes race/ethnicity, income, age, gender, geographical location (urban, suburban, or rural), and educational attainment. The voting history is based on a precinct-by-precinct voting history in Congressional races, and has a heavy weight on the election simulation.

Research Paper

The current version of my research paper can be found here.

Software Demonstration Video

A demonstration of my software can be found here.

Senior Capstone

with No Comments

An Integrated Model for Offline Handwritten Chinese Character Recognition Based on Convolutional Neural Networks

Abstract

Optical Character Recognition (OCR) is an important technology in computer vision and pattern recognition that recognizes text embedded in images. Although the OCR achieved high accuracy for languages with alphabet-based writing systems, its performance on handwritten Chinese text is poor due to the complexity of the Chinese writing system. In order to improve the accuracy rate, this paper proposes an integrated OCR model for Chinese handwriting that combines existing methods in the pre-processing phase, recognition phase, and post-processing phase.

Paper

Software Demo Video

Software Architecture Diagram

Phi Nguyen – Senior Capstone

with No Comments

Abstract

Modern internet architecture faces the challenge of centralized services by big tech companies, which capitalizes on the users’ information. Most of the well-known chat services at the moment have to depend on a third party server which stores the users’ conversation. We also have to face the challenge of regulation, and government authorization. To solve this problem, we propose a peer to peer architecture for video chat that is private to the people involved in the conversation.

Link paper: https://drive.google.com/file/d/1RLIgvjrtHJrrZsttzxtxxppFCcdex85x/view?usp=sharing

Link demo: https://www.youtube.com/watch?v=9AcGwwbwukY