Final Proposal: https://drive.google.com/file/d/1G8MKPq0wwN7Z83wQYncKorOK_4xJpni6/view?usp=sharing
Presentation: https://drive.google.com/file/d/1UC4GmRP-SzGlXY010vyLMKU9AB8n6Y7C/view?usp=sharing
Final Proposal: https://drive.google.com/file/d/1G8MKPq0wwN7Z83wQYncKorOK_4xJpni6/view?usp=sharing
Presentation: https://drive.google.com/file/d/1UC4GmRP-SzGlXY010vyLMKU9AB8n6Y7C/view?usp=sharing
Pitch 1: Website that creates tangible, engaging visualizations of probabilities.
Humans are poor at conceptualizing probability. I notice that one common reason anti-vaxxers use for refusing the vaccine is that they’d rather take their chances with catching covid than get the shot and come down with complications, which is far rarer than the chance of dying from covid. The objective of this project would be to create a website with high user engagement that would inform users of the odds of certain events happening using tangible, engaging visualizations. The user would be able to select from a list of events to compare. Visualizations that represent a dangerous event that is likely to occur will look visually more dangerous than those that are less likely to occur.
I have found some potentially useful datasets to use
Questions
Graphics
Some sources I found that may be useful/interesting:
Pitch 2: Formality indicator for Japanese text
Google translate and Deepl don’t really give users the option to indicate what context a user needs to use Japanese in. One key thing that distinguishes different styles of speech or written communication is the choice in vocabulary. On Jisho, an online Japanese dictionary, some words are marked with what context they are used in. i.e. colloquial, slang, sonkeigo,(Honorific or respectful language). My proposed project would use web scraping from online dictionaries to gather data about which words are used in which context and would tell the user what level of formality the text is in and who it would be appropriate to say/send to.
Ana sent me a list of tools that may be useful. Sudachipy seems promising
Other sources
Pitch 3: Visualization of cause of death
I feel like death statistics are inherently somewhat dehumanizing, and I want to use recent death statistics to create a more human representation. The user would be able to enter certain demographics, such as location, age, sex, etc. This would be pretty similar to pitch one, but I would want to represent each person with something that people would empathize with. Perhaps instead of using the data outright, I would gather patterns from the data to create a fake population. I’m not sure how I might do this
Questions
Problems
I have found several sources for cause of death data on kaggle, but not all are completely recent, which is ideally I am looking for. I did find recent data specifically for Brazil. The CDC has some data as well, I just need to figure out how to get at it.
Pitch #1
The use of computing resources allows the processing of biological data and computational analysis. However in order to conver this data into useful information requires the us of a large number oftools, parameters, and dynamically changing reference data. As a result workflow managers such asSnake and OpenWDL were created to make these workflow scalable, repeatable and shareable. However, many of these workflow managers offer ambiguity toward creating workflows often lacking the specificity many other workflows require. I plan on creating bioinformatics workflow in which can be specified to particular workflows.
Bioshake: A Haskell EDSL for Bioinformatics workflows
Justin Bedő. 2015. Experiences with workflows for automating data-intensive bioinformatics – biology direct. (August 2015). Retrieved January 9, 2022 from https://biologydirect.biomedcentral.com/articles/10.1186/s13062-015-0071-8
Reproducible, scalable, and shareable analysis pipelines with bioinformatics workflow managers
Laura Wratten, Andreas Wilm, and Jonathan Göke. 2021. Reproducible, scalable, and shareable analysis pipelines with bioinformatics workflow managers. (September 2021). Retrieved January 9, 2022 from https://www.nature.com/articles/s41592-021-01254-9
Paper highlights the key features of workflow manager and comapares commonly used approaches for bioinformatics workflows.
A versatile workflow to integrate RNA-seq genomic and transcriptomic data into mechanistic models of signaling pathways
Martín Garrido-Rodriguez et al. 2021. A versatile workflow to integrate RNA-seq genomic and transcriptomic data into mechanistic models of signaling pathways. (February 2021). Retrieved January 9, 2022 from
MIGNON is used for the analysis of RNA-Seq experiments. Moreover, it provides a framework for the integration of transcriptomic and genomic data based on a mechanistic model of signaling pathway activities that allows for a biological interpretation of the results, including profiling of cell activity. Entire pipeline was developed using the Workflow Descriptions Language (OpenWDL). All the steps of the pipeline were wrapped into WDL tasks that were designed to be executed on an independent unit of containerized software by using docker containers, which prevent deployment issues.Paper is an excellent source of seeing how WDL performs as workflow management language and the various problems that can occur from it.
Planning Bioinformatics workflows using an expert system.
Xiaoling Chen and Jeffrey T. Chang. 2017. Planning Bioinformatics workflows using an expert system. (January 2017). Retrieved January 9, 2022 from https://academic.oup.com/bioinformatics/article/33/8/1210/2801462?login=true
ACLIMATISE: Automated Generation of Tool Definitions for bioinformatics workflows.
Michael Milton and Natalie Thorne. 2020. ACLIMATISE: Automated Generation of Tool Definitions for bioinformatics workflows. (December 2020). Retrieved January 9, 2022 from https://academic.oup.com/bioinformatics/article/36/22-23/5556/6039117?login=true
Paper presents aCLImatise which is a utility for automatically generating tool definitions compatible with bioinformatics workflow languages, by parsing command-line help output. This utility can be used withing our workflow to create tool definitions.Workflow definitions must be customized according to the use-case, however tool definitions simply describe a piece of software, and are therefore not coupled to a single workflow or context this aCLImatise will not have a hindrance on workflow creations.
SciPipe: A workflow library for agile development of complex and dynamic bioinformatics pipelines.
Samuel Lampa, Martin Dahlö Jonathan Alvarsson, and Ola Spjuth. 2019. SciPipe: A workflow library for agile development of complex and dynamic bioinformatics pipelines. (April 2019). Retrieved January 9, 2022 from https://academic.oup.com/gigascience/article/8/5/giz044/5480570?login=true
Using rapid prototyping to choose a bioinformatics workflow management system
Michael J. Jackson, Edward Wallace, and Kostas Kavoussanakis. 2020. Using rapid prototyping to choose a bioinformatics Workflow Management System. (January 2020). Retrieved January 9, 2022, from https://www.biorxiv.org/content/10.1101/2020.08.04.236208v1.abstract
Pitch #2
New technologies have been evolving to aid life within the home. Video door bells, cameras and smart devices make many tasks much simpler than they use to be. However, the threat of security and ensuring that those with malicious intent are unable to hack and harm your home network has also increased, a failure in security could expose all of your personal information. As a result of this many organizations provide VPN services that have been developed as a means to protect people from the dangers of malicious hackers and malware. However, these same VPNS come with some faults such as higher cost and limitations as dictated by the provider , and the fact that paid services place you in the hands of the operator and its various cloud/network providers with no certainty that these providers will not snoop around in your data.
A VPN server that a user can host on there local machine solves all of these aforementioned problems with the added benefit of the user being able to securly access and maintain there home network.The server will be held in a virtual machine and will allow the user to be in complete control of it and its functions. This will increase efficiency of the VPN as the user no longer has to go through the network of the provider. My goal is to automate and open-source this process creating an easy launchable VPN server an average user can easily launch and use to maintain access to their home network.While at the same time being capable being edited and changed by the user for more robust security. I seek to compare this to similar paid services identifying which is more secure for the user.
What is a VPN?
Paul Ferguson and Geoff Huston . 1998. What is a VPN. (April 1998). Retrieved January 7, 2022 from https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.169.7689&rep=rep1&type=pdf
Paper defines what a VPN is. Further describes different types of VPN’s such as Network Layer VPN’s how they are constructed and the underlying protocols and techniques used create one. Breaks down the various VPN’s in accordance to the TCP/IP protocol. Describes VPN concepts such as Controlled route leading and Tunnelling. Overall this paper is a good source for understanding the basics of what a VPN is aswell aas the types, and procedures to setup one.
Implementation and analysis ipsec-vpn on cisco asa firewall using gns3 network simulator
Dwi Ely Kurniawan1, Hamdani Arif1, N. Nelmiawati1, Ahmad Hamim Tohari1, and Maidel Fani1. 2019. Implementation and analysis ipsec-vpn on cisco asa firewall using gns3 network simulator. (March 2019). Retrieved January 8, 2022 from https://iopscience.iop.org/article/10.1088/1742-6596/1175/1/012031/meta
This paper provides an example of constructing VPN and testing it using a virtual setting in which is a similar approach in which I am thinking of using. It is built using GNS3 network simulator software and virtual Cisco ASA Firewall. The result shows that VPN network connectivity is strongly influenced by the hardware used as well as depend on Internet bandwidth provided by Internet Service Provider (ISP). In addition to the security testing result shows that IPSec-based VPN can provide security against Man in the Middle (MitM) attacks. However, the VPN still has weaknesses against network attacks such as Denial of Service (DoS) that causes the VPN server can no longer serve VPN client and become crashes.
Enhancing security and privacy in local area network with TORVPN using Raspberry Pi as access point
Mohamad AfiqHakimi Rosli. 2019. Ehancing security and privacy in local area network with TORVPN using Raspberry Pi as access point . (October 2019). Retrieved January 8, 2022 from https://ir.uitm.edu.my/id/eprint/26068/
Provides another method of utilizing VPN servers to protect one’s local network.
Involves the Tor routing technique providing an extra layer of anonymity and encryption.
Although this approach requires the use of Rasberry pie for its implementation it would eliminate the need for installation and configuration of software while also making such services accessible to others.
A Remote Access Security Model based on Vulnerability Management
Samuel Ndichu, Sylvester McOyowo, and Henry Wekesa. 2020. A remote access security model based on … – MECS press. (October 2020). Retrieved January 11, 2022 from https://www.mecs-press.org/ijitcs/ijitcs-v12-n5/IJITCS-V12-N5-3.pdf
Client-Side Vulnerabilities in Commercial VPN’s
Bui Thanh, Rao Siddharth, Antikainen Markku, and Aura Tuomas. 2019. Client-side vulnerabilities in commercial vpns | springerlink. (November 2019). Retrieved January 11, 2022 from https://link.springer.com/chapter/10.1007/978-3-030-35055-0_7
Beyond the VPN: Practical Client Identity in an Internet with Widespread IP Address Sharing
Yu Liu and Craig A. Shue. 2021. Beyond the VPN: Practical client identity in an internet with widespread IP address sharing. (January 2021). Retrieved January 10, 2022 from https://ieeexplore.ieee.org/abstract/document/9314846
Research on network security of VPN technology
Zhiwei Xu and Jie Ni. 2021. Research on network security of VPN Technology. (May 2021). Retrieved January 11, 2022 from https://ieeexplore.ieee.org/abstract/document/9418865
Idea #1
Using machine learning to identify if a webpage is malicious, sometimes websites are blacklisted and that’s how they are identified as malicious but its cumbersome to do that for every website and constant new sites, use ML to identify malicious sites based on keyword density and improve upon existing methods. Other factors that could be used to identify the malicious website are URL length, website age, country of origin. Identifying the most important features to use for ML will be key to the project. A nuance I could add would be to identify the type of attack associated with the URL and rank its severity. Short URLs are a way that malicious attackers attempt to circumvent detection. Being able to expand short URLs in order to extract features could allow for current tools to be more effective.
Idea #2
Calculate expected goals of premier league soccer teams. Expected goals is commonly used as a predictor to help analysts identify skillful players and predict the winning team. There is a mass of datasets to use and techniques that could be analyzed for efficacy and improved upon. A possible nuance I could add is comparing expected goals of a player to their wages or expected goals to the teams’ total wage bill to find efficient teams.
Idea #3
Using machine learning to identify network attacks specifically DOS attacks. Most current methods use huge and cumbersome MIB databases. I would explore more efficient and less time and resource-consuming methods for classifying the data and identifying anomalies within network traffic. Data can be classified by where it comes from, to help determine if it may be malicious. There is less specific research on this topic as most of it is specific to a domain or the data is private.
Idea 1 : Maze Generation
I would like to examine maze generation algorithms for the purpose of generating more challenging domains for search algorithms to solve. Creating domains that challenge existing search algorithms can assist in the development of more robust search algorithms that can avoid certain pitfalls of existing algorithms. Additionally, mazes are widely understood and have efficient state changes, which can allow for more algorithm based examinations in the future. I would like to develop a system for rating the “hardness” of a given maze, as well as creating a maze generation algorithm that can generate mazes that have a higher or lower “hardness” rating.
Idea 2 : Cave Generation Using Cellular Automata
I would like to experiment with using cellular automata to generate cave structures. This has applications in procedural level generation for video games, artistic potential, and depending on the techniques used, it could also be useful in real world geological simulations. There has already been some work done in the area which gives ample room for extension and exploration. Most of the materials seem to be focused on 2D maze generation, so it may be fruitful to focus on generating 3D structures, or extending the existing techniques from 2D CA to 3D.
Idea 3 : Origami Crease Pattern Generation Tool
I would like to build an application that generates an origami crease pattern from a 3D model, with a stretch goal being to implement 3D scanning from a camera. Folding techniques used in origami are seeing more and more uses in engineering contexts. The new Webb telescope for instance used origami for its heat shield and mirrors. Being able to generate crease patterns based on a source model may allow for more widespread use of similar techniques. This project will involve image analysis and potentially machine learning.
Maze generation and solving:
Web application security:
Pitch #1
We all at some point have received that suspicious message stating that we are being watched or an annoying pop up in which insists that our devices are riddled with virus’s. I seek to find out how often and by what measure are people being trully attacked on there smart devices. As many smart devices do not offer robust cyber security systems they are more vulnerable to attack than other devices like computers. This software will provide an insight into the presence of hackers and malware on smart devices gathering data on the types of attacks to be wary of.
Pitch #2
New technologies have been evolving to aid life within the home. Video door bells, cameras and smart devices make many tasks much simpler than they use to be. However, the threat of security and ensuring that those with malicious intent are unable to hack and harm your home network has also increased, a failure in security could expose all of your personal information. As a result of this many organizations provide VPN services that have been developed as a means to protect people from the dangers of malicious hackers and malware. However, these same VPNS come with some faults such as higher cost and limitations as dictated by the provider , and the fact that paid services place you in the hands of the operator and its various cloud/network providers with no certainty that these providers will not snoop around in your data.
A VPN server that a user can host on there local machine solves all of these aforementioned problems with the added benefit of the user being able to securly access and maintain there home network.The server will be held in a virtual machine and will allow the user to be in complete control of it and its functions. This will increase efficiency of the VPN as the user no longer has to go through the network of the provider. My goal is to automate and open-source this process creating an easy launchable VPN server an average user can easily launch and use to maintain access to their home network.While at the same time being capable being edited and changed by the user for more robust security. I seek to compare this to similar paid services identifying which is more secure for the user.
Pitch #3
Many have been contacted by a scam caller and while most have the common sense to recognize the scam being played, thousands of Americans fall victim to such scams and end up paying a huge price for there mistake. While many assume these numbers mainly stem from the elderly, research has shown that people likey to fall for scams are broad in age group with the elderly being scammed for more money and the youth being scammed more frequently. To address this issue I seek to create a real time speech and text recognition answering bot that is capable of answering on phone calls from unknown numbers and through certain verbal ques will be able to deduce weather or not the person on the other end is scammer or not. With this bot I will be able to gather data on the most common types of scams and improve upon existing scam blocker software.