Phi Nguyen – Senior Capstone

with No Comments

Abstract

Modern internet architecture faces the challenge of centralized services by big tech companies, which capitalizes on the users’ information. Most of the well-known chat services at the moment have to depend on a third party server which stores the users’ conversation. We also have to face the challenge of regulation, and government authorization. To solve this problem, we propose a peer to peer architecture for video chat that is private to the people involved in the conversation.

Link paper: https://drive.google.com/file/d/1RLIgvjrtHJrrZsttzxtxxppFCcdex85x/view?usp=sharing

Link demo: https://www.youtube.com/watch?v=9AcGwwbwukY

Phi – Three Pitches

with No Comments

First Idea: Peer-to-peer chat webapp

This app allows two(or more) people to instantly and directly connect with each other and exchange messages without any third party authorization. It operates completely on a peer-to-peer network, established among the users. This solves the privacy and tracking concerns of users who want to have a personal chat with others without worrying about their data being watched and sniffed. The mechanism is fairly simple: the user can create a meeting, and anyone with the url can join the meeting without any hassle of logging in/authorization. 

Second Idea: Onion routing through an overlay network

Using the already existing IP address as identifier to build a peer-to-peer overlay network.  But instead of naked requests to the server, I want to wrap the request with some layer of protection so that it ensures the data being safe and unsniffed. I want to build a software on the client side that handles data encapsulation and identification in order to join the network. 

Third Idea: Stocks movement based on Social media

This is a AWS Lambda Serverless service that makes requests to Twitter tracking the frequency/occurrences of a stock name being mentioned and how it correlates to its movement in the stock market. The technology that could be used are a couple of Python scripts that make requests and analyze the data, possibly graphing or comparing with the closing price of a given stock. This does not attempt to predict a price in the future, but simply to see if there is correlation between a price movement versus the frequency of its name being mentioned in the social media. 

UPDATE: After some talks with Charlie, the first two ideas can be combined as one. The first two would require more work and harder to iterate but a great source for acquiring the fundamentals of networks. The last one can be a great place to learn about AWS and their services, which is essential in its own right.

Fourth Idea

In the internship this summer, I had many good experience coding and dealing with webservers. But I had one bad experience: I had to web-scrape information for a week. It was a necessary job since we had to gather data for research purpose. But it was gnawing to say the least. At the time I just wish there was better web-scraping technology out there to help me deal with that nightmare. 

I was able to write some software using some library but that did not make my life any less miserable. So I am thinking, maybe I want to make a web-scraper that is more general yet deal with information on a higher level. It can deal with almost any kind of information and any type of communication – whether it is static or ajax at load. And it can gather the common type of information that people search for: name, phone number, email, statistics, etc. 

Week 10 Updates

with No Comments

In the past week, I have:

  • Finished a simple Hangman Game in Elixir, I coded it up during the weekend. It can be found here: https://github.com/hungphi98/Hangman_Elixir. I learned a lot of things while making this tiny game: decoupling design; BEAM processes and concurrency; distributed client-side; and a couple of Elixir native concepts: Supervisor, Agent, and Application. I think these lessons will be extremely valuable to me once I started implementing my idea.
  • I started picking up three books and reading them at the same time, two books by Tanenbaum: Distributed System and Computer Network, another one is Handbook of Peer to peer networking. I find it much more useful to learn from reading books other than papers. They gave me a foundation of knowledge, and I don’t feel starved like I do when reading the papers. Some of the stuff I have picked up so far:
    • Scaling techniques
    • Basics of network security
    • Requirements and implications of a P2P network
  • I started seriously looking at the implementation of Blockchain. I have learned their basic protocol and had a rough idea of how they work. I have coded a basic blockchain using Flask but too shy to upload to GitHub since the code is extremely messy – I might clean it – probably not.

CS388 – Week 9 Update

with No Comments

This week I have:

  • Learned about the P2P and different Protocols in the implementation
  • Learned about the history of P2P, and different architectures in implementing a fully functional P2P File Sharing Application. It is quite colorful actually.
  • Made a rough draft of Outline Proposal
  • Learned about design patterns in Elixir. I am still very very new to the language. The backend architecture of Elixir is actually a lot different from all the web servers backend I have seen.
    • The code is really really tight – there is should be no place for redundancy, ever. A good Elixir code should be decoupled as much as possible for scaling later.
    • Perfect for real-time processing, and concurrency handling. – An absolutely perfect choice for this project. Fun fact, Elixir is built on top of Erlang VM. And Erlang is what helped WhatsApp become the WhatsApp we know today – at times WhatsApp had 1 million new users every day – and the scaling power of Erlang enabled instantaneous communication, with little to no failure of nodes. AT&T also used Erlang as its backbone for telecommunication – in Erlang, your application still runs while being updated! (That’s why you can make phone calls why AT&T updates their software in the background – mindblown!)
    • Supervision tree which automatically respawns failing nodes/processes –> guarantee availability.
    • Running on Erlang VM means the code in Elixir is compiled into Erlang bytecode, which runs on BEAM VM. Erlang processes are implemented entirely by the Erlang VM and have no connection to either OS processes or OS threads. So even if you are running an Erlang system of over ten million processes it is still only one OS process and one thread per core and completely isolated to your actual OS. Amazing!
    • There are more wonderful things about Elixir but I guess I stop ranting here.
  • I am trying to scale my current implementation of Baby Distributed Hash Table (still very early stage and primitive) to more nodes but there are some unexpected bugs that I have to study further. The Distributed Hash Table will become a crucial part of the query later in my application, along with the Merkle Tree.
  • I am studying the architecture of Napster – the killer P2P application that appears in the early 2000s that paved the way to P2P. However, I kinda want to improve/ or rather try with a slightly modified architecture – to negate the delegator/gateway node to make my system completely decentralized to absolutely destroy the single point of failure. I am not sure if this is possible. I might not have time to actually implement it in the near future but might be good to keep in mind.

Next week:

  • Continue with studying Elixir. The more I learn about it, the more I fall in love with this language. So elegant yet doing so much.
  • Continue to do more research on applications that have P2P architectures. I already saw some “grid architecture”, which is slightly different from P2P but have not taken a deep dive in it yet. So perhaps one of the things to look at. I have also looked at some apps other than Napster, and I wonder why most of them were implemented for Windows.
  • Contemplate over the current architecture that I have in mind and its purpose. Most of the P2P apps have faced a challenge in the legal issues regarding copyright. I guess I have to repurpose my app such that no such thing would happen if I decide to do hardcore and actually deploy it into use.