Eternal File System

The Eternal File System is a combination of multiple techniques. By integrating four existing techniques - distributed hash tables, BT technology, Git versioning protocol, and SFS self-verifying file system - Protocol Lab creates a peer-to-peer hypermedia protocol in an attempt to create a faster, safer, and more open next-generation Internet. It can realize a global file storage system that is eternal available in the Internet and where data can be saved forever.

  • Distributed Hash Table (DHT), is a distributed storage and addressing technology. The key idea of DHT is to maintain a huge file indexed hash table across the network, and the entries of this hash table are shaped as <Key,Value>. The Key is usually the hash value of something of a file (it can also be the file name or file content description), while the Value is the IP address of the stored file. When querying, only the Key needs to be provided to query the address of the storage node from the table.

  • BT technology, BitTorrent, is a content distribution protocol. It uses content distribution and peer-to-peer technology to help users share large files with each other more efficiently and reduce the load on centralized servers. In a BitTorrent network, each user needs to upload and download data at the same time. The owner of the file sends it to one or more of the users, who in turn forward it to other users. And the users forward the portion of the file they own to each other until each user's download is complete. This method reduces the load on the download server, where the downloaders are also uploaders, spreading the bandwidth resources equally, thus greatly speeding up the average file download speed.

  • Ethanim innovated BitTorrent technology by supporting requests for data across files based on BT technology (BT can only transfer based on files). It also involves a credit and billing system to incentivize nodes to share, thus creating Bitswap (a data block exchange protocol).

  • Each node calculates the debt ratio based on the data sent and received with other nodes

r=bytesSent/(bytesRecv+1)

where e r indicates the debt ratio of the node, bytesSent indicates the number of bytes sent to the liability node, and bytesRecv indicates the number of bytes received from the liability node by other nodes. When more data is sent to the liability node and less bytes are received by other nodes from the liability node, the higher the liability is.

  • The debt ratio can be used to calculate the probability that a liability node will be able to receive the data

P=1−1/(1+exp(6−3r))

  • The Git versioning protocol, Ethanim uses GIT to solve the problem of data distribution and versioning. We always run into problems with storage or transfer pressure when transferring or modifying large files, while Git is great at version iteration. Git stores files in parts, calculates the hashes of each part, and uses these to build a directed acyclic graph (DAG) of the files. The root node of the DAG is the hash of the file. The benefit of the strategy is obvious: if you need to modify a file, you only need to modify a few nodes in the graph; if you need to share a file, it is equivalent to sharing this graph; if you need to transfer all the files, you can download and merge them according to the hash value in the graph.

  • SFS (Self-Certifying File System) is proposed to design a set of file systems shared across the Internet. Its purpose is that all SFS systems worldwide are under the same namespace. In SFS, sharing files will become very simple, only need to provide the file name.

  • Ethanim uses the SFS self-authenticating file system to solve the problem of convenient file sharing and trusted authentication.

  • The method generally used to solve the problem of trusted authentication is that all servers generate a pair of public and private keys. When a client sends a file to the server, the client first encrypts it with the server's public key, and the server decrypts it with the private key when it receives the file. But there still remains a problem, how to make all clients get the public key of the server?

  • SFS uses a new solution that embeds the public key information in the file name, named "self-certifying file name". This makes it unnecessary to implement key management inside the file system.

Last updated