1gIexBPsHk3kzx-GJQOh5FA.png

DISTRIBUTED SOCIAL NETWORK. WEB-SCALE SOCIAL PEER-TO-PEER PROTOCOL | by Solcial | Jan, 2022

  • Solcial is being built to provide users with an experience as close as possible to Web2 social media platforms, while taking advantage of Web3 technologies.
  • One obstacle that any blockchain-based platform faces, is user fees. In order to eliminate user fees for non-token related operations while maintaining privacy and consensus, Solcial is building a peer-to-peer (P2P) layer anchored in Solana and built on top of libp2p that delivers content (feeds, likes, etc) through self-hosting and peer replication.
  • We have also developed a special type of node that has the same external protocol as a typical desktop or mobile app, with added optimizations designed to handle tens of thousands of concurrent peers, called Beacon Nodes. Beacons are essentially a set of pinning services that will preserve users’ content when creators go offline and make it available for other users; all user content is stored on IPFS and accessed through a P2P layer without relying on censorable servers or gateways.

This is the first post in a series of notes describing how the inner workings of the Solcial protocol give it unique censorship resistance properties while still maintaining reasonable data exchange performance.

This article highlights how the peer-to-peer network works and, while this article is intended for a more technical audience, we hope that the less technical Solcial followers will still find value in the technology behind the protocol.

Nodes communicate over the public internet, forming an overlay network specific to Solcial using a modified version of the HyparView¹ protocol.

Each node maintains a list of active peers with a size of Log2(N)+1 where N is the estimated size of the entire network. The value of N is pulled from the blockchain by finding the number of user accounts signed up to date, and is updated periodically as the number of users grows. Connections to the active peers are stable, bidirectional and persistent and those are the nodes they communicate with directly during gossip or any other p2p activity. Each node also maintains a list of passive peers with a size of 6*(Log2(N) + 1) that are used as backup peers in case any of the active peers disconnects or becomes unresponsive. The underlying protocol for connections between active peers is QUIC² with NOISE encryption, all powered by libp2p³.

NAT traversal is achieved using three methods that are used depending on the network circumstances of the peer. The first method is using IPv6 instead of IPv4 whenever possible because in that case there is no need for NATs, however according to statistics only under 40% of the internet is on IPv6.

The fallback mechanism is to use AutoNAT by employing the identity protocol of libp2p that asks other nodes in the network about the visible address of our machine. That address, being initiated by the peer behind a NAT, is already registered with their router to the target machine, however there exist circumstances when this does not work, then we fall back to the third and most reliable but expensive option.

The third option is to use libp2p circuit relay protocol that imitates the workings of a TURN protocol by having an intermediate node that relays packets between two peers behind NATs. This role can be assumed by beacon nodes and some peers with public IPs that opt into being relays with configured limits. More on beacon nodes later in this post.

Peer Identity

Peers are identified using their public key in the form of a multihash that is equal to the public key of their wallet address on Solana that is registered with the Solcial on-chain contract. The only way to assume a given public peer id is to possess the secret key of an ED25519 keypair.

This keypair is also used to establish encrypted channels for all message passing between peers, as well as a way to verify the authenticity of messages that are presumably produced by the owner of the wallet.

Overlay membership

Peers with public keys not registered on chain will not be allowed to join the HyparView network overlay and by that automatically excluded from the Solcial peer-to-peer network.

All peers wanting to join the Solcial p2p will send a JOIN message to one of the bootstrap nodes (some of those bootstrap nodes are also beacon nodes) on the /solcial/public topic. The introduction of topics to HyparView is one of our modifications to the original paper.

Any node receiving a JOIN request from a peer will first verify that the multihash exists on chain and is associated with a Solcial account, then it will send a FORWARDJOIN message to one of its active peers with an active random walk set to 3, they in turn will propagate that message to all their active peers, increasing the current hop number by one until it reaches 3 hops. This is a mitigation for DDoS attacks against bootstrap nodes, as well as an effective decentralization technique to spread and randomize the connectivity between nodes as much as possible.

Any node that is within 3 hops of the initial JOIN receiver will send a NEIGHBOR message to the initial JOIN requestor establishing an active persistent connection between them.

Periodically every 15 seconds nodes will broadcast a SHUFFLE message to a random active peer with a random sample of its active and passive peers and set the random walk length of this message to 4. This ensures that all peers in the network always have fresh information about other peers that they can store in their passive view for network repair purposes, whenever any of their active peers becomes unresponsive or explicitly disconnects.

Efficient Gossiping

Only active peers participate in the message dissemination between peers. Active peers form an overlay network over the public internet and there exists cycles in the connectivity graph between nodes. To enable efficient gossiping of messages between peers on the network we need to first minimize duplicate message propagation between peers and second, find the most optimal route for messages.

This is achieved by using the Epidemic Broadcast Trees algorithm that forms a minimum spanning tree between peers, by splitting peers into a group of eager-push nodes and lazy push nodes. Eager-push nodes always get messages forwarded as soon as they are received by any node.

Lazy push nodes get periodically, every 500ms, a batch of message IDs seen by the current node. Whenever a node observes that there is a message id in the lazy-push batch that was not received from its upstream parent node, it will rebuild (GRAFT) the connection and restore its active-push status. This constitutes the minimum spanning tree repair mechanism in case one of the nodes goes down and disjoint graphs are formed. GRAFT messages restore full network connectivity.

Whenever a received message is found to be a duplicate of a previously received message from an active node, then the connection to that node is considered a cycle in the graph and transformed into a Lazy-Push by sending a PRUNE message.

Another case for pruning an active connection and replacing it with one from the lazy list is when we observe that the number of hops a message traveled to reach us through a lazy node is less than 4 hops than of a message from the eager active peer. That makes up the broadcast tree optimization algorithm.

Messages gossiped through the p2p protocol described in the previous sections are IPFS Bitswap messages with CIDs of content produced by Solcial users. To better understand how Bitswap and IPFS is utilized in Solcial, first let’s describe the structure of a user profile.

Each user account on the network has something called a profile index. This index points to CIDs of the latest versions of their content. The index is an Eventually Consistent Sequence CRDT that forms an immutable log of operations performed by the account.

The profile index can be thought of as a linked list where each element points to the CID of the previous element and is accepted by peers as the next element in sequence only if it is signed by the private key of the account owner. A high-level logical example of a profile feed posted encoded in this format looks like this:

{
“author”: “12D3KooWSoeYKbpkb5UoL2T5eiomWRHdxR9cPC4tk11gKU89fFwT”,
“prev”: “QmYtUc4iTCbbfVSDNKvtQqrfyezPPnFvE33wFmutw9PBBk”,
“action”: “append-feed-post”,
“timestamp”: “2022–01–11T15:58Z”,
“params”: {
“content”: “QmV8cfu6n4NT5xRr2AHdKxFMTZEJrA44qgrBCr739BN9Wb”
“enckey”: “z2DhMLJmV8kNQm6zeWUrXQKtmzoh6YkKHSRxVSibscDQ7nq”
},
“signature”: “2Lpnvt23H6qHswCNPmwCCUSas7YNP[…]jV1dC9qdNPR4zDqsCuBX”
}

An entry like that has its own CID and gets broadcasted as IPFS PB-DAG¹⁰ object to the /solcial/content HyparView overlay topic. This entry is linked to its previous entry and any additional content by recursively querying linked CIDs users can get the entire account history and content.

This object is first validated by any receiving peer if the signature of the content matches the public key of the author. In case of a failed signature verification the sender peer is permanently banned from the current node for violating the protocol. A successful signature validation propagates the content to all other peers interested in content authored by this account.

When such a CID is received by a peer, it is added to the log of operations of the author. The sum of all operations performed by the author constitutes the current state of the profile.

By default each node pins and seeds its own account content and its friends content. There are special peers that we are building as part of our infrastructure that we’re calling beacon nodes and they are like regular user peers, except they are interested in everyone’s content and serve as sort of a Social-specific IPFS pinning service for all content in case all current seeders are offline. Think of them as nodes that are everyone’s friend. The network can function without them, but they provide an extra layer of content availability.

Read Access

Everyone on the Solcial p2p overlay can query, download and provide any user content by broadcasting IWANT and IHAVE Bitswap messages to the /solcial/content topic with CID of the top-level root index of the account, and then recursively request all linked CIDs.

Initial sync of an account, or getting account’s most recent entries is achieved by broadcasting a HEAD message to the /solcial/sync topic, where other peers that are seeders (including beacon nodes) respond with the CID of the latest content that they are aware of. The translation between user handles and user public keys is done by querying the Solana blockchain.

The previous entry field can be used to resolve conflicting HEADs and decide on the most recent entry.

Write Access

To be able to write to an account log the user must possess the private key of the ED25519 keypair that corresponds to their account ID. By having that key, the user is able to generate a valid signature of an entry that they are appending to the log that will not be rejected by other peers on the network.

Privileged Content Access

Some content posted by users is intended only for a select group of recipients, such as Tier-1 or Tier-2 subscription feeds. Those posts are first encrypted using a symmetric AES-256 encryption key. The hash of that key is attached to the content metadata in the original post.

The key is disseminated to a random subset of eligible peers over the NOISE encrypted channel established during libp2p handshake. The post author may choose to allow beacon nodes to also participate in this key exchange scheme to make the process near-instant for all users, but they don’t have to if they believe that their content is super sensitive. When an eligible user wants to decipher the content they send a GETKEY message to the /solcial/keyexchange overlay topic with the hash of the encryption key.

All peers in possession of that key that are online will check the eligibility of the peer for getting that key by querying the blockchain and verifying if the user holds the required tokens for accessing the content. After a successful verification the key is transmitted to the requesting peer using a direct quic/noise connection that bypasses the gossip protocol.

Platform support

This peer to peer protocol for censorship resistant content is available only for desktop and mobile clients and is not available over the web interface. This is due to browser limitations. Browsers are inherently designed as consumers of content hosted on servers and their security model forbids accepting random connections from other machines.

The underlying technical stack is implemented using Rust, libp2p with bindings to React Native and Tauri¹¹.

Leave a Comment