Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
100 changes: 100 additions & 0 deletions obip-0009.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
<pre>
OBIP: 9
Title: Data sharing via Pubsub
Author: Chris Pacia <chris@ob1.io>
Discussions-To: Github Issues
Status: Draft
Type: Standards Track
Created: 02/28/2018
Copyright: MIT
</pre>

## Abstract
This obip defines a new data sharing protocol using IPFS pubsub to ensure persistence of content when users are offline.

## Motivation
In basic IPFS content only persists when users go offline if other users have viewed/downloaded the content and those users remain online. Thus IPFS provides no hard garanutees that OpenBazaar stores will remain available when a user goes offline. Currently the protocol tries to remedy this problem by defining a list of "data peers" in the config file and
by pushing content to them on each publish to ensure persistence. The downside to this approach is the sharing of data is limited only to those peers found in the config file. While users remain free to change the default list of data peers, in practice very few do which elevates the default peers to the status of critcal infrastructure.

By creating a pubsub channel for user content anyone on the network is free to subscribe to the channel and receive updates to all new content published on the network. They are then free to sort through that content and decide what they want to re-seed.

## Specification

#### Protocol IDs
| Network | ID |
| ------------- |:-------------:|
| Mainnet | /openbazaar/floodsub/1.0.0 |
| Testnet | /openbazaar/floodsub/testnet/1.0.0 |

#### Channel ID

The pubsub channel ID is `zb2rhezUFMJAfLymEQwukCy5f57hmMUpKDzLu58kaEwtZV5cB` which is the CIDv1 (type `Raw`) of the sha256 multihash of []byte("floodsub:usercontent").

### Message Format
```protobuf
syntax = "proto2";

message RPC {
repeated SubOpts subscriptions = 1;
repeated Message publish = 2;

message SubOpts {
optional bool subscribe = 1; // subscribe or unsubcribe
optional string topicid = 2;
}
}

message Message {
optional bytes from = 1;
optional bytes data = 2;
optional bytes seqno = 3;
repeated string topicIDs = 4;
}
```

#### Subscribe

To `subscribe` to the content sharing channel a node MUST:

1) Make a `GET_PROVIDERS` query in the DHT using the channel ID as the key.
2) Pick at least eight of the returned providers and open a connection to them.
3) Both peers should exchange`hello` packets upon connection.
4) Set self as a provider for the channel ID in the DHT using the `ADD_PROVIDERS` command.

The `hello` packet contains only the list of topic subscriptions. Currently the only topic for this channel is `rawblocks`.

#### Publish

1) Prior to updating its root directory the node should fetch a list of CIDs in its graph.
2) After updating the root directory the node should calculate the diff between old graph and new graph.
3) Make a `GET_PROVIDERS` query in the DHT using the channel ID as the key.
4) Pick at least eight of the returned providers and open a connection to them. Check the `hello` packet for each to make sure they are subscribed to the `rawblocks` topic.
5) For each CID in the diff publish a `Message` to each peer using the corresponding raw block data as `Message.data` with a topic of `rawblocks`.
6) Disconnect from each peer.

#### Message Relay

Upon receving new `RPC` message the node should relay it to all other peers who are subscribed to the `rawblocks` topic.

#### Unsubscribing

Since it's not possible to remove yourself as a subscriber from the DHT prior to provider expiration, nodes should respond to any incoming `hello` packets with a `hello` packet with the `rawblocks` subscription set to false.

## CID or Raw Block?

This spec calls for publishers to share the full raw block data in the channel. A possibly more efficient alternative would be to only share the CIDs for each block and let the subscribing peers decide whether to download the full block via IPFS. However, if it turns out that most subscribers would prefer to download each block then this strategy could end up being less effient as it requires another round trip and DHT crawl per CID.

Additionally, publishers may be behind restricted NATs which may make establishing a connection with them difficult and prevent subscribers from acquring the data. By pushing the raw blocks directly we can get around NAT traversal issues.

## Future Extensions

We may wish to create a new topic which serializes the published data differently. For example we might want to specify the data type or originating peer such that subscribers can better determine if they wish to re-seed the block.

## References

https://github.com/ipld/cid

https://github.com/libp2p/go-floodsub

https://github.com/multiformats/go-multihash