diff --git a/obip-0009.md b/obip-0009.md new file mode 100644 index 0000000..d6e85bf --- /dev/null +++ b/obip-0009.md @@ -0,0 +1,100 @@ +
+OBIP: 9 +Title: Data sharing via Pubsub +Author: Chris Pacia+ +## Abstract +This obip defines a new data sharing protocol using IPFS pubsub to ensure persistence of content when users are offline. + +## Motivation +In basic IPFS content only persists when users go offline if other users have viewed/downloaded the content and those users remain online. Thus IPFS provides no hard garanutees that OpenBazaar stores will remain available when a user goes offline. Currently the protocol tries to remedy this problem by defining a list of "data peers" in the config file and +by pushing content to them on each publish to ensure persistence. The downside to this approach is the sharing of data is limited only to those peers found in the config file. While users remain free to change the default list of data peers, in practice very few do which elevates the default peers to the status of critcal infrastructure. + +By creating a pubsub channel for user content anyone on the network is free to subscribe to the channel and receive updates to all new content published on the network. They are then free to sort through that content and decide what they want to re-seed. + +## Specification + +#### Protocol IDs +| Network | ID | +| ------------- |:-------------:| +| Mainnet | /openbazaar/floodsub/1.0.0 | +| Testnet | /openbazaar/floodsub/testnet/1.0.0 | + +#### Channel ID + +The pubsub channel ID is `zb2rhezUFMJAfLymEQwukCy5f57hmMUpKDzLu58kaEwtZV5cB` which is the CIDv1 (type `Raw`) of the sha256 multihash of []byte("floodsub:usercontent"). + +### Message Format +```protobuf +syntax = "proto2"; + +message RPC { + repeated SubOpts subscriptions = 1; + repeated Message publish = 2; + + message SubOpts { + optional bool subscribe = 1; // subscribe or unsubcribe + optional string topicid = 2; + } +} + +message Message { + optional bytes from = 1; + optional bytes data = 2; + optional bytes seqno = 3; + repeated string topicIDs = 4; +} +``` + +#### Subscribe + +To `subscribe` to the content sharing channel a node MUST: + +1) Make a `GET_PROVIDERS` query in the DHT using the channel ID as the key. +2) Pick at least eight of the returned providers and open a connection to them. +3) Both peers should exchange`hello` packets upon connection. +4) Set self as a provider for the channel ID in the DHT using the `ADD_PROVIDERS` command. + +The `hello` packet contains only the list of topic subscriptions. Currently the only topic for this channel is `rawblocks`. + +#### Publish + +1) Prior to updating its root directory the node should fetch a list of CIDs in its graph. +2) After updating the root directory the node should calculate the diff between old graph and new graph. +3) Make a `GET_PROVIDERS` query in the DHT using the channel ID as the key. +4) Pick at least eight of the returned providers and open a connection to them. Check the `hello` packet for each to make sure they are subscribed to the `rawblocks` topic. +5) For each CID in the diff publish a `Message` to each peer using the corresponding raw block data as `Message.data` with a topic of `rawblocks`. +6) Disconnect from each peer. + +#### Message Relay + +Upon receving new `RPC` message the node should relay it to all other peers who are subscribed to the `rawblocks` topic. + +#### Unsubscribing + +Since it's not possible to remove yourself as a subscriber from the DHT prior to provider expiration, nodes should respond to any incoming `hello` packets with a `hello` packet with the `rawblocks` subscription set to false. + +## CID or Raw Block? + +This spec calls for publishers to share the full raw block data in the channel. A possibly more efficient alternative would be to only share the CIDs for each block and let the subscribing peers decide whether to download the full block via IPFS. However, if it turns out that most subscribers would prefer to download each block then this strategy could end up being less effient as it requires another round trip and DHT crawl per CID. + +Additionally, publishers may be behind restricted NATs which may make establishing a connection with them difficult and prevent subscribers from acquring the data. By pushing the raw blocks directly we can get around NAT traversal issues. + +## Future Extensions + +We may wish to create a new topic which serializes the published data differently. For example we might want to specify the data type or originating peer such that subscribers can better determine if they wish to re-seed the block. + +## References + +https://github.com/ipld/cid + +https://github.com/libp2p/go-floodsub + +https://github.com/multiformats/go-multihash ++Discussions-To: Github Issues +Status: Draft +Type: Standards Track +Created: 02/28/2018 +Copyright: MIT +