-
Notifications
You must be signed in to change notification settings - Fork 3
Specify the IPNI federation protocol #27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
6ebe728 to
fb22099
Compare
Specify the initial IPNI federation protocol which aims to achieve eventually consistent index records across a collaborating set of nodes. The federation protocol consists of four fundamental steps: Initialization, Periodic snapshot taking, Exchange of snapshots and Reconciliation. The protocol takes advantage of the immutability of advertisements exposed by each provider to resolve conflicts across indexers. The specification lists a set of APIs exposed by a participating indexer in order to enable the implementation of the federation protocol.
fb22099 to
5032b67
Compare
willscott
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The basic snapshot proposal here looks generally good with one comment
| The Inter-Planetary Network Indexer (IPNI) offers a routing system that enables mass advertisement of content addressed | ||
| data and lookup performance in order of milliseconds. | ||
| This is achieved by a design where a single IPNI instance strives to maintain full network state knowledge. | ||
| Many IPNI instances can be instantiated across the globe to offer fast local ingestion and lookup of advertised content. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another goal of federation may be to allocate portions of the index content to different indexers. I think this should be talked about - even if only to say why we do not do it. People will want to index some stuff for some time, but not everything forever, and will want to be part of a federation for some of the other features.
Having local fast lookup is a good goal, but does that apply to everything? It may not be necessary to be fast and local for all storage providers, but just the highly accessed ones. This means we may want to consider a strategy that allows some subset of service providers, and even epochs within those providers' chains, to be assigned to various indexers. Infrequently accessed storage provider content can be indexed by fewer indexers.
Such a strategy may also be useful so that indexers can choose which providers and epochs are the best to index and not have to index everything, even from specific provider(s).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though useful, this feature seems sufficiently complicated to need its own specification?
Aside from that, I think what's documented in the current spec doesn't technically limit that behaviour. For example, there is nothing that would stop a group of collaborating indexers to agree on only indexing advertisements up to the depth 1000 within the last week. The federation protocol described here would still work as long as all the participating nodes only include providers with such criteria.
WDYT @gammazero ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose that is the purpose of this spec - to make indexing consistent between indexers that cover identical index content. It means that if indexer-1 covers providers A and B, indexer-2 covers B and C, and indexer-3 covers C and A, these cannot be in the same federation even though they can check each other's consistency. Or, are they each part of 2 federations?
I can see index allocation, and incentivization as completely different specs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose that is the purpose of this spec
It all boils down to the words "Collaborating IPNI indexers".
The current spec aggregates provider lists across all indexers. This means one indeed cannot have an indexer that only indexes a proportion of providers instead of all them.
In the case you described, at least 2 of those indexers need to agree on the heuristic for selecting providers. Right now that heuristic in cid.contact is simply "All that you can get your hands on but not the ones you have not seen for a week". If indexer-1, 2 and 3 follow the current heuristic and are part of the same federation, then the should all end up with A, B and C in their providers list.
The specification of what that heuristic is, IMO is beyond the scope of this spec. I think we can both imagine case where it can grow quite complicated, and could potentially lead into a lot of head scratching for users trying to look things up.
Considering the spec does not prohibit the federation of a set of indexers that agree upon the same heuristic to select providers, my vote would be to keep the scope small and aim for a federation of cid.contact-like indexers going first.
In that world, one could use DNS names to form separate federations across indexers that only care about "hot off the wire" records. For example, today.cid.contact could be a new indexer part of a separate federation of indexers that only index ads published within the last 24 hours.
| for content propagation. As multiple IPNI instances exist across the globe, the potential for discrepancies in the state | ||
| of these chains is undeniable. It becomes essential, therefore, to devise strategies that can reconcile these | ||
| discrepancies effectively. | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a use case to check that providers are indexing what they claim to index? An indexer may try to only index popular, dropping data it does not get any hits on after some amount of time. Another indexer (or index checker) could prove that an indexer is not indexing some content that it should be, and that could result in some penalty or reputation damage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there certainly is but checking indexers' behaviour is an independent problem: it applies to a single index too, not just a cluster of federated ones.
Adding that to the federation protocol itself seems like possible future extension. But I am not sure if it is a blocker for having a federation.
| advertisement head. | ||
|
|
||
| While traversing advertisement chains, instances should prefer pulling ad chain from eachother's mirror instead of going | ||
| directly to the provider where possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unfair to the faster indexer, because all the other indexers will pull from the mirror of the fastest. Maybe we need a strategy that strongly encourages other indexers to add content from others' mirrors to their own.
For example, an indexer cannot pull advertisement data directly from another mirror, but instead must sync some portion of the mirror and then get the advertisements from the local mirror after syncing. This forces the receiving indexer to have a copy of the data in its own mirror.
The original source indexer may then redirect others, for a short time, to the destination indexer for the synced content. Or, maybe there is some check that if one indexer copies advertisement data from another indexer's mirror, the indexer that received the data subsequently serves it from its mirror.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In think in practice, the fastest indexer would/should have rate limiting in place, and other indexers should scatter traffic across multiple sources simply to pull ads from different providers in parallel.
The key point I am hoping to get across with this sentence is to say: try going to mirrors first before going to providers. That doesn't mean the same mirror all the time.
The "fastest indexer" would also depend on where the providers and indexers physically are. I expect that the indexer network would naturally grow around the organisational structure of its providers. For example, if there is a heavy presence of providers in Australia , it is likely that there would be more than one indexer in Australia to cope with the demand. At this point I find myself heavily speculating which is probably a hint that we might be over thinking it.
Another factor here is the loosely defined concept of Mirror itself: there is no specification yet that documents the API for a mirror. We are in the process of figuring out its economics, rate limits etc.
So i think we would probably be OK revisiting this once there is more data.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there anything that encourages one indexer to help another, when those indexers are not operated by the same organization? I guess just that it allows the other indexer to supply the ad data and reduces load on the first.
In other blockchain indexing ecosystems, the indexers are interested in handling as many queries as they can for the data they index - so they do not want to help other indexers. In the IPNI ecosystem, it appears that indexers are interested in doing as little of the required work as possible - so want other indexers to handle as many queries as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, I am OK with allowing indexers to share their mirror as much or as little as possible, with whatever limits they see fit, and specifying this a some later time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Captured this #29
aschmahmann
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to see the interest here 🙏. Apologies for the late review and hope it's of use.
| A snapshot consists of: | ||
|
|
||
| * **Epoch**: a monotonically increasing value that represents the time at which snapshot was formed. | ||
| * **Vector Clock**: the monotonically increasing vector clock value that corresponds to the IPNI instance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem like a vector clock since it's an integer and not a vector. Is the idea that despite there being no communication around the vector clock of all IPNI instances in the federation this is something nodes are tracking locally to understand how far behind they are?
| **Resolution Strategy:** | ||
| This represents a divergence in the advertisement chain, and careful reconciliation is required: | ||
|
|
||
| 1. Instance A and B determine which advertisement is the later one in the chain of advertisements provided by P1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like #30 would make life a lot easier here. However, IMO it's also worth calling out what should happen if the advertisement chains diverge (e.g. P1 forks their advertisement chain and give different information to A and B in order to mess with their reputations).
A couple ways to do this:
- Nice: Just choose one based on some criteria (e.g. based on sequence number and then lexicographical multihash ... although if you're worrying about keeping history, since the best chain could flip-flop between multiple, it could be annoying)
- Harsh: Keep both signed advertisements around and propagate to the network as a "every just delete this provider, they don't play by the rules". Might require keeping those two records around for a while to make sure all the providers know to remove them. Would require some support here in the sync protocol to enable communicating about this.
| * **Strong consistency**: While the IPNI federation protocol aims to ensure data consistency and availability, it does | ||
| not aim to atomically replicate every piece of indexed data across all nodes. For example, it is possible for some | ||
| records to remain partially available on one node until deletions are propagate through the network. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this statement the same as the "real-time synchronization" non-goal, or is this saying that it's actually ok if the federation never reaches consistency (e.g. it's ok for a single federation to include ipni.china that only hold Chinese providers, and ipni.us that only hold US providers)?
Note: I get that technically the HTTP APIs here don't really care, but in practice a federation (e.g. the /mainnet federation) would have to care about this.
| } | ||
| ``` | ||
|
|
||
| ### GET `/ipni/v1/fed/{CID}` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any expectations on how frequently this is meant to be called and in what scenarios? Given that calling this on one IPNI instance doesn't tell you anything about the others it's a little confusing why and how many other IPNIs you'd ask for /fed/{CID} vs asking the many providers for /ad/head (assuming /ad/head came with the sequence number)? I can see why you'd rather ask the federation nodes vs the providers, but if you have to ask multiple federation nodes to gain confidence then you're also adding overhead.
| #### Case 3: A provider is known by both but with different head advertisements | ||
|
|
||
| **Scenario:** | ||
| Instance A and Instance B both know Provider P1, but they have different latest advertisements (heads) for P1's chain. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand it today it's unspecified what the conditions are for keeping and service a provider record. IIUC cid.contact tries to contact providers once a day and being unresponsive for some number of those will result in eviction.
If those rules are inconsistent across the federation, or certain providers are simply inaccessible by certain IPNI instances (e.g. Chinese providers inaccessible by US IPNI instances), it could lead to unexpected results. Whether here or in a document describing the behavior for /mainnet this should be specified. Maybe it's as simple as saying a provider is banned unless any member of the federation vouches for them being reachable, or maybe some "vote" is required to ban them.
| This section clarifies aspects that the IPNI federation protocol will not address in its current iteration. | ||
| Here are the non-goals for the IPNI federation protocol: | ||
|
|
||
| * **Permissionless Membership**: This specification assumes a permissioned network, where the membership is controlled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given this is setting the stage for a permissioned network it seems like there should be a few more benefits to clients given they're easier to achieve than in a permissionless environment.
While it may be the case that this protocol can be meant for background sync (and helping fleets of nodes accumulate partial IPNI state such that an overlay network can compute the full state with a TBD protocol), my understanding from the motivation is that this is meant to help clients be able to use >1 indexer and evolve beyond the current state where in practice cid.contact is hard coded everywhere.
To that end some items that would be helpful include:
- How can I find the set of participants to query that are supposed to have the full IPNI state (given a root of trust like some bootstrap clients, a domain name, a blockchain identifier, ...)?
- At least for
/mainnetsome document that helps with rules/expectations for that network would be great. For example:- What are the expectations around initial ingestion time/rates (e.g. I understand there are concerns around people advertising long chains of tiny advertisements, here might be a place where those expectations can be set)
- Expectations around whether every "full node" should actually reach the same state. e.g. is choosing to exclude a provider grounds for expulsion?
- Expectations around how slow of a sync is slow enough to report an incident
- A description of how a basic client that's able to pull from multiple endpoints could work (if this is too rough to describe in text that's not great, but then a code reference implementation would be helpful)
|
|
||
| * **Epoch**: a monotonically increasing value that represents the time at which snapshot was formed. | ||
| * **Vector Clock**: the monotonically increasing vector clock value that corresponds to the IPNI instance. | ||
| * **Provider to Ingest State Map**: a map of provider ID to the latest advertisement CID processed by the IPNI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does "processed" mean there's an expectation here that if a client see "height 10" that all of the advertisements up to height 10 will be queryable (by clients or other IPNI nodes), or does that mean they've simply heard about the latest CID/height. Similarly, must height have been validated?
Note: I get that you may not want to be prescriptive here around caching / loadbalancing strategies (e.g. could a client end up with inconsistent state due to a load balancer routing the data to different backends rather than being sticky), either way it'd probably be helpful to call this out for any client implementers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, "processed" means that the advertised content should be queryable, otherwise that status has little useful external meaning. At very least it should mean that the indexer has pulled the index content from the provider and the corresponding advertisements are available from the indexer's mirror. I prefer it to mean the content is also queryable.
| consistently synchronized across all instances. Future enhancements may introduce additional layers of optimization or | ||
| fault-tolerance, but the essence will remain rooted in these foundational principles. | ||
|
|
||
| ### Snapshot Exchange |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @hsanjuan mentioned elsewhere you might consider if a more prescriptive syncing scheme (e.g. having IPNI nodes post CRDT updates to a shared kv-store like ipfs-cluster uses https://github.com/ipfs/go-ds-crdt) would help you here.
This could reduce the amount of data emitted by snapshots as well as help with the synchronization among many nodes rather than leaving the "how to find the latest state" work up to the individual instances.
|
Notes from going over:
|
|
|
||
| * **Epoch**: An epoch in the context of the IPNI federation protocol refers to a specific time period or iteration of | ||
| the protocol's operation. It can be thought of as a version or generation number that increments each time there's a | ||
| major operation or update, aiding in synchronization efforts across instances. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who determines what an epoch consists of? Different IPNI instances may have different rates of progress for different index providers, depending on network location, bandwidth limits, etc. If any single IPNI instance tries to declare an epoch based on its view of indexing progress, then other IPNI instance may never be able to reach consistency because they will always be ahead for some index providers.
If an epoch is a representation of an indexer's state (snapshot ID), other indexers will not know how to reach consistency by ingesting data from index providers, because it is necessary to know what data from each provider to ingest (how far along advertisement chain). If consistency is reached by exchanging records until there is an empty diff, then the indexer's data will be out of sync with the indexing progress for some index providers and further progress may cause an invalid data set.
I think the means an epoch is a set of agreed-on progress points for all index providers in the federation, that each IPNI node can independently determine. A distance from the root of each provider's advertisement chain could serve this purpose. A date may not be a good example unless the IPNI nodes have access to a shared clock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this sounds like consensus on a vector clock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Different epochs can contain different sets of index providers, but an epoch has a specific set of index providers associated with it. The creator of an epoch defines the set of index providers, and the progress points (epoch synchronization points) for each index provider.
An epoch needs to be uniquely identified if there are multiple creators of epochs. That could be the epoch data itself or a hash or CID of the epoch data. If not using the data itself, then different IPNI instances will need to be able to resolve an epoch identifier to the identified epoch data.
An epoch should contain the ID (hash) of the previous epoch.
|
|
||
| * **Reconciliation**: Reconciliation is the process of ensuring that two or more IPNI instances have consistent index | ||
| records. It involves comparing the content of instances, identifying discrepancies, and making updates so that all | ||
| participating instances have the same content. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to reach consistency, IPNI nodes would need to reach the same epoch. To reach an epoch an IPNI node needs to reach a pre-arranged point of progress for each index provider, but not progress beyond that point, until the pre-arranged point for all other index providers is reached.
Reconciliation requires IPNI nodes to wait for each other to reach the same point, defined as the epoch, in the ingestion of index data. I think Reconciliation should be a non-goal, and without Reconciliation, consistency is only be achieved when all IPNI nodes have finished processing data from all index providers and there have been no updates from any index providers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reconciliation requires IPNI nodes to wait for each other to reach the same point
Not necessarily. The ingest velocity is a separate thing from what the lookup returns in response to a given CID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but not progress beyond that point, until the pre-arranged point for all other index providers is reached.
is it possible for new entries to be added for a cid, but with a timestamp, such that responses to queries at an earlier epoch are then filtered back out, and as such the response is consistent for that epoch even when the node has moved past it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am thinking about the case where something in a later epoch deletes items from an earlier epoch, making those items no longer queryable. Would that prevent reaching consensus at that earlier epoch?
| * **Snapshot**: A snapshot refers to a captured state of the index records at a particular point in time consisting of | ||
| the providers list known by the indexer, the latest processed advertisement CID for each provider, an epoch number and | ||
| link to the previous snapshot. IPNI instances can share snapshots with each other to quickly update and reconcile | ||
| their content. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there a strict correspondence between a snapshot and an epoch?
If no (indicated by the original description), then a snapshot is the index data at the time the snapshot was created. Specifically it is the difference of then index data for this snapshot and of the previous snapshot. The snapshot will contains the items listed in the original description: providers list, latest processed advertisement CID for each provider, an epoch ID of the most recent past epoch ,and link to the previous snapshot.
If yes, then there is a 1:1 relationship between a snapshot and an epoch. The snapshot is index data the for the epoch associated with the snapshot. More specifically it is the _difference _ of index data between this snapshot and the previous. The snapshot is identified by the epoch it is associated with. The previous snapshot can be identified by the ID of the previous epoch, which is contained in the current epoch data. The advertisement CID for each provider is also part of the epoch data.
Having a strict correspondence between epoch and snapshot means that the data for an epoch can be retrieved from anywhere it is hosted.Any indexer can create a snapshot, and thereby validate other indexers' snapshots. Consistency can be achieved when syncing by snapshot.
gammazero
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some comments/questions around definition of epch.
gammazero
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to resolve some of the questions before merging this.
Specify the initial IPNI federation protocol which aims to achieve eventually consistent index records across a collaborating set of nodes.
The federation protocol consists of four fundamental steps: Initialization, Periodic snapshot taking, Exchange of snapshots and Reconciliation. The protocol takes advantage of the immutability of advertisements exposed by each provider to resolve conflicts across indexers.
The specification lists a set of APIs exposed by a participating indexer in order to enable the implementation of the federation protocol.
See rendered document.