From 4f84a9c049c79a9e7a052d2faf271fc6357e265a Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Fri, 23 Apr 2021 19:54:38 -0500 Subject: [PATCH 01/43] first commit --- implementation_details.md | 153 ++++++++++++++++++++++++ network_replication.md | 244 ++++++++++++++++++++++++++++++++++++++ replication_concepts.md | 129 ++++++++++++++++++++ rfcs/DELETEME.md | 1 - template.md | 73 ------------ 5 files changed, 526 insertions(+), 74 deletions(-) create mode 100644 implementation_details.md create mode 100644 network_replication.md create mode 100644 replication_concepts.md delete mode 100644 rfcs/DELETEME.md delete mode 100644 template.md diff --git a/implementation_details.md b/implementation_details.md new file mode 100644 index 00000000..eaecf726 --- /dev/null +++ b/implementation_details.md @@ -0,0 +1,153 @@ + +# Implementation Details +## Delta Compression +TBD + +## Area of Interest +TBD + +## "Clock" Synchronization +Ideally, clients predict ahead by just enough to have their inputs reach the server right before they're needed. For some reason, lots of people arrive at the idea that clients should estimate the clock time on the server (with some SNTP handshake) and use that to schedule the next simulation step. + +That's overcomplicating it. What we really care about is: How much time passes between when the server receives my input and when that input is consumed? If the server simply tells clients how long their inputs are waiting in its buffer, the clients can use that information to converge on the correct lead. + +```rust +if received_newer_server_update: + // an exponential moving average is a simple smoothing filter + smoothed_age = (31 / 32) * smoothed_age + (1 / 32) * age + + // too late -> positive error -> speed up + // too early -> negative error -> slow down + error = target_age - smoothed_age + + // reset accumulator + accumulated_correction = 0.0 + + +time_dilation = remap(error + accumulated_correction, -max_error, max_error, -0.1, 0.1) +accumulated_correction += time_dilation * simulation_timestep + +tick_cost = (1.0 + time_dilation) * fixed_delta_time +``` + +If its inputs are arriving too early, a client can temporarily run fewer ticks each second to relax its lead. For example, a client simulating 10% slower would shrink their lead by 1 tick for every 10. + +Interpolation is the same. All that matters is the interval between received packets and how it varies. You want the interpolation delay to be as small as possible. + +```rust +if received_newer_server_update: + // an exponential moving average is simple smoothing filter + smoothed_delay = (31 / 32) * smoothed_delay + (1 / 32) * delay + smoothed_jitter = (31 / 32) * smoothed_jitter + (1 / 32) * abs(smoothed_delay - delay) + + target_interp_delay = smoothed_delay + (2.0 * smoothed_jitter); + smoothed_interp_delay = (31 / 32) * smoothed_interp_delay + (1 / 32) * (latest_snapshot_time - interp_time); + + // too early -> positive error -> slow down + // too late -> negative error -> speed up + error = -(target_interp_delay - smoothed_interp_delay) + + // reset accumulator + accumulated_correction = 0.0 + + +time_dilation = remap(error + accumulated_correction, -max_error, max_error, -0.1, 0.1) +accumulated_correction += time_dilation * time.delta_seconds() + +interp_time += (1.0 + time_dilation) * delta_time +interp_time = max(interp_time, predicted_time - max_lag_comp) +``` + +The key idea here is that simplifying the client-server relationship makes the problem easier. You *could* have the server apply inputs whenever they arrive, rolling back if necessary, but that would only complicate things. If the server never accepts late inputs and never changes its pace, no one needs to coordinate. + +## Lag Compensation +Lag compensation mainly deals with colliders. To avoid weird outcomes, lag compensation must run after all motion and physics systems. + +Once again, people get weird ideas about having the server estimate what interpolated state the client was looking at based on their RTT. Once again, that guesswork is unnecessary. + +Clients can just tell the server what they were looking at by bundling the interpolated tick numbers and the blend value inside the input payloads. + +``` + +tick number (predicted) +tick number (interpolated from) +tick number (interpolated to) +interpolation blend value + +``` +With this information, the server can reconstruct *exactly* what each client saw. + +Lag compensation goes like this: +1. Queue projectile spawns, tagged with the shooter interpolation data. +2. Restore all colliders to the earliest interpolated moment. +3. Replay forward back to the current tick, spawning the projectiles at the appropriate time and registering hits. + +After that's done, any surviving projectiles will exist in the correct time. The process is the same for raycasts weapons. + +Overwatch allows defensive abilities to mitigate lag-compensated shots. This is simple to do. If a player activates any defensive bonus, just apply it to all their buffered hitboxes. +[Link](https://youtu.be/W3aieHjyNvw?t=2492) + +Overwatch also reduces the number of intersection tests by calculating the movement envelope of each entity, the "sum" of its bounding volumes over the full lag compensation window, and only rewinding characters whose movement envelopes intersect projectiles. +[Link](https://youtu.be/W3aieHjyNvw?t=2226) + +For clients with very high ping, their interpolated time will lag too far behind their predicted time. Generally, you won't want to favor the shooter past a certain limit (e.g. 250ms), so those clients will have to extrapolate the difference. Not extrapolating is also valid, but then lagging clients would abruptly have to start leading their targets. +[Link](https://youtu.be/W3aieHjyNvw?t=2347) + +This limit is the only relation between the predicted time and the interpolated time. They're otherwise decoupled. + +## Smooth Rendering +Whenever clients receive an update with new remote entities, those entities shouldn't be rendered until that update is interpolated. + +Is an exponential decay enough for smooth error correction or are there better algorithms? + +## Prediction <-> Interpolation +TBD + +Clients can't directly modify the authoritative state, but they should be able to predict whatever they want locally. How though? The obvious implementation is to literally fork the latest authoritative state. If this copy ends up being too expensive, we probably use copy-on-write. + +Clients should predict the entities driven by their input, the entities they spawn (until confirmed), and any entities mutated as a result the first two. I think that should cover it. Predicting *everything* would be a compile-time choice. + +So we can predict with component granularity, the question is: How do we shift things between predicted and interpolated? Current idea is have things be interpolated by default (reset upon receiving a server update) and then using specialized change detection `DerefMut` magic to produce `Predicted`. + +All systems that handle "predictable" interactions (pushing a button, putting an item in your inventory) should run *before* physics. Everything should run before rendering. + +Cameras need a little special treatment. Inputs to the view rotation need to be accumulated at the render rate and re-applied just before rendering. + +Should UI be allowed to reference predicted state or only verified state? + +Events are tricky. We'll need some events that only trigger on authoritative changes and others that trigger on predicted changes with follow-up events on confirmed or cancelled later. + +We'll need something like the latter to handle sounds and particle effects. Those shouldn't be duplicated during rollbacks and should be faded out if mispredicted. + +## Predicting Entity Creation +This requires some special consideration. + +The naive solution is to have clients spawn dummy entities. When an update that confirms the result arrives, clients can simply destroy the dummy and spawn the true entity. This is a frankly poor solution because it prevents clients from smoothly blending these entities from predicted time into interpolated time. It won't look right. + +A better solution is for the server to assign each networked entity a global ID that the spawning client can predict and map to its local instance. + +- The simplest form of this would be an incrementing index whose upper bits are fixed to match the spawning player's ID. This is my recommendation. + +- Alternatively, PRNGs could be used to generate shared keys for pairing global and local IDs. Rather than predict the global ID, the client would predict the shared key. Server updates that confirm the predicted entity would include both its global ID and the shared key, which the client can then use to pair the IDs. This method adds complexity but bypasses the previous method's implicit entity limit. + +- An extreme potential solution would be to somehow bake global IDs directly into the memory allocation. If memory layouts are mirrored, relative pointers become global IDs, which don't need to be explicitly written into packets. This would save 4-8 bytes per entity before compression. + +## Unconditional Rollbacks +Every article on "rollback netcode" and "client-side prediction and server reconciliation" encourages having clients compare their predicted state to the authoritative state and reconciling *if* they mispredicted. But how do you actually detect a mispredict? + +I thought of two methods while I was writing this: + +1. Unordered scan looking for first difference. +2. Ordered scan to compute checksum and compare. + +The first option has an unpredictable speed. The second option requires a fixed walk of the game state (checksums *are* probably worth having even if only for debugging non-determinism). There may be options I didn't consider, but the point I'm trying to make is that detecting changes among large numbers of entities isn't cheap. + +Let's consider a simpler default: + +3. Always rollback and re-simulate. + +Now, you might be thinking, "Isn't that wasteful?" + +*If* gives a false sense of security. If I make a game and claim it can rollback 250ms, that basically should mean *any* 250ms. With no stuttering. If clients *always* rollback and re-sim, it'll be easier to profile and optimize for that. As a bonus, clients can immediately toss old predicted states. + +Constant rollbacks may sound expensive, but there were games with rollback running on the original Playstation 20+ years ago. \ No newline at end of file diff --git a/network_replication.md b/network_replication.md new file mode 100644 index 00000000..115b74a6 --- /dev/null +++ b/network_replication.md @@ -0,0 +1,244 @@ +# Feature Name: `networked-replication` + +## Summary + +This RFC describes an implementation of engine features for developing networked games. Its main focus is replication and its key interest is providing these systems transparently (i.e. minimal, if any, networking boilerplate). + +## Motivation + +Networking is unequivocally the most lacking feature in all general-purpose game engines thus far. "Networking" is actually a pretty loaded term. This RFC focuses on **replication**, the part of networking that deals with simulation behavior and the only one that directly involves the ECS. + +Most engines provide "low level" connectivity—virtual connections, optionally reliable UDP channels, rooms—and stop there. Those are not very useful without "high level" replication features—prediction, reconciliation, lag compensation, area of interest management, etc. + +> The goal of replication is to ensure that all of the players in the game have a consistent model of the game state. Replication is the absolute minimum problem which all networked games have to solve in order to be functional, and all other problems in networked games ultimately follow from it. - [Mikola Lysenko](https://0fps.net/2014/02/10/replication-in-networked-games-overview-part-1/) + +Bevy has an opportunity to become one of the first, if not *the* first, open game engines to offer a plug-and-play networking API. + +Among Godot, Unity, and Unreal, only Unreal provides any of these built-in (and dogfooded in Fortnite). + +IMO the absence of built-in replication systems leads many to conclude that every multiplayer game must need its own unique solution. This is not true. While the exact replication "strategy" depends of the game, all of them—lockstep, rollback, client-side prediction with server reconciliation—pull from the same bag of tricks. Their differences can be captured with simple configuration options. Really, only *massive* multiplayer games require custom solutions. + +In general, I think that building *up* from the socket layer leads to the wrong intuition about what "networking" is. If you start from what the simulation needs and design *down*, all the reasoning is clearer and routing becomes an implementation detail. + +What I hope to explore in this RFC is: +- How do game design and networking constrain each other? +- How do these constraints affect user decisions? +- What should developing a networked game look like in Bevy? + +## Guide-level explanation + +[TBD](../blob/main/replication_concepts.md) + +### Example: "Networked" Components + +```rust +#[derive(Replicate)] +struct NetworkTransform { + #[replicate(precision=0.001)] + translation: Vec3, + #[replicate(precision=0.01)] + rotation: Quat, + #[replicate(precision=0.1)] + scale: Vec3, +} + +#[derive(Replicate)] +struct Health { + #[replicate(precision=0.1, range=(0.0, 100.0))] + hp: f32, +} +``` + +### Example: "Networked" Systems +```rust +fn check_zero_health(mut query: Query<(&Health, &mut NetworkTransform)>){ + for (health, mut transform) in query.iter_mut() { + if health.hp <= 0.0 { + transform.translation = Vec3::ZERO; + } + } +} +``` + +### Example: "Networked" App +```rust +#[derive(Debug, Hash, PartialEq, Eq, Clone, SystemLabel)] +pub enum NetworkLabel { + Input, + Gameplay, + Physics, + LagComp +} + +fn main() { + App::build() + .add_plugins(DefaultPlugins) + .add_plugins(NetworkPlugins) + + // Add the fixed update state. + .add_state(AppState::NetworkFixedUpdate) + .run_criteria(FixedTimestep::step(1.0 / 60.0)) + + // Add our game systems: + .add_system_set( + SystemSet::new() + .label(NetworkLabel::Input) + .before(NetworkLabel::Gameplay) + .with_system(sample_inputs.system()) + ) + .add_system_set( + SystemSet::on_update(AppState::NetworkFixedUpdate) + .label(NetworkLabel::Gameplay) + .before(NetworkLabel::Physics) + .with_system(check_zero_health.system()) + // ... Most user systems would go here. + ) + #[server_only] + .add_system_set( + SystemSet::on_update(AppState::NetworkFixedUpdate) + .label(NetworkLabel::LagComp) + .after(NetworkLabel::Physics) + // ... + ) + // ... + .run(); +} +``` + +### Example Configuration Options +``` +- players: 32, +- max networked entities: 1024, +- replication strategy: snapshots, + - mode: listen server, + - simulation tick rate: 60Hz + - client send interval: 1 tick, + - server send interval: 2 ticks, + - client input delay: 0, + - server input delay: 2 ticks, + - prediction: local-only, + - rollback window: 250ms, + - min interpolation delay: 32ms, + - lag compensation: true + - compensation window: 200ms +``` + + +## Reference-level explanation + +[TBD](../blob/main/implementation_details.md) + +### Macros +- Adds `[repr(C)]` +- Float quantization and compression +- Conditional compilation of client and server logic +- Identifying components for snapshot generation + +### Saving and Restoring Game State +Requirements +- Replicable components must only be mutated in `NetworkFixedUpdate`. +- World needs to reserve a range of entity IDs and track metadata for them separately. +- Networked entities must be spawned as such. You cannot spawn a non-networked entity and "network it" later, at least not without some kind of RPC. + +Saving +- At the end of every fixed update, iterate the `Added`, `Changed`, and `Removed` for all replicable components and duplicate them to an isolated copy. +- This isolated copy would be a collection of sparse sets, for just the replicable components. Tables would be rebuilt when restoring. + +Packets +- For snapshots, also compute the changes as a XOR and copy that into a ring buffer of patches. XOR the latest patch with the earlier patches to bring them up-to-date. Finally, write the packets and pass them to the protocol layer. +- For eventual consistency, we need some metadata. Entities accrue send priority over time. We can use the magnitude of the changes (addition or removal would be largest magnitude) as the base amount to accrue. We can then run a bipartite AABB sweep-and-prune followed by a radial distance test to prioritize the entities physically inside each client's areas of interest. Then any user-defined prioritization rules could run. Finally, write the packets and pass them to the protocol layer. + +Restoring +- TBD + +### NetworkFixedUpdate +Clients +1. Poll for received updates. +2. Update simulation and interpolation timescales. +3. Sample and send inputs to server. +4. Rollback and re-sim (if received new update). +5. Simulate predicted tick. + +Server +1. Poll for received inputs. +2. Sample buffered inputs. +3. Simulate authoritative tick. +4. Duplicate state changes to copy. +5. Send updates to clients. + +Everything aside from the simulation steps can be generated automatically. + +### Networking Modes +- listen server + - client and server instances on same machine + - single player = listen server with dummy socket / no connections +- dedicated server +- relay + - for managing deterministic and client-authoritative games + - clock reference, input validation, interest management, etc. but no simulation + +## Drawbacks +- Possibly cursed macro magic. +- Writes to World directly. +- Seemingly limited to components that implement `Clone` and `Serialize`. + +## Rationale and alternatives +- Why is this design the best in the space of possible designs? + +Networking is a widely misunderstood problem domain and multiplayer often enters the conversation too far into development. The proposed implementation or interface should suffice for most games while minimizing design friction—users need only annotate gameplay-related components and systems, put those systems in `NetworkFixedUpdate`, and configure some settings. + +- What other designs have been considered and what is the rationale for not choosing them? + +Replication always boils down to sending inputs or state, so the space of alternative designs includes different choices for the end-user interface and different implementations of save/restore functions. + +Frankly, given the abundance of confusion surrounding networking, polluting the API with "networked" variants of structs and systems (aside from Transform, Rigidbody, etc.) would just make life harder for everybody, both game developers and Bevy maintainers. + +People who want to make multiplayer games want to focus on designing their game and not worry about how to implement prediction, how to serialize their game, how to keep packets under MTU, etc. All of that should just work. I think the ease of macro annotations is worth any increase in compile times when networking features are enabled. + +From their description, ["subworlds"](https://github.com/bevyengine/rfcs/pull/16) seem promising for generating snapshots and performing rollbacks, but the proposal needs more details on interop with the ECS and performance (so does this one lol). + +- What is the impact of not doing this? + +Without committing to support these features early, Bevy risks ending up like Unity, whose built-in features were too non-deterministic for the first kind of replication and whose only working solutions for the second kind appeared years later as [subscription-based](blank "monthly, per concurrent user, even when self-hosted") third-party plugins and couldn't integrate deeply enough to be transparent (at least not without duplicating parts of the engine). + +- Why is this important to implement as a feature of Bevy itself, rather than an ecosystem crate? + +I strongly doubt that fast, efficient, and transparent replication features can be implemented without directly manipulating a World and its component storages. + +## Unresolved questions +- What can't be serialized? +- What is the correct amount of isolation between replicable and non-replicable game state? Are we sure non-networked entities can't *become* networked through the addition of replicable components? +- Can we provide lints for undefined behavior like mutating networked state outside of `NetworkFixedUpdate`? +- How should UI widgets interact with networked state? Exclusively poll verified data? +- How should we deal with predicting and reconciling events and FX—animations, audio, particles? +- Do rollbacks break change detection? +- Can we replicate animations exactly without explicitly sending animation parameters? +- When sending partial state updates, how should we deal with weird stuff like there being references to entities that haven't been spawned or have been destroyed? + +## Future possibilities +- With some game state diffing tools, these replication systems could help detect non-determinism in other parts of the engine. + +- Much like how Unreal has Fortnite, it would help immensenly if Bevy had an official collection of multiplayer samples to dogfood these features. + +- Beyond replication, Bevy need only provide one good default for protocol and IO for the sake of completeness. I recommend dividing responsibilities as shown below to make it easy for developers to swap them with the [many](https://partner.steamgames.com/doc/features/multiplayer) [robust](https://developer.microsoft.com/en-us/games/solutions/multiplayer/) [platform](https://dev.epicgames.com/docs/services/en-US/Overview/index.html) [SDKs](https://docs.aws.amazon.com/gamelift/latest/developerguide/gamelift-intro.html). + + **replication** + - save and restore + - prediction + - serialization and compression + - interest management, prioritization, level-of-detail + - smooth rendering + - lag compensation + - statistics + + **protocol** + - (N)ACKs and reliability + - channels + - connection authentication and management + - encryption + - statistics + + **I/O** + - send, recv, poll, etc. + +Replication addresses all the underlying ECS interop, so it should be settled first. diff --git a/replication_concepts.md b/replication_concepts.md new file mode 100644 index 00000000..38ba40e5 --- /dev/null +++ b/replication_concepts.md @@ -0,0 +1,129 @@ +# Replication +Abstractly, you can think of a game as a pure function that accepts an initial state and player inputs and generates a new state. +```rust +let state[n+1] = simulate(&state[n], &inputs[n]); +``` +Fundamentally, if several players want to perform a synchronized simulation over a network, they have basically two options: + +**Active replication** +- Send their inputs to each other and independently and deterministically simulate the game. +- also called lockstep, state-machine synchronization, and "determinism" + +**Passive replication** +- Send their inputs to a single machine (the server) who simulates the game and broadcasts updates back. +- also called client-server, primary-backup, master-slave, and "state transfer" + +In other words, players can either run the "real" game or follow it. + +Although the distributed computing terminology is probably more useful, for the rest of this RFC, I'll refer to active and passive replication as determinism and state transfer, respectively. They're more commonly used in the gamedev context. + +## Why determinism? +Determinism is straightforward. It's basically local multiplayer but with really long, sometimes ocean-spanning controller cables. The netcode is virtually independent from the gameplay code, it simply supplies the inputs. + +Determinism has low infrastructure costs, both in terms of bandwith and server hardware. All steady-state network traffic is input, which is not only small but also compresses well. (Note that as player count increases, there *is* a crossover point where state transfer becomes more efficient). Likewise, as the game runs completely on the clients, there's no need to rent powerful servers. Relays are still handy for efficiently managing rooms and scaling to higher player counts, but those could be cheap VPS instances. + +Determinism is also tamperproof. It's impossible to do anything like speedhack or teleport as running these exploits would simply cause cheaters to desync. On the other hand, determinism inherently leaks all information. + +The biggest strength of determinism is also its biggest limitation: every client must run the *entire* game. While this works well for games with thousands of micro-managed entities like *Starcraft 2*, you won't be seeing games with expansive worlds like *Genshin Impact* networked this way any time soon. + +## Why state transfer? +Determinism is awesome when it fits but it's generally unavailable. Neither Godot nor Unity nor Unreal can make this guarantee for large parts of their engines, particularly physics. + +Whenever you can't have or don't want bit-perfect determinism, you use state transfer. + +The idea behind state transfer is the concept of **authority**. It's essentially ownership in Rust. Those who own any state are responsible for broadcasting up-to-date information about it. I sometimes see authority divided into *input* authority (control permission) and *state* authority (write permission), but usually authority means state authority. + +The server usually owns everything, but authority is very flexible. In games like *Destiny* and *Fall Guys*, clients own their movement state. Other games even trust clients to confirm hits. Distributing authority like this adds complexity and obviously leaves the door wide open for cheaters, but sometimes it's necessary. In VR, it makes sense to let clients claim and relinquish authority over interactable objects. + +## Why not messaging patterns? +The only other strategy you really see used for replication is messaging. Like RPCs or remote events. Not sure why, but it's what most people try the first time. + +Take chess for example. Instead of sending polled player inputs or the state of the chessboard, you could just send the moves like "white, e2 to e4," etc. + +Here's the issue. Messages are tightly coupled to their game's logic. They can't be generalized. Chess is simple—one turn, one event—but what about an FPS? What messages would it need? How many? When and where would those messages need be sent and received? + +If those messages have cascading effects, they can only be sent reliable, ordered. How can you build prediction and reconciliation when you can't drop a packet? +```rust +let mut s = state[n]; +for message in queue.iter() { + s.apply(&message); +} + +// The key thing to note is that state[n+1] +// cannot be correct unless all messages were +// applied and applied in the right order. +*state[n+1] = s; +``` + +Messages are the right tool for when you really do want explicit request-response interactions or for global alerts like players joining or leaving. They just aren't good for replication. They encourage poor ergonomics, with send and receive calls littered everywhere. Even if you collect and send messages in batches, they don't compress as well as inputs or state. + +# Latency +Networking a game simulation so that players who live in different locations can play together is an unintuitive problem. No matter how we physically connect their computers, they most likely won't be able to exchange data within one simulation step. + +## Lockstep +The simplest form of online multiplayer is lockstep. All clients simply block until they have everything needed to execute the next simulation step. This delay is fine for most turn-based games but feels awful for real-time games. + +## Local Input Delay +A partial solution is for each client to delay the local player input for some number of simulation steps, trading a small amount of responsiveness for more time to receive remote info. Doing this also reduces the perceived latency between players. Under stable network conditions, the game will run smoothly, but it still stutters when the window is missed. + +> determinism + lockstep + local input delay = delay-based netcode + +## Predict and Reconcile +A more elegant way to hide the input latency is local prediction. + +Instead of blocking, clients can substitute any missing information with reasonable guesses (often reusing the previous value) and just run the simulation. Guessing removes the need to wait, removing perceived input lag, but what if the guesses are wrong? + +Well, what a client can do later is restore its simulation to the last verified state and redo the mispredicted steps with the correct info. + +This retroactive correction is called **rollback** or **reconciliation** and with a high simulation rate and good visual smoothing, it's practically invisible. Adding local input delay reduces the amount of rollback. + +> determinism + predict-rollback + local input delay (optional) = rollback netcode + +Once again, determinism is an all or nothing deal. If you predict, you predict everything. + +State transfer has the flexibility to predict only *some* things, letting you offload expensive systems onto the server. Games like *Rocket League* still predict everything, including other clients (the server re-distributes their inputs along with game state so that this is more accurate). However, most games choose not to do this. It's more common for clients to predict only what they control and interact with. + +# Consistency +## Smooth Rendering and Lag Compensation +Predicting only *some* things adds implementation complexity. + +When clients predict everything, they produce renderable state at a fixed pace. Now, anything that isn't predicted must be rendered using data received from the server. The problem is that server updates are sent over a lossy, unreliable internet that disrupts any consistent spacing between packets. This means clients need to buffer incoming server updates long enough to have two authoritative updates to interpolate most of the time. + +Gameplay-wise, not predicting everything also divides entities between two points in time: a predicted time and an interpolated time. Clients see themselves in the future and everything else in the past. Because players demand a WYSIWYG experience, the server must compensate for this "remote lag" by allowing certain things, mainly projectiles, to interact with the past. + +Visually, we'll often have to blend between extrapolated and authoritative data. Simply interpolating between two authoritative updates is incorrect. The visual state can and will accrue errors, but that's what we want. Those can be tracked and smoothly reduced (to some near-zero threshold, then cleared). + +If it's unclear: Hard snap the actual game state to reconcile but softly blend the view. + +# Bandwidth +## How much can we fit into each packet? +Not a lot. + +You can't send arbitrarily large packets over the internet. The information superhighway has load limits. The conservative, almost universally supported "maximum transmissible unit" or MTU is 1280 bytes. Accounting for IP and UDP headers and some connection metadata, you realistically can send ~1200 bytes of game data per packet. + +If you significantly exceed this, some random stop along the way will delay the packet and break it up into fragments. + +[Fragmentation](https://packetpushers.net/ip-fragmentation-in-detail/) [sucks](https://blog.cloudflare.com/ip-fragmentation-is-broken) because it multiplies the likelihood of the overall packet being lost (all fragments have to arrive to read the full packet). Getting fragmented along the way is even worse because of the added delay. It's okay if the sender manually fragments their packet (like 2 or 3) *upfront*, although the higher loss does limit simulation rate, just don't rely on the internet to do it. + +## Okay, but that doesn't seem like much? +Well, there are two more reasons not to yeet giant 100kB packets across the network: +- Bandwidth costs are the lion's share of hosting expenses. +- Many players still have limited bandwidth. + +So unless we limit everyone to <20Hz tick rates, our only options are: +- Send smaller things. +- Send fewer things. + +### Snapshots +Alright then, state transfer. The most obvious strategy is to send full **snapshots**. All we can do with these is make them smaller (i.e. quantize floats, then compress everything). + +Fortunately, snapshots are very compressible. An extremely popular idea called **delta compression** is to send each client a diff (often with further compression on top) of the current snapshot and the latest one they acknowledged receiving. Clients can then use these to patch their existing snapshots into the current one. + +The server can fragment payloads as a last resort. + +### Eventual Consistency +When snapshots fail or hidden information is needed, the best alternative is to prioritize sending each client the state most relevant to them. This technique is commonly called **eventual consistency**. + +Determining relevance is often called **interest management** or **area of interest**. Each granular piece of state is given a "send priority" that accumulates over time and resets when sent. How quickly priority accumulates for different things is up to the developer, though physical proximity and visual salience usually have the most influence. + +Eventual consistency can be combined with delta compression, but I wouldn't recommend it. It's just too much bookkeeping. Unlike snapshots, the server would have to track the latest received state for each *item* on each client separately and create diffs for each client separately. \ No newline at end of file diff --git a/rfcs/DELETEME.md b/rfcs/DELETEME.md deleted file mode 100644 index 2b210d2f..00000000 --- a/rfcs/DELETEME.md +++ /dev/null @@ -1 +0,0 @@ -Dummy file for git, please delete once the first RFC is merged. diff --git a/template.md b/template.md deleted file mode 100644 index 8e7bc765..00000000 --- a/template.md +++ /dev/null @@ -1,73 +0,0 @@ -# Feature Name: (fill me in with a unique ident, `my_awesome_feature`) - -## Summary - -One paragraph explanation of the feature. - -## Motivation - -Why are we doing this? What use cases does it support? - -## Guide-level explanation - -Explain the proposal as if it was already included in the engine and you were teaching it to another Bevy user. That generally means: - -- Introducing new named concepts. -- Explaining the feature, ideally through simple examples of solutions to concrete problems. -- Explaining how Bevy users should *think* about the feature, and how it should impact the way they use Bevy. It should explain the impact as concretely as possible. -- If applicable, provide sample error messages, deprecation warnings, or migration guidance. -- If applicable, explain how this feature compares to similar existing features, and in what situations the user would use each one. - -## Reference-level explanation - -This is the technical portion of the RFC. Explain the design in sufficient detail that: - -- Its interaction with other features is clear. -- It is reasonably clear how the feature would be implemented. -- Corner cases are dissected by example. - -The section should return to the examples given in the previous section, and explain more fully how the detailed proposal makes those examples work. - -## Drawbacks - -Why should we *not* do this? - -## Rationale and alternatives - -- Why is this design the best in the space of possible designs? -- What other designs have been considered and what is the rationale for not choosing them? -- What is the impact of not doing this? -- Why is this important to implement as a feature of Bevy itself, rather than an ecosystem crate? - -## \[Optional\] Prior art - -Discuss prior art, both the good and the bad, in relation to this proposal. -This can include: - -- Does this feature exist in other libraries and what experiences have their community had? -- Papers: Are there any published papers or great posts that discuss this? - -This section is intended to encourage you as an author to think about the lessons from other tools and provide readers of your RFC with a fuller picture. - -Note that while precedent set by other engines is some motivation, it does not on its own motivate an RFC. - -## Unresolved questions - -- What parts of the design do you expect to resolve through the RFC process before this gets merged? -- What parts of the design do you expect to resolve through the implementation of this feature before the feature PR is merged? -- What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC? - -## \[Optional\] Future possibilities - -Think about what the natural extension and evolution of your proposal would -be and how it would affect Bevy as a whole in a holistic way. -Try to use this section as a tool to more fully consider other possible -interactions with the engine in your proposal. - -This is also a good place to "dump ideas", if they are out of scope for the -RFC you are writing but otherwise related. - -Note that having something written down in the future-possibilities section -is not a reason to accept the current or a future RFC; such notes should be -in the section on motivation or rationale in this or subsequent RFCs. -If a feature or change has no direct value on its own, expand your RFC to include the first valuable feature that would build on it. From de08dd5b286ef7e4e07bd37c87eccba802cc62aa Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Fri, 23 Apr 2021 19:57:01 -0500 Subject: [PATCH 02/43] Update network_replication.md fixed links --- network_replication.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/network_replication.md b/network_replication.md index 115b74a6..fc76a6b3 100644 --- a/network_replication.md +++ b/network_replication.md @@ -27,7 +27,7 @@ What I hope to explore in this RFC is: ## Guide-level explanation -[TBD](../blob/main/replication_concepts.md) +[TBD](../main/replication_concepts.md) ### Example: "Networked" Components @@ -126,7 +126,7 @@ fn main() { ## Reference-level explanation -[TBD](../blob/main/implementation_details.md) +[TBD](../main/implementation_details.md) ### Macros - Adds `[repr(C)]` From 883af5155d1f6cfef254c1a9e204fd7c32da19e5 Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Fri, 23 Apr 2021 21:29:08 -0500 Subject: [PATCH 03/43] Update network_replication.md Updating from Discord feedback. --- network_replication.md | 14 +++++--------- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git a/network_replication.md b/network_replication.md index fc76a6b3..32c8aee3 100644 --- a/network_replication.md +++ b/network_replication.md @@ -12,9 +12,7 @@ Most engines provide "low level" connectivity—virtual connections, optionally > The goal of replication is to ensure that all of the players in the game have a consistent model of the game state. Replication is the absolute minimum problem which all networked games have to solve in order to be functional, and all other problems in networked games ultimately follow from it. - [Mikola Lysenko](https://0fps.net/2014/02/10/replication-in-networked-games-overview-part-1/) -Bevy has an opportunity to become one of the first, if not *the* first, open game engines to offer a plug-and-play networking API. - -Among Godot, Unity, and Unreal, only Unreal provides any of these built-in (and dogfooded in Fortnite). +Bevy has an opportunity to become one of the first open game engines to offer a plug-and-play networking API. Among Godot, Unity, and Unreal, only Unreal provides any of these built-in (and dogfooded in Fortnite). IMO the absence of built-in replication systems leads many to conclude that every multiplayer game must need its own unique solution. This is not true. While the exact replication "strategy" depends of the game, all of them—lockstep, rollback, client-side prediction with server reconciliation—pull from the same bag of tricks. Their differences can be captured with simple configuration options. Really, only *massive* multiplayer games require custom solutions. @@ -123,7 +121,6 @@ fn main() { - compensation window: 200ms ``` - ## Reference-level explanation [TBD](../main/implementation_details.md) @@ -141,8 +138,9 @@ Requirements - Networked entities must be spawned as such. You cannot spawn a non-networked entity and "network it" later, at least not without some kind of RPC. Saving -- At the end of every fixed update, iterate the `Added`, `Changed`, and `Removed` for all replicable components and duplicate them to an isolated copy. +- At the end of every fixed update, iterate `Changed` and `Removed` for all replicable components and duplicate them to an isolated copy. - This isolated copy would be a collection of sparse sets, for just the replicable components. Tables would be rebuilt when restoring. +- (From their description, [subworlds](https://github.com/bevyengine/rfcs/pull/16) seem like they could also be used for generating snapshots and performing rollbacks, but I need more details. Might be a lot of overhead.) Packets - For snapshots, also compute the changes as a XOR and copy that into a ring buffer of patches. XOR the latest patch with the earlier patches to bring them up-to-date. Finally, write the packets and pass them to the protocol layer. @@ -185,7 +183,7 @@ Everything aside from the simulation steps can be generated automatically. ## Rationale and alternatives - Why is this design the best in the space of possible designs? -Networking is a widely misunderstood problem domain and multiplayer often enters the conversation too far into development. The proposed implementation or interface should suffice for most games while minimizing design friction—users need only annotate gameplay-related components and systems, put those systems in `NetworkFixedUpdate`, and configure some settings. +Networking is a widely misunderstood problem domain. The proposed implementation should suffice for most games while minimizing design friction—users need only annotate gameplay-related components and systems, put those systems in `NetworkFixedUpdate`, and configure some settings. - What other designs have been considered and what is the rationale for not choosing them? @@ -195,11 +193,9 @@ Frankly, given the abundance of confusion surrounding networking, polluting the People who want to make multiplayer games want to focus on designing their game and not worry about how to implement prediction, how to serialize their game, how to keep packets under MTU, etc. All of that should just work. I think the ease of macro annotations is worth any increase in compile times when networking features are enabled. -From their description, ["subworlds"](https://github.com/bevyengine/rfcs/pull/16) seem promising for generating snapshots and performing rollbacks, but the proposal needs more details on interop with the ECS and performance (so does this one lol). - - What is the impact of not doing this? -Without committing to support these features early, Bevy risks ending up like Unity, whose built-in features were too non-deterministic for the first kind of replication and whose only working solutions for the second kind appeared years later as [subscription-based](blank "monthly, per concurrent user, even when self-hosted") third-party plugins and couldn't integrate deeply enough to be transparent (at least not without duplicating parts of the engine). +Without committing to support these features early, Bevy risks ending up like Unity, whose built-in features were too non-deterministic for the first kind of replication and whose only working solutions for the second are paid third-party plugins that couldn't integrate deeply enough to be transparent (at least not without duplicating parts of the engine). - Why is this important to implement as a feature of Bevy itself, rather than an ecosystem crate? From 4c5b8860d9ff8c0abb4c953c775452bf75b4f354 Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Fri, 23 Apr 2021 22:10:21 -0500 Subject: [PATCH 04/43] fixed some grammar errors --- implementation_details.md | 32 ++++++++++++++++---------------- network_replication.md | 12 ++++++------ 2 files changed, 22 insertions(+), 22 deletions(-) diff --git a/implementation_details.md b/implementation_details.md index eaecf726..48ddba64 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -7,7 +7,7 @@ TBD TBD ## "Clock" Synchronization -Ideally, clients predict ahead by just enough to have their inputs reach the server right before they're needed. For some reason, lots of people arrive at the idea that clients should estimate the clock time on the server (with some SNTP handshake) and use that to schedule the next simulation step. +Ideally, clients predict ahead by just enough to have their inputs reach the server right before they're needed. For some reason, people frequently arrive at the idea that clients should estimate the clock time on the server (with some SNTP handshake) and use that to schedule the next simulation step. That's overcomplicating it. What we really care about is: How much time passes between when the server receives my input and when that input is consumed? If the server simply tells clients how long their inputs are waiting in its buffer, the clients can use that information to converge on the correct lead. @@ -52,7 +52,7 @@ if received_newer_server_update: time_dilation = remap(error + accumulated_correction, -max_error, max_error, -0.1, 0.1) -accumulated_correction += time_dilation * time.delta_seconds() +accumulated_correction += time_dilation * delta_time interp_time += (1.0 + time_dilation) * delta_time interp_time = max(interp_time, predicted_time - max_lag_comp) @@ -61,9 +61,9 @@ interp_time = max(interp_time, predicted_time - max_lag_comp) The key idea here is that simplifying the client-server relationship makes the problem easier. You *could* have the server apply inputs whenever they arrive, rolling back if necessary, but that would only complicate things. If the server never accepts late inputs and never changes its pace, no one needs to coordinate. ## Lag Compensation -Lag compensation mainly deals with colliders. To avoid weird outcomes, lag compensation must run after all motion and physics systems. +Lag compensation mainly deals with colliders. To avoid weird outcomes, lag compensation needs to run after all motion and physics systems. -Once again, people get weird ideas about having the server estimate what interpolated state the client was looking at based on their RTT. Once again, that guesswork is unnecessary. +Again, people get weird ideas about having the server estimate what interpolated state the client was looking at based on their RTT. Again, that kind of guesswork is unnecessary. Clients can just tell the server what they were looking at by bundling the interpolated tick numbers and the blend value inside the input payloads. @@ -78,16 +78,16 @@ interpolation blend value With this information, the server can reconstruct *exactly* what each client saw. Lag compensation goes like this: -1. Queue projectile spawns, tagged with the shooter interpolation data. +1. Queue projectile spawns, tagged with their shooter's interpolation data. 2. Restore all colliders to the earliest interpolated moment. -3. Replay forward back to the current tick, spawning the projectiles at the appropriate time and registering hits. +3. Replay forward to the current tick, spawning the projectiles at the appropriate times and registering hits. -After that's done, any surviving projectiles will exist in the correct time. The process is the same for raycasts weapons. +After that's done, any surviving projectiles will exist in the correct time. The process is the same for raycast weapons. -Overwatch allows defensive abilities to mitigate lag-compensated shots. This is simple to do. If a player activates any defensive bonus, just apply it to all their buffered hitboxes. +*Overwatch* allows defensive abilities to mitigate lag-compensated shots. This is simple to do. If a player activates any defensive bonus, just apply it to all their buffered hitboxes. [Link](https://youtu.be/W3aieHjyNvw?t=2492) -Overwatch also reduces the number of intersection tests by calculating the movement envelope of each entity, the "sum" of its bounding volumes over the full lag compensation window, and only rewinding characters whose movement envelopes intersect projectiles. +*Overwatch* also reduces the number of intersection tests by calculating the movement envelope of each entity, the "sum" of its bounding volumes over the full lag compensation window, and only rewinding characters whose movement envelopes intersect projectiles. [Link](https://youtu.be/W3aieHjyNvw?t=2226) For clients with very high ping, their interpolated time will lag too far behind their predicted time. Generally, you won't want to favor the shooter past a certain limit (e.g. 250ms), so those clients will have to extrapolate the difference. Not extrapolating is also valid, but then lagging clients would abruptly have to start leading their targets. @@ -103,11 +103,11 @@ Is an exponential decay enough for smooth error correction or are there better a ## Prediction <-> Interpolation TBD -Clients can't directly modify the authoritative state, but they should be able to predict whatever they want locally. How though? The obvious implementation is to literally fork the latest authoritative state. If this copy ends up being too expensive, we probably use copy-on-write. +Clients can't directly modify the authoritative state, but they should be able to predict whatever they want locally. One obvious implementation is to literally fork the latest authoritative state. If copying the full state ends up being too expensive, we can probably use a copy-on-write layer. -Clients should predict the entities driven by their input, the entities they spawn (until confirmed), and any entities mutated as a result the first two. I think that should cover it. Predicting *everything* would be a compile-time choice. +Clients should predict the entities driven by their input, the entities they spawn (until confirmed), and any entities mutated as a result of the first two. I think that should cover it. Predicting *everything* would be a compile-time choice. -So we can predict with component granularity, the question is: How do we shift things between predicted and interpolated? Current idea is have things be interpolated by default (reset upon receiving a server update) and then using specialized change detection `DerefMut` magic to produce `Predicted`. +So we can predict with component granularity, but how do we shift things between prediction and interpolation? My current idea is for everything to default to interpolation (reset upon receiving a server update) and then use specialized change detection `DerefMut` magic to produce `Predicted`. All systems that handle "predictable" interactions (pushing a button, putting an item in your inventory) should run *before* physics. Everything should run before rendering. @@ -115,14 +115,14 @@ Cameras need a little special treatment. Inputs to the view rotation need to be Should UI be allowed to reference predicted state or only verified state? -Events are tricky. We'll need some events that only trigger on authoritative changes and others that trigger on predicted changes with follow-up events on confirmed or cancelled later. +Events are tricky. We'll need some events that only trigger on authoritative changes and others that trigger on predicted changes with follow-up confirmed or cancelled events. We'll need something like the latter to handle sounds and particle effects. Those shouldn't be duplicated during rollbacks and should be faded out if mispredicted. ## Predicting Entity Creation This requires some special consideration. -The naive solution is to have clients spawn dummy entities. When an update that confirms the result arrives, clients can simply destroy the dummy and spawn the true entity. This is a frankly poor solution because it prevents clients from smoothly blending these entities from predicted time into interpolated time. It won't look right. +The naive solution is to have clients spawn dummy entities. When an update that confirms the result arrives, clients can simply destroy the dummy and spawn the true entity. IMO this is a poor solution because it prevents clients from smoothly blending these entities from predicted time into interpolated time. It won't look right. A better solution is for the server to assign each networked entity a global ID that the spawning client can predict and map to its local instance. @@ -130,7 +130,7 @@ A better solution is for the server to assign each networked entity a global ID - Alternatively, PRNGs could be used to generate shared keys for pairing global and local IDs. Rather than predict the global ID, the client would predict the shared key. Server updates that confirm the predicted entity would include both its global ID and the shared key, which the client can then use to pair the IDs. This method adds complexity but bypasses the previous method's implicit entity limit. -- An extreme potential solution would be to somehow bake global IDs directly into the memory allocation. If memory layouts are mirrored, relative pointers become global IDs, which don't need to be explicitly written into packets. This would save 4-8 bytes per entity before compression. +- A more extreme solution would be to somehow bake global IDs directly into the memory allocation. If memory layouts are mirrored, relative pointers become global IDs, which don't need to be explicitly written into packets. This would save 4-8 bytes per entity before compression. ## Unconditional Rollbacks Every article on "rollback netcode" and "client-side prediction and server reconciliation" encourages having clients compare their predicted state to the authoritative state and reconciling *if* they mispredicted. But how do you actually detect a mispredict? @@ -148,6 +148,6 @@ Let's consider a simpler default: Now, you might be thinking, "Isn't that wasteful?" -*If* gives a false sense of security. If I make a game and claim it can rollback 250ms, that basically should mean *any* 250ms. With no stuttering. If clients *always* rollback and re-sim, it'll be easier to profile and optimize for that. As a bonus, clients can immediately toss old predicted states. +*If* gives a false sense of security. If I make a game and claim it can rollback 250ms, that basically should mean *any* 250ms, with no stuttering. If clients *always* rollback and re-sim, it'll be easier to profile and optimize for that. As a bonus, clients can immediately toss old predicted states. Constant rollbacks may sound expensive, but there were games with rollback running on the original Playstation 20+ years ago. \ No newline at end of file diff --git a/network_replication.md b/network_replication.md index 32c8aee3..5d7165b1 100644 --- a/network_replication.md +++ b/network_replication.md @@ -6,13 +6,13 @@ This RFC describes an implementation of engine features for developing networked ## Motivation -Networking is unequivocally the most lacking feature in all general-purpose game engines thus far. "Networking" is actually a pretty loaded term. This RFC focuses on **replication**, the part of networking that deals with simulation behavior and the only one that directly involves the ECS. +Networking is unequivocally the most lacking feature in all general-purpose game engines thus far. "Networking" is actually a pretty loaded term. This RFC focuses on *replication*, the part of networking that deals with simulation behavior and the only one that directly involves the ECS. Most engines provide "low level" connectivity—virtual connections, optionally reliable UDP channels, rooms—and stop there. Those are not very useful without "high level" replication features—prediction, reconciliation, lag compensation, area of interest management, etc. > The goal of replication is to ensure that all of the players in the game have a consistent model of the game state. Replication is the absolute minimum problem which all networked games have to solve in order to be functional, and all other problems in networked games ultimately follow from it. - [Mikola Lysenko](https://0fps.net/2014/02/10/replication-in-networked-games-overview-part-1/) -Bevy has an opportunity to become one of the first open game engines to offer a plug-and-play networking API. Among Godot, Unity, and Unreal, only Unreal provides any of these built-in (and dogfooded in Fortnite). +Bevy has an opportunity to become one of the first open game engines to offer a truly plug-and-play networking API. Among Godot, Unity, and Unreal, only Unreal provides something like that built-in (the Replication Graph plugin they dogfooded in Fortnite). IMO the absence of built-in replication systems leads many to conclude that every multiplayer game must need its own unique solution. This is not true. While the exact replication "strategy" depends of the game, all of them—lockstep, rollback, client-side prediction with server reconciliation—pull from the same bag of tricks. Their differences can be captured with simple configuration options. Really, only *massive* multiplayer games require custom solutions. @@ -127,9 +127,9 @@ fn main() { ### Macros - Adds `[repr(C)]` -- Float quantization and compression -- Conditional compilation of client and server logic - Identifying components for snapshot generation +- Implement float quantization and compression +- Conditional compilation of client and server logic ### Saving and Restoring Game State Requirements @@ -195,7 +195,7 @@ People who want to make multiplayer games want to focus on designing their game - What is the impact of not doing this? -Without committing to support these features early, Bevy risks ending up like Unity, whose built-in features were too non-deterministic for the first kind of replication and whose only working solutions for the second are paid third-party plugins that couldn't integrate deeply enough to be transparent (at least not without duplicating parts of the engine). +It'll only grow more difficult to add these features as time goes on. Take Unity for example. Its built-in features are too non-deterministic and its only working solutions for state transfer are paid third-party assets. Thus far, said assets cannot integrate deeply enough to be transparent (at least not without custom memory management and duplicating parts of the engine). - Why is this important to implement as a feature of Bevy itself, rather than an ecosystem crate? @@ -218,7 +218,7 @@ I strongly doubt that fast, efficient, and transparent replication features can - Beyond replication, Bevy need only provide one good default for protocol and IO for the sake of completeness. I recommend dividing responsibilities as shown below to make it easy for developers to swap them with the [many](https://partner.steamgames.com/doc/features/multiplayer) [robust](https://developer.microsoft.com/en-us/games/solutions/multiplayer/) [platform](https://dev.epicgames.com/docs/services/en-US/Overview/index.html) [SDKs](https://docs.aws.amazon.com/gamelift/latest/developerguide/gamelift-intro.html). - **replication** + **replication** (this RFC) - save and restore - prediction - serialization and compression From c2fe4d0421aec55d5afc1d7fcc4864c4fe9dbe7a Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Sat, 24 Apr 2021 05:22:00 -0500 Subject: [PATCH 05/43] Update network_replication.md --- network_replication.md | 56 ++++++++++++++++++++++-------------------- 1 file changed, 30 insertions(+), 26 deletions(-) diff --git a/network_replication.md b/network_replication.md index 5d7165b1..ce05f55b 100644 --- a/network_replication.md +++ b/network_replication.md @@ -6,17 +6,19 @@ This RFC describes an implementation of engine features for developing networked ## Motivation -Networking is unequivocally the most lacking feature in all general-purpose game engines thus far. "Networking" is actually a pretty loaded term. This RFC focuses on *replication*, the part of networking that deals with simulation behavior and the only one that directly involves the ECS. +Networking is unequivocally the most lacking feature in all general-purpose game engines. -Most engines provide "low level" connectivity—virtual connections, optionally reliable UDP channels, rooms—and stop there. Those are not very useful without "high level" replication features—prediction, reconciliation, lag compensation, area of interest management, etc. +Bevy has an opportunity to be among the first open game engines to provide a truly plug-and-play networking API. This RFC focuses on *replication*, the part of networking that deals with simulation behavior and the only one that directly involves the ECS. > The goal of replication is to ensure that all of the players in the game have a consistent model of the game state. Replication is the absolute minimum problem which all networked games have to solve in order to be functional, and all other problems in networked games ultimately follow from it. - [Mikola Lysenko](https://0fps.net/2014/02/10/replication-in-networked-games-overview-part-1/) -Bevy has an opportunity to become one of the first open game engines to offer a truly plug-and-play networking API. Among Godot, Unity, and Unreal, only Unreal provides something like that built-in (the Replication Graph plugin they dogfooded in Fortnite). +Most engines provide "low level" connectivity—virtual connections, optionally reliable UDP channels, rooms—and stop there. Those are not very useful to developers without "high level" replication features—prediction, reconciliation, lag compensation, interest management, etc. -IMO the absence of built-in replication systems leads many to conclude that every multiplayer game must need its own unique solution. This is not true. While the exact replication "strategy" depends of the game, all of them—lockstep, rollback, client-side prediction with server reconciliation—pull from the same bag of tricks. Their differences can be captured with simple configuration options. Really, only *massive* multiplayer games require custom solutions. +Among Godot, Unity, and Unreal, only Unreal provides [any of these](https://docs.unrealengine.com/en-US/InteractiveExperiences/Networking/ReplicationGraph/index.html) built-in. -In general, I think that building *up* from the socket layer leads to the wrong intuition about what "networking" is. If you start from what the simulation needs and design *down*, all the reasoning is clearer and routing becomes an implementation detail. +IMO the broader absence of these systems leads many to conclude that every multiplayer game must need its own unique solution. This is not true. While the exact replication "strategy" depends of the game, all of them—lockstep, rollback, client-side prediction with server reconciliation, etc.—pull from the same bag of tricks. Their differences can be captured with simple configuration options. Really, only *massive* multiplayer games require custom solutions. + +In general, I think that building *up* from the socket layer leads to the wrong intuition about what "networking" is. If you start from what the simulation needs and design *down*, defining the problem is easier and routing becomes an implementation detail. What I hope to explore in this RFC is: - How do game design and networking constrain each other? @@ -49,6 +51,8 @@ struct Health { ### Example: "Networked" Systems ```rust +// No networking boilerplate. Just swap components. +// Same code runs on client and server. fn check_zero_health(mut query: Query<(&Health, &mut NetworkTransform)>){ for (health, mut transform) in query.iter_mut() { if health.hp <= 0.0 { @@ -105,20 +109,20 @@ fn main() { ### Example Configuration Options ``` -- players: 32, -- max networked entities: 1024, -- replication strategy: snapshots, - - mode: listen server, - - simulation tick rate: 60Hz - - client send interval: 1 tick, - - server send interval: 2 ticks, - - client input delay: 0, - - server input delay: 2 ticks, - - prediction: local-only, - - rollback window: 250ms, - - min interpolation delay: 32ms, - - lag compensation: true - - compensation window: 200ms +players: 32, +max networked entities: 1024, +replication strategy: snapshots, + mode: listen server, + simulation tick rate: 60Hz + client send interval: 1 tick, + server send interval: 2 ticks, + client input delay: 0, + server input delay: 2 ticks, + prediction: local-only, + rollback window: 250ms, + min interpolation delay: 32ms, + lag compensation: true + compensation window: 200ms ``` ## Reference-level explanation @@ -126,9 +130,9 @@ fn main() { [TBD](../main/implementation_details.md) ### Macros -- Adds `[repr(C)]` -- Identifying components for snapshot generation -- Implement float quantization and compression +- Add `[repr(C)]` +- Networked state identification +- Float quantization and compression - Conditional compilation of client and server logic ### Saving and Restoring Game State @@ -139,8 +143,8 @@ Requirements Saving - At the end of every fixed update, iterate `Changed` and `Removed` for all replicable components and duplicate them to an isolated copy. -- This isolated copy would be a collection of sparse sets, for just the replicable components. Tables would be rebuilt when restoring. -- (From their description, [subworlds](https://github.com/bevyengine/rfcs/pull/16) seem like they could also be used for generating snapshots and performing rollbacks, but I need more details. Might be a lot of overhead.) +- This isolated copy would be a collection of `SpareSet`, for just the replicable components. Tables would be rebuilt when restoring. +- (From [their RFC](https://github.com/bevyengine/rfcs/pull/16), `SubWorlds` seem like they might be usable for snapshot generation and rollbacks, but I need more details. AFAIK, they only address the "reserve a range of entities with separate metadata" requirement.) Packets - For snapshots, also compute the changes as a XOR and copy that into a ring buffer of patches. XOR the latest patch with the earlier patches to bring them up-to-date. Finally, write the packets and pass them to the protocol layer. @@ -177,7 +181,7 @@ Everything aside from the simulation steps can be generated automatically. ## Drawbacks - Possibly cursed macro magic. -- Writes to World directly. +- Writes to `World` directly. - Seemingly limited to components that implement `Clone` and `Serialize`. ## Rationale and alternatives @@ -189,7 +193,7 @@ Networking is a widely misunderstood problem domain. The proposed implementation Replication always boils down to sending inputs or state, so the space of alternative designs includes different choices for the end-user interface and different implementations of save/restore functions. -Frankly, given the abundance of confusion surrounding networking, polluting the API with "networked" variants of structs and systems (aside from Transform, Rigidbody, etc.) would just make life harder for everybody, both game developers and Bevy maintainers. +Frankly, given the abundance of confusion surrounding networking, polluting the API with "networked" variants of structs and systems (aside from `Transform`, `Rigidbody`, etc.) would just make life harder for everybody, both game developers and Bevy maintainers. People who want to make multiplayer games want to focus on designing their game and not worry about how to implement prediction, how to serialize their game, how to keep packets under MTU, etc. All of that should just work. I think the ease of macro annotations is worth any increase in compile times when networking features are enabled. From 779d1f5c5100c3af3068faa12e2430e7c7ff778d Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Sat, 24 Apr 2021 05:22:33 -0500 Subject: [PATCH 06/43] changed file name to match feature name --- network_replication.md => networked_replication.md | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename network_replication.md => networked_replication.md (100%) diff --git a/network_replication.md b/networked_replication.md similarity index 100% rename from network_replication.md rename to networked_replication.md From 69714f352189a9e537526ebee3a8ddc4c7345b9c Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Sat, 24 Apr 2021 05:31:06 -0500 Subject: [PATCH 07/43] Update networked_replication.md --- networked_replication.md | 34 ++++++++++++++++------------------ 1 file changed, 16 insertions(+), 18 deletions(-) diff --git a/networked_replication.md b/networked_replication.md index ce05f55b..c1456286 100644 --- a/networked_replication.md +++ b/networked_replication.md @@ -27,7 +27,7 @@ What I hope to explore in this RFC is: ## Guide-level explanation -[TBD](../main/replication_concepts.md) +[Link to some fundamental concepts.](../main/replication_concepts.md) ### Example: "Networked" Components @@ -109,30 +109,30 @@ fn main() { ### Example Configuration Options ``` -players: 32, -max networked entities: 1024, -replication strategy: snapshots, - mode: listen server, +players: 32 +max networked entities: 1024 +replication strategy: snapshots + mode: listen server simulation tick rate: 60Hz - client send interval: 1 tick, - server send interval: 2 ticks, - client input delay: 0, - server input delay: 2 ticks, - prediction: local-only, - rollback window: 250ms, - min interpolation delay: 32ms, + client send interval: 1 tick + server send interval: 2 ticks + client input delay: 0 + server input delay: 2 ticks + prediction: local-only + rollback window: 250ms + min interpolation delay: 32ms lag compensation: true compensation window: 200ms ``` ## Reference-level explanation -[TBD](../main/implementation_details.md) +[Link to some implementation details.](../main/implementation_details.md) ### Macros - Add `[repr(C)]` -- Networked state identification -- Float quantization and compression +- Identification of networked state for snapshots +- Quantization and range compression - Conditional compilation of client and server logic ### Saving and Restoring Game State @@ -220,7 +220,7 @@ I strongly doubt that fast, efficient, and transparent replication features can - Much like how Unreal has Fortnite, it would help immensenly if Bevy had an official collection of multiplayer samples to dogfood these features. -- Beyond replication, Bevy need only provide one good default for protocol and IO for the sake of completeness. I recommend dividing responsibilities as shown below to make it easy for developers to swap them with the [many](https://partner.steamgames.com/doc/features/multiplayer) [robust](https://developer.microsoft.com/en-us/games/solutions/multiplayer/) [platform](https://dev.epicgames.com/docs/services/en-US/Overview/index.html) [SDKs](https://docs.aws.amazon.com/gamelift/latest/developerguide/gamelift-intro.html). +- Beyond replication, Bevy need only provide one good default for protocol and IO for the sake of completeness. I recommend dividing responsibilities as shown below to make it easy for developers to swap them with the [many](https://partner.steamgames.com/doc/features/multiplayer) [robust](https://developer.microsoft.com/en-us/games/solutions/multiplayer/) [platform](https://dev.epicgames.com/docs/services/en-US/Overview/index.html) [SDKs](https://docs.aws.amazon.com/gamelift/latest/developerguide/gamelift-intro.html). Replication addresses all the underlying ECS interop, so it should be settled first. **replication** (this RFC) - save and restore @@ -240,5 +240,3 @@ I strongly doubt that fast, efficient, and transparent replication features can **I/O** - send, recv, poll, etc. - -Replication addresses all the underlying ECS interop, so it should be settled first. From 21c7e95089e846632b962755afdbe8eb3f231b4b Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Sat, 24 Apr 2021 07:54:24 -0500 Subject: [PATCH 08/43] clarified some prediction stuff in implementation_details.md --- implementation_details.md | 37 ++++++++++++++++++++++--------------- replication_concepts.md | 2 +- 2 files changed, 23 insertions(+), 16 deletions(-) diff --git a/implementation_details.md b/implementation_details.md index 48ddba64..23e87d7a 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -84,40 +84,47 @@ Lag compensation goes like this: After that's done, any surviving projectiles will exist in the correct time. The process is the same for raycast weapons. -*Overwatch* allows defensive abilities to mitigate lag-compensated shots. This is simple to do. If a player activates any defensive bonus, just apply it to all their buffered hitboxes. -[Link](https://youtu.be/W3aieHjyNvw?t=2492) +There's a lot to learn from *Overwatch* here. -*Overwatch* also reduces the number of intersection tests by calculating the movement envelope of each entity, the "sum" of its bounding volumes over the full lag compensation window, and only rewinding characters whose movement envelopes intersect projectiles. -[Link](https://youtu.be/W3aieHjyNvw?t=2226) +*Overwatch* [allows defensive abilities to mitigate lag-compensated shots](https://youtu.be/W3aieHjyNvw?t=2492). AFAIK this is simple to do. If a player activates any defensive bonus, just apply it to all their buffered hitboxes. -For clients with very high ping, their interpolated time will lag too far behind their predicted time. Generally, you won't want to favor the shooter past a certain limit (e.g. 250ms), so those clients will have to extrapolate the difference. Not extrapolating is also valid, but then lagging clients would abruptly have to start leading their targets. -[Link](https://youtu.be/W3aieHjyNvw?t=2347) +*Overwatch* also [finds the movement envelope of each entity](https://youtu.be/W3aieHjyNvw?t=2226), the "sum" of its bounding volumes over the full lag compensation window, to reduce the number of intersection tests, only rewinding characters whose movement envelopes intersect projectiles. + +For clients with very high ping, their interpolated time will lag too far behind their predicted time. You generally don't want to favor the shooter past a certain limit (e.g. 250ms), so [those clients have to extrapolate the difference](https://youtu.be/W3aieHjyNvw?t=2347). Not extrapolating is also valid, but then lagging clients would abruptly have to start leading their targets. This limit is the only relation between the predicted time and the interpolated time. They're otherwise decoupled. ## Smooth Rendering Whenever clients receive an update with new remote entities, those entities shouldn't be rendered until that update is interpolated. +Cameras need a little special treatment. Inputs to the view rotation need to be accumulated at the render rate and re-applied just before rendering. + Is an exponential decay enough for smooth error correction or are there better algorithms? ## Prediction <-> Interpolation -TBD - Clients can't directly modify the authoritative state, but they should be able to predict whatever they want locally. One obvious implementation is to literally fork the latest authoritative state. If copying the full state ends up being too expensive, we can probably use a copy-on-write layer. Clients should predict the entities driven by their input, the entities they spawn (until confirmed), and any entities mutated as a result of the first two. I think that should cover it. Predicting *everything* would be a compile-time choice. -So we can predict with component granularity, but how do we shift things between prediction and interpolation? My current idea is for everything to default to interpolation (reset upon receiving a server update) and then use specialized change detection `DerefMut` magic to produce `Predicted`. - -All systems that handle "predictable" interactions (pushing a button, putting an item in your inventory) should run *before* physics. Everything should run before rendering. +I said entities, but we can predict with component granularity. The million-dollar question is how to shift things between prediction and interpolation. My current idea is for everything to default to interpolation (reset upon receiving a server update) and then use specialized change detection `DerefMut` magic. -Cameras need a little special treatment. Inputs to the view rotation need to be accumulated at the render rate and re-applied just before rendering. +``` +Predicted +PredictAdded +PredictRemoved +Confirmed +ConfirmAdded +ConfirmRemoved +Canceled +CancelAdded +CancelRemoved +``` -Should UI be allowed to reference predicted state or only verified state? +With these, we can generate events that only trigger on authoritative changes and events that trigger on predicted changes to be confirmed or cancelled later. The latter are necessary for handling sounds and particle effects. Those shouldn't be duplicated during rollbacks and should be faded out if mispredicted. -Events are tricky. We'll need some events that only trigger on authoritative changes and others that trigger on predicted changes with follow-up confirmed or cancelled events. +All systems that handle "predictable" interactions (pushing a button, putting an item in your inventory) should run *before* physics. Everything in `NetworkFixedUpdate` should run before rendering. -We'll need something like the latter to handle sounds and particle effects. Those shouldn't be duplicated during rollbacks and should be faded out if mispredicted. +Should UI be allowed to reference predicted state or only verified state? ## Predicting Entity Creation This requires some special consideration. diff --git a/replication_concepts.md b/replication_concepts.md index 38ba40e5..b7e4eead 100644 --- a/replication_concepts.md +++ b/replication_concepts.md @@ -31,7 +31,7 @@ Determinism is awesome when it fits but it's generally unavailable. Neither Godo Whenever you can't have or don't want bit-perfect determinism, you use state transfer. -The idea behind state transfer is the concept of **authority**. It's essentially ownership in Rust. Those who own any state are responsible for broadcasting up-to-date information about it. I sometimes see authority divided into *input* authority (control permission) and *state* authority (write permission), but usually authority means state authority. +The key idea behind state transfer is the concept of **authority**. It's essentially ownership in Rust. Those who own state are responsible for broadcasting up-to-date information about it. I sometimes see authority divided into *input* authority (control permission) and *state* authority (write permission), but usually authority means state authority. The server usually owns everything, but authority is very flexible. In games like *Destiny* and *Fall Guys*, clients own their movement state. Other games even trust clients to confirm hits. Distributing authority like this adds complexity and obviously leaves the door wide open for cheaters, but sometimes it's necessary. In VR, it makes sense to let clients claim and relinquish authority over interactable objects. From 485eadf3fd5ed8e7cffa6148f492cf02062d2ac0 Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Sat, 24 Apr 2021 10:46:41 -0500 Subject: [PATCH 09/43] added a doc-like explanation pretty lazy, but it's something --- networked_replication.md | 25 +++++++++++++++++++------ 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/networked_replication.md b/networked_replication.md index c1456286..538e06bd 100644 --- a/networked_replication.md +++ b/networked_replication.md @@ -29,6 +29,20 @@ What I hope to explore in this RFC is: [Link to some fundamental concepts.](../main/replication_concepts.md) +> Please treat all terms like determinism, state transfer, snapshots, and eventual consistency as placeholders. We could easily label them differently. + +Bevy aims to make developing networked games as simple as possible. There isn't a "one size fits all" replication strategy that works for every game, but Bevy provides those it has under one API. + +First think about your game and consider which form of replication might fit best. Players can *either* send their inputs to each other (or through a relay) and independently and deterministically simulate the game *or* they can send their inputs to a single machine (the server) who simulates the game and sends back updated game state. + +> Honestly, Bevy could put something like a questionnaire in the docs. Genre and player count pretty much choose the replication strategy for you. + +Next, determine which components and systems affect the global simulation state and tag them accordingly. Usually adding `#[derive(Replicate)]` to all replicable components is enough. You can additionally decorate gameplay logic and systems with `#[client]` or `#[server]` for conditional compilation. + +Lastly, add these simulation systems to the `NetworkedFixedUpdate` app state. Bevy will take care of all state rollback, serialization, and compression internally. Other than that, you're free to write your game as if it were local multiplayer. + +> This guide is pretty lazy lol, but that's the gist of it. + ### Example: "Networked" Components ```rust @@ -185,11 +199,11 @@ Everything aside from the simulation steps can be generated automatically. - Seemingly limited to components that implement `Clone` and `Serialize`. ## Rationale and alternatives -- Why is this design the best in the space of possible designs? +> Why is this design the best in the space of possible designs? Networking is a widely misunderstood problem domain. The proposed implementation should suffice for most games while minimizing design friction—users need only annotate gameplay-related components and systems, put those systems in `NetworkFixedUpdate`, and configure some settings. -- What other designs have been considered and what is the rationale for not choosing them? +> What other designs have been considered and what is the rationale for not choosing them? Replication always boils down to sending inputs or state, so the space of alternative designs includes different choices for the end-user interface and different implementations of save/restore functions. @@ -197,11 +211,11 @@ Frankly, given the abundance of confusion surrounding networking, polluting the People who want to make multiplayer games want to focus on designing their game and not worry about how to implement prediction, how to serialize their game, how to keep packets under MTU, etc. All of that should just work. I think the ease of macro annotations is worth any increase in compile times when networking features are enabled. -- What is the impact of not doing this? +> What is the impact of not doing this? It'll only grow more difficult to add these features as time goes on. Take Unity for example. Its built-in features are too non-deterministic and its only working solutions for state transfer are paid third-party assets. Thus far, said assets cannot integrate deeply enough to be transparent (at least not without custom memory management and duplicating parts of the engine). -- Why is this important to implement as a feature of Bevy itself, rather than an ecosystem crate? +> Why is this important to implement as a feature of Bevy itself, rather than an ecosystem crate? I strongly doubt that fast, efficient, and transparent replication features can be implemented without directly manipulating a World and its component storages. @@ -217,9 +231,8 @@ I strongly doubt that fast, efficient, and transparent replication features can ## Future possibilities - With some game state diffing tools, these replication systems could help detect non-determinism in other parts of the engine. - - Much like how Unreal has Fortnite, it would help immensenly if Bevy had an official collection of multiplayer samples to dogfood these features. - +- Bevy's future editor could automate most of the configuration and annotation. - Beyond replication, Bevy need only provide one good default for protocol and IO for the sake of completeness. I recommend dividing responsibilities as shown below to make it easy for developers to swap them with the [many](https://partner.steamgames.com/doc/features/multiplayer) [robust](https://developer.microsoft.com/en-us/games/solutions/multiplayer/) [platform](https://dev.epicgames.com/docs/services/en-US/Overview/index.html) [SDKs](https://docs.aws.amazon.com/gamelift/latest/developerguide/gamelift-intro.html). Replication addresses all the underlying ECS interop, so it should be settled first. **replication** (this RFC) From 5aa233ce8f39f968b478e4d9dc074c67df6cd487 Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Sat, 24 Apr 2021 10:56:01 -0500 Subject: [PATCH 10/43] fixed some typos --- networked_replication.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/networked_replication.md b/networked_replication.md index 538e06bd..253a6ad7 100644 --- a/networked_replication.md +++ b/networked_replication.md @@ -27,7 +27,7 @@ What I hope to explore in this RFC is: ## Guide-level explanation -[Link to some fundamental concepts.](../main/replication_concepts.md) +[Link to my explanation of important replication concepts.](../main/replication_concepts.md) > Please treat all terms like determinism, state transfer, snapshots, and eventual consistency as placeholders. We could easily label them differently. @@ -39,7 +39,7 @@ First think about your game and consider which form of replication might fit bes Next, determine which components and systems affect the global simulation state and tag them accordingly. Usually adding `#[derive(Replicate)]` to all replicable components is enough. You can additionally decorate gameplay logic and systems with `#[client]` or `#[server]` for conditional compilation. -Lastly, add these simulation systems to the `NetworkedFixedUpdate` app state. Bevy will take care of all state rollback, serialization, and compression internally. Other than that, you're free to write your game as if it were local multiplayer. +Lastly, add these simulation systems to the `NetworkFixedUpdate` app state. Bevy will take care of all state rollback, serialization, and compression internally. Other than that, you're free to write your game as if it were local multiplayer. > This guide is pretty lazy lol, but that's the gist of it. @@ -141,7 +141,7 @@ replication strategy: snapshots ## Reference-level explanation -[Link to some implementation details.](../main/implementation_details.md) +[Link to more in-depth implementation details.](../main/implementation_details.md) ### Macros - Add `[repr(C)]` From df61d3a2d07340d646a22b6d28f0a58c75c85735 Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Sun, 25 Apr 2021 07:13:06 -0500 Subject: [PATCH 11/43] Added a network mode table, revised some wording, and a note on "player" vs "connection" --- implementation_details.md | 14 +++++++++++--- networked_replication.md | 40 ++++++++++++++++++++++++--------------- replication_concepts.md | 2 +- 3 files changed, 37 insertions(+), 19 deletions(-) diff --git a/implementation_details.md b/implementation_details.md index 23e87d7a..04b9b609 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -3,9 +3,17 @@ ## Delta Compression TBD -## Area of Interest +## Interest Management TBD +## RPC +RPCs are best for sending global alerts and any gameplay mechanics you explicitly want modeled as request-reply (or one-way) interactions. They can be reliable or unreliable. + +TBD + +## Clients are not players... +I know I've been using the terms somewhat interchangeably, but `Player` and `Connection` should be separate tokens. No reason to force one player per connection in the engine API. Having `Player` be its own thing makes it easier to do stuff like replace leaving players with bots. + ## "Clock" Synchronization Ideally, clients predict ahead by just enough to have their inputs reach the server right before they're needed. For some reason, people frequently arrive at the idea that clients should estimate the clock time on the server (with some SNTP handshake) and use that to schedule the next simulation step. @@ -115,7 +123,7 @@ PredictRemoved Confirmed ConfirmAdded ConfirmRemoved -Canceled +Cancelled CancelAdded CancelRemoved ``` @@ -133,7 +141,7 @@ The naive solution is to have clients spawn dummy entities. When an update that A better solution is for the server to assign each networked entity a global ID that the spawning client can predict and map to its local instance. -- The simplest form of this would be an incrementing index whose upper bits are fixed to match the spawning player's ID. This is my recommendation. +- The simplest form of this would be an incrementing generational index whose upper bits are fixed to match the spawning player's ID. This is my recommendation. Basically, reuse Entity and reserve some of the upper bits in the ID. - Alternatively, PRNGs could be used to generate shared keys for pairing global and local IDs. Rather than predict the global ID, the client would predict the shared key. Server updates that confirm the predicted entity would include both its global ID and the shared key, which the client can then use to pair the IDs. This method adds complexity but bypasses the previous method's implicit entity limit. diff --git a/networked_replication.md b/networked_replication.md index 253a6ad7..d5c15796 100644 --- a/networked_replication.md +++ b/networked_replication.md @@ -14,7 +14,7 @@ Bevy has an opportunity to be among the first open game engines to provide a tru Most engines provide "low level" connectivity—virtual connections, optionally reliable UDP channels, rooms—and stop there. Those are not very useful to developers without "high level" replication features—prediction, reconciliation, lag compensation, interest management, etc. -Among Godot, Unity, and Unreal, only Unreal provides [any of these](https://docs.unrealengine.com/en-US/InteractiveExperiences/Networking/ReplicationGraph/index.html) built-in. +Among Godot, Unity, and Unreal, only Unreal provides [some](https://youtu.be/JOJP0CvpB8w) [features](https://www.unrealengine.com/en-US/tech-blog/replication-graph-overview-and-proper-replication-methods) built-in. IMO the broader absence of these systems leads many to conclude that every multiplayer game must need its own unique solution. This is not true. While the exact replication "strategy" depends of the game, all of them—lockstep, rollback, client-side prediction with server reconciliation, etc.—pull from the same bag of tricks. Their differences can be captured with simple configuration options. Really, only *massive* multiplayer games require custom solutions. @@ -31,7 +31,7 @@ What I hope to explore in this RFC is: > Please treat all terms like determinism, state transfer, snapshots, and eventual consistency as placeholders. We could easily label them differently. -Bevy aims to make developing networked games as simple as possible. There isn't a "one size fits all" replication strategy that works for every game, but Bevy provides those it has under one API. +Bevy aims to make developing networked games as simple as possible. There isn't one replication strategy that works best for every game, but Bevy provides those it has under one API. First think about your game and consider which form of replication might fit best. Players can *either* send their inputs to each other (or through a relay) and independently and deterministically simulate the game *or* they can send their inputs to a single machine (the server) who simulates the game and sends back updated game state. @@ -109,7 +109,7 @@ fn main() { .with_system(check_zero_health.system()) // ... Most user systems would go here. ) - #[server_only] + #[server] .add_system_set( SystemSet::on_update(AppState::NetworkFixedUpdate) .label(NetworkLabel::LagComp) @@ -152,17 +152,24 @@ replication strategy: snapshots ### Saving and Restoring Game State Requirements - Replicable components must only be mutated in `NetworkFixedUpdate`. -- World needs to reserve a range of entity IDs and track metadata for them separately. +- `World` needs to reserve a range of entity IDs and track metadata for them separately. - Networked entities must be spawned as such. You cannot spawn a non-networked entity and "network it" later, at least not without some kind of RPC. Saving - At the end of every fixed update, iterate `Changed` and `Removed` for all replicable components and duplicate them to an isolated copy. - This isolated copy would be a collection of `SpareSet`, for just the replicable components. Tables would be rebuilt when restoring. -- (From [their RFC](https://github.com/bevyengine/rfcs/pull/16), `SubWorlds` seem like they might be usable for snapshot generation and rollbacks, but I need more details. AFAIK, they only address the "reserve a range of entities with separate metadata" requirement.) +- (From [their RFC](https://github.com/bevyengine/rfcs/pull/16), "sub-worlds" seem like they might be usable for snapshot generation and rollbacks, but I need more details. AFAIK, they only address the "reserve a range of entities with separate metadata" requirement.) Packets -- For snapshots, also compute the changes as a XOR and copy that into a ring buffer of patches. XOR the latest patch with the earlier patches to bring them up-to-date. Finally, write the packets and pass them to the protocol layer. -- For eventual consistency, we need some metadata. Entities accrue send priority over time. We can use the magnitude of the changes (addition or removal would be largest magnitude) as the base amount to accrue. We can then run a bipartite AABB sweep-and-prune followed by a radial distance test to prioritize the entities physically inside each client's areas of interest. Then any user-defined prioritization rules could run. Finally, write the packets and pass them to the protocol layer. +- Snapshots will use delta compression. + - We'll keep a ring buffer of patches for the last N snapshots. + - Whenever we duplicate changes to the isolated copy, also compute `copy ^ changes` and push this "patch" into a ring buffer. Importantly, XOR this new patch with the earlier patches to bring them up-to-date. (All this XOR'ing should produce many chains of zero bits.) + - Finally, compress whichever patches (run-length encoding, variable-byte encoding, etc.) clients need and pass them to the protocol layer. +- Eventual consistency will use interest management. + - Entities accrue send priority over time. Maybe we can use the magnitude of component changes (addition or removal would be largest magnitude) as the base amount to accrue. + - Users-defined rules for gameplay relevancy would run. + - For physical entities, we can use collision detection to prioritize the entities inside each client's area of interest. + - Finally, write the payload for each client and pass them to the protocol layer. Restoring - TBD @@ -184,14 +191,17 @@ Server Everything aside from the simulation steps can be generated automatically. -### Networking Modes -- listen server - - client and server instances on same machine - - single player = listen server with dummy socket / no connections -- dedicated server -- relay - - for managing deterministic and client-authoritative games - - clock reference, input validation, interest management, etc. but no simulation +### Network Modes +| Mode | Playable? | Authoritative? | Open to connections? | +| :--- | :---: | :---: | :---: | +| Client | ✓ | ✗ | ✗ | +| Standalone | ✓ | ✓ | ✗ | +| Listen Server | ✓ | ✓ | ✓ | +| Dedicated Server | ✗ | ✓ | ✓ | +| Relay | ✗ | ✗ | ✓ | + +- Listen servers have client and server instances on the same machine. Standalone is a listen server configured with a dummy socket and closed to connections. +- Relays are for managing deterministic and client-authoritative games. They can do "clock" synchronization, input validation, interest management, etc. Just no simulation. ## Drawbacks - Possibly cursed macro magic. diff --git a/replication_concepts.md b/replication_concepts.md index b7e4eead..8cafc128 100644 --- a/replication_concepts.md +++ b/replication_concepts.md @@ -55,7 +55,7 @@ for message in queue.iter() { *state[n+1] = s; ``` -Messages are the right tool for when you really do want explicit request-response interactions or for global alerts like players joining or leaving. They just aren't good for replication. They encourage poor ergonomics, with send and receive calls littered everywhere. Even if you collect and send messages in batches, they don't compress as well as inputs or state. +Messages are the right tool for when you really do want explicit request-reply interactions or for global alerts like players joining or leaving. They just aren't good for general replication. They encourage poor ergonomics, with send and receive calls littered everywhere. Even if you collect and send messages in batches, they don't compress as well as inputs or state. # Latency Networking a game simulation so that players who live in different locations can play together is an unintuitive problem. No matter how we physically connect their computers, they most likely won't be able to exchange data within one simulation step. From 4b3dac6fe58136d8f4d830f48c7135775e9b5bf3 Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Sun, 25 Apr 2021 07:24:57 -0500 Subject: [PATCH 12/43] shortened some bulleted text because I didn't like the indents --- networked_replication.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/networked_replication.md b/networked_replication.md index d5c15796..04814484 100644 --- a/networked_replication.md +++ b/networked_replication.md @@ -162,11 +162,11 @@ Saving Packets - Snapshots will use delta compression. - - We'll keep a ring buffer of patches for the last N snapshots. - - Whenever we duplicate changes to the isolated copy, also compute `copy ^ changes` and push this "patch" into a ring buffer. Importantly, XOR this new patch with the earlier patches to bring them up-to-date. (All this XOR'ing should produce many chains of zero bits.) - - Finally, compress whichever patches (run-length encoding, variable-byte encoding, etc.) clients need and pass them to the protocol layer. + - We'll keep a ring buffer of patches for the last `N` snapshots. + - Whenever we duplicate changes to the isolated copy, also compute `copy xor changes` as the latest patch and push it into the ring buffer. Update the earlier patches by xor`ing them with the new patch. + - Finally, compress whichever patches clients need and pass them to the protocol layer. - Eventual consistency will use interest management. - - Entities accrue send priority over time. Maybe we can use the magnitude of component changes (addition or removal would be largest magnitude) as the base amount to accrue. + - Entities accrue send priority over time. Maybe we can use the magnitude of component changes as the base amount to accrue. - Users-defined rules for gameplay relevancy would run. - For physical entities, we can use collision detection to prioritize the entities inside each client's area of interest. - Finally, write the payload for each client and pass them to the protocol layer. From 1db785c8e511bb9b5ff2b1a0e072f496f17f625c Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Sun, 25 Apr 2021 07:39:55 -0500 Subject: [PATCH 13/43] more minor changes --- networked_replication.md | 2 +- replication_concepts.md | 15 ++++++++------- 2 files changed, 9 insertions(+), 8 deletions(-) diff --git a/networked_replication.md b/networked_replication.md index 04814484..4e0b01df 100644 --- a/networked_replication.md +++ b/networked_replication.md @@ -163,7 +163,7 @@ Saving Packets - Snapshots will use delta compression. - We'll keep a ring buffer of patches for the last `N` snapshots. - - Whenever we duplicate changes to the isolated copy, also compute `copy xor changes` as the latest patch and push it into the ring buffer. Update the earlier patches by xor`ing them with the new patch. + - Whenever we duplicate changes to the isolated copy, also compute `copy xor changes` as the latest patch and push it into the ring buffer. Update the earlier patches by xor'ing them with the new patch. - Finally, compress whichever patches clients need and pass them to the protocol layer. - Eventual consistency will use interest management. - Entities accrue send priority over time. Maybe we can use the magnitude of component changes as the base amount to accrue. diff --git a/replication_concepts.md b/replication_concepts.md index 8cafc128..8adf636a 100644 --- a/replication_concepts.md +++ b/replication_concepts.md @@ -1,7 +1,7 @@ # Replication Abstractly, you can think of a game as a pure function that accepts an initial state and player inputs and generates a new state. ```rust -let state[n+1] = simulate(&state[n], &inputs[n]); +*state[n+1] = simulate(&state[n], &inputs[n]); ``` Fundamentally, if several players want to perform a synchronized simulation over a network, they have basically two options: @@ -36,13 +36,13 @@ The key idea behind state transfer is the concept of **authority**. It's essenti The server usually owns everything, but authority is very flexible. In games like *Destiny* and *Fall Guys*, clients own their movement state. Other games even trust clients to confirm hits. Distributing authority like this adds complexity and obviously leaves the door wide open for cheaters, but sometimes it's necessary. In VR, it makes sense to let clients claim and relinquish authority over interactable objects. ## Why not messaging patterns? -The only other strategy you really see used for replication is messaging. Like RPCs or remote events. Not sure why, but it's what most people try the first time. +The only other strategy you really see used for replication is messaging. Like RPCs or remote events. Not sure why, but it's what I most often see people try the first time. Take chess for example. Instead of sending polled player inputs or the state of the chessboard, you could just send the moves like "white, e2 to e4," etc. Here's the issue. Messages are tightly coupled to their game's logic. They can't be generalized. Chess is simple—one turn, one event—but what about an FPS? What messages would it need? How many? When and where would those messages need be sent and received? -If those messages have cascading effects, they can only be sent reliable, ordered. How can you build prediction and reconciliation when you can't drop a packet? +If those messages have cascading effects, they can only be sent reliable, ordered. ```rust let mut s = state[n]; for message in queue.iter() { @@ -54,8 +54,9 @@ for message in queue.iter() { // applied and applied in the right order. *state[n+1] = s; ``` +How do you even build prediction and reconciliation out of this? -Messages are the right tool for when you really do want explicit request-reply interactions or for global alerts like players joining or leaving. They just aren't good for general replication. They encourage poor ergonomics, with send and receive calls littered everywhere. Even if you collect and send messages in batches, they don't compress as well as inputs or state. +Messages are the right tool for when you really do want explicit request-reply interactions or for global alerts like players joining or leaving. They just don't cut it as a general tool. Even if you were to avoid littering send and receive calls everywhere (i.e., collect and send in batches), messages don't compress as well as inputs or state. # Latency Networking a game simulation so that players who live in different locations can play together is an unintuitive problem. No matter how we physically connect their computers, they most likely won't be able to exchange data within one simulation step. @@ -83,8 +84,10 @@ Once again, determinism is an all or nothing deal. If you predict, you predict e State transfer has the flexibility to predict only *some* things, letting you offload expensive systems onto the server. Games like *Rocket League* still predict everything, including other clients (the server re-distributes their inputs along with game state so that this is more accurate). However, most games choose not to do this. It's more common for clients to predict only what they control and interact with. -# Consistency +# Visual Consistency +**tl;dr**: Hard snap the simulation state and subtly blend the view. Time travel if needed. ## Smooth Rendering and Lag Compensation + Predicting only *some* things adds implementation complexity. When clients predict everything, they produce renderable state at a fixed pace. Now, anything that isn't predicted must be rendered using data received from the server. The problem is that server updates are sent over a lossy, unreliable internet that disrupts any consistent spacing between packets. This means clients need to buffer incoming server updates long enough to have two authoritative updates to interpolate most of the time. @@ -93,8 +96,6 @@ Gameplay-wise, not predicting everything also divides entities between two point Visually, we'll often have to blend between extrapolated and authoritative data. Simply interpolating between two authoritative updates is incorrect. The visual state can and will accrue errors, but that's what we want. Those can be tracked and smoothly reduced (to some near-zero threshold, then cleared). -If it's unclear: Hard snap the actual game state to reconcile but softly blend the view. - # Bandwidth ## How much can we fit into each packet? Not a lot. From 4306eb10f4ff49647ee612c5bb7017395558c895 Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Sun, 25 Apr 2021 18:14:22 -0500 Subject: [PATCH 14/43] added some things, revised some things --- implementation_details.md | 20 +++++----- networked_replication.md | 80 +++++++++++++++++++++------------------ 2 files changed, 52 insertions(+), 48 deletions(-) diff --git a/implementation_details.md b/implementation_details.md index 04b9b609..aab2a513 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -109,12 +109,12 @@ Cameras need a little special treatment. Inputs to the view rotation need to be Is an exponential decay enough for smooth error correction or are there better algorithms? -## Prediction <-> Interpolation +## Prediction ⟷ Interpolation Clients can't directly modify the authoritative state, but they should be able to predict whatever they want locally. One obvious implementation is to literally fork the latest authoritative state. If copying the full state ends up being too expensive, we can probably use a copy-on-write layer. Clients should predict the entities driven by their input, the entities they spawn (until confirmed), and any entities mutated as a result of the first two. I think that should cover it. Predicting *everything* would be a compile-time choice. -I said entities, but we can predict with component granularity. The million-dollar question is how to shift things between prediction and interpolation. My current idea is for everything to default to interpolation (reset upon receiving a server update) and then use specialized change detection `DerefMut` magic. +I said entities, but we can predict with component granularity. The million-dollar question is how to shift things between prediction and interpolation. My current idea is for everything to default to interpolation (reset upon receiving a server update) and then use specialized change detection `DerefMut` magic to flag local predictions. ``` Predicted @@ -128,9 +128,9 @@ CancelAdded CancelRemoved ``` -With these, we can generate events that only trigger on authoritative changes and events that trigger on predicted changes to be confirmed or cancelled later. The latter are necessary for handling sounds and particle effects. Those shouldn't be duplicated during rollbacks and should be faded out if mispredicted. +With these, we can explicitly opt-out of funneling non-predicted components through expensive systems. We can also generate events that only trigger on authoritative changes and events that trigger on predicted changes to be confirmed or cancelled later. The latter are necessary for handling sounds and particle effects. Those shouldn't be duplicated during rollbacks and should be faded out if mispredicted. -All systems that handle "predictable" interactions (pushing a button, putting an item in your inventory) should run *before* physics. Everything in `NetworkFixedUpdate` should run before rendering. +All systems that handle "predictable" interactions (pushing a button, putting an item in your inventory) should probably run *before* the expensive stuff (physics, path-planning). Rendering should come after `NetworkFixedUpdate`. Should UI be allowed to reference predicted state or only verified state? @@ -139,11 +139,11 @@ This requires some special consideration. The naive solution is to have clients spawn dummy entities. When an update that confirms the result arrives, clients can simply destroy the dummy and spawn the true entity. IMO this is a poor solution because it prevents clients from smoothly blending these entities from predicted time into interpolated time. It won't look right. -A better solution is for the server to assign each networked entity a global ID that the spawning client can predict and map to its local instance. +A better solution is for the server to assign each networked entity a global ID (`NetworkID`) that the spawning client can predict and map to its local instance. -- The simplest form of this would be an incrementing generational index whose upper bits are fixed to match the spawning player's ID. This is my recommendation. Basically, reuse Entity and reserve some of the upper bits in the ID. +- The simplest form of this would be an incrementing generational index whose upper bits are fixed to match the spawning player's ID. This is my recommendation. Basically, reuse `Entity` and reserve some of the upper bits in the ID. -- Alternatively, PRNGs could be used to generate shared keys for pairing global and local IDs. Rather than predict the global ID, the client would predict the shared key. Server updates that confirm the predicted entity would include both its global ID and the shared key, which the client can then use to pair the IDs. This method adds complexity but bypasses the previous method's implicit entity limit. +- Alternatively, PRNGs could be used to generate shared keys (called "prediction keys" in some places) for pairing global and local IDs. Rather than predict the global ID, the client would predict the shared key. Server updates that confirm the predicted entity would include both its global ID and the shared key, which the client can then use to pair the IDs. This method adds complexity but bypasses the previous method's implicit entity limit. - A more extreme solution would be to somehow bake global IDs directly into the memory allocation. If memory layouts are mirrored, relative pointers become global IDs, which don't need to be explicitly written into packets. This would save 4-8 bytes per entity before compression. @@ -161,8 +161,6 @@ Let's consider a simpler default: 3. Always rollback and re-simulate. -Now, you might be thinking, "Isn't that wasteful?" +Now, if you're thinking that's wasteful, the "if mispredicted" gives you a false sense of security. If I make a game and claim it can rollback 250ms, that basically should mean *any* 250ms, with no stuttering. If clients *always* rollback and re-sim, it'll be easier to profile and optimize for that. As a bonus, clients never need to store old predicted states. -*If* gives a false sense of security. If I make a game and claim it can rollback 250ms, that basically should mean *any* 250ms, with no stuttering. If clients *always* rollback and re-sim, it'll be easier to profile and optimize for that. As a bonus, clients can immediately toss old predicted states. - -Constant rollbacks may sound expensive, but there were games with rollback running on the original Playstation 20+ years ago. \ No newline at end of file +Constant rollbacks may sound expensive, but there were games with rollback running on the original Playstation over 20 years ago. \ No newline at end of file diff --git a/networked_replication.md b/networked_replication.md index 4e0b01df..5637c78b 100644 --- a/networked_replication.md +++ b/networked_replication.md @@ -132,7 +132,7 @@ replication strategy: snapshots server send interval: 2 ticks client input delay: 0 server input delay: 2 ticks - prediction: local-only + prediction: true rollback window: 250ms min interpolation delay: 32ms lag compensation: true @@ -144,52 +144,58 @@ replication strategy: snapshots [Link to more in-depth implementation details.](../main/implementation_details.md) ### Macros -- Add `[repr(C)]` -- Identification of networked state for snapshots -- Quantization and range compression -- Conditional compilation of client and server logic +- `#[derive(Replicate)]` for identification and serialization; also adds `[repr(C)]` +- `#[replicate(precision=?)]` for quantization +- `#[replicate(range=(?, ?))]` for range compression +- `#[client]` and `#[server]` for conditional compilation ### Saving and Restoring Game State + Requirements -- Replicable components must only be mutated in `NetworkFixedUpdate`. -- `World` needs to reserve a range of entity IDs and track metadata for them separately. -- Networked entities must be spawned as such. You cannot spawn a non-networked entity and "network it" later, at least not without some kind of RPC. + +- Replicable components should only be mutated inside `NetworkFixedUpdate`. +- Networked entities will have a global `NetworkID` component at minimum. +- Clients shouldn't try to "network" local entities through the addition of replicable components since those entities do not exist on the server. +- `World` should reserve an `Entity` ID range and track metadata for it separately. + - [Sub-worlds](https://github.com/bevyengine/rfcs/pull/16) seem like a potential candidate for this. Saving -- At the end of every fixed update, iterate `Changed` and `Removed` for all replicable components and duplicate them to an isolated copy. -- This isolated copy would be a collection of `SpareSet`, for just the replicable components. Tables would be rebuilt when restoring. -- (From [their RFC](https://github.com/bevyengine/rfcs/pull/16), "sub-worlds" seem like they might be usable for snapshot generation and rollbacks, but I need more details. AFAIK, they only address the "reserve a range of entities with separate metadata" requirement.) - -Packets -- Snapshots will use delta compression. - - We'll keep a ring buffer of patches for the last `N` snapshots. - - Whenever we duplicate changes to the isolated copy, also compute `copy xor changes` as the latest patch and push it into the ring buffer. Update the earlier patches by xor'ing them with the new patch. - - Finally, compress whichever patches clients need and pass them to the protocol layer. -- Eventual consistency will use interest management. +- At the end of each fixed update, server iterates `Changed` and `Removed` for all replicable components and duplicates them to an isolated collection of `SpareSet`. + - You could pass this "read-only" copy to another thread to do the remaining work. + - Tables would be rebuilt when restoring. + +Preparing Packets +- Snapshots (full state updates) will use delta compression. + - Server keeps a ring buffer of patches for the last `N` snapshots. + - Server computes the latest patch by xor'ing the copy and the latest changes (before applying them) and pushes it into the ring buffer. The servers also updates the earlier patches by xor'ing them with the latest patch. (The xor'ing is basically a pre-compression step that produces long zero chains with high probability.) + - Server compresses whichever patches clients need and hands them off to the protocol layer. (The same patch can be sent to multiple clients, so it scales pretty well.) + +- Eventual consistency (partial state updates) will use interest management. - Entities accrue send priority over time. Maybe we can use the magnitude of component changes as the base amount to accrue. - - Users-defined rules for gameplay relevancy would run. - - For physical entities, we can use collision detection to prioritize the entities inside each client's area of interest. - - Finally, write the payload for each client and pass them to the protocol layer. + - Server runs users-defined rules for gameplay relevance. + - Server runs collision detection to prioritize physical entities inside each client's area of interest. + - Server writes the payload for each client and hands them off to the protocol layer. Restoring -- TBD +- At the beginning of each fixed update, the client decompresses the received update and writes its changes to the appropriate `SparseSet` collection (several will be buffered). +- Client then uses this updated collection to write the prediction copy that has all the tables and non-replicable components. ### NetworkFixedUpdate Clients -1. Poll for received updates. +1. Iterate received server updates. 2. Update simulation and interpolation timescales. 3. Sample and send inputs to server. -4. Rollback and re-sim (if received new update). +4. Rollback and re-sim *if* a new update was received. 5. Simulate predicted tick. Server -1. Poll for received inputs. +1. Iterate received client inputs. 2. Sample buffered inputs. 3. Simulate authoritative tick. 4. Duplicate state changes to copy. -5. Send updates to clients. +5. Send updated state to clients. -Everything aside from the simulation steps can be generated automatically. +Everything aside from the simulation steps could be auto-generated. ### Network Modes | Mode | Playable? | Authoritative? | Open to connections? | @@ -204,8 +210,8 @@ Everything aside from the simulation steps can be generated automatically. - Relays are for managing deterministic and client-authoritative games. They can do "clock" synchronization, input validation, interest management, etc. Just no simulation. ## Drawbacks -- Possibly cursed macro magic. -- Writes to `World` directly. +- Lots of potentially cursed macro magic. +- Direct writes to `World`. - Seemingly limited to components that implement `Clone` and `Serialize`. ## Rationale and alternatives @@ -223,29 +229,29 @@ People who want to make multiplayer games want to focus on designing their game > What is the impact of not doing this? -It'll only grow more difficult to add these features as time goes on. Take Unity for example. Its built-in features are too non-deterministic and its only working solutions for state transfer are paid third-party assets. Thus far, said assets cannot integrate deeply enough to be transparent (at least not without custom memory management and duplicating parts of the engine). +It'll only grow more difficult to add these features as time goes on. Take Unity for example. Its built-in features are too non-deterministic and its only working solutions for state transfer are paid third-party assets. Thus far, said assets cannot integrate deeply enough to be transparent (at least not without substituting parts of the engine). > Why is this important to implement as a feature of Bevy itself, rather than an ecosystem crate? -I strongly doubt that fast, efficient, and transparent replication features can be implemented without directly manipulating a World and its component storages. +I strongly doubt that fast, efficient, and transparent replication features can be implemented without directly manipulating a `World` and its component storages. ## Unresolved questions -- What can't be serialized? -- What is the correct amount of isolation between replicable and non-replicable game state? Are we sure non-networked entities can't *become* networked through the addition of replicable components? +- What components and resources can't be serialized? +- Is there a better way to isolate replicable and non-replicable entities? - Can we provide lints for undefined behavior like mutating networked state outside of `NetworkFixedUpdate`? - How should UI widgets interact with networked state? Exclusively poll verified data? -- How should we deal with predicting and reconciling events and FX—animations, audio, particles? -- Do rollbacks break change detection? +- How should we deal with correcting mispredicted events and FX? +- Does rolling back break existing change detection or events? - Can we replicate animations exactly without explicitly sending animation parameters? - When sending partial state updates, how should we deal with weird stuff like there being references to entities that haven't been spawned or have been destroyed? ## Future possibilities -- With some game state diffing tools, these replication systems could help detect non-determinism in other parts of the engine. +- With some tool to visualize game state diffs, these replication systems could help detect non-determinism in other parts of the engine. - Much like how Unreal has Fortnite, it would help immensenly if Bevy had an official collection of multiplayer samples to dogfood these features. - Bevy's future editor could automate most of the configuration and annotation. - Beyond replication, Bevy need only provide one good default for protocol and IO for the sake of completeness. I recommend dividing responsibilities as shown below to make it easy for developers to swap them with the [many](https://partner.steamgames.com/doc/features/multiplayer) [robust](https://developer.microsoft.com/en-us/games/solutions/multiplayer/) [platform](https://dev.epicgames.com/docs/services/en-US/Overview/index.html) [SDKs](https://docs.aws.amazon.com/gamelift/latest/developerguide/gamelift-intro.html). Replication addresses all the underlying ECS interop, so it should be settled first. - **replication** (this RFC) + **replication** ← this RFC - save and restore - prediction - serialization and compression From dfec5c2aecbf53942567e2fdb0ea3f9d6b593c3e Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Sun, 25 Apr 2021 18:23:41 -0500 Subject: [PATCH 15/43] replaced some I/O term usage --- networked_replication.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/networked_replication.md b/networked_replication.md index 5637c78b..64a06c7b 100644 --- a/networked_replication.md +++ b/networked_replication.md @@ -184,7 +184,7 @@ Restoring Clients 1. Iterate received server updates. 2. Update simulation and interpolation timescales. -3. Sample and send inputs to server. +3. Sample inputs and push them to send buffer. 4. Rollback and re-sim *if* a new update was received. 5. Simulate predicted tick. @@ -193,7 +193,7 @@ Server 2. Sample buffered inputs. 3. Simulate authoritative tick. 4. Duplicate state changes to copy. -5. Send updated state to clients. +5. Push client updates to send buffer. Everything aside from the simulation steps could be auto-generated. @@ -247,9 +247,9 @@ I strongly doubt that fast, efficient, and transparent replication features can ## Future possibilities - With some tool to visualize game state diffs, these replication systems could help detect non-determinism in other parts of the engine. -- Much like how Unreal has Fortnite, it would help immensenly if Bevy had an official collection of multiplayer samples to dogfood these features. +- Much like how Unreal has Fortnite, Bevy could have an official (or community-curated) collection of multiplayer samples to dogfood these features. - Bevy's future editor could automate most of the configuration and annotation. -- Beyond replication, Bevy need only provide one good default for protocol and IO for the sake of completeness. I recommend dividing responsibilities as shown below to make it easy for developers to swap them with the [many](https://partner.steamgames.com/doc/features/multiplayer) [robust](https://developer.microsoft.com/en-us/games/solutions/multiplayer/) [platform](https://dev.epicgames.com/docs/services/en-US/Overview/index.html) [SDKs](https://docs.aws.amazon.com/gamelift/latest/developerguide/gamelift-intro.html). Replication addresses all the underlying ECS interop, so it should be settled first. +- Beyond replication, Bevy need only provide one good default for protocol and I/O for the sake of completeness. I recommend dividing crates at least to the extent shown below to make it easy for developers to swap the low-level stuff with [whatever](https://partner.steamgames.com/doc/features/multiplayer) [alternatives](https://developer.microsoft.com/en-us/games/solutions/multiplayer/) [they](https://dev.epicgames.com/docs/services/en-US/Overview/index.html) [want](https://docs.aws.amazon.com/gamelift/latest/developerguide/gamelift-intro.html). Replication addresses all the underlying ECS interop, so it should be settled first. **replication** ← this RFC - save and restore From a79fb40f95058f12b8c83985401a31136acb27e0 Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Sun, 25 Apr 2021 18:40:53 -0500 Subject: [PATCH 16/43] fixed a typo "sparesets" --- networked_replication.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/networked_replication.md b/networked_replication.md index 64a06c7b..ecf6912d 100644 --- a/networked_replication.md +++ b/networked_replication.md @@ -160,7 +160,7 @@ Requirements - [Sub-worlds](https://github.com/bevyengine/rfcs/pull/16) seem like a potential candidate for this. Saving -- At the end of each fixed update, server iterates `Changed` and `Removed` for all replicable components and duplicates them to an isolated collection of `SpareSet`. +- At the end of each fixed update, server iterates `Changed` and `Removed` for all replicable components and duplicates them to an isolated collection of `SparseSet`. - You could pass this "read-only" copy to another thread to do the remaining work. - Tables would be rebuilt when restoring. @@ -247,7 +247,7 @@ I strongly doubt that fast, efficient, and transparent replication features can ## Future possibilities - With some tool to visualize game state diffs, these replication systems could help detect non-determinism in other parts of the engine. -- Much like how Unreal has Fortnite, Bevy could have an official (or community-curated) collection of multiplayer samples to dogfood these features. +- Much like how Unreal has Fortnite, Bevy could have an official (or curated) collection of multiplayer samples to dogfood these features. - Bevy's future editor could automate most of the configuration and annotation. - Beyond replication, Bevy need only provide one good default for protocol and I/O for the sake of completeness. I recommend dividing crates at least to the extent shown below to make it easy for developers to swap the low-level stuff with [whatever](https://partner.steamgames.com/doc/features/multiplayer) [alternatives](https://developer.microsoft.com/en-us/games/solutions/multiplayer/) [they](https://dev.epicgames.com/docs/services/en-US/Overview/index.html) [want](https://docs.aws.amazon.com/gamelift/latest/developerguide/gamelift-intro.html). Replication addresses all the underlying ECS interop, so it should be settled first. From 236fea41c2397f6d35c132ae1be3788115206167 Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Mon, 26 Apr 2021 06:47:18 -0500 Subject: [PATCH 17/43] reverted arrows to ASCII; added another example --- implementation_details.md | 2 +- networked_replication.md | 63 +++++++++++++++++++++++++-------------- 2 files changed, 42 insertions(+), 23 deletions(-) diff --git a/implementation_details.md b/implementation_details.md index aab2a513..e6421c14 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -109,7 +109,7 @@ Cameras need a little special treatment. Inputs to the view rotation need to be Is an exponential decay enough for smooth error correction or are there better algorithms? -## Prediction ⟷ Interpolation +## Prediction <-> Interpolation Clients can't directly modify the authoritative state, but they should be able to predict whatever they want locally. One obvious implementation is to literally fork the latest authoritative state. If copying the full state ends up being too expensive, we can probably use a copy-on-write layer. Clients should predict the entities driven by their input, the entities they spawn (until confirmed), and any entities mutated as a result of the first two. I think that should cover it. Predicting *everything* would be a compile-time choice. diff --git a/networked_replication.md b/networked_replication.md index ecf6912d..b24cce8e 100644 --- a/networked_replication.md +++ b/networked_replication.md @@ -29,20 +29,18 @@ What I hope to explore in this RFC is: [Link to my explanation of important replication concepts.](../main/replication_concepts.md) -> Please treat all terms like determinism, state transfer, snapshots, and eventual consistency as placeholders. We could easily label them differently. +> Please consider terms like determinism, state transfer, snapshots, etc. as placeholders. Bevy aims to make developing networked games as simple as possible. There isn't one replication strategy that works best for every game, but Bevy provides those it has under one API. First think about your game and consider which form of replication might fit best. Players can *either* send their inputs to each other (or through a relay) and independently and deterministically simulate the game *or* they can send their inputs to a single machine (the server) who simulates the game and sends back updated game state. -> Honestly, Bevy could put something like a questionnaire in the docs. Genre and player count pretty much choose the replication strategy for you. +> Bevy could have a questionnaire in the docs. Genre and player count basically choose for you. Next, determine which components and systems affect the global simulation state and tag them accordingly. Usually adding `#[derive(Replicate)]` to all replicable components is enough. You can additionally decorate gameplay logic and systems with `#[client]` or `#[server]` for conditional compilation. Lastly, add these simulation systems to the `NetworkFixedUpdate` app state. Bevy will take care of all state rollback, serialization, and compression internally. Other than that, you're free to write your game as if it were local multiplayer. -> This guide is pretty lazy lol, but that's the gist of it. - ### Example: "Networked" Components ```rust @@ -65,17 +63,30 @@ struct Health { ### Example: "Networked" Systems ```rust -// No networking boilerplate. Just swap components. -// Same code runs on client and server. fn check_zero_health(mut query: Query<(&Health, &mut NetworkTransform)>){ for (health, mut transform) in query.iter_mut() { if health.hp <= 0.0 { - transform.translation = Vec3::ZERO; + *transform.translation = Vec3::ZERO; } } } ``` +```rust +fn update_player_velocity(mut q: Query<(&Player, &mut Rigidbody)>) { + for (player, mut rigidbody) in q.iter_mut() { + // DerefMut flags these rigidbodies as predicted on the client. + *rigidbody.velocity = player.move_direction * player.move_speed; + } +} + +fn expensive_physics_calculation(mut q: Query<(&mut Rigidbody), Predicted>) { + for rigidbody in q.iter_mut() { + // Do stuff with only the predicted rigidbodies... + } +} +``` + ### Example: "Networked" App ```rust #[derive(Debug, Hash, PartialEq, Eq, Clone, SystemLabel)] @@ -143,28 +154,36 @@ replication strategy: snapshots [Link to more in-depth implementation details.](../main/implementation_details.md) -### Macros +### What is required? +- Guaranteed `ComponentId` stability. +- Make `World` able to reserve an `Entity` ID range and track metadata for it separately. + - If merged, [#16](https://github.com/bevyengine/rfcs/pull/16) could probably be used to handle this cleanly. +- Ideally, `RunCriteria` would support nested loops. I'm pretty sure this isn't an actual blocker, but the workaround feels a little hacky. + +### Primitives - `#[derive(Replicate)]` for identification and serialization; also adds `[repr(C)]` - `#[replicate(precision=?)]` for quantization - `#[replicate(range=(?, ?))]` for range compression - `#[client]` and `#[server]` for conditional compilation - -### Saving and Restoring Game State - -Requirements - +- `NetworkID` component for GUID +- `NetworkFixedUpdate` app state +- Specialized change detection filters + - `Predicted` + - `Confirmed` + - `Cancelled` + - `Added` and `Removed` variants for each also + +### Best Practices +- Networked entities will have the `NetworkID` component at minimum. Could be auto-added. - Replicable components should only be mutated inside `NetworkFixedUpdate`. -- Networked entities will have a global `NetworkID` component at minimum. -- Clients shouldn't try to "network" local entities through the addition of replicable components since those entities do not exist on the server. -- `World` should reserve an `Entity` ID range and track metadata for it separately. - - [Sub-worlds](https://github.com/bevyengine/rfcs/pull/16) seem like a potential candidate for this. +- Clients shouldn't try to "network" local entities by just adding replicable components. Those entities do not exist on the server. Client-authoritative is a different story. -Saving +### Saving Game State - At the end of each fixed update, server iterates `Changed` and `Removed` for all replicable components and duplicates them to an isolated collection of `SparseSet`. - You could pass this "read-only" copy to another thread to do the remaining work. - Tables would be rebuilt when restoring. -Preparing Packets +### Preparing Server Packets - Snapshots (full state updates) will use delta compression. - Server keeps a ring buffer of patches for the last `N` snapshots. - Server computes the latest patch by xor'ing the copy and the latest changes (before applying them) and pushes it into the ring buffer. The servers also updates the earlier patches by xor'ing them with the latest patch. (The xor'ing is basically a pre-compression step that produces long zero chains with high probability.) @@ -176,7 +195,7 @@ Preparing Packets - Server runs collision detection to prioritize physical entities inside each client's area of interest. - Server writes the payload for each client and hands them off to the protocol layer. -Restoring +### Restoring Game State - At the beginning of each fixed update, the client decompresses the received update and writes its changes to the appropriate `SparseSet` collection (several will be buffered). - Client then uses this updated collection to write the prediction copy that has all the tables and non-replicable components. @@ -246,12 +265,12 @@ I strongly doubt that fast, efficient, and transparent replication features can - When sending partial state updates, how should we deal with weird stuff like there being references to entities that haven't been spawned or have been destroyed? ## Future possibilities -- With some tool to visualize game state diffs, these replication systems could help detect non-determinism in other parts of the engine. +- With some tool to visualize game state diffs, these replication systems could help detect non-determinism in other parts of the engine. - Much like how Unreal has Fortnite, Bevy could have an official (or curated) collection of multiplayer samples to dogfood these features. - Bevy's future editor could automate most of the configuration and annotation. - Beyond replication, Bevy need only provide one good default for protocol and I/O for the sake of completeness. I recommend dividing crates at least to the extent shown below to make it easy for developers to swap the low-level stuff with [whatever](https://partner.steamgames.com/doc/features/multiplayer) [alternatives](https://developer.microsoft.com/en-us/games/solutions/multiplayer/) [they](https://dev.epicgames.com/docs/services/en-US/Overview/index.html) [want](https://docs.aws.amazon.com/gamelift/latest/developerguide/gamelift-intro.html). Replication addresses all the underlying ECS interop, so it should be settled first. - **replication** ← this RFC + **replication** <- this RFC - save and restore - prediction - serialization and compression From efaf47fecd761e8c99b924a3abd5c5ddde14fca0 Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Mon, 26 Apr 2021 07:43:08 -0500 Subject: [PATCH 18/43] some rewording more neutral tone in the implementation details --- implementation_details.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/implementation_details.md b/implementation_details.md index e6421c14..eb664010 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -15,9 +15,9 @@ TBD I know I've been using the terms somewhat interchangeably, but `Player` and `Connection` should be separate tokens. No reason to force one player per connection in the engine API. Having `Player` be its own thing makes it easier to do stuff like replace leaving players with bots. ## "Clock" Synchronization -Ideally, clients predict ahead by just enough to have their inputs reach the server right before they're needed. For some reason, people frequently arrive at the idea that clients should estimate the clock time on the server (with some SNTP handshake) and use that to schedule the next simulation step. +Ideally, clients predict ahead by just enough to have their inputs reach the server right before they're needed. People often try to have clients estimate the clock time on the server (with some SNTP handshake) and use that to schedule the next simulation step, but that's overly complex. -That's overcomplicating it. What we really care about is: How much time passes between when the server receives my input and when that input is consumed? If the server simply tells clients how long their inputs are waiting in its buffer, the clients can use that information to converge on the correct lead. +What we really care about is: How much time passes between when the server receives my input and when that input is consumed? If the server simply tells clients how long their inputs are waiting in its buffer, the clients can use that information to converge on the correct lead. ```rust if received_newer_server_update: @@ -71,7 +71,7 @@ The key idea here is that simplifying the client-server relationship makes the p ## Lag Compensation Lag compensation mainly deals with colliders. To avoid weird outcomes, lag compensation needs to run after all motion and physics systems. -Again, people get weird ideas about having the server estimate what interpolated state the client was looking at based on their RTT. Again, that kind of guesswork is unnecessary. +Again, people often imagine having the server estimate what interpolated state the client was looking at based on their RTT, but we can solve this problem without any guesswork. Clients can just tell the server what they were looking at by bundling the interpolated tick numbers and the blend value inside the input payloads. @@ -103,19 +103,20 @@ For clients with very high ping, their interpolated time will lag too far behind This limit is the only relation between the predicted time and the interpolated time. They're otherwise decoupled. ## Smooth Rendering +Rendering should come after `NetworkFixedUpdate`. + Whenever clients receive an update with new remote entities, those entities shouldn't be rendered until that update is interpolated. Cameras need a little special treatment. Inputs to the view rotation need to be accumulated at the render rate and re-applied just before rendering. +We'll also need to distinguish instant motion from integrated motion when interpolating. Moving an entity by modifying `transform.translation` and `rigidbody.velocity` should look different. + Is an exponential decay enough for smooth error correction or are there better algorithms? ## Prediction <-> Interpolation Clients can't directly modify the authoritative state, but they should be able to predict whatever they want locally. One obvious implementation is to literally fork the latest authoritative state. If copying the full state ends up being too expensive, we can probably use a copy-on-write layer. -Clients should predict the entities driven by their input, the entities they spawn (until confirmed), and any entities mutated as a result of the first two. I think that should cover it. Predicting *everything* would be a compile-time choice. - -I said entities, but we can predict with component granularity. The million-dollar question is how to shift things between prediction and interpolation. My current idea is for everything to default to interpolation (reset upon receiving a server update) and then use specialized change detection `DerefMut` magic to flag local predictions. - +My current idea to shift components between prediction and interpolation is to default to interpolated (reset upon receiving a server update) and then use specialized change detection `DerefMut` magic to flag as predicted. ``` Predicted PredictAdded @@ -127,10 +128,9 @@ Cancelled CancelAdded CancelRemoved ``` +Everything is predicted by default, but users can opt-out by filtering on `Predicted`. In the more conservative cases, clients would predict the entities driven by their input, the entities they spawn (until confirmed), and any entities mutated as a result of the first two. Systems with filtered queries (i.e. physics, path-planning) should typically run last. -With these, we can explicitly opt-out of funneling non-predicted components through expensive systems. We can also generate events that only trigger on authoritative changes and events that trigger on predicted changes to be confirmed or cancelled later. The latter are necessary for handling sounds and particle effects. Those shouldn't be duplicated during rollbacks and should be faded out if mispredicted. - -All systems that handle "predictable" interactions (pushing a button, putting an item in your inventory) should probably run *before* the expensive stuff (physics, path-planning). Rendering should come after `NetworkFixedUpdate`. +We can also use these filters to generate events that only trigger on authoritative changes and events that trigger on predicted changes to be confirmed or cancelled later. The latter are necessary for handling sounds and particle effects. Those shouldn't be duplicated during rollbacks and should be faded out if mispredicted. Should UI be allowed to reference predicted state or only verified state? From b4450bd043240c3ec91b30d48a2c1586ef8b0e1c Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Mon, 26 Apr 2021 07:53:56 -0500 Subject: [PATCH 19/43] reorganized implementaton_details.md --- implementation_details.md | 118 +++++++++++++++++++------------------- 1 file changed, 59 insertions(+), 59 deletions(-) diff --git a/implementation_details.md b/implementation_details.md index eb664010..2e404006 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -1,18 +1,7 @@ # Implementation Details -## Delta Compression -TBD - -## Interest Management -TBD - -## RPC -RPCs are best for sending global alerts and any gameplay mechanics you explicitly want modeled as request-reply (or one-way) interactions. They can be reliable or unreliable. - -TBD - -## Clients are not players... -I know I've been using the terms somewhat interchangeably, but `Player` and `Connection` should be separate tokens. No reason to force one player per connection in the engine API. Having `Player` be its own thing makes it easier to do stuff like replace leaving players with bots. +## `Connection` != `Player` +I know I've been using the terms "client" and "player" somewhat interchangeably, but `Connection` and `Player` should be separate tokens. There's no benefit in forcing one player per connection. Having `Player` be its own thing makes it easier to do stuff like replace leaving players with bots. ## "Clock" Synchronization Ideally, clients predict ahead by just enough to have their inputs reach the server right before they're needed. People often try to have clients estimate the clock time on the server (with some SNTP handshake) and use that to schedule the next simulation step, but that's overly complex. @@ -68,51 +57,6 @@ interp_time = max(interp_time, predicted_time - max_lag_comp) The key idea here is that simplifying the client-server relationship makes the problem easier. You *could* have the server apply inputs whenever they arrive, rolling back if necessary, but that would only complicate things. If the server never accepts late inputs and never changes its pace, no one needs to coordinate. -## Lag Compensation -Lag compensation mainly deals with colliders. To avoid weird outcomes, lag compensation needs to run after all motion and physics systems. - -Again, people often imagine having the server estimate what interpolated state the client was looking at based on their RTT, but we can solve this problem without any guesswork. - -Clients can just tell the server what they were looking at by bundling the interpolated tick numbers and the blend value inside the input payloads. - -``` - -tick number (predicted) -tick number (interpolated from) -tick number (interpolated to) -interpolation blend value - -``` -With this information, the server can reconstruct *exactly* what each client saw. - -Lag compensation goes like this: -1. Queue projectile spawns, tagged with their shooter's interpolation data. -2. Restore all colliders to the earliest interpolated moment. -3. Replay forward to the current tick, spawning the projectiles at the appropriate times and registering hits. - -After that's done, any surviving projectiles will exist in the correct time. The process is the same for raycast weapons. - -There's a lot to learn from *Overwatch* here. - -*Overwatch* [allows defensive abilities to mitigate lag-compensated shots](https://youtu.be/W3aieHjyNvw?t=2492). AFAIK this is simple to do. If a player activates any defensive bonus, just apply it to all their buffered hitboxes. - -*Overwatch* also [finds the movement envelope of each entity](https://youtu.be/W3aieHjyNvw?t=2226), the "sum" of its bounding volumes over the full lag compensation window, to reduce the number of intersection tests, only rewinding characters whose movement envelopes intersect projectiles. - -For clients with very high ping, their interpolated time will lag too far behind their predicted time. You generally don't want to favor the shooter past a certain limit (e.g. 250ms), so [those clients have to extrapolate the difference](https://youtu.be/W3aieHjyNvw?t=2347). Not extrapolating is also valid, but then lagging clients would abruptly have to start leading their targets. - -This limit is the only relation between the predicted time and the interpolated time. They're otherwise decoupled. - -## Smooth Rendering -Rendering should come after `NetworkFixedUpdate`. - -Whenever clients receive an update with new remote entities, those entities shouldn't be rendered until that update is interpolated. - -Cameras need a little special treatment. Inputs to the view rotation need to be accumulated at the render rate and re-applied just before rendering. - -We'll also need to distinguish instant motion from integrated motion when interpolating. Moving an entity by modifying `transform.translation` and `rigidbody.velocity` should look different. - -Is an exponential decay enough for smooth error correction or are there better algorithms? - ## Prediction <-> Interpolation Clients can't directly modify the authoritative state, but they should be able to predict whatever they want locally. One obvious implementation is to literally fork the latest authoritative state. If copying the full state ends up being too expensive, we can probably use a copy-on-write layer. @@ -147,6 +91,51 @@ A better solution is for the server to assign each networked entity a global ID - A more extreme solution would be to somehow bake global IDs directly into the memory allocation. If memory layouts are mirrored, relative pointers become global IDs, which don't need to be explicitly written into packets. This would save 4-8 bytes per entity before compression. +## Smooth Rendering +Rendering should come after `NetworkFixedUpdate`. + +Whenever clients receive an update with new remote entities, those entities shouldn't be rendered until that update is interpolated. + +Cameras need a little special treatment. Inputs to the view rotation need to be accumulated at the render rate and re-applied just before rendering. + +We'll also need to distinguish instant motion from integrated motion when interpolating. Moving an entity by modifying `transform.translation` and `rigidbody.velocity` should look different. + +Is an exponential decay enough for smooth error correction or are there better algorithms? + +## Lag Compensation +Lag compensation mainly deals with colliders. To avoid weird outcomes, lag compensation needs to run after all motion and physics systems. + +Again, people often imagine having the server estimate what interpolated state the client was looking at based on their RTT, but we can solve this problem without any guesswork. + +Clients can just tell the server what they were looking at by bundling the interpolated tick numbers and the blend value inside the input payloads. + +``` + +tick number (predicted) +tick number (interpolated from) +tick number (interpolated to) +interpolation blend value + +``` +With this information, the server can reconstruct *exactly* what each client saw. + +Lag compensation goes like this: +1. Queue projectile spawns, tagged with their shooter's interpolation data. +2. Restore all colliders to the earliest interpolated moment. +3. Replay forward to the current tick, spawning the projectiles at the appropriate times and registering hits. + +After that's done, any surviving projectiles will exist in the correct time. The process is the same for raycast weapons. + +There's a lot to learn from *Overwatch* here. + +*Overwatch* [allows defensive abilities to mitigate lag-compensated shots](https://youtu.be/W3aieHjyNvw?t=2492). AFAIK this is simple to do. If a player activates any defensive bonus, just apply it to all their buffered hitboxes. + +*Overwatch* also [finds the movement envelope of each entity](https://youtu.be/W3aieHjyNvw?t=2226), the "sum" of its bounding volumes over the full lag compensation window, to reduce the number of intersection tests, only rewinding characters whose movement envelopes intersect projectiles. + +For clients with very high ping, their interpolated time will lag too far behind their predicted time. You generally don't want to favor the shooter past a certain limit (e.g. 250ms), so [those clients have to extrapolate the difference](https://youtu.be/W3aieHjyNvw?t=2347). Not extrapolating is also valid, but then lagging clients would abruptly have to start leading their targets. + +This limit is the only relation between the predicted time and the interpolated time. They're otherwise decoupled. + ## Unconditional Rollbacks Every article on "rollback netcode" and "client-side prediction and server reconciliation" encourages having clients compare their predicted state to the authoritative state and reconciling *if* they mispredicted. But how do you actually detect a mispredict? @@ -163,4 +152,15 @@ Let's consider a simpler default: Now, if you're thinking that's wasteful, the "if mispredicted" gives you a false sense of security. If I make a game and claim it can rollback 250ms, that basically should mean *any* 250ms, with no stuttering. If clients *always* rollback and re-sim, it'll be easier to profile and optimize for that. As a bonus, clients never need to store old predicted states. -Constant rollbacks may sound expensive, but there were games with rollback running on the original Playstation over 20 years ago. \ No newline at end of file +Constant rollbacks may sound too expensive, but there were games with rollback running on the original Playstation over 20 years ago. + +## Delta Compression +TBD + +## Interest Management +TBD + +## Messages +Messages are best for sending global alerts and any gameplay mechanics you explicitly want modeled as request-reply (or one-way) interactions. They can be reliable or unreliable. + +TBD \ No newline at end of file From 7d3d0c73bb53d837e5537c59f66eeb5896153736 Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Mon, 26 Apr 2021 19:45:50 -0500 Subject: [PATCH 20/43] edit lag comp description --- implementation_details.md | 26 ++++++++++---------------- 1 file changed, 10 insertions(+), 16 deletions(-) diff --git a/implementation_details.md b/implementation_details.md index 2e404006..0e91051a 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -1,7 +1,7 @@ # Implementation Details ## `Connection` != `Player` -I know I've been using the terms "client" and "player" somewhat interchangeably, but `Connection` and `Player` should be separate tokens. There's no benefit in forcing one player per connection. Having `Player` be its own thing makes it easier to do stuff like replace leaving players with bots. +I know I've been using the terms "client" and "player" somewhat interchangeably, but `Connection` and `Player` should be separate tokens. There's no benefit in forcing one player per connection. Having `Player` be its own thing makes it easier to do stuff like online splitscreen, temporarily fill team slots with bots, etc. ## "Clock" Synchronization Ideally, clients predict ahead by just enough to have their inputs reach the server right before they're needed. People often try to have clients estimate the clock time on the server (with some SNTP handshake) and use that to schedule the next simulation step, but that's overly complex. @@ -103,11 +103,11 @@ We'll also need to distinguish instant motion from integrated motion when interp Is an exponential decay enough for smooth error correction or are there better algorithms? ## Lag Compensation -Lag compensation mainly deals with colliders. To avoid weird outcomes, lag compensation needs to run after all motion and physics systems. +Lag compensation deals with colliders. To avoid weird outcomes, lag compensation needs to run after all motion and physics systems. -Again, people often imagine having the server estimate what interpolated state the client was looking at based on their RTT, but we can solve this problem without any guesswork. +Again, people often imagine having the server estimate what interpolated state the client was looking at based on their RTT, but we can resolve this without any guesswork. -Clients can just tell the server what they were looking at by bundling the interpolated tick numbers and the blend value inside the input payloads. +Clients can just tell the server what they were looking at by bundling the interpolated tick numbers and the blend value inside the input payloads. With this information, the server can reconstruct *exactly* what each client saw. ``` @@ -117,24 +117,18 @@ tick number (interpolated to) interpolation blend value ``` -With this information, the server can reconstruct *exactly* what each client saw. -Lag compensation goes like this: -1. Queue projectile spawns, tagged with their shooter's interpolation data. -2. Restore all colliders to the earliest interpolated moment. -3. Replay forward to the current tick, spawning the projectiles at the appropriate times and registering hits. - -After that's done, any surviving projectiles will exist in the correct time. The process is the same for raycast weapons. +So there are two ways to go about the actual compensation: +- Compensate upfront by bringing new projectiles into the present (similar to a rollback). +- Compensate over time ("amortized"), constantly testing projectiles against the history buffer. There's a lot to learn from *Overwatch* here. -*Overwatch* [allows defensive abilities to mitigate lag-compensated shots](https://youtu.be/W3aieHjyNvw?t=2492). AFAIK this is simple to do. If a player activates any defensive bonus, just apply it to all their buffered hitboxes. - -*Overwatch* also [finds the movement envelope of each entity](https://youtu.be/W3aieHjyNvw?t=2226), the "sum" of its bounding volumes over the full lag compensation window, to reduce the number of intersection tests, only rewinding characters whose movement envelopes intersect projectiles. +*Overwatch* shows that [time is just another collision dimension](https://youtu.be/W3aieHjyNvw?t=2226). Basically, you can broadphase test against the entire collider history at once (with the amortized method). -For clients with very high ping, their interpolated time will lag too far behind their predicted time. You generally don't want to favor the shooter past a certain limit (e.g. 250ms), so [those clients have to extrapolate the difference](https://youtu.be/W3aieHjyNvw?t=2347). Not extrapolating is also valid, but then lagging clients would abruptly have to start leading their targets. +*Overwatch* [allows defensive abilities to mitigate compensated projectiles](https://youtu.be/W3aieHjyNvw?t=2492). AFAIK this is simple to do. If a player activates any defensive bonus, just apply it to all their buffered hitboxes. -This limit is the only relation between the predicted time and the interpolated time. They're otherwise decoupled. +For clients with too-high ping, their interpolation will lag far behind their prediction. If you only compensate up to a limit (e.g. 200ms), [those clients will have to extrapolate the difference](https://youtu.be/W3aieHjyNvw?t=2347). Doing nothing is also valid, but lagging clients would abruptly have to start leading their targets. ## Unconditional Rollbacks Every article on "rollback netcode" and "client-side prediction and server reconciliation" encourages having clients compare their predicted state to the authoritative state and reconciling *if* they mispredicted. But how do you actually detect a mispredict? From 943b336e0e695b6be1fdc33ac979fd918437bfb0 Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Tue, 27 Apr 2021 10:24:26 -0500 Subject: [PATCH 21/43] re-did user-facing explanation, moved stuff around --- implementation_details.md | 2 +- networked_replication.md | 303 +++++++++++++++----------------------- replication_concepts.md | 6 +- 3 files changed, 125 insertions(+), 186 deletions(-) diff --git a/implementation_details.md b/implementation_details.md index 0e91051a..21a81f25 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -155,6 +155,6 @@ TBD TBD ## Messages -Messages are best for sending global alerts and any gameplay mechanics you explicitly want modeled as request-reply (or one-way) interactions. They can be reliable or unreliable. +Messages are best for sending global alerts and any gameplay mechanics you explicitly want modeled as request-reply (or one-way) interactions. They can be unreliable or reliable. You can also postmark messages to be executed on a certain tick like inputs. That can only be best effort, though. TBD \ No newline at end of file diff --git a/networked_replication.md b/networked_replication.md index b24cce8e..cc7e1a40 100644 --- a/networked_replication.md +++ b/networked_replication.md @@ -2,50 +2,38 @@ ## Summary -This RFC describes an implementation of engine features for developing networked games. Its main focus is replication and its key interest is providing these systems transparently (i.e. minimal, if any, networking boilerplate). +This RFC proposes an implementation of engine features for developing networked games. It abstracts away the (mostly irrelevant) low-level transport details to focus on high-level *replication* features, with key interest in providing them transparently (i.e. minimal, if any, networking boilerplate). ## Motivation -Networking is unequivocally the most lacking feature in all general-purpose game engines. +Networking is unequivocally the most lacking feature in all general-purpose game engines. -Bevy has an opportunity to be among the first open game engines to provide a truly plug-and-play networking API. This RFC focuses on *replication*, the part of networking that deals with simulation behavior and the only one that directly involves the ECS. +While most engines provide low-level connectivity—virtual connections, optionally reliable UDP channels, rooms—almost none of them ([except][1] [Unreal][2]) provide high-level *replication* features like prediction, interest management, or lag compensation, which are necessary for most networked multiplayer games. -> The goal of replication is to ensure that all of the players in the game have a consistent model of the game state. Replication is the absolute minimum problem which all networked games have to solve in order to be functional, and all other problems in networked games ultimately follow from it. - [Mikola Lysenko](https://0fps.net/2014/02/10/replication-in-networked-games-overview-part-1/) +This broad absence of first-class replication features stifles creative ambition and feeds into an idea that every multiplayer game needs its own unique implementation. Certainly, there are idiomatic "strategies" for different genres, but all of them—lockstep, rollback, client-side prediction with server reconciliation—pull from the same bag of tricks. Their differences can be captured in a short list of configuration options. Really, only *massive* multiplayer games require custom solutions. -Most engines provide "low level" connectivity—virtual connections, optionally reliable UDP channels, rooms—and stop there. Those are not very useful to developers without "high level" replication features—prediction, reconciliation, lag compensation, interest management, etc. - -Among Godot, Unity, and Unreal, only Unreal provides [some](https://youtu.be/JOJP0CvpB8w) [features](https://www.unrealengine.com/en-US/tech-blog/replication-graph-overview-and-proper-replication-methods) built-in. - -IMO the broader absence of these systems leads many to conclude that every multiplayer game must need its own unique solution. This is not true. While the exact replication "strategy" depends of the game, all of them—lockstep, rollback, client-side prediction with server reconciliation, etc.—pull from the same bag of tricks. Their differences can be captured with simple configuration options. Really, only *massive* multiplayer games require custom solutions. - -In general, I think that building *up* from the socket layer leads to the wrong intuition about what "networking" is. If you start from what the simulation needs and design *down*, defining the problem is easier and routing becomes an implementation detail. +Bevy's ECS opens up the possibility of providing a near-seamless, generalized networking API. What I hope to explore in this RFC is: -- How do game design and networking constrain each other? -- How do these constraints affect user decisions? -- What should developing a networked game look like in Bevy? +- What game design choices and constraints does networking add? +- How does ECS make networking easier to implement? +- What should developing a networked multiplayer game in Bevy look like? -## Guide-level explanation +## User-facing Explanation [Link to my explanation of important replication concepts.](../main/replication_concepts.md) -> Please consider terms like determinism, state transfer, snapshots, etc. as placeholders. - -Bevy aims to make developing networked games as simple as possible. There isn't one replication strategy that works best for every game, but Bevy provides those it has under one API. - -First think about your game and consider which form of replication might fit best. Players can *either* send their inputs to each other (or through a relay) and independently and deterministically simulate the game *or* they can send their inputs to a single machine (the server) who simulates the game and sends back updated game state. - -> Bevy could have a questionnaire in the docs. Genre and player count basically choose for you. +Bevy's aim here is to make writing local and networked multiplayer games indistinguishable, with minimal added boilerplate. Having an exact simulation timeline simplifies this problem, thus the core of this unified approach is a fixed timestep—`NetworkFixedUpdate`. -Next, determine which components and systems affect the global simulation state and tag them accordingly. Usually adding `#[derive(Replicate)]` to all replicable components is enough. You can additionally decorate gameplay logic and systems with `#[client]` or `#[server]` for conditional compilation. +As a user, you only have to annotate your gameplay-related components and systems, add those systems to `NetworkFixedUpdate` (currently would be an `AppState`), and configure a few simulation settings to get up and running. That's it! Bevy will transparently handle separating, reconciling, serializing, and compressing the networked state for you. (Those systems can be exposed for advanced users, but non-interested users need not concern themselves.) -Lastly, add these simulation systems to the `NetworkFixedUpdate` app state. Bevy will take care of all state rollback, serialization, and compression internally. Other than that, you're free to write your game as if it were local multiplayer. +> Game design should (mostly) drive networking choices. Future documentation could feature a questionnaire to guide users to the correct configuration options for their game. Genre and player count are generally enough to decide. -### Example: "Networked" Components +The core primitive here is the `Replicate` trait. All instances of components and resources that implement this trait will be automatically detected and synchronized over the network. Simply adding a `#[derive(Replicate)]` should be enough in most cases. ```rust #[derive(Replicate)] -struct NetworkTransform { +struct Transform { #[replicate(precision=0.001)] translation: Vec3, #[replicate(precision=0.01)] @@ -56,128 +44,114 @@ struct NetworkTransform { #[derive(Replicate)] struct Health { - #[replicate(precision=0.1, range=(0.0, 100.0))] - hp: f32, + #[replicate(range=(0, 1000))] + hp: u32, } ``` +By default, both client and server will run every system you add to `NetworkFixedUpdate`. If you want systems or code snippets to run exclusively on one or the other, you can annotate them with `#[client]` or `#[server]` for the compiler. -### Example: "Networked" Systems ```rust -fn check_zero_health(mut query: Query<(&Health, &mut NetworkTransform)>){ - for (health, mut transform) in query.iter_mut() { - if health.hp <= 0.0 { - *transform.translation = Vec3::ZERO; - } +#[server] +fn ball_movement_system( + mut ball_query: Query<(&Ball, &mut Transform)>) +{ + for (ball, mut transform) in ball_query.iter_mut() { + transform.translation += ball.velocity * FIXED_TIMESTEP; } } ``` +For more nuanced runtime cases—say, an expensive movement system that should only process the local player entity on clients—you can use the `Predicted` query filter. If you need an explicit request or notification, you can use `Message` variants. ```rust -fn update_player_velocity(mut q: Query<(&Player, &mut Rigidbody)>) { +fn update_player_velocity( + mut q: Query<(&Player, &mut Rigidbody)>) +{ for (player, mut rigidbody) in q.iter_mut() { // DerefMut flags these rigidbodies as predicted on the client. - *rigidbody.velocity = player.move_direction * player.move_speed; + *rigidbody.velocity = player.aim_direction * player.movement_speed * FIXED_TIMESTEP; } } -fn expensive_physics_calculation(mut q: Query<(&mut Rigidbody), Predicted>) { +fn expensive_physics_calculation( + mut q: Query<(&mut Rigidbody), Predicted>) +{ for rigidbody in q.iter_mut() { // Do stuff with only the predicted rigidbodies... } } ``` -### Example: "Networked" App +``` +TODO: Message Example +``` + +Bevy can configure an `App` to operate in several different network modes. + +| Mode | Playable? | Authoritative? | Open to connections? | +| :--- | :---: | :---: | :---: | +| Client | ✓ | ✗ | ✗ | +| Standalone | ✓ | ✓ | ✗ | +| Listen Server | ✓ | ✓ | ✓ | +| Dedicated Server | ✗ | ✓ | ✓ | +| Relay | ✗ | ✗ | ✓ | + +
+ ```rust -#[derive(Debug, Hash, PartialEq, Eq, Clone, SystemLabel)] -pub enum NetworkLabel { - Input, - Gameplay, - Physics, - LagComp -} +// TODO: Example App configuration. +``` -fn main() { - App::build() - .add_plugins(DefaultPlugins) - .add_plugins(NetworkPlugins) - - // Add the fixed update state. - .add_state(AppState::NetworkFixedUpdate) - .run_criteria(FixedTimestep::step(1.0 / 60.0)) - - // Add our game systems: - .add_system_set( - SystemSet::new() - .label(NetworkLabel::Input) - .before(NetworkLabel::Gameplay) - .with_system(sample_inputs.system()) - ) - .add_system_set( - SystemSet::on_update(AppState::NetworkFixedUpdate) - .label(NetworkLabel::Gameplay) - .before(NetworkLabel::Physics) - .with_system(check_zero_health.system()) - // ... Most user systems would go here. - ) - #[server] - .add_system_set( - SystemSet::on_update(AppState::NetworkFixedUpdate) - .label(NetworkLabel::LagComp) - .after(NetworkLabel::Physics) - // ... - ) - // ... - .run(); +## Implementation Strategy +[Link to more in-depth implementation details (more of an idea dump atm).](../main/implementation_details.md) + +### What is required? +- In order for servers to send state to clients, `ComponentId` should be stable. +- `World` should be able to reserve an `Entity` ID range, with separate metadata. + - If merged, [#16](https://github.com/bevyengine/rfcs/pull/16) can probably be used to handle this cleanly. +- The ECS scheduler should support nested loops. + - (I'm pretty sure this isn't an actual blocker, but the workaround feels a little hacky.) +- Replicable components should only be mutated inside `NetworkFixedUpdate`. +- Networked entities should have a `NetworkID` component at minimum. Could be auto-added. +- Adding replicable components to non-networked entities should be avoided unless client-authoritative. + +### The Replicate Trait +```rust +// TODO +impl Replicate for T { + ... } ``` -### Example Configuration Options -``` -players: 32 -max networked entities: 1024 -replication strategy: snapshots - mode: listen server - simulation tick rate: 60Hz - client send interval: 1 tick - server send interval: 2 ticks - client input delay: 0 - server input delay: 2 ticks - prediction: true - rollback window: 250ms - min interpolation delay: 32ms - lag compensation: true - compensation window: 200ms +### Specialized Change Detection +```rust +// TODO +// Predicted +// Confirmed +// Cancelled +// Added and Removed variants for each also ``` -## Reference-level explanation +### Rollback via Run Criteria +```rust +// TODO +``` -[Link to more in-depth implementation details.](../main/implementation_details.md) +### NetworkFixedUpdate +Clients +1. Iterate received server updates. +2. Update simulation and interpolation timescales. +3. Sample inputs and push them to send buffer. +4. Rollback and re-sim *if* a new update was received. +5. Simulate predicted tick. -### What is required? -- Guaranteed `ComponentId` stability. -- Make `World` able to reserve an `Entity` ID range and track metadata for it separately. - - If merged, [#16](https://github.com/bevyengine/rfcs/pull/16) could probably be used to handle this cleanly. -- Ideally, `RunCriteria` would support nested loops. I'm pretty sure this isn't an actual blocker, but the workaround feels a little hacky. - -### Primitives -- `#[derive(Replicate)]` for identification and serialization; also adds `[repr(C)]` -- `#[replicate(precision=?)]` for quantization -- `#[replicate(range=(?, ?))]` for range compression -- `#[client]` and `#[server]` for conditional compilation -- `NetworkID` component for GUID -- `NetworkFixedUpdate` app state -- Specialized change detection filters - - `Predicted` - - `Confirmed` - - `Cancelled` - - `Added` and `Removed` variants for each also - -### Best Practices -- Networked entities will have the `NetworkID` component at minimum. Could be auto-added. -- Replicable components should only be mutated inside `NetworkFixedUpdate`. -- Clients shouldn't try to "network" local entities by just adding replicable components. Those entities do not exist on the server. Client-authoritative is a different story. +Server +1. Iterate received client inputs. +2. Sample buffered inputs. +3. Simulate authoritative tick. +4. Duplicate state changes to copy. +5. Push client updates to send buffer. +Everything aside from the simulation steps could be auto-generated. ### Saving Game State - At the end of each fixed update, server iterates `Changed` and `Removed` for all replicable components and duplicates them to an isolated collection of `SparseSet`. - You could pass this "read-only" copy to another thread to do the remaining work. @@ -199,92 +173,53 @@ replication strategy: snapshots - At the beginning of each fixed update, the client decompresses the received update and writes its changes to the appropriate `SparseSet` collection (several will be buffered). - Client then uses this updated collection to write the prediction copy that has all the tables and non-replicable components. -### NetworkFixedUpdate -Clients -1. Iterate received server updates. -2. Update simulation and interpolation timescales. -3. Sample inputs and push them to send buffer. -4. Rollback and re-sim *if* a new update was received. -5. Simulate predicted tick. - -Server -1. Iterate received client inputs. -2. Sample buffered inputs. -3. Simulate authoritative tick. -4. Duplicate state changes to copy. -5. Push client updates to send buffer. - -Everything aside from the simulation steps could be auto-generated. - -### Network Modes -| Mode | Playable? | Authoritative? | Open to connections? | -| :--- | :---: | :---: | :---: | -| Client | ✓ | ✗ | ✗ | -| Standalone | ✓ | ✓ | ✗ | -| Listen Server | ✓ | ✓ | ✓ | -| Dedicated Server | ✗ | ✓ | ✓ | -| Relay | ✗ | ✗ | ✓ | - -- Listen servers have client and server instances on the same machine. Standalone is a listen server configured with a dummy socket and closed to connections. -- Relays are for managing deterministic and client-authoritative games. They can do "clock" synchronization, input validation, interest management, etc. Just no simulation. - ## Drawbacks - Lots of potentially cursed macro magic. - Direct writes to `World`. - Seemingly limited to components that implement `Clone` and `Serialize`. -## Rationale and alternatives -> Why is this design the best in the space of possible designs? +## Rationale and Alternatives +### Why *this* design? Networking is a widely misunderstood problem domain. The proposed implementation should suffice for most games while minimizing design friction—users need only annotate gameplay-related components and systems, put those systems in `NetworkFixedUpdate`, and configure some settings. -> What other designs have been considered and what is the rationale for not choosing them? - -Replication always boils down to sending inputs or state, so the space of alternative designs includes different choices for the end-user interface and different implementations of save/restore functions. - -Frankly, given the abundance of confusion surrounding networking, polluting the API with "networked" variants of structs and systems (aside from `Transform`, `Rigidbody`, etc.) would just make life harder for everybody, both game developers and Bevy maintainers. +Polluting the API with "networked" variants of structs and systems (aside from `Transform`, `Rigidbody`, etc.) would just make life harder for everybody, both game developers and Bevy maintainers. IMO the ease of macro annotations is worth any increase in compile times when networking features are enabled. -People who want to make multiplayer games want to focus on designing their game and not worry about how to implement prediction, how to serialize their game, how to keep packets under MTU, etc. All of that should just work. I think the ease of macro annotations is worth any increase in compile times when networking features are enabled. - -> What is the impact of not doing this? +### Why should Bevy provide this? +People who want to make multiplayer games want to focus on designing their game and not worry about how to implement prediction, how to serialize their game, how to keep packets under MTU, etc. Having these come built-in would be a huge selling point. +### Why not wait until Bevy is more mature? It'll only grow more difficult to add these features as time goes on. Take Unity for example. Its built-in features are too non-deterministic and its only working solutions for state transfer are paid third-party assets. Thus far, said assets cannot integrate deeply enough to be transparent (at least not without substituting parts of the engine). -> Why is this important to implement as a feature of Bevy itself, rather than an ecosystem crate? - +### Why does this need to involve `bevy_ecs`? I strongly doubt that fast, efficient, and transparent replication features can be implemented without directly manipulating a `World` and its component storages. -## Unresolved questions +## Unresolved Questions - What components and resources can't be serialized? - Is there a better way to isolate replicable and non-replicable entities? - Can we provide lints for undefined behavior like mutating networked state outside of `NetworkFixedUpdate`? +- Does rolling back break existing change detection or events? +- When sending partial state updates, how should we deal with weird stuff like there being references to entities that haven't been spawned or have been destroyed? - How should UI widgets interact with networked state? Exclusively poll verified data? - How should we deal with correcting mispredicted events and FX? -- Does rolling back break existing change detection or events? - Can we replicate animations exactly without explicitly sending animation parameters? -- When sending partial state updates, how should we deal with weird stuff like there being references to entities that haven't been spawned or have been destroyed? -## Future possibilities -- With some tool to visualize game state diffs, these replication systems could help detect non-determinism in other parts of the engine. +## Future Possibilities + +- With some tools to visualize game state diffs, these replication systems could help detect non-determinism in other parts of the engine. - Much like how Unreal has Fortnite, Bevy could have an official (or curated) collection of multiplayer samples to dogfood these features. - Bevy's future editor could automate most of the configuration and annotation. -- Beyond replication, Bevy need only provide one good default for protocol and I/O for the sake of completeness. I recommend dividing crates at least to the extent shown below to make it easy for developers to swap the low-level stuff with [whatever](https://partner.steamgames.com/doc/features/multiplayer) [alternatives](https://developer.microsoft.com/en-us/games/solutions/multiplayer/) [they](https://dev.epicgames.com/docs/services/en-US/Overview/index.html) [want](https://docs.aws.amazon.com/gamelift/latest/developerguide/gamelift-intro.html). Replication addresses all the underlying ECS interop, so it should be settled first. - - **replication** <- this RFC - - save and restore - - prediction - - serialization and compression - - interest management, prioritization, level-of-detail - - smooth rendering - - lag compensation - - statistics - - **protocol** - - (N)ACKs and reliability - - channels - - connection authentication and management - - encryption - - statistics - - **I/O** - - send, recv, poll, etc. +- Replication addresses all the underlying ECS interop, so it should be settled first. But beyond replication, Bevy need only provide one good default for protocol and I/O for the sake of completeness. I recommend dividing crates at least to the extent shown below to make it easy for developers to swap the low-level transport with [whatever][3] [alternatives][4] [they][5] [want][7]. + +| `bevy_net_replication` | `bevy_net_protocol` | `bevy_net_io` | +| -- | -- | -- | +|
  • save and restore
  • prediction
  • serialization
  • interest management
  • visual error correction
  • statistics (high-level)
|
  • (N)ACKs
  • reliability
  • virtual connections
  • channels
  • encryption
  • statistics (low-level)
|
  • send
  • recv
  • poll
| + + +[1]: https://youtu.be/JOJP0CvpB8w "Unreal Networking Features" +[2]: https://www.unrealengine.com/en-US/tech-blog/replication-graph-overview-and-proper-replication-methods "Unreal Replication Graph Plugin" +[3]: https://github.com/quinn-rs/quinn +[4]: https://partner.steamgames.com/doc/features/multiplayer +[5]: https://developer.microsoft.com/en-us/games/solutions/multiplayer/ +[6]: https://dev.epicgames.com/docs/services/en-US/Overview/index.html +[7]: https://docs.aws.amazon.com/gamelift/latest/developerguide/gamelift-intro.html \ No newline at end of file diff --git a/replication_concepts.md b/replication_concepts.md index 8adf636a..533ab330 100644 --- a/replication_concepts.md +++ b/replication_concepts.md @@ -1,4 +1,8 @@ # Replication +> The goal of replication is to ensure that all of the players in the game have a consistent model of the game state. Replication is the absolute minimum problem which all networked games have to solve in order to be functional, and all other problems in networked games ultimately follow from it. - [Mikola Lysenko][1] + +[1]: https://0fps.net/2014/02/10/replication-in-networked-games-overview-part-1/ + Abstractly, you can think of a game as a pure function that accepts an initial state and player inputs and generates a new state. ```rust *state[n+1] = simulate(&state[n], &inputs[n]); @@ -127,4 +131,4 @@ When snapshots fail or hidden information is needed, the best alternative is to Determining relevance is often called **interest management** or **area of interest**. Each granular piece of state is given a "send priority" that accumulates over time and resets when sent. How quickly priority accumulates for different things is up to the developer, though physical proximity and visual salience usually have the most influence. -Eventual consistency can be combined with delta compression, but I wouldn't recommend it. It's just too much bookkeeping. Unlike snapshots, the server would have to track the latest received state for each *item* on each client separately and create diffs for each client separately. \ No newline at end of file +Eventual consistency can be combined with delta compression, but I wouldn't recommend it. It's just too much bookkeeping. Unlike snapshots, the server would have to track the latest received state for each *item* on each client separately and create diffs for each client separately. From bec485c0258d3783f1fc1775886c8a5c230d130a Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Wed, 28 Apr 2021 13:12:12 -0500 Subject: [PATCH 22/43] some tweaks --- implementation_details.md | 4 +-- networked_replication.md | 2 +- replication_concepts.md | 66 +++++++++++++++++++++------------------ 3 files changed, 37 insertions(+), 35 deletions(-) diff --git a/implementation_details.md b/implementation_details.md index 21a81f25..c600d5e7 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -144,9 +144,7 @@ Let's consider a simpler default: 3. Always rollback and re-simulate. -Now, if you're thinking that's wasteful, the "if mispredicted" gives you a false sense of security. If I make a game and claim it can rollback 250ms, that basically should mean *any* 250ms, with no stuttering. If clients *always* rollback and re-sim, it'll be easier to profile and optimize for that. As a bonus, clients never need to store old predicted states. - -Constant rollbacks may sound too expensive, but there were games with rollback running on the original Playstation over 20 years ago. +Now, you may think that's wasteful, but I would say "if mispredicted" gives you a false sense of security. Mispredictions can occur at any time, *especially* during long-lasting complex physics interactions. It's much easier to profile and optimize for your worst-case if clients *always* rollback and re-sim. It's also more memory-efficient, since clients never need to store old predicted states. ## Delta Compression TBD diff --git a/networked_replication.md b/networked_replication.md index cc7e1a40..ab4acee1 100644 --- a/networked_replication.md +++ b/networked_replication.md @@ -213,7 +213,7 @@ I strongly doubt that fast, efficient, and transparent replication features can | `bevy_net_replication` | `bevy_net_protocol` | `bevy_net_io` | | -- | -- | -- | -|
  • save and restore
  • prediction
  • serialization
  • interest management
  • visual error correction
  • statistics (high-level)
|
  • (N)ACKs
  • reliability
  • virtual connections
  • channels
  • encryption
  • statistics (low-level)
|
  • send
  • recv
  • poll
| +|
  • save and restore
  • prediction
  • serialization
  • interest management
  • visual error correction
  • lag compensation
  • statistics (high-level)
|
  • (N)ACKs
  • reliability
  • virtual connections
  • channels
  • encryption
  • statistics (low-level)
|
  • send
  • recv
  • poll
| [1]: https://youtu.be/JOJP0CvpB8w "Unreal Networking Features" diff --git a/replication_concepts.md b/replication_concepts.md index 533ab330..042a5a32 100644 --- a/replication_concepts.md +++ b/replication_concepts.md @@ -1,46 +1,43 @@ # Replication > The goal of replication is to ensure that all of the players in the game have a consistent model of the game state. Replication is the absolute minimum problem which all networked games have to solve in order to be functional, and all other problems in networked games ultimately follow from it. - [Mikola Lysenko][1] -[1]: https://0fps.net/2014/02/10/replication-in-networked-games-overview-part-1/ +--- Abstractly, you can think of a game as a pure function that accepts an initial state and player inputs and generates a new state. ```rust -*state[n+1] = simulate(&state[n], &inputs[n]); +let new_state = simulate(&state, &inputs); ``` Fundamentally, if several players want to perform a synchronized simulation over a network, they have basically two options: -**Active replication** -- Send their inputs to each other and independently and deterministically simulate the game. -- also called lockstep, state-machine synchronization, and "determinism" - -**Passive replication** -- Send their inputs to a single machine (the server) who simulates the game and broadcasts updates back. -- also called client-server, primary-backup, master-slave, and "state transfer" +- Send their inputs to each other and independently and deterministically simulate the game. + -
also known asactive replication, lockstep, state-machine synchronization, determinism
+- Send their inputs to a single machine (the server) who simulates the game and broadcasts updates back. + -
also known aspassive replication, client-server, primary-backup, state transfer
In other words, players can either run the "real" game or follow it. -Although the distributed computing terminology is probably more useful, for the rest of this RFC, I'll refer to active and passive replication as determinism and state transfer, respectively. They're more commonly used in the gamedev context. +For the rest of this RFC, I'll refer to them as determinism and state transfer, respectively. I just think they're the most literal terminology. ## Why determinism? -Determinism is straightforward. It's basically local multiplayer but with really long, sometimes ocean-spanning controller cables. The netcode is virtually independent from the gameplay code, it simply supplies the inputs. +Deterministic multiplayer is basically local multiplayer but with *really* long controller cables. The netcode simply supplies the gameplay code with inputs. They're basically decoupled. Determinism has low infrastructure costs, both in terms of bandwith and server hardware. All steady-state network traffic is input, which is not only small but also compresses well. (Note that as player count increases, there *is* a crossover point where state transfer becomes more efficient). Likewise, as the game runs completely on the clients, there's no need to rent powerful servers. Relays are still handy for efficiently managing rooms and scaling to higher player counts, but those could be cheap VPS instances. -Determinism is also tamperproof. It's impossible to do anything like speedhack or teleport as running these exploits would simply cause cheaters to desync. On the other hand, determinism inherently leaks all information. +Determinism is also tamperproof. It's impossible to do anything like speedhack or teleport as running these exploits would simply cause cheaters to desync. On the other hand, determinism inherently suffers from total information leakage. -The biggest strength of determinism is also its biggest limitation: every client must run the *entire* game. While this works well for games with thousands of micro-managed entities like *Starcraft 2*, you won't be seeing games with expansive worlds like *Genshin Impact* networked this way any time soon. +That every client must run the *entire* world is also determinism's biggest limit. While this works well for games with thousands of micro-managed entities like *Starcraft 2*, you won't be seeing games with expansive worlds like *Genshin Impact* networked this way any time soon. ## Why state transfer? Determinism is awesome when it fits but it's generally unavailable. Neither Godot nor Unity nor Unreal can make this guarantee for large parts of their engines, particularly physics. -Whenever you can't have or don't want bit-perfect determinism, you use state transfer. +Whenever you can't have or don't want determinism, you should use state transfer. -The key idea behind state transfer is the concept of **authority**. It's essentially ownership in Rust. Those who own state are responsible for broadcasting up-to-date information about it. I sometimes see authority divided into *input* authority (control permission) and *state* authority (write permission), but usually authority means state authority. +Its main underlying idea is **authority**, which is just like ownership in Rust. Those who own state are responsible for broadcasting up-to-date information about it. I sometimes see authority divided into *input* authority (control permission) and *state* authority (write permission), but usually authority means state authority. The server usually owns everything, but authority is very flexible. In games like *Destiny* and *Fall Guys*, clients own their movement state. Other games even trust clients to confirm hits. Distributing authority like this adds complexity and obviously leaves the door wide open for cheaters, but sometimes it's necessary. In VR, it makes sense to let clients claim and relinquish authority over interactable objects. ## Why not messaging patterns? -The only other strategy you really see used for replication is messaging. Like RPCs or remote events. Not sure why, but it's what I most often see people try the first time. +The only other strategy you really see used for replication is messaging. RPCs. I actually see these most often in the free asset space. (I guess it's the go-to pattern outside of games?) Take chess for example. Instead of sending polled player inputs or the state of the chessboard, you could just send the moves like "white, e2 to e4," etc. @@ -58,32 +55,35 @@ for message in queue.iter() { // applied and applied in the right order. *state[n+1] = s; ``` -How do you even build prediction and reconciliation out of this? - -Messages are the right tool for when you really do want explicit request-reply interactions or for global alerts like players joining or leaving. They just don't cut it as a general tool. Even if you were to avoid littering send and receive calls everywhere (i.e., collect and send in batches), messages don't compress as well as inputs or state. +Messages are great for when you want explicit request-reply interactions and global alerts like players joining or leaving. They just don't cut it as a replication mechanism for real-time games. Even if you avoided send and receive calls everywhere (i.e., collect and send in batches), messages don't compress as well as inputs or state. # Latency -Networking a game simulation so that players who live in different locations can play together is an unintuitive problem. No matter how we physically connect their computers, they most likely won't be able to exchange data within one simulation step. +Networking is hard because we want to let players who live in different countries play together *at the same time*, something that special relativity tells us is [strictly impossible][2]... unless we cheat. -## Lockstep -The simplest form of online multiplayer is lockstep. All clients simply block until they have everything needed to execute the next simulation step. This delay is fine for most turn-based games but feels awful for real-time games. +### Lockstep +The simplest solution is to concede to the universe with grace and have players stall until they've received whatever data they need to execute the next simulation step. Blocking is fine for most turn-based games but it just doesn't cut it for real-time games. + +### Adding Local Input Delay +The first trick we can pull is have each player delay their own input for a bit, trading responsiveness for more time to receive the incoming data. -## Local Input Delay -A partial solution is for each client to delay the local player input for some number of simulation steps, trading a small amount of responsiveness for more time to receive remote info. Doing this also reduces the perceived latency between players. Under stable network conditions, the game will run smoothly, but it still stutters when the window is missed. +Our brains are pretty lenient about this, so we can actually *reduce* the latency between players. Two players in a 1v1 match actually could experience simultaneity if each delayed their input by half the round-trip time. + +This trick has powered the RTS genre for decades. With a large enough input delay and a stable connection, the game will run smoothly. However, there's still a problem because the game stutters whenever the window is missed. This leads to the next trick. > determinism + lockstep + local input delay = delay-based netcode -## Predict and Reconcile -A more elegant way to hide the input latency is local prediction. +### Predict-Reconcile +Instead of blocking, what if players just guess the missing data and keep going? Doing that would let us avoid stuttering, but then we'd have to deal with guessing incorrectly. -Instead of blocking, clients can substitute any missing information with reasonable guesses (often reusing the previous value) and just run the simulation. Guessing removes the need to wait, removing perceived input lag, but what if the guesses are wrong? +Well, when the player finally has that missing remote data, what they can do is restore their simulation to the previous verified state, update it with the received data, and then re-predict the remaining steps. -Well, what a client can do later is restore its simulation to the last verified state and redo the mispredicted steps with the correct info. +This retroactive correction is called **rollback** or **reconciliation**, and it ensures that players never desync *too much*. Honestly, it's practically invisible with a high tick rate and good visual smoothing. (Apparently it's been around since [1996][3].) -This retroactive correction is called **rollback** or **reconciliation** and with a high simulation rate and good visual smoothing, it's practically invisible. Adding local input delay reduces the amount of rollback. +With prediction, input delay is no longer needed, but it's still useful. Reducing latency reduces how many steps players need to re-simulate. -> determinism + predict-rollback + local input delay (optional) = rollback netcode +> determinism + predict-rollback + local input delay (optional) = rollback netcode +### Selective Prediction Once again, determinism is an all or nothing deal. If you predict, you predict everything. State transfer has the flexibility to predict only *some* things, letting you offload expensive systems onto the server. Games like *Rocket League* still predict everything, including other clients (the server re-distributes their inputs along with game state so that this is more accurate). However, most games choose not to do this. It's more common for clients to predict only what they control and interact with. @@ -131,4 +131,8 @@ When snapshots fail or hidden information is needed, the best alternative is to Determining relevance is often called **interest management** or **area of interest**. Each granular piece of state is given a "send priority" that accumulates over time and resets when sent. How quickly priority accumulates for different things is up to the developer, though physical proximity and visual salience usually have the most influence. -Eventual consistency can be combined with delta compression, but I wouldn't recommend it. It's just too much bookkeeping. Unlike snapshots, the server would have to track the latest received state for each *item* on each client separately and create diffs for each client separately. +Eventual consistency can be combined with delta compression, but I wouldn't recommend it. Many AAA games have done it, but IMO it's just too much bookkeeping. Unlike snapshots, the server would have to track the latest received state for each *item* on each client separately and create diffs for each client separately. + +[1]: https://0fps.net/2014/02/10/replication-in-networked-games-overview-part-1/ +[2]: https://en.wikipedia.org/wiki/Relativity_of_simultaneity +[3]: https://en.wikipedia.org/wiki/Client-side_prediction \ No newline at end of file From e4a71b6c48c39f7078e7e423076de1dcaec681a2 Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Sat, 1 May 2021 07:15:27 -0500 Subject: [PATCH 23/43] moved snapshot and interest management stuff into implementation_details.md Worded more generically until I have a working implementation. --- implementation_details.md | 28 ++++++++++---- networked_replication.md | 77 +++++++++++++++++++-------------------- replication_concepts.md | 25 +++++++------ 3 files changed, 72 insertions(+), 58 deletions(-) diff --git a/implementation_details.md b/implementation_details.md index c600d5e7..4f62881a 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -85,7 +85,7 @@ The naive solution is to have clients spawn dummy entities. When an update that A better solution is for the server to assign each networked entity a global ID (`NetworkID`) that the spawning client can predict and map to its local instance. -- The simplest form of this would be an incrementing generational index whose upper bits are fixed to match the spawning player's ID. This is my recommendation. Basically, reuse `Entity` and reserve some of the upper bits in the ID. +- The simplest form of this would be an incrementing generational index whose upper bits are fixed to match the spawning player's ID. This is my recommendation. Basically, reuse `Entity` and reserve some of the upper bits in its ID. - Alternatively, PRNGs could be used to generate shared keys (called "prediction keys" in some places) for pairing global and local IDs. Rather than predict the global ID, the client would predict the shared key. Server updates that confirm the predicted entity would include both its global ID and the shared key, which the client can then use to pair the IDs. This method adds complexity but bypasses the previous method's implicit entity limit. @@ -130,6 +130,8 @@ There's a lot to learn from *Overwatch* here. For clients with too-high ping, their interpolation will lag far behind their prediction. If you only compensate up to a limit (e.g. 200ms), [those clients will have to extrapolate the difference](https://youtu.be/W3aieHjyNvw?t=2347). Doing nothing is also valid, but lagging clients would abruptly have to start leading their targets. +When a player is parented to another entity, which they have no control over (e.g. the player is a passenger in a vehicle), the non-predicted movement of that parent must be rewound during compensation to spawn any projectiles fired by the player in the correct location. + ## Unconditional Rollbacks Every article on "rollback netcode" and "client-side prediction and server reconciliation" encourages having clients compare their predicted state to the authoritative state and reconciling *if* they mispredicted. But how do you actually detect a mispredict? @@ -146,13 +148,25 @@ Let's consider a simpler default: Now, you may think that's wasteful, but I would say "if mispredicted" gives you a false sense of security. Mispredictions can occur at any time, *especially* during long-lasting complex physics interactions. It's much easier to profile and optimize for your worst-case if clients *always* rollback and re-sim. It's also more memory-efficient, since clients never need to store old predicted states. -## Delta Compression -TBD - -## Interest Management -TBD +## Delta-Compressed Snapshots +- The server keeps an incrementally updated copy of the networked state. + - Components are stored with their global ID instead of the local ID. +- The server keeps a ring buffer of "patches" for the last `N` snapshots. +- At the end of every `NetworkFixedUpdate`, the server iterates `Changed` and `Removed`, then: + - Generates the latest patch as the copy `xor` changes. + - Applies the changes to the copy and pushes the latest patch into the ring buffer. + - `Xors` older patches with the latest patch to update them. +- The server reads the needed patches as `&[u8]` (or `&[u64]`) and compresses them using run-length encoding (RLE) or similar. + - No "serialization" needed. If networked DSTs are stored in their own heap allocation, we can literally send the bits. `rkyv` is a good reference (relative pointers). +- Pass compressed payloads to protocol layer. +- Protocol and I/O layers do whatever they do and send the packet. + +## Interest Managed Updates +TODO ## Messages +TODO + Messages are best for sending global alerts and any gameplay mechanics you explicitly want modeled as request-reply (or one-way) interactions. They can be unreliable or reliable. You can also postmark messages to be executed on a certain tick like inputs. That can only be best effort, though. -TBD \ No newline at end of file +The example I'm thinking of is buying items from an in-game vendor. The server doesn't simulate UI, but ideally we can write the message transaction in the same system. A macro might end up being the most ergonomic choice. \ No newline at end of file diff --git a/networked_replication.md b/networked_replication.md index ab4acee1..02711b7c 100644 --- a/networked_replication.md +++ b/networked_replication.md @@ -81,8 +81,8 @@ fn expensive_physics_calculation( } ``` -``` -TODO: Message Example +```rust +// TODO: Message Example ``` Bevy can configure an `App` to operate in several different network modes. @@ -104,16 +104,17 @@ Bevy can configure an `App` to operate in several different network modes. ## Implementation Strategy [Link to more in-depth implementation details (more of an idea dump atm).](../main/implementation_details.md) -### What is required? -- In order for servers to send state to clients, `ComponentId` should be stable. -- `World` should be able to reserve an `Entity` ID range, with separate metadata. - - If merged, [#16](https://github.com/bevyengine/rfcs/pull/16) can probably be used to handle this cleanly. +### Requirements +- `ComponentId` (and maybe the other `*Ids`) should be stable between clients and the server. +- Must have a means to isolate networked and non-networked state. + - `World` should be able to reserve an `Entity` ID range, with separate storage metadata. + - (If merged, [#16](https://github.com/bevyengine/rfcs/pull/16) could probably be used for this). + - Entities must be born (non-)networked. They cannot become (non-)networked. + - Networked entities must have a `NetworkID` component at minimum. + - Networked components and resources must only contain or reference networked data. + - Networked components must only be mutated inside `NetworkFixedUpdate`. - The ECS scheduler should support nested loops. - (I'm pretty sure this isn't an actual blocker, but the workaround feels a little hacky.) -- Replicable components should only be mutated inside `NetworkFixedUpdate`. -- Networked entities should have a `NetworkID` component at minimum. Could be auto-added. -- Adding replicable components to non-networked entities should be avoided unless client-authoritative. - ### The Replicate Trait ```rust // TODO @@ -125,15 +126,21 @@ impl Replicate for T { ### Specialized Change Detection ```rust // TODO -// Predicted -// Confirmed +// Predicted (+ Added and Removed variants) +// Set when mutated by client. Cleared when mutated by server update. +// Confirmed (+ Added and Removed variants) +// Set when mutated by server update. Cleared when mutated by client. // Cancelled -// Added and Removed variants for each also +// ???? ``` ### Rollback via Run Criteria ```rust -// TODO +/* +TODO +The "outer" loop is the number of fixed update steps as determined by the fixed timestep accumulator. +The "inner" loop is the number of steps to re-simulate. +*/ ``` ### NetworkFixedUpdate @@ -153,25 +160,18 @@ Server Everything aside from the simulation steps could be auto-generated. ### Saving Game State -- At the end of each fixed update, server iterates `Changed` and `Removed` for all replicable components and duplicates them to an isolated collection of `SparseSet`. - - You could pass this "read-only" copy to another thread to do the remaining work. - - Tables would be rebuilt when restoring. +- At the end of each fixed update, server iterates `Changed` and `Removed` for all replicable components and duplicates them to an isolated copy. + - Could pass this copy to another thread to do the serialization and compression. + - This copy has no `Table`, those would be rebuilt by the client. ### Preparing Server Packets -- Snapshots (full state updates) will use delta compression. - - Server keeps a ring buffer of patches for the last `N` snapshots. - - Server computes the latest patch by xor'ing the copy and the latest changes (before applying them) and pushes it into the ring buffer. The servers also updates the earlier patches by xor'ing them with the latest patch. (The xor'ing is basically a pre-compression step that produces long zero chains with high probability.) - - Server compresses whichever patches clients need and hands them off to the protocol layer. (The same patch can be sent to multiple clients, so it scales pretty well.) - -- Eventual consistency (partial state updates) will use interest management. - - Entities accrue send priority over time. Maybe we can use the magnitude of component changes as the base amount to accrue. - - Server runs users-defined rules for gameplay relevance. - - Server runs collision detection to prioritize physical entities inside each client's area of interest. - - Server writes the payload for each client and hands them off to the protocol layer. +- Snapshots (full state updates) will use delta compression and manual fragmentation. +- Eventual consistency (partial state updates) will use interest management. +- Both will most likely use the same data structure. ### Restoring Game State -- At the beginning of each fixed update, the client decompresses the received update and writes its changes to the appropriate `SparseSet` collection (several will be buffered). -- Client then uses this updated collection to write the prediction copy that has all the tables and non-replicable components. +- At the beginning of each fixed update, the client decodes the received update and generates the latest authoritative state. +- Client then uses this state to write its local prediction copy that has all the tables and non-replicable components. ## Drawbacks - Lots of potentially cursed macro magic. @@ -192,28 +192,25 @@ People who want to make multiplayer games want to focus on designing their game It'll only grow more difficult to add these features as time goes on. Take Unity for example. Its built-in features are too non-deterministic and its only working solutions for state transfer are paid third-party assets. Thus far, said assets cannot integrate deeply enough to be transparent (at least not without substituting parts of the engine). ### Why does this need to involve `bevy_ecs`? -I strongly doubt that fast, efficient, and transparent replication features can be implemented without directly manipulating a `World` and its component storages. +I strongly doubt that fast, efficient, and transparent replication features can be implemented without directly manipulating a `World` and its component storages. We may need to allocate memory for networked data separately. ## Unresolved Questions -- What components and resources can't be serialized? -- Is there a better way to isolate replicable and non-replicable entities? - Can we provide lints for undefined behavior like mutating networked state outside of `NetworkFixedUpdate`? -- Does rolling back break existing change detection or events? -- When sending partial state updates, how should we deal with weird stuff like there being references to entities that haven't been spawned or have been destroyed? -- How should UI widgets interact with networked state? Exclusively poll verified data? -- How should we deal with correcting mispredicted events and FX? -- Can we replicate animations exactly without explicitly sending animation parameters? +- Do rollbacks break change detection or events? +- ~~When sending partial state updates, how should we deal with weird stuff like there being references to entities that haven't been spawned or have been destroyed?~~ Already solved by generational indexes. +- How should UI widgets interact with networked state? React to events? Exclusively poll verified data? +- How should we handle correcting mispredicted events and FX? +- Can we replicate animations exactly without explicitly sending animation data? ## Future Possibilities - - With some tools to visualize game state diffs, these replication systems could help detect non-determinism in other parts of the engine. - Much like how Unreal has Fortnite, Bevy could have an official (or curated) collection of multiplayer samples to dogfood these features. - Bevy's future editor could automate most of the configuration and annotation. - Replication addresses all the underlying ECS interop, so it should be settled first. But beyond replication, Bevy need only provide one good default for protocol and I/O for the sake of completeness. I recommend dividing crates at least to the extent shown below to make it easy for developers to swap the low-level transport with [whatever][3] [alternatives][4] [they][5] [want][7]. -| `bevy_net_replication` | `bevy_net_protocol` | `bevy_net_io` | +| `bevy::net::replication` | `bevy::net::protocol` | `bevy::net::io` | | -- | -- | -- | -|
  • save and restore
  • prediction
  • serialization
  • interest management
  • visual error correction
  • lag compensation
  • statistics (high-level)
|
  • (N)ACKs
  • reliability
  • virtual connections
  • channels
  • encryption
  • statistics (low-level)
|
  • send
  • recv
  • poll
| +|
  • save and restore
  • prediction
  • serialization
  • delta compression
  • interest management
  • visual error correction
  • lag compensation
  • statistics (high-level)
|
  • (N)ACKs
  • reliability
  • virtual connections
  • channels
  • encryption
  • statistics (low-level)
|
  • send
  • recv
  • poll
| [1]: https://youtu.be/JOJP0CvpB8w "Unreal Networking Features" diff --git a/replication_concepts.md b/replication_concepts.md index 042a5a32..d7bbde1e 100644 --- a/replication_concepts.md +++ b/replication_concepts.md @@ -25,7 +25,7 @@ Determinism has low infrastructure costs, both in terms of bandwith and server h Determinism is also tamperproof. It's impossible to do anything like speedhack or teleport as running these exploits would simply cause cheaters to desync. On the other hand, determinism inherently suffers from total information leakage. -That every client must run the *entire* world is also determinism's biggest limit. While this works well for games with thousands of micro-managed entities like *Starcraft 2*, you won't be seeing games with expansive worlds like *Genshin Impact* networked this way any time soon. +That every client must run the *entire* world is also determinism's biggest limit. While this works well for games with thousands of micro-managed entities like *Starcraft 2*, you won't be seeing games with expansive worlds like *Genshin Impact* networked this way anytime soon. ## Why state transfer? Determinism is awesome when it fits but it's generally unavailable. Neither Godot nor Unity nor Unreal can make this guarantee for large parts of their engines, particularly physics. @@ -61,7 +61,7 @@ Messages are great for when you want explicit request-reply interactions and glo Networking is hard because we want to let players who live in different countries play together *at the same time*, something that special relativity tells us is [strictly impossible][2]... unless we cheat. ### Lockstep -The simplest solution is to concede to the universe with grace and have players stall until they've received whatever data they need to execute the next simulation step. Blocking is fine for most turn-based games but it just doesn't cut it for real-time games. +The simplest solution is to concede to the universe with grace and have players stall until they've received whatever data they need to execute the next simulation step. Blocking is fine for most turn-based games but simply doesn't cut it for real-time games. ### Adding Local Input Delay The first trick we can pull is have each player delay their own input for a bit, trading responsiveness for more time to receive the incoming data. @@ -70,9 +70,9 @@ Our brains are pretty lenient about this, so we can actually *reduce* the latenc This trick has powered the RTS genre for decades. With a large enough input delay and a stable connection, the game will run smoothly. However, there's still a problem because the game stutters whenever the window is missed. This leads to the next trick. -> determinism + lockstep + local input delay = delay-based netcode +> determinism + lockstep + local input delay = "delay-based netcode" -### Predict-Reconcile +### Predict-Rollback Instead of blocking, what if players just guess the missing data and keep going? Doing that would let us avoid stuttering, but then we'd have to deal with guessing incorrectly. Well, when the player finally has that missing remote data, what they can do is restore their simulation to the previous verified state, update it with the received data, and then re-predict the remaining steps. @@ -81,16 +81,19 @@ This retroactive correction is called **rollback** or **reconciliation**, and it With prediction, input delay is no longer needed, but it's still useful. Reducing latency reduces how many steps players need to re-simulate. -> determinism + predict-rollback + local input delay (optional) = rollback netcode +> determinism + predict-rollback + local input delay (optional) = "rollback netcode" ### Selective Prediction -Once again, determinism is an all or nothing deal. If you predict, you predict everything. +Determinism is an all or nothing deal. If you predict, you predict everything. + +State transfer has the flexibility to predict only *some* things, letting you offload expensive computations onto the server. There *are* client-server games like *Rocket League* who still predict everything (FWIW deterministic predict-rollback would have been a better fit), including other clients—the server redistributes inputs along with game state to reduce error. However, most often clients only predict what they control directly. -State transfer has the flexibility to predict only *some* things, letting you offload expensive systems onto the server. Games like *Rocket League* still predict everything, including other clients (the server re-distributes their inputs along with game state so that this is more accurate). However, most games choose not to do this. It's more common for clients to predict only what they control and interact with. # Visual Consistency -**tl;dr**: Hard snap the simulation state and subtly blend the view. Time travel if needed. -## Smooth Rendering and Lag Compensation + +Real quick, always hard snap the simulation state. If clients do any blending, it's entirely visual. Yes, this does mean that entities may appear in different positions from where they should be. On the other hand, we have to honor this inaccurate view to keep players happy. + +### Smooth Rendering and Lag Compensation Predicting only *some* things adds implementation complexity. @@ -101,7 +104,7 @@ Gameplay-wise, not predicting everything also divides entities between two point Visually, we'll often have to blend between extrapolated and authoritative data. Simply interpolating between two authoritative updates is incorrect. The visual state can and will accrue errors, but that's what we want. Those can be tracked and smoothly reduced (to some near-zero threshold, then cleared). # Bandwidth -## How much can we fit into each packet? +### How much can we fit into each packet? Not a lot. You can't send arbitrarily large packets over the internet. The information superhighway has load limits. The conservative, almost universally supported "maximum transmissible unit" or MTU is 1280 bytes. Accounting for IP and UDP headers and some connection metadata, you realistically can send ~1200 bytes of game data per packet. @@ -110,7 +113,7 @@ If you significantly exceed this, some random stop along the way will delay the [Fragmentation](https://packetpushers.net/ip-fragmentation-in-detail/) [sucks](https://blog.cloudflare.com/ip-fragmentation-is-broken) because it multiplies the likelihood of the overall packet being lost (all fragments have to arrive to read the full packet). Getting fragmented along the way is even worse because of the added delay. It's okay if the sender manually fragments their packet (like 2 or 3) *upfront*, although the higher loss does limit simulation rate, just don't rely on the internet to do it. -## Okay, but that doesn't seem like much? +### Okay, but that doesn't seem like much? Well, there are two more reasons not to yeet giant 100kB packets across the network: - Bandwidth costs are the lion's share of hosting expenses. - Many players still have limited bandwidth. From cdd0586a5cbea12477607801fa96ef616068e6cf Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Sat, 1 May 2021 15:38:31 -0500 Subject: [PATCH 24/43] lints --- implementation_details.md | 24 ++++++++++++++---- networked_replication.md | 48 +++++++++++++++++++++++++----------- replication_concepts.md | 51 +++++++++++++++++++++++++++------------ 3 files changed, 88 insertions(+), 35 deletions(-) diff --git a/implementation_details.md b/implementation_details.md index 4f62881a..6c88e310 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -1,9 +1,12 @@ # Implementation Details + ## `Connection` != `Player` + I know I've been using the terms "client" and "player" somewhat interchangeably, but `Connection` and `Player` should be separate tokens. There's no benefit in forcing one player per connection. Having `Player` be its own thing makes it easier to do stuff like online splitscreen, temporarily fill team slots with bots, etc. ## "Clock" Synchronization + Ideally, clients predict ahead by just enough to have their inputs reach the server right before they're needed. People often try to have clients estimate the clock time on the server (with some SNTP handshake) and use that to schedule the next simulation step, but that's overly complex. What we really care about is: How much time passes between when the server receives my input and when that input is consumed? If the server simply tells clients how long their inputs are waiting in its buffer, the clients can use that information to converge on the correct lead. @@ -58,10 +61,12 @@ interp_time = max(interp_time, predicted_time - max_lag_comp) The key idea here is that simplifying the client-server relationship makes the problem easier. You *could* have the server apply inputs whenever they arrive, rolling back if necessary, but that would only complicate things. If the server never accepts late inputs and never changes its pace, no one needs to coordinate. ## Prediction <-> Interpolation + Clients can't directly modify the authoritative state, but they should be able to predict whatever they want locally. One obvious implementation is to literally fork the latest authoritative state. If copying the full state ends up being too expensive, we can probably use a copy-on-write layer. My current idea to shift components between prediction and interpolation is to default to interpolated (reset upon receiving a server update) and then use specialized change detection `DerefMut` magic to flag as predicted. -``` + +```rust Predicted PredictAdded PredictRemoved @@ -72,6 +77,7 @@ Cancelled CancelAdded CancelRemoved ``` + Everything is predicted by default, but users can opt-out by filtering on `Predicted`. In the more conservative cases, clients would predict the entities driven by their input, the entities they spawn (until confirmed), and any entities mutated as a result of the first two. Systems with filtered queries (i.e. physics, path-planning) should typically run last. We can also use these filters to generate events that only trigger on authoritative changes and events that trigger on predicted changes to be confirmed or cancelled later. The latter are necessary for handling sounds and particle effects. Those shouldn't be duplicated during rollbacks and should be faded out if mispredicted. @@ -79,6 +85,7 @@ We can also use these filters to generate events that only trigger on authoritat Should UI be allowed to reference predicted state or only verified state? ## Predicting Entity Creation + This requires some special consideration. The naive solution is to have clients spawn dummy entities. When an update that confirms the result arrives, clients can simply destroy the dummy and spawn the true entity. IMO this is a poor solution because it prevents clients from smoothly blending these entities from predicted time into interpolated time. It won't look right. @@ -92,6 +99,7 @@ A better solution is for the server to assign each networked entity a global ID - A more extreme solution would be to somehow bake global IDs directly into the memory allocation. If memory layouts are mirrored, relative pointers become global IDs, which don't need to be explicitly written into packets. This would save 4-8 bytes per entity before compression. ## Smooth Rendering + Rendering should come after `NetworkFixedUpdate`. Whenever clients receive an update with new remote entities, those entities shouldn't be rendered until that update is interpolated. @@ -103,13 +111,14 @@ We'll also need to distinguish instant motion from integrated motion when interp Is an exponential decay enough for smooth error correction or are there better algorithms? ## Lag Compensation + Lag compensation deals with colliders. To avoid weird outcomes, lag compensation needs to run after all motion and physics systems. Again, people often imagine having the server estimate what interpolated state the client was looking at based on their RTT, but we can resolve this without any guesswork. Clients can just tell the server what they were looking at by bundling the interpolated tick numbers and the blend value inside the input payloads. With this information, the server can reconstruct *exactly* what each client saw. -``` +```plaintext tick number (predicted) tick number (interpolated from) @@ -119,6 +128,7 @@ interpolation blend value ``` So there are two ways to go about the actual compensation: + - Compensate upfront by bringing new projectiles into the present (similar to a rollback). - Compensate over time ("amortized"), constantly testing projectiles against the history buffer. @@ -133,6 +143,7 @@ For clients with too-high ping, their interpolation will lag far behind their pr When a player is parented to another entity, which they have no control over (e.g. the player is a passenger in a vehicle), the non-predicted movement of that parent must be rewound during compensation to spawn any projectiles fired by the player in the correct location. ## Unconditional Rollbacks + Every article on "rollback netcode" and "client-side prediction and server reconciliation" encourages having clients compare their predicted state to the authoritative state and reconciling *if* they mispredicted. But how do you actually detect a mispredict? I thought of two methods while I was writing this: @@ -140,7 +151,7 @@ I thought of two methods while I was writing this: 1. Unordered scan looking for first difference. 2. Ordered scan to compute checksum and compare. -The first option has an unpredictable speed. The second option requires a fixed walk of the game state (checksums *are* probably worth having even if only for debugging non-determinism). There may be options I didn't consider, but the point I'm trying to make is that detecting changes among large numbers of entities isn't cheap. +The first option has an unpredictable speed. The second option requires a fixed walk of the game state (checksums *are* probably worth having even if only for debugging non-determinism). There may be options I didn't consider, but the point I'm trying to make is that detecting changes among large numbers of entities isn't cheap. Let's consider a simpler default: @@ -149,6 +160,7 @@ Let's consider a simpler default: Now, you may think that's wasteful, but I would say "if mispredicted" gives you a false sense of security. Mispredictions can occur at any time, *especially* during long-lasting complex physics interactions. It's much easier to profile and optimize for your worst-case if clients *always* rollback and re-sim. It's also more memory-efficient, since clients never need to store old predicted states. ## Delta-Compressed Snapshots + - The server keeps an incrementally updated copy of the networked state. - Components are stored with their global ID instead of the local ID. - The server keeps a ring buffer of "patches" for the last `N` snapshots. @@ -161,12 +173,14 @@ Now, you may think that's wasteful, but I would say "if mispredicted" gives you - Pass compressed payloads to protocol layer. - Protocol and I/O layers do whatever they do and send the packet. -## Interest Managed Updates +## Interest-Managed Updates + TODO ## Messages + TODO Messages are best for sending global alerts and any gameplay mechanics you explicitly want modeled as request-reply (or one-way) interactions. They can be unreliable or reliable. You can also postmark messages to be executed on a certain tick like inputs. That can only be best effort, though. -The example I'm thinking of is buying items from an in-game vendor. The server doesn't simulate UI, but ideally we can write the message transaction in the same system. A macro might end up being the most ergonomic choice. \ No newline at end of file +The example I'm thinking of is buying items from an in-game vendor. The server doesn't simulate UI, but ideally we can write the message transaction in the same system. A macro might end up being the most ergonomic choice. diff --git a/networked_replication.md b/networked_replication.md index 02711b7c..0a8f2a45 100644 --- a/networked_replication.md +++ b/networked_replication.md @@ -6,7 +6,7 @@ This RFC proposes an implementation of engine features for developing networked ## Motivation -Networking is unequivocally the most lacking feature in all general-purpose game engines. +Networking is unequivocally the most lacking feature in all general-purpose game engines. While most engines provide low-level connectivity—virtual connections, optionally reliable UDP channels, rooms—almost none of them ([except][1] [Unreal][2]) provide high-level *replication* features like prediction, interest management, or lag compensation, which are necessary for most networked multiplayer games. @@ -15,6 +15,7 @@ This broad absence of first-class replication features stifles creative ambition Bevy's ECS opens up the possibility of providing a near-seamless, generalized networking API. What I hope to explore in this RFC is: + - What game design choices and constraints does networking add? - How does ECS make networking easier to implement? - What should developing a networked multiplayer game in Bevy look like? @@ -29,7 +30,7 @@ As a user, you only have to annotate your gameplay-related components and system > Game design should (mostly) drive networking choices. Future documentation could feature a questionnaire to guide users to the correct configuration options for their game. Genre and player count are generally enough to decide. -The core primitive here is the `Replicate` trait. All instances of components and resources that implement this trait will be automatically detected and synchronized over the network. Simply adding a `#[derive(Replicate)]` should be enough in most cases. +The core primitive here is the `Replicate` trait. All instances of components and resources that implement this trait will be automatically detected and synchronized over the network. Simply adding a `#[derive(Replicate)]` should be enough in most cases. ```rust #[derive(Replicate)] @@ -48,6 +49,7 @@ struct Health { hp: u32, } ``` + By default, both client and server will run every system you add to `NetworkFixedUpdate`. If you want systems or code snippets to run exclusively on one or the other, you can annotate them with `#[client]` or `#[server]` for the compiler. ```rust @@ -62,6 +64,7 @@ fn ball_movement_system( ``` For more nuanced runtime cases—say, an expensive movement system that should only process the local player entity on clients—you can use the `Predicted` query filter. If you need an explicit request or notification, you can use `Message` variants. + ```rust fn update_player_velocity( mut q: Query<(&Player, &mut Rigidbody)>) @@ -89,22 +92,22 @@ Bevy can configure an `App` to operate in several different network modes. | Mode | Playable? | Authoritative? | Open to connections? | | :--- | :---: | :---: | :---: | -| Client | ✓ | ✗ | ✗ | -| Standalone | ✓ | ✓ | ✗ | -| Listen Server | ✓ | ✓ | ✓ | +| Client | ✓ | ✗ | ✗ | +| Standalone | ✓ | ✓ | ✗ | +| Listen Server | ✓ | ✓ | ✓ | | Dedicated Server | ✗ | ✓ | ✓ | | Relay | ✗ | ✗ | ✓ | -
- ```rust // TODO: Example App configuration. ``` ## Implementation Strategy + [Link to more in-depth implementation details (more of an idea dump atm).](../main/implementation_details.md) ### Requirements + - `ComponentId` (and maybe the other `*Ids`) should be stable between clients and the server. - Must have a means to isolate networked and non-networked state. - `World` should be able to reserve an `Entity` ID range, with separate storage metadata. @@ -115,7 +118,9 @@ Bevy can configure an `App` to operate in several different network modes. - Networked components must only be mutated inside `NetworkFixedUpdate`. - The ECS scheduler should support nested loops. - (I'm pretty sure this isn't an actual blocker, but the workaround feels a little hacky.) + ### The Replicate Trait + ```rust // TODO impl Replicate for T { @@ -124,6 +129,7 @@ impl Replicate for T { ``` ### Specialized Change Detection + ```rust // TODO // Predicted (+ Added and Removed variants) @@ -135,7 +141,8 @@ impl Replicate for T { ``` ### Rollback via Run Criteria -```rust + +```rust /* TODO The "outer" loop is the number of fixed update steps as determined by the fixed timestep accumulator. @@ -144,7 +151,9 @@ The "inner" loop is the number of steps to re-simulate. ``` ### NetworkFixedUpdate + Clients + 1. Iterate received server updates. 2. Update simulation and interpolation timescales. 3. Sample inputs and push them to send buffer. @@ -152,6 +161,7 @@ Clients 5. Simulate predicted tick. Server + 1. Iterate received client inputs. 2. Sample buffered inputs. 3. Simulate authoritative tick. @@ -159,21 +169,26 @@ Server 5. Push client updates to send buffer. Everything aside from the simulation steps could be auto-generated. + ### Saving Game State + - At the end of each fixed update, server iterates `Changed` and `Removed` for all replicable components and duplicates them to an isolated copy. - Could pass this copy to another thread to do the serialization and compression. - This copy has no `Table`, those would be rebuilt by the client. ### Preparing Server Packets + - Snapshots (full state updates) will use delta compression and manual fragmentation. - Eventual consistency (partial state updates) will use interest management. - Both will most likely use the same data structure. ### Restoring Game State + - At the beginning of each fixed update, the client decodes the received update and generates the latest authoritative state. - Client then uses this state to write its local prediction copy that has all the tables and non-replicable components. ## Drawbacks + - Lots of potentially cursed macro magic. - Direct writes to `World`. - Seemingly limited to components that implement `Clone` and `Serialize`. @@ -181,42 +196,47 @@ Everything aside from the simulation steps could be auto-generated. ## Rationale and Alternatives ### Why *this* design? + Networking is a widely misunderstood problem domain. The proposed implementation should suffice for most games while minimizing design friction—users need only annotate gameplay-related components and systems, put those systems in `NetworkFixedUpdate`, and configure some settings. Polluting the API with "networked" variants of structs and systems (aside from `Transform`, `Rigidbody`, etc.) would just make life harder for everybody, both game developers and Bevy maintainers. IMO the ease of macro annotations is worth any increase in compile times when networking features are enabled. ### Why should Bevy provide this? -People who want to make multiplayer games want to focus on designing their game and not worry about how to implement prediction, how to serialize their game, how to keep packets under MTU, etc. Having these come built-in would be a huge selling point. + +People who want to make multiplayer games want to focus on designing their game and not worry about how to implement prediction, how to serialize their game, how to keep packets under MTU, etc. Having these come built-in would be a huge selling point. ### Why not wait until Bevy is more mature? + It'll only grow more difficult to add these features as time goes on. Take Unity for example. Its built-in features are too non-deterministic and its only working solutions for state transfer are paid third-party assets. Thus far, said assets cannot integrate deeply enough to be transparent (at least not without substituting parts of the engine). ### Why does this need to involve `bevy_ecs`? + I strongly doubt that fast, efficient, and transparent replication features can be implemented without directly manipulating a `World` and its component storages. We may need to allocate memory for networked data separately. ## Unresolved Questions -- Can we provide lints for undefined behavior like mutating networked state outside of `NetworkFixedUpdate`? + +- Can we provide lints for undefined behavior like mutating networked state outside of `NetworkFixedUpdate`? - Do rollbacks break change detection or events? - ~~When sending partial state updates, how should we deal with weird stuff like there being references to entities that haven't been spawned or have been destroyed?~~ Already solved by generational indexes. - How should UI widgets interact with networked state? React to events? Exclusively poll verified data? -- How should we handle correcting mispredicted events and FX? +- How should we handle correcting mispredicted events and FX? - Can we replicate animations exactly without explicitly sending animation data? ## Future Possibilities + - With some tools to visualize game state diffs, these replication systems could help detect non-determinism in other parts of the engine. - Much like how Unreal has Fortnite, Bevy could have an official (or curated) collection of multiplayer samples to dogfood these features. - Bevy's future editor could automate most of the configuration and annotation. - Replication addresses all the underlying ECS interop, so it should be settled first. But beyond replication, Bevy need only provide one good default for protocol and I/O for the sake of completeness. I recommend dividing crates at least to the extent shown below to make it easy for developers to swap the low-level transport with [whatever][3] [alternatives][4] [they][5] [want][7]. -| `bevy::net::replication` | `bevy::net::protocol` | `bevy::net::io` | +| `bevy::net::replication` | `bevy::net::protocol` | `bevy::net::io` | | -- | -- | -- | |
  • save and restore
  • prediction
  • serialization
  • delta compression
  • interest management
  • visual error correction
  • lag compensation
  • statistics (high-level)
|
  • (N)ACKs
  • reliability
  • virtual connections
  • channels
  • encryption
  • statistics (low-level)
|
  • send
  • recv
  • poll
| - [1]: https://youtu.be/JOJP0CvpB8w "Unreal Networking Features" [2]: https://www.unrealengine.com/en-US/tech-blog/replication-graph-overview-and-proper-replication-methods "Unreal Replication Graph Plugin" [3]: https://github.com/quinn-rs/quinn [4]: https://partner.steamgames.com/doc/features/multiplayer [5]: https://developer.microsoft.com/en-us/games/solutions/multiplayer/ [6]: https://dev.epicgames.com/docs/services/en-US/Overview/index.html -[7]: https://docs.aws.amazon.com/gamelift/latest/developerguide/gamelift-intro.html \ No newline at end of file +[7]: https://docs.aws.amazon.com/gamelift/latest/developerguide/gamelift-intro.html diff --git a/replication_concepts.md b/replication_concepts.md index d7bbde1e..a69e9340 100644 --- a/replication_concepts.md +++ b/replication_concepts.md @@ -1,15 +1,18 @@ # Replication + > The goal of replication is to ensure that all of the players in the game have a consistent model of the game state. Replication is the absolute minimum problem which all networked games have to solve in order to be functional, and all other problems in networked games ultimately follow from it. - [Mikola Lysenko][1] ---- +## Simulation Behavior Abstractly, you can think of a game as a pure function that accepts an initial state and player inputs and generates a new state. + ```rust let new_state = simulate(&state, &inputs); ``` -Fundamentally, if several players want to perform a synchronized simulation over a network, they have basically two options: -- Send their inputs to each other and independently and deterministically simulate the game. +If several players want to perform a synchronized simulation over a network, they have basically two options: + +- Send their inputs to each other and independently and deterministically simulate the game. -
also known asactive replication, lockstep, state-machine synchronization, determinism
- Send their inputs to a single machine (the server) who simulates the game and broadcasts updates back. -
also known aspassive replication, client-server, primary-backup, state transfer
@@ -18,7 +21,8 @@ In other words, players can either run the "real" game or follow it. For the rest of this RFC, I'll refer to them as determinism and state transfer, respectively. I just think they're the most literal terminology. -## Why determinism? +### Why determinism? + Deterministic multiplayer is basically local multiplayer but with *really* long controller cables. The netcode simply supplies the gameplay code with inputs. They're basically decoupled. Determinism has low infrastructure costs, both in terms of bandwith and server hardware. All steady-state network traffic is input, which is not only small but also compresses well. (Note that as player count increases, there *is* a crossover point where state transfer becomes more efficient). Likewise, as the game runs completely on the clients, there's no need to rent powerful servers. Relays are still handy for efficiently managing rooms and scaling to higher player counts, but those could be cheap VPS instances. @@ -27,7 +31,8 @@ Determinism is also tamperproof. It's impossible to do anything like speedhack o That every client must run the *entire* world is also determinism's biggest limit. While this works well for games with thousands of micro-managed entities like *Starcraft 2*, you won't be seeing games with expansive worlds like *Genshin Impact* networked this way anytime soon. -## Why state transfer? +### Why state transfer? + Determinism is awesome when it fits but it's generally unavailable. Neither Godot nor Unity nor Unreal can make this guarantee for large parts of their engines, particularly physics. Whenever you can't have or don't want determinism, you should use state transfer. @@ -36,7 +41,8 @@ Its main underlying idea is **authority**, which is just like ownership in Rust. The server usually owns everything, but authority is very flexible. In games like *Destiny* and *Fall Guys*, clients own their movement state. Other games even trust clients to confirm hits. Distributing authority like this adds complexity and obviously leaves the door wide open for cheaters, but sometimes it's necessary. In VR, it makes sense to let clients claim and relinquish authority over interactable objects. -## Why not messaging patterns? +### Why not messaging patterns? + The only other strategy you really see used for replication is messaging. RPCs. I actually see these most often in the free asset space. (I guess it's the go-to pattern outside of games?) Take chess for example. Instead of sending polled player inputs or the state of the chessboard, you could just send the moves like "white, e2 to e4," etc. @@ -44,6 +50,7 @@ Take chess for example. Instead of sending polled player inputs or the state of Here's the issue. Messages are tightly coupled to their game's logic. They can't be generalized. Chess is simple—one turn, one event—but what about an FPS? What messages would it need? How many? When and where would those messages need be sent and received? If those messages have cascading effects, they can only be sent reliable, ordered. + ```rust let mut s = state[n]; for message in queue.iter() { @@ -55,15 +62,19 @@ for message in queue.iter() { // applied and applied in the right order. *state[n+1] = s; ``` + Messages are great for when you want explicit request-reply interactions and global alerts like players joining or leaving. They just don't cut it as a replication mechanism for real-time games. Even if you avoided send and receive calls everywhere (i.e., collect and send in batches), messages don't compress as well as inputs or state. -# Latency +## Latency + Networking is hard because we want to let players who live in different countries play together *at the same time*, something that special relativity tells us is [strictly impossible][2]... unless we cheat. ### Lockstep + The simplest solution is to concede to the universe with grace and have players stall until they've received whatever data they need to execute the next simulation step. Blocking is fine for most turn-based games but simply doesn't cut it for real-time games. - + ### Adding Local Input Delay + The first trick we can pull is have each player delay their own input for a bit, trading responsiveness for more time to receive the incoming data. Our brains are pretty lenient about this, so we can actually *reduce* the latency between players. Two players in a 1v1 match actually could experience simultaneity if each delayed their input by half the round-trip time. @@ -73,6 +84,7 @@ This trick has powered the RTS genre for decades. With a large enough input dela > determinism + lockstep + local input delay = "delay-based netcode" ### Predict-Rollback + Instead of blocking, what if players just guess the missing data and keep going? Doing that would let us avoid stuttering, but then we'd have to deal with guessing incorrectly. Well, when the player finally has that missing remote data, what they can do is restore their simulation to the previous verified state, update it with the received data, and then re-predict the remaining steps. @@ -84,18 +96,18 @@ With prediction, input delay is no longer needed, but it's still useful. Reducin > determinism + predict-rollback + local input delay (optional) = "rollback netcode" ### Selective Prediction -Determinism is an all or nothing deal. If you predict, you predict everything. -State transfer has the flexibility to predict only *some* things, letting you offload expensive computations onto the server. There *are* client-server games like *Rocket League* who still predict everything (FWIW deterministic predict-rollback would have been a better fit), including other clients—the server redistributes inputs along with game state to reduce error. However, most often clients only predict what they control directly. +Determinism is an all or nothing deal. If you predict, you predict everything. +State transfer has the flexibility to predict only *some* things, letting you offload expensive computations onto the server. There *are* client-server games like *Rocket League* who still predict everything (FWIW deterministic predict-rollback would have been a better fit), including other clients—the server redistributes inputs along with game state to reduce error. However, most often clients only predict what they control directly. -# Visual Consistency +## Visual Consistency Real quick, always hard snap the simulation state. If clients do any blending, it's entirely visual. Yes, this does mean that entities may appear in different positions from where they should be. On the other hand, we have to honor this inaccurate view to keep players happy. ### Smooth Rendering and Lag Compensation -Predicting only *some* things adds implementation complexity. +Predicting only *some* things adds implementation complexity. When clients predict everything, they produce renderable state at a fixed pace. Now, anything that isn't predicted must be rendered using data received from the server. The problem is that server updates are sent over a lossy, unreliable internet that disrupts any consistent spacing between packets. This means clients need to buffer incoming server updates long enough to have two authoritative updates to interpolate most of the time. @@ -103,33 +115,40 @@ Gameplay-wise, not predicting everything also divides entities between two point Visually, we'll often have to blend between extrapolated and authoritative data. Simply interpolating between two authoritative updates is incorrect. The visual state can and will accrue errors, but that's what we want. Those can be tracked and smoothly reduced (to some near-zero threshold, then cleared). -# Bandwidth +## Bandwidth + ### How much can we fit into each packet? + Not a lot. You can't send arbitrarily large packets over the internet. The information superhighway has load limits. The conservative, almost universally supported "maximum transmissible unit" or MTU is 1280 bytes. Accounting for IP and UDP headers and some connection metadata, you realistically can send ~1200 bytes of game data per packet. -If you significantly exceed this, some random stop along the way will delay the packet and break it up into fragments. +If you significantly exceed this, some random stop along the way will delay the packet and break it up into fragments. [Fragmentation](https://packetpushers.net/ip-fragmentation-in-detail/) [sucks](https://blog.cloudflare.com/ip-fragmentation-is-broken) because it multiplies the likelihood of the overall packet being lost (all fragments have to arrive to read the full packet). Getting fragmented along the way is even worse because of the added delay. It's okay if the sender manually fragments their packet (like 2 or 3) *upfront*, although the higher loss does limit simulation rate, just don't rely on the internet to do it. ### Okay, but that doesn't seem like much? + Well, there are two more reasons not to yeet giant 100kB packets across the network: + - Bandwidth costs are the lion's share of hosting expenses. - Many players still have limited bandwidth. So unless we limit everyone to <20Hz tick rates, our only options are: + - Send smaller things. - Send fewer things. ### Snapshots + Alright then, state transfer. The most obvious strategy is to send full **snapshots**. All we can do with these is make them smaller (i.e. quantize floats, then compress everything). Fortunately, snapshots are very compressible. An extremely popular idea called **delta compression** is to send each client a diff (often with further compression on top) of the current snapshot and the latest one they acknowledged receiving. Clients can then use these to patch their existing snapshots into the current one. -The server can fragment payloads as a last resort. +The server can fragment payloads as a last resort. ### Eventual Consistency + When snapshots fail or hidden information is needed, the best alternative is to prioritize sending each client the state most relevant to them. This technique is commonly called **eventual consistency**. Determining relevance is often called **interest management** or **area of interest**. Each granular piece of state is given a "send priority" that accumulates over time and resets when sent. How quickly priority accumulates for different things is up to the developer, though physical proximity and visual salience usually have the most influence. @@ -138,4 +157,4 @@ Eventual consistency can be combined with delta compression, but I wouldn't reco [1]: https://0fps.net/2014/02/10/replication-in-networked-games-overview-part-1/ [2]: https://en.wikipedia.org/wiki/Relativity_of_simultaneity -[3]: https://en.wikipedia.org/wiki/Client-side_prediction \ No newline at end of file +[3]: https://en.wikipedia.org/wiki/Client-side_prediction From e3730b6c590f3091fd0e6b36206dfd31d20db165 Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Sat, 1 May 2021 17:44:19 -0500 Subject: [PATCH 25/43] edited TODOs --- networked_replication.md | 32 +++++++++++++++++--------------- 1 file changed, 17 insertions(+), 15 deletions(-) diff --git a/networked_replication.md b/networked_replication.md index 0a8f2a45..4ef68976 100644 --- a/networked_replication.md +++ b/networked_replication.md @@ -84,8 +84,8 @@ fn expensive_physics_calculation( } ``` -```rust -// TODO: Message Example +```plaintext +TODO: Message Example ``` Bevy can configure an `App` to operate in several different network modes. @@ -98,8 +98,8 @@ Bevy can configure an `App` to operate in several different network modes. | Dedicated Server | ✗ | ✓ | ✓ | | Relay | ✗ | ✗ | ✓ | -```rust -// TODO: Example App configuration. +```plaintext +TODO: Example App configuration. ``` ## Implementation Strategy @@ -130,24 +130,26 @@ impl Replicate for T { ### Specialized Change Detection -```rust -// TODO -// Predicted (+ Added and Removed variants) -// Set when mutated by client. Cleared when mutated by server update. -// Confirmed (+ Added and Removed variants) -// Set when mutated by server update. Cleared when mutated by client. -// Cancelled -// ???? +```plaintext +TODO + +Predicted +- Set when mutated by client. Cleared when mutated by server update. +Confirmed +- Set when mutated by server update. Cleared when mutated by client. +Cancelled +- Set when something was predicted but not confirmed. + +(Also Added and Removed variants) ``` ### Rollback via Run Criteria -```rust -/* +```plaintext TODO + The "outer" loop is the number of fixed update steps as determined by the fixed timestep accumulator. The "inner" loop is the number of steps to re-simulate. -*/ ``` ### NetworkFixedUpdate From bfdfa53ff8129401822a2e739cd552bf7d04aa11 Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Mon, 3 May 2021 14:57:53 -0500 Subject: [PATCH 26/43] changed link wording more nitpicky edits --- networked_replication.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/networked_replication.md b/networked_replication.md index 4ef68976..7b7d7c65 100644 --- a/networked_replication.md +++ b/networked_replication.md @@ -22,7 +22,7 @@ What I hope to explore in this RFC is: ## User-facing Explanation -[Link to my explanation of important replication concepts.](../main/replication_concepts.md) +[Recommended reading on replication concepts.](../main/replication_concepts.md) Bevy's aim here is to make writing local and networked multiplayer games indistinguishable, with minimal added boilerplate. Having an exact simulation timeline simplifies this problem, thus the core of this unified approach is a fixed timestep—`NetworkFixedUpdate`. @@ -104,7 +104,7 @@ TODO: Example App configuration. ## Implementation Strategy -[Link to more in-depth implementation details (more of an idea dump atm).](../main/implementation_details.md) +[See more in-depth implementation details (more of an idea dump atm).](../main/implementation_details.md) ### Requirements From c38824d1c3abf25fbd42243f8ac06865a0a4441e Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Mon, 3 May 2021 15:18:50 -0500 Subject: [PATCH 27/43] clarification on tick-based simulation and visually smoothing predictions errors --- implementation_details.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/implementation_details.md b/implementation_details.md index 6c88e310..4a6f0fe0 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -7,9 +7,11 @@ I know I've been using the terms "client" and "player" somewhat interchangeably, ## "Clock" Synchronization -Ideally, clients predict ahead by just enough to have their inputs reach the server right before they're needed. People often try to have clients estimate the clock time on the server (with some SNTP handshake) and use that to schedule the next simulation step, but that's overly complex. +Using a fixed rate, tick-based simulation simplifies how we need to think about time. It's like scrubbing a timeline, from one "frame" to the next. The key point is that everyone follows the same sequence. Clients may be simulating different points on the timeline, but tick 480 refers is the same simulation step for everyone. -What we really care about is: How much time passes between when the server receives my input and when that input is consumed? If the server simply tells clients how long their inputs are waiting in its buffer, the clients can use that information to converge on the correct lead. +Ideally, clients predict ahead by just enough to have their inputs reach the server right before they're needed. People often try to have clients estimate the clock time on the server (with some SNTP handshake) and use that to schedule the next simulation step, but that's too indirect IMO. + +What we really care about is: How much time passes between when the server receives my input and when that input is consumed? If the server just tells me—in its update for tick N—how long my input for tick N sat in its buffer, I can use that information to converge on the correct lead. ```rust if received_newer_server_update: @@ -108,7 +110,7 @@ Cameras need a little special treatment. Inputs to the view rotation need to be We'll also need to distinguish instant motion from integrated motion when interpolating. Moving an entity by modifying `transform.translation` and `rigidbody.velocity` should look different. -Is an exponential decay enough for smooth error correction or are there better algorithms? +We'll need a special blending for predicted entities and entities transitioning between prediction and interpolation. "Projective velocity blending" seems to be a common way to smooth extrapolation errors, but I've also seen a simple exponential decay recommended. Are there better smoothing algorithms? (Any econ grads good with time-series?) ## Lag Compensation From 7a3b6eedcbd30ac142cf8f48f431bfb897eab1f6 Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Mon, 3 May 2021 15:39:55 -0500 Subject: [PATCH 28/43] fixed typo --- implementation_details.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/implementation_details.md b/implementation_details.md index 4a6f0fe0..d6c96f67 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -7,7 +7,7 @@ I know I've been using the terms "client" and "player" somewhat interchangeably, ## "Clock" Synchronization -Using a fixed rate, tick-based simulation simplifies how we need to think about time. It's like scrubbing a timeline, from one "frame" to the next. The key point is that everyone follows the same sequence. Clients may be simulating different points on the timeline, but tick 480 refers is the same simulation step for everyone. +Using a fixed rate, tick-based simulation simplifies how we need to think about time. It's like scrubbing a timeline, from one "frame" to the next. The key point is that everyone follows the same sequence. Clients may be simulating different points on the timeline, but tick 480 is the same simulation step for everyone. Ideally, clients predict ahead by just enough to have their inputs reach the server right before they're needed. People often try to have clients estimate the clock time on the server (with some SNTP handshake) and use that to schedule the next simulation step, but that's too indirect IMO. From b0075eec07c47857ee972e67ed648b77a35179c9 Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Sat, 8 May 2021 16:20:03 -0500 Subject: [PATCH 29/43] added some details for interest management --- implementation_details.md | 18 +++++++++++------- networked_replication.md | 15 ++++++++------- 2 files changed, 19 insertions(+), 14 deletions(-) diff --git a/implementation_details.md b/implementation_details.md index d6c96f67..33ea0138 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -136,7 +136,7 @@ So there are two ways to go about the actual compensation: There's a lot to learn from *Overwatch* here. -*Overwatch* shows that [time is just another collision dimension](https://youtu.be/W3aieHjyNvw?t=2226). Basically, you can broadphase test against the entire collider history at once (with the amortized method). +*Overwatch* shows that [we can treat time as another spatial dimension](https://youtu.be/W3aieHjyNvw?t=2226), so we can put the entire collider history in something like a BVH and test it all at once (with the amortized method). *Overwatch* [allows defensive abilities to mitigate compensated projectiles](https://youtu.be/W3aieHjyNvw?t=2492). AFAIK this is simple to do. If a player activates any defensive bonus, just apply it to all their buffered hitboxes. @@ -150,16 +150,16 @@ Every article on "rollback netcode" and "client-side prediction and server recon I thought of two methods while I was writing this: -1. Unordered scan looking for first difference. -2. Ordered scan to compute checksum and compare. +- Unordered scan looking for first difference. +- Ordered scan to compute checksum and compare. The first option has an unpredictable speed. The second option requires a fixed walk of the game state (checksums *are* probably worth having even if only for debugging non-determinism). There may be options I didn't consider, but the point I'm trying to make is that detecting changes among large numbers of entities isn't cheap. Let's consider a simpler default: -3. Always rollback and re-simulate. +- Always rollback and re-simulate. -Now, you may think that's wasteful, but I would say "if mispredicted" gives you a false sense of security. Mispredictions can occur at any time, *especially* during long-lasting complex physics interactions. It's much easier to profile and optimize for your worst-case if clients *always* rollback and re-sim. It's also more memory-efficient, since clients never need to store old predicted states. +Now, you may think that's wasteful, but I would say "if mispredicted" gives you a false sense of security. Mispredictions can occur at any time, *especially* during long-lasting complex physics interactions. Those would show up as CPU spikes. Instead, it's much easier to profile and optimize for your worst-case if clients *always* rollback and re-sim. It's also more memory-efficient, since clients never need to store old predicted states. ## Delta-Compressed Snapshots @@ -171,13 +171,17 @@ Now, you may think that's wasteful, but I would say "if mispredicted" gives you - Applies the changes to the copy and pushes the latest patch into the ring buffer. - `Xors` older patches with the latest patch to update them. - The server reads the needed patches as `&[u8]` (or `&[u64]`) and compresses them using run-length encoding (RLE) or similar. - - No "serialization" needed. If networked DSTs are stored in their own heap allocation, we can literally send the bits. `rkyv` is a good reference (relative pointers). + - No "serialization" needed. If networked DSTs (dynamically-sized types) are stored in their own heap allocation, we can literally send the bits. `rkyv` is a good reference (relative pointers). - Pass compressed payloads to protocol layer. - Protocol and I/O layers do whatever they do and send the packet. ## Interest-Managed Updates -TODO +- Uses the same latest copy + delta buffer data structure, but with additional metadata and filter "components" to track priority and relevance per-*player*. +- Changing components (note: `Transform` is special) adds to the send priority of their entities. Existing send priority doubles every tick. Importantly, entities that haven't changed won't accumulate priority. +- Server writes the entities relevant to each client in priority order, until the packet is full or all entities are written. The send priority of written entities are reset for that client. +- If the server is notified of packet loss, it checks the patch of the update that was lost and re-prioritizes its changed entities for that client. If the patch no longer exists, everything is prioritized. +- (Again, handling networked DSTs needs more consideration.) ## Messages diff --git a/networked_replication.md b/networked_replication.md index 7b7d7c65..e8afdc2a 100644 --- a/networked_replication.md +++ b/networked_replication.md @@ -98,6 +98,8 @@ Bevy can configure an `App` to operate in several different network modes. | Dedicated Server | ✗ | ✓ | ✓ | | Relay | ✗ | ✗ | ✓ | +We'll also need a mode similar to listen server for deterministic peers. + ```plaintext TODO: Example App configuration. ``` @@ -128,19 +130,18 @@ impl Replicate for T { } ``` -### Specialized Change Detection +### Special Query Filters ```plaintext TODO Predicted -- Set when mutated by client. Cleared when mutated by server update. +- Set when locally mutated. Cleared when mutated by authoritative update. Confirmed -- Set when mutated by server update. Cleared when mutated by client. -Cancelled -- Set when something was predicted but not confirmed. +- Set when mutated by authoritative update. Cleared when locally mutated. -(Also Added and Removed variants) +Predicting non-synchronized state such as sounds and particles is probably best realized through dispatching events, with follow-up confirmation / cancellation. +How to uniquely identify them is another question, though. ``` ### Rollback via Run Criteria @@ -219,7 +220,7 @@ I strongly doubt that fast, efficient, and transparent replication features can - Can we provide lints for undefined behavior like mutating networked state outside of `NetworkFixedUpdate`? - Do rollbacks break change detection or events? -- ~~When sending partial state updates, how should we deal with weird stuff like there being references to entities that haven't been spawned or have been destroyed?~~ Already solved by generational indexes. +- ~~When sending interest-managed updates, how should we deal with weird stuff like there being references to entities that haven't been spawned or have been destroyed?~~ Already solved by generational indexes. - How should UI widgets interact with networked state? React to events? Exclusively poll verified data? - How should we handle correcting mispredicted events and FX? - Can we replicate animations exactly without explicitly sending animation data? From e953da6d5b9dca6d7077710f645a7fe8afe54141 Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Sat, 8 May 2021 16:27:25 -0500 Subject: [PATCH 30/43] more cleanup --- implementation_details.md | 19 +++++-------------- networked_replication.md | 3 --- 2 files changed, 5 insertions(+), 17 deletions(-) diff --git a/implementation_details.md b/implementation_details.md index 33ea0138..47dbf068 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -66,23 +66,14 @@ The key idea here is that simplifying the client-server relationship makes the p Clients can't directly modify the authoritative state, but they should be able to predict whatever they want locally. One obvious implementation is to literally fork the latest authoritative state. If copying the full state ends up being too expensive, we can probably use a copy-on-write layer. -My current idea to shift components between prediction and interpolation is to default to interpolated (reset upon receiving a server update) and then use specialized change detection `DerefMut` magic to flag as predicted. - -```rust -Predicted -PredictAdded -PredictRemoved -Confirmed -ConfirmAdded -ConfirmRemoved -Cancelled -CancelAdded -CancelRemoved -``` +My current idea to shift components between prediction and interpolation is to default to interpolated (reset upon receiving a server update) and then use specialized `Predicted` and `Confirmed` query filters that piggyback off of reliable change detection. Everything is predicted by default, but users can opt-out by filtering on `Predicted`. In the more conservative cases, clients would predict the entities driven by their input, the entities they spawn (until confirmed), and any entities mutated as a result of the first two. Systems with filtered queries (i.e. physics, path-planning) should typically run last. -We can also use these filters to generate events that only trigger on authoritative changes and events that trigger on predicted changes to be confirmed or cancelled later. The latter are necessary for handling sounds and particle effects. Those shouldn't be duplicated during rollbacks and should be faded out if mispredicted. +Since sounds and particles require special consideration, they're probably best realized through dispatching events to be handled *outside* `NetworkFixedUpdate`. We can use these query filters to generate events that only trigger on authoritative changes and events that trigger on predicted changes to be confirmed or cancelled later. + +How to uniquely identify these events is another question, though. + Should UI be allowed to reference predicted state or only verified state? diff --git a/networked_replication.md b/networked_replication.md index e8afdc2a..191f7575 100644 --- a/networked_replication.md +++ b/networked_replication.md @@ -140,10 +140,7 @@ Predicted Confirmed - Set when mutated by authoritative update. Cleared when locally mutated. -Predicting non-synchronized state such as sounds and particles is probably best realized through dispatching events, with follow-up confirmation / cancellation. -How to uniquely identify them is another question, though. ``` - ### Rollback via Run Criteria ```plaintext From 5cae2c21594a7a2f5035b7926cebc47c8edd508c Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Fri, 28 May 2021 14:24:41 -0500 Subject: [PATCH 31/43] updated some notes; moved rest of impl strategy into the other doc I think I've figured out a satisfying technique for both full and interest-managed state synchronization. Just adding it here. --- implementation_details.md | 226 +++++++++++++++++++++++++------------- networked_replication.md | 96 ++-------------- 2 files changed, 157 insertions(+), 165 deletions(-) diff --git a/implementation_details.md b/implementation_details.md index 47dbf068..865a480a 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -1,53 +1,158 @@ # Implementation Details +## Requirements + +- `ComponentId` should be stable between clients and the server. +- Must isolate networked and non-networked state. + - Entities must be born (non-)networked. They cannot become (non-)networked. + - Networked entities must have a "network ID" component at minimum. + - Networked components and resources must only hold or reference networked data. + - Networked components must only be mutated inside `NetworkFixedUpdate`. + +## Wants + +- Ideally, `World` could reserve and split off a range of entities, with separate component storages. ([#16][1] could potentially be used for this). +- The ECS scheduler should support arbitrary cycles in the stage graph (or equivalent). Want ergonomic support for nested loops. + +## Storage + +The server maintains a storage resource containing a full copy of the latest networked state as well as a ring buffer of deltas (for the last `N` snapshots). Both are updated lazily using Bevy's built-in change detection. + +```plaintext + delta ringbuf copy of latest + v v +[(0^8), (1^8), (2^8), (3^8), (4^8), (5^8), (6^8), (7^8)] [8] + ^ + latest delta +``` + +At the end of every tick, the server zeroes the space for the newest delta, then iterates `Changed` and `Removed`: + +- Generate the newest delta by xor'ing the changes with the stored copy. +- Update the rest of the ring buffer by xor'ing the older deltas with the newest. +- Write the changes to the stored copy. + +This structure is the same for both delta-compressed snapshots and interest-managed updates. It's all pre-allocated when the resource is initialized. + +**TODO**: Components and resources that allocate on the heap (DSTs) won't be supported at first. The solution is most likely going to be backing this resource with its own memory region (something like `bumpalo` but smarter). + +### Full Updates + +(a.k.a. delta-compressed snapshots) + +For delta-compression, the server just compresses whichever deltas clients need using some variant of run-length encoding, such as [this Simple8b + RLE variant][2] (licensed under Apache 2.0). If the compressed payload is too large, the server chops it into fragments. No unnecessary work. The server only compresses deltas that are going to be sent and the same compressed payload can be sent to any number of clients. + +### Interest-Managed Updates + +(a.k.a. eventual consistency) + +For interest management, the server needs some extra metadata to know what to send each player. + +```rust +struct InterestMetadata { + priority: [Vec; P], + relevance: SparseSet; P]>, + location: Vec, + within_aoi: [Vec; P], +} +``` + +Each entity has a per-player send priority that's just the age of its oldest undelivered change. Entities that don't change won't accumulate priority. + +For checking if an entity is within a player's area of interest, I'm looking into a sort-and-sweep (sweep-and-prune) using Morton-encoded coordinates as the broad-phase algorithm, followed by a simple sphere radius test. Results will be stored in an array for each player. Alternatives like grids and potentially visible sets (PVS) can be added later. + +Relevance will be tracked per component (per player). Relevance will be set by changed detection and certain rules, while other rules can clear the relevance (TBD). This rule-based filtering seems likely to involve relations. + +Once the metadata has been updated, the server sorts each results array in priority order and writes the relevant components of those entities until the packet is full or all relevant entities have been written. + +When the server sends a packet, it remembers the priorities for each included entity (well, for their indexes). Their current priority and relevances are then cleared. Later, if the server is notified that some previously sent packets were probably lost, it can restore all their priorities (plus the number of ticks since they were first sent). + +For restoring the relevance of an entity's components, there are two cases. If the relevant patch matching its age is still around, the server will use it as a reference and only mark the changed components as relevant. If not, the server will mark all of its components as relevant. + +## Replicate Trait + +```rust +pub unsafe trait Replicate { + fn quantize(&mut self); +} + +unsafe impl Replicate for NetworkTransform {} +``` + +## How to rollback? + +TODO + +```plaintext +The "outer" loop is the number of fixed update steps as determined by the fixed timestep accumulator. +The "inner" loop is the number of steps to re-simulate. +``` + +## Unconditional Rollbacks + +Every article on "rollback netcode" and "client-side prediction and server reconciliation" encourages having clients compare their predicted state to the authoritative state and reconciling *if* they mispredicted, but well... How do you actually detect a mispredict? + +AFAIK, there are two ways to do it: + +- Iterate both copies and look for the first difference. +- Have both server and client compute a checksum for their copy and have the client compare them. + +The first option has an unpredictable speed. The second option requires an ordered scan of the entire networked game state. Checksums *are* worth having for deterministic desync detection, but that can be deferred. The point I'm trying to make is that detecting state differences isn't cheap (especially once DSTs are involved). + +Let's consider a simpler default: + +- Always rollback and re-simulate when you receive a new update. + +This might seem wasteful, but think about it. If-then just hides performance problems from you. Heavy rollback scenarios will exist regardless. You can't prevent clients from running into them. Mispredictions are *especially* likely during heavier computations like physics. Just have clients always rollback and re-sim. It's easier to profile and optimize your worst-case. It's also more memory-efficient, since clients never need to store old predicted states. + ## `Connection` != `Player` -I know I've been using the terms "client" and "player" somewhat interchangeably, but `Connection` and `Player` should be separate tokens. There's no benefit in forcing one player per connection. Having `Player` be its own thing makes it easier to do stuff like online splitscreen, temporarily fill team slots with bots, etc. +I know I've been using the terms "client" and "player" somewhat interchangeably, but `Connection` and `Player` should be separate tokens. There's no benefit in forcing one player per connection. Having `Player` be its own thing makes it easier to do stuff like online splitscreen, temporarily substituting vacancies with bots, etc. ## "Clock" Synchronization Using a fixed rate, tick-based simulation simplifies how we need to think about time. It's like scrubbing a timeline, from one "frame" to the next. The key point is that everyone follows the same sequence. Clients may be simulating different points on the timeline, but tick 480 is the same simulation step for everyone. -Ideally, clients predict ahead by just enough to have their inputs reach the server right before they're needed. People often try to have clients estimate the clock time on the server (with some SNTP handshake) and use that to schedule the next simulation step, but that's too indirect IMO. +Ideally, clients predict ahead by just enough to have their inputs for each tick reach the server right before it simulates that tick. A commonly discussed strategy is to have clients estimate the clock time on the server (through some SNTP handshake) and use that to schedule their next simulation step, but IMO that's too indirect. What we really care about is: How much time passes between when the server receives my input and when that input is consumed? If the server just tells me—in its update for tick N—how long my input for tick N sat in its buffer, I can use that information to converge on the correct lead. ```rust if received_newer_server_update: // an exponential moving average is a simple smoothing filter - smoothed_age = (31 / 32) * smoothed_age + (1 / 32) * age + avg_age = (31 / 32) * avg_age + (1 / 32) * age // too late -> positive error -> speed up // too early -> negative error -> slow down - error = target_age - smoothed_age + error = target_age - avg_age // reset accumulator accumulated_correction = 0.0 time_dilation = remap(error + accumulated_correction, -max_error, max_error, -0.1, 0.1) -accumulated_correction += time_dilation * simulation_timestep +accumulated_correction += time_dilation * fixed_delta_time -tick_cost = (1.0 + time_dilation) * fixed_delta_time +cost_of_one_tick = (1.0 + time_dilation) * fixed_delta_time ``` If its inputs are arriving too early, a client can temporarily run fewer ticks each second to relax its lead. For example, a client simulating 10% slower would shrink their lead by 1 tick for every 10. -Interpolation is the same. All that matters is the interval between received packets and how it varies. You want the interpolation delay to be as small as possible. +Interpolation is the same. You want the interpolation delay to be as small as possible. All that matters is the interval between received packets and how it varies (or maybe the number of buffered snapshots ahead of your current interpolation time). ```rust if received_newer_server_update: // an exponential moving average is simple smoothing filter - smoothed_delay = (31 / 32) * smoothed_delay + (1 / 32) * delay - smoothed_jitter = (31 / 32) * smoothed_jitter + (1 / 32) * abs(smoothed_delay - delay) + avg_delay = (31 / 32) * avg_delay + (1 / 32) * delay + avg_jitter = (31 / 32) * avg_jitter + (1 / 32) * abs(avg_delay - delay) - target_interp_delay = smoothed_delay + (2.0 * smoothed_jitter); - smoothed_interp_delay = (31 / 32) * smoothed_interp_delay + (1 / 32) * (latest_snapshot_time - interp_time); + target_interp_delay = avg_delay + (2.0 * avg_jitter); + avg_interp_delay = (31 / 32) * avg_interp_delay + (1 / 32) * (latest_snapshot_recv_time - interp_time); // too early -> positive error -> slow down // too late -> negative error -> speed up - error = -(target_interp_delay - smoothed_interp_delay) + error = -(target_interp_delay - avg_interp_delay) // reset accumulator accumulated_correction = 0.0 @@ -60,56 +165,53 @@ interp_time += (1.0 + time_dilation) * delta_time interp_time = max(interp_time, predicted_time - max_lag_comp) ``` -The key idea here is that simplifying the client-server relationship makes the problem easier. You *could* have the server apply inputs whenever they arrive, rolling back if necessary, but that would only complicate things. If the server never accepts late inputs and never changes its pace, no one needs to coordinate. +The key idea here is that simplifying the client-server relationship is more efficient and has less problems. If you followed the Source engine model described [here][3], the server would have to apply inputs whenever they arrive, meaning the server also has to rollback and it also must deal with weird ping-related issues (see the lag compensation section in [this article][4]). If the server never accepts late inputs and never changes its pace, no one needs to coordinate. ## Prediction <-> Interpolation -Clients can't directly modify the authoritative state, but they should be able to predict whatever they want locally. One obvious implementation is to literally fork the latest authoritative state. If copying the full state ends up being too expensive, we can probably use a copy-on-write layer. +Clients can't directly modify the authoritative state, but they should be able to predict whatever they want locally. Current plan is to just copy the latest authoritative state. If this ends up being too expensive (or when DSTs are supported), we can probably use a copy-on-write layer. -My current idea to shift components between prediction and interpolation is to default to interpolated (reset upon receiving a server update) and then use specialized `Predicted` and `Confirmed` query filters that piggyback off of reliable change detection. +To shift components between prediction and interpolation, we can default to either. When remote entities are interpolated by default, most entities will reset to interpolated when modified by a server update. We can then use specialized `Predicted` and `Confirmed` (equivalent to `Not(Predicted)`) query filters to address the two separately. These will piggyback off of Bevy's built-in reliable change detection. -Everything is predicted by default, but users can opt-out by filtering on `Predicted`. In the more conservative cases, clients would predict the entities driven by their input, the entities they spawn (until confirmed), and any entities mutated as a result of the first two. Systems with filtered queries (i.e. physics, path-planning) should typically run last. +Systems will predict by default, but users can opt-out with the `Predicted` filter. Systems with filtered queries (i.e. physics, path-planning) should typically run last. Clients should always predict entities driven by their input and entities whose spawns haven't been confirmed. -Since sounds and particles require special consideration, they're probably best realized through dispatching events to be handled *outside* `NetworkFixedUpdate`. We can use these query filters to generate events that only trigger on authoritative changes and events that trigger on predicted changes to be confirmed or cancelled later. +Since sounds and particles require special consideration, they're probably best realized through dispatching events to be handled *outside* `NetworkFixedUpdate`. We can use these query filters to generate events that only trigger on authoritative changes and events that trigger on predicted changes to be confirmed or cancelled later. How to uniquely identify these events is another question, though. - Should UI be allowed to reference predicted state or only verified state? ## Predicting Entity Creation This requires some special consideration. -The naive solution is to have clients spawn dummy entities. When an update that confirms the result arrives, clients can simply destroy the dummy and spawn the true entity. IMO this is a poor solution because it prevents clients from smoothly blending these entities from predicted time into interpolated time. It won't look right. +The naive solution is to have clients spawn dummy entities so that when an update that confirms the result arrives, they'll simply destroy the dummy and spawn the true entity. IMO this is a poor solution because it prevents clients from smoothly blending predicted spawns to their authoritative location. Snapping won't look right. -A better solution is for the server to assign each networked entity a global ID (`NetworkID`) that the spawning client can predict and map to its local instance. +A better solution is for the server to assign each networked entity a global ID that the spawning client can predict and map to its local instance. There are 3 variants that I know of: -- The simplest form of this would be an incrementing generational index whose upper bits are fixed to match the spawning player's ID. This is my recommendation. Basically, reuse `Entity` and reserve some of the upper bits in its ID. +1. Use an incrementing generational index (reuse `Entity`) and fix its upper bits to match the ID of the spawning player. This is the simplest method and my recommendation. -- Alternatively, PRNGs could be used to generate shared keys (called "prediction keys" in some places) for pairing global and local IDs. Rather than predict the global ID, the client would predict the shared key. Server updates that confirm the predicted entity would include both its global ID and the shared key, which the client can then use to pair the IDs. This method adds complexity but bypasses the previous method's implicit entity limit. +2. Use PRNGs to generate shared keys (I've seen these dubbed "prediction keys") for pairing local and global IDs. Rather than predict the global ID directly, clients predict the shared keys. Server updates that confirm a predicted entity would include both its global ID and the shared key. Once acknowledged, later updates can include just the global ID. This method is more complicated but does not share the previous method's implicit entity limit. -- A more extreme solution would be to somehow bake global IDs directly into the memory allocation. If memory layouts are mirrored, relative pointers become global IDs, which don't need to be explicitly written into packets. This would save 4-8 bytes per entity before compression. +3. Bake it into the memory layout. If the layout and order of the snapshot storage is identical on all machines, array indexes and relative pointers can double as global IDs. They wouldn't need to be explicitly written into packets, potentially reducing packet size by 4-8 bytes per entity (before compression). However, we'd probably end up separately including a generation anyway to not confuse destroyed entities with new ones. ## Smooth Rendering Rendering should come after `NetworkFixedUpdate`. -Whenever clients receive an update with new remote entities, those entities shouldn't be rendered until that update is interpolated. +Whenever clients receive an update with new remote entities, those entities shouldn't be rendered until that update is interpolated, likely done through adding a marker component. -Cameras need a little special treatment. Inputs to the view rotation need to be accumulated at the render rate and re-applied just before rendering. +Cameras need a little special treatment. Look inputs need to be accumulated at the render rate and re-applied to the camera just before rendering. -We'll also need to distinguish instant motion from integrated motion when interpolating. Moving an entity by modifying `transform.translation` and `rigidbody.velocity` should look different. +We'll also need to distinguish instant motion from integrated motion when interpolating. Moving an entity by modifying `transform.translation` should teleport and moving by integrating `rigidbody.velocity` should look smooth. -We'll need a special blending for predicted entities and entities transitioning between prediction and interpolation. "Projective velocity blending" seems to be a common way to smooth extrapolation errors, but I've also seen a simple exponential decay recommended. Are there better smoothing algorithms? (Any econ grads good with time-series?) +We'll need a special blending for predicted entities and entities transitioning between prediction and interpolation. "Projective velocity blending" seems to be a common way to smooth extrapolation errors, but I've also seen a simple exponential decay recommended. There may be better smoothing filters. ## Lag Compensation -Lag compensation deals with colliders. To avoid weird outcomes, lag compensation needs to run after all motion and physics systems. +Lag compensation deals with colliders and needs to run after all motion and physics systems. All positions have to be settled or you'll get unexpected results. -Again, people often imagine having the server estimate what interpolated state the client was looking at based on their RTT, but we can resolve this without any guesswork. - -Clients can just tell the server what they were looking at by bundling the interpolated tick numbers and the blend value inside the input payloads. With this information, the server can reconstruct *exactly* what each client saw. +It seems like a common strategy is to have the server estimate what interpolated state the client was looking at based on their RTT, but we can resolve this without any guesswork. Clients can just tell the server what they were looking at by bundling their interpolation parameters along with their inputs. With this information, the server can reconstruct what each client saw with *perfect* accuracy. ```plaintext @@ -120,64 +222,34 @@ interpolation blend value ``` -So there are two ways to go about the actual compensation: +So there are two ways to do the actual compensation: - Compensate upfront by bringing new projectiles into the present (similar to a rollback). - Compensate over time ("amortized"), constantly testing projectiles against the history buffer. There's a lot to learn from *Overwatch* here. -*Overwatch* shows that [we can treat time as another spatial dimension](https://youtu.be/W3aieHjyNvw?t=2226), so we can put the entire collider history in something like a BVH and test it all at once (with the amortized method). - -*Overwatch* [allows defensive abilities to mitigate compensated projectiles](https://youtu.be/W3aieHjyNvw?t=2492). AFAIK this is simple to do. If a player activates any defensive bonus, just apply it to all their buffered hitboxes. - -For clients with too-high ping, their interpolation will lag far behind their prediction. If you only compensate up to a limit (e.g. 200ms), [those clients will have to extrapolate the difference](https://youtu.be/W3aieHjyNvw?t=2347). Doing nothing is also valid, but lagging clients would abruptly have to start leading their targets. - -When a player is parented to another entity, which they have no control over (e.g. the player is a passenger in a vehicle), the non-predicted movement of that parent must be rewound during compensation to spawn any projectiles fired by the player in the correct location. - -## Unconditional Rollbacks - -Every article on "rollback netcode" and "client-side prediction and server reconciliation" encourages having clients compare their predicted state to the authoritative state and reconciling *if* they mispredicted. But how do you actually detect a mispredict? - -I thought of two methods while I was writing this: +*Overwatch* shows that [we can treat time as another spatial dimension][5], so we can put the entire collider history in something like a BVH and test it all at once (with the amortized method). -- Unordered scan looking for first difference. -- Ordered scan to compute checksum and compare. +For clients with too-high ping, their interpolation will lag far behind their prediction. If you only compensate up to a limit (e.g. 200ms), [those clients will have to extrapolate the difference][6]. Doing nothing is also valid, but lagging clients would abruptly have to start leading their targets. -The first option has an unpredictable speed. The second option requires a fixed walk of the game state (checksums *are* probably worth having even if only for debugging non-determinism). There may be options I didn't consider, but the point I'm trying to make is that detecting changes among large numbers of entities isn't cheap. +*Overwatch* [allows defensive abilities to mitigate compensated projectiles][7]. AFAIK this is simple to do. If a player activates any defensive bonus, just apply it to all their buffered hitboxes. -Let's consider a simpler default: - -- Always rollback and re-simulate. - -Now, you may think that's wasteful, but I would say "if mispredicted" gives you a false sense of security. Mispredictions can occur at any time, *especially* during long-lasting complex physics interactions. Those would show up as CPU spikes. Instead, it's much easier to profile and optimize for your worst-case if clients *always* rollback and re-sim. It's also more memory-efficient, since clients never need to store old predicted states. - -## Delta-Compressed Snapshots +When a player is parented to another entity, which they have no control over (e.g. the player is a passenger in a vehicle), the non-predicted movement of that parent must be rewound during compensation to spawn any projectiles fired by the player in the correct location. See [here.][8] -- The server keeps an incrementally updated copy of the networked state. - - Components are stored with their global ID instead of the local ID. -- The server keeps a ring buffer of "patches" for the last `N` snapshots. -- At the end of every `NetworkFixedUpdate`, the server iterates `Changed` and `Removed`, then: - - Generates the latest patch as the copy `xor` changes. - - Applies the changes to the copy and pushes the latest patch into the ring buffer. - - `Xors` older patches with the latest patch to update them. -- The server reads the needed patches as `&[u8]` (or `&[u64]`) and compresses them using run-length encoding (RLE) or similar. - - No "serialization" needed. If networked DSTs (dynamically-sized types) are stored in their own heap allocation, we can literally send the bits. `rkyv` is a good reference (relative pointers). -- Pass compressed payloads to protocol layer. -- Protocol and I/O layers do whatever they do and send the packet. - -## Interest-Managed Updates - -- Uses the same latest copy + delta buffer data structure, but with additional metadata and filter "components" to track priority and relevance per-*player*. -- Changing components (note: `Transform` is special) adds to the send priority of their entities. Existing send priority doubles every tick. Importantly, entities that haven't changed won't accumulate priority. -- Server writes the entities relevant to each client in priority order, until the packet is full or all entities are written. The send priority of written entities are reset for that client. -- If the server is notified of packet loss, it checks the patch of the update that was lost and re-prioritizes its changed entities for that client. If the patch no longer exists, everything is prioritized. -- (Again, handling networked DSTs needs more consideration.) - -## Messages +## Messages (RPCs) TODO -Messages are best for sending global alerts and any gameplay mechanics you explicitly want modeled as request-reply (or one-way) interactions. They can be unreliable or reliable. You can also postmark messages to be executed on a certain tick like inputs. That can only be best effort, though. +Messages are good for sending global alerts and any gameplay mechanics you explicitly want modeled as requests. They can be unreliable or reliable. You can also postmark messages to be processed on a certain tick like inputs. That can only be best effort, though. The example I'm thinking of is buying items from an in-game vendor. The server doesn't simulate UI, but ideally we can write the message transaction in the same system. A macro might end up being the most ergonomic choice. + +[1]: https://github.com/bevyengine/rfcs/pull/16 +[2]: https://github.com/lemire/FastPFor/blob/master/headers/simple8b_rle. +[3]: https://developer.valvesoftware.com/wiki/Source_Multiplayer_Networking +[4]: https://www.ea.com/games/apex-legends/news/servers-netcode-developer-deep-dive +[5]: https://youtu.be/W3aieHjyNvw?t=2226 "Tim Ford explains Overwatch's hit registration" +[6]: https://youtu.be/W3aieHjyNvw?t=2347 "Tim Ford explains Overwatch's lag comp. limits" +[7]: https://youtu.be/W3aieHjyNvw?t=2492 "Tim Ford explains Overwatch's lag comp. mitigation" +[8]: https://alontavor.github.io/AdvancedLatencyCompensation/ \ No newline at end of file diff --git a/networked_replication.md b/networked_replication.md index 191f7575..ec33b704 100644 --- a/networked_replication.md +++ b/networked_replication.md @@ -26,11 +26,11 @@ What I hope to explore in this RFC is: Bevy's aim here is to make writing local and networked multiplayer games indistinguishable, with minimal added boilerplate. Having an exact simulation timeline simplifies this problem, thus the core of this unified approach is a fixed timestep—`NetworkFixedUpdate`. -As a user, you only have to annotate your gameplay-related components and systems, add those systems to `NetworkFixedUpdate` (currently would be an `AppState`), and configure a few simulation settings to get up and running. That's it! Bevy will transparently handle separating, reconciling, serializing, and compressing the networked state for you. (Those systems can be exposed for advanced users, but non-interested users need not concern themselves.) +As a user, you only have to annotate your gameplay-related components and systems, add those systems to `NetworkFixedUpdate`, and configure a few simulation settings to get up and running. That's it! Bevy will transparently handle separating, reconciling, serializing, and compressing the networked state for you. (Those systems can be exposed for advanced users, but non-interested users need not concern themselves.) > Game design should (mostly) drive networking choices. Future documentation could feature a questionnaire to guide users to the correct configuration options for their game. Genre and player count are generally enough to decide. -The core primitive here is the `Replicate` trait. All instances of components and resources that implement this trait will be automatically detected and synchronized over the network. Simply adding a `#[derive(Replicate)]` should be enough in most cases. +The core primitive here is the `Replicate` trait. All instances of components and resources that implement this trait will be automatically registered and synchronized over the network. Simply adding a `#[derive(Replicate)]` should be enough in most cases. ```rust #[derive(Replicate)] @@ -45,7 +45,6 @@ struct Transform { #[derive(Replicate)] struct Health { - #[replicate(range=(0, 1000))] hp: u32, } ``` @@ -105,93 +104,14 @@ TODO: Example App configuration. ``` ## Implementation Strategy - -[See more in-depth implementation details (more of an idea dump atm).](../main/implementation_details.md) -### Requirements - -- `ComponentId` (and maybe the other `*Ids`) should be stable between clients and the server. -- Must have a means to isolate networked and non-networked state. - - `World` should be able to reserve an `Entity` ID range, with separate storage metadata. - - (If merged, [#16](https://github.com/bevyengine/rfcs/pull/16) could probably be used for this). - - Entities must be born (non-)networked. They cannot become (non-)networked. - - Networked entities must have a `NetworkID` component at minimum. - - Networked components and resources must only contain or reference networked data. - - Networked components must only be mutated inside `NetworkFixedUpdate`. -- The ECS scheduler should support nested loops. - - (I'm pretty sure this isn't an actual blocker, but the workaround feels a little hacky.) - -### The Replicate Trait - -```rust -// TODO -impl Replicate for T { - ... -} -``` - -### Special Query Filters - -```plaintext -TODO - -Predicted -- Set when locally mutated. Cleared when mutated by authoritative update. -Confirmed -- Set when mutated by authoritative update. Cleared when locally mutated. - -``` -### Rollback via Run Criteria - -```plaintext -TODO - -The "outer" loop is the number of fixed update steps as determined by the fixed timestep accumulator. -The "inner" loop is the number of steps to re-simulate. -``` - -### NetworkFixedUpdate - -Clients - -1. Iterate received server updates. -2. Update simulation and interpolation timescales. -3. Sample inputs and push them to send buffer. -4. Rollback and re-sim *if* a new update was received. -5. Simulate predicted tick. - -Server - -1. Iterate received client inputs. -2. Sample buffered inputs. -3. Simulate authoritative tick. -4. Duplicate state changes to copy. -5. Push client updates to send buffer. - -Everything aside from the simulation steps could be auto-generated. - -### Saving Game State - -- At the end of each fixed update, server iterates `Changed` and `Removed` for all replicable components and duplicates them to an isolated copy. - - Could pass this copy to another thread to do the serialization and compression. - - This copy has no `Table`, those would be rebuilt by the client. - -### Preparing Server Packets - -- Snapshots (full state updates) will use delta compression and manual fragmentation. -- Eventual consistency (partial state updates) will use interest management. -- Both will most likely use the same data structure. - -### Restoring Game State - -- At the beginning of each fixed update, the client decodes the received update and generates the latest authoritative state. -- Client then uses this state to write its local prediction copy that has all the tables and non-replicable components. +[See here for a big idea dump.](../main/implementation_details.md) (Hopefully, I can clean this up later.) ## Drawbacks -- Lots of potentially cursed macro magic. -- Direct writes to `World`. -- Seemingly limited to components that implement `Clone` and `Serialize`. +- Serialization strategy is `unsafe` (might be possible to do it entirely with safe Rust, idk). +- Macros might be gnarly. +- At first, only POD components and resources will be supported. DST support will come later. ## Rationale and Alternatives @@ -211,12 +131,12 @@ It'll only grow more difficult to add these features as time goes on. Take Unity ### Why does this need to involve `bevy_ecs`? -I strongly doubt that fast, efficient, and transparent replication features can be implemented without directly manipulating a `World` and its component storages. We may need to allocate memory for networked data separately. +For better encapsulation, I'd prefer if multiple world functionality and nested loops were standard ECS features. Nesting an outer fixed timestep loop and an inner rollback loop doesn't seem possible without a custom stage or scheduler right now. ## Unresolved Questions - Can we provide lints for undefined behavior like mutating networked state outside of `NetworkFixedUpdate`? -- Do rollbacks break change detection or events? +- ~~Do rollbacks break change detection or events?~~ As long as we're careful to update the appropriate change ticks, it should be okay. - ~~When sending interest-managed updates, how should we deal with weird stuff like there being references to entities that haven't been spawned or have been destroyed?~~ Already solved by generational indexes. - How should UI widgets interact with networked state? React to events? Exclusively poll verified data? - How should we handle correcting mispredicted events and FX? From 3db9cdfd2fe72f257c4b18d3e589147839972da0 Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Fri, 28 May 2021 14:27:00 -0500 Subject: [PATCH 32/43] fixed link typo --- implementation_details.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/implementation_details.md b/implementation_details.md index 865a480a..91351101 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -246,7 +246,7 @@ Messages are good for sending global alerts and any gameplay mechanics you expli The example I'm thinking of is buying items from an in-game vendor. The server doesn't simulate UI, but ideally we can write the message transaction in the same system. A macro might end up being the most ergonomic choice. [1]: https://github.com/bevyengine/rfcs/pull/16 -[2]: https://github.com/lemire/FastPFor/blob/master/headers/simple8b_rle. +[2]: https://github.com/lemire/FastPFor/blob/master/headers/simple8b_rle.h [3]: https://developer.valvesoftware.com/wiki/Source_Multiplayer_Networking [4]: https://www.ea.com/games/apex-legends/news/servers-netcode-developer-deep-dive [5]: https://youtu.be/W3aieHjyNvw?t=2226 "Tim Ford explains Overwatch's hit registration" From b34786f6f97413f603e57c567b58ce02c92fd94d Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Sat, 29 May 2021 08:26:52 -0500 Subject: [PATCH 33/43] events still open problem, fixed some inconsistent term usage, bulleted lists --- implementation_details.md | 90 ++++++++++++++++++++++++++++----------- networked_replication.md | 5 ++- 2 files changed, 67 insertions(+), 28 deletions(-) diff --git a/implementation_details.md b/implementation_details.md index 91351101..9c670490 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -24,51 +24,84 @@ The server maintains a storage resource containing a full copy of the latest net v v [(0^8), (1^8), (2^8), (3^8), (4^8), (5^8), (6^8), (7^8)] [8] ^ - latest delta + newest delta ``` +This structure is pre-allocated when the resource is initialized and is the same for both full and interest-managed updates. + At the end of every tick, the server zeroes the space for the newest delta, then iterates `Changed` and `Removed`: -- Generate the newest delta by xor'ing the changes with the stored copy. -- Update the rest of the ring buffer by xor'ing the older deltas with the newest. -- Write the changes to the stored copy. +- Generating the newest delta by xor'ing the changes with the stored copy. +- Updating the rest of the ring buffer by xor'ing the older deltas with the newest. +- Writing the changes to the stored copy. -This structure is the same for both delta-compressed snapshots and interest-managed updates. It's all pre-allocated when the resource is initialized. +TODO -**TODO**: Components and resources that allocate on the heap (DSTs) won't be supported at first. The solution is most likely going to be backing this resource with its own memory region (something like `bumpalo` but smarter). +- See if we can store data in the snapshots without struct padding. +- Support components and resources that allocate on the heap (DSTs). Won't be possible at first but the solution most likely will be backing this resource with its own memory region (something like `bumpalo` but smarter). That will be important for deterministic desync detection as well. -### Full Updates +### Full Updates (a.k.a. delta-compressed snapshots) -For delta-compression, the server just compresses whichever deltas clients need using some variant of run-length encoding, such as [this Simple8b + RLE variant][2] (licensed under Apache 2.0). If the compressed payload is too large, the server chops it into fragments. No unnecessary work. The server only compresses deltas that are going to be sent and the same compressed payload can be sent to any number of clients. +For delta-compression, the server just compresses whichever deltas clients need using some variant of run-length encoding (currently looking at [Simple8b + RLE][2]). If the compressed payload is too large, the server will split it into fragments. There's no unnecessary work either. The server only compresses deltas that are going to be sent and the same compressed payload can be sent to any number of clients. ### Interest-Managed Updates (a.k.a. eventual consistency) -For interest management, the server needs some extra metadata to know what to send each player. +Eventual consistency isn't inherently reliant on prioritization and filtering, but they're essential for the optimal player experience. + +If we can't send everything, we should prioritize what players want to know. They want live updates on objects that are close or occupy a big chunk of their FOV. They want to know about their teammates or projectiles they've fired, even if those are far away. The server has to make the most of each packet. + +Similarly, game designers often want to hide certain information from certain players. Limiting the amount of hidden information that can be exploited by cheaters is often crucial to a game's long-term health. Battle royale players, for example, don't need and probably shouldn't even have their opponents' inventory data. In practice, the guards aren't be perfect (e.g. *Valorant's* Fog of War not preventing wallhacks), but something is better than nothing. + +Anyway, to do all this interest management, the server needs to track some extra metadata. ```rust struct InterestMetadata { priority: [Vec; P], relevance: SparseSet; P]>, - location: Vec, - within_aoi: [Vec; P], + position: Vec>, + within_scope: [Vec; P], } ``` -Each entity has a per-player send priority that's just the age of its oldest undelivered change. Entities that don't change won't accumulate priority. +This metadata tracks a few things: + +- the age of each entity's oldest undelivered change, per player + +This is used as the send priority value so that the server only sends something when it changes. I think it's a better core idea than assigning entities arbitrary update frequencies. + +- the relevance of each component, per entity, per player + +This controls which components are sent. By default, change detection will mark a component as relevant for everybody, and then some form of rule-based filtering (maybe [entity relations][10]) can be used to selectively omit or force delivery. -For checking if an entity is within a player's area of interest, I'm looking into a sort-and-sweep (sweep-and-prune) using Morton-encoded coordinates as the broad-phase algorithm, followed by a simple sphere radius test. Results will be stored in an array for each player. Alternatives like grids and potentially visible sets (PVS) can be added later. +(Might need second age value that accounts for irrelevant changes.) -Relevance will be tracked per component (per player). Relevance will be set by changed detection and certain rules, while other rules can clear the relevance (TBD). This rule-based filtering seems likely to involve relations. +- the position of every networked entity (if available) -Once the metadata has been updated, the server sorts each results array in priority order and writes the relevant components of those entities until the packet is full or all relevant entities have been written. +Checking if an entity is inside someone's area of interest is just an application of collision detection. "Intangible" entities that lack a transform will auto-pass this check. I don't see a need for ray, distance, and shape queries where BVH or spatial partitioning structures excel, so I'm looking into doing something like a [sweep-and-prune][9] with Morton-encoding that might have better performance, since it's basically just sorting this array. -When the server sends a packet, it remembers the priorities for each included entity (well, for their indexes). Their current priority and relevances are then cleared. Later, if the server is notified that some previously sent packets were probably lost, it can restore all their priorities (plus the number of ticks since they were first sent). +Alternatives like grids and potentially visible sets (PVS) can be explored and added later. -For restoring the relevance of an entity's components, there are two cases. If the relevant patch matching its age is still around, the server will use it as a reference and only mark the changed components as relevant. If not, the server will mark all of its components as relevant. +- results of the AOI intersection tests, per player + +Self-explanatory. Once the other metadata has been updated, the server sorts this array in priority order and writes the relevant components of these entities until the packet is full or all relevant entities have been written. + +*So what do we do if packets are lost?* + +Whenever the server sends a packet, it remembers the priorities of the included entities (well, of their row indexes), then zeroes their priority and relevance. Later, if the server is notified that some previously sent packets were probably lost, it can pull this info and restore all the priorities (plus the however many ticks have passed). + +For restoring the relevance of an entity's components, there are two cases. If the delta matching the entity's age still exists, the server can use that as a reference and only flag its changed components. All get flagged otherwise. + +*So how 'bout those edge cases?* + +Unfortunately, the most generalized strategy comes with its own headaches. + +- What do should client do when it loses the first update for an entity? + +TBD ## Replicate Trait @@ -77,7 +110,7 @@ pub unsafe trait Replicate { fn quantize(&mut self); } -unsafe impl Replicate for NetworkTransform {} +unsafe impl Replicate for T {} ``` ## How to rollback? @@ -189,23 +222,25 @@ The naive solution is to have clients spawn dummy entities so that when an updat A better solution is for the server to assign each networked entity a global ID that the spawning client can predict and map to its local instance. There are 3 variants that I know of: -1. Use an incrementing generational index (reuse `Entity`) and fix its upper bits to match the ID of the spawning player. This is the simplest method and my recommendation. +1. Use an incrementing generational index (reuse `Entity`) and fix its upper bits to match the ID of the spawning player. 2. Use PRNGs to generate shared keys (I've seen these dubbed "prediction keys") for pairing local and global IDs. Rather than predict the global ID directly, clients predict the shared keys. Server updates that confirm a predicted entity would include both its global ID and the shared key. Once acknowledged, later updates can include just the global ID. This method is more complicated but does not share the previous method's implicit entity limit. -3. Bake it into the memory layout. If the layout and order of the snapshot storage is identical on all machines, array indexes and relative pointers can double as global IDs. They wouldn't need to be explicitly written into packets, potentially reducing packet size by 4-8 bytes per entity (before compression). However, we'd probably end up separately including a generation anyway to not confuse destroyed entities with new ones. +3. Bake it into the memory layout. If the layout and order of the snapshot storage is identical on all machines, array indexes and relative pointers can double as global IDs. They wouldn't need to be explicitly written into packets, potentially reducing packet size by 4-8 bytes per entity (before compression). However, we'd probably end up wanting generations anyway to not confuse destroyed entities with new ones. + +I recommend 1 as it's the simplest method. Bandwidth and CPU resources would run out long before the reduced entity ranges does. My current strategy is a mix of 1 and 3. ## Smooth Rendering -Rendering should come after `NetworkFixedUpdate`. +Rendering should happen after `NetworkFixedUpdate`. -Whenever clients receive an update with new remote entities, those entities shouldn't be rendered until that update is interpolated, likely done through adding a marker component. +Whenever clients receive an update with new remote entities, those entities shouldn't be rendered until that update is interpolated. We can do this through a marker component or with a field in the render transform. -Cameras need a little special treatment. Look inputs need to be accumulated at the render rate and re-applied to the camera just before rendering. +Cameras need some special treatment. Look inputs need to be accumulated at the render rate and re-applied to the predicted camera rotation just before rendering. -We'll also need to distinguish instant motion from integrated motion when interpolating. Moving an entity by modifying `transform.translation` should teleport and moving by integrating `rigidbody.velocity` should look smooth. +We'll also need some way for developers to declare their intent that a motion should be instant instead of smoothly interpolated. Since it needs to work for remote entities as well, maybe this just has to be a bool on the networked transform. -We'll need a special blending for predicted entities and entities transitioning between prediction and interpolation. "Projective velocity blending" seems to be a common way to smooth extrapolation errors, but I've also seen a simple exponential decay recommended. There may be better smoothing filters. +We'll need a special blending for predicted entities and entities transitioning between prediction and interpolation. [Projective velocity blending][11] seems like the de facto standard method for smoothing extrapolation errors, but I've also seen simple exponential decays used. There may be better smoothing filters. ## Lag Compensation @@ -252,4 +287,7 @@ The example I'm thinking of is buying items from an in-game vendor. The server d [5]: https://youtu.be/W3aieHjyNvw?t=2226 "Tim Ford explains Overwatch's hit registration" [6]: https://youtu.be/W3aieHjyNvw?t=2347 "Tim Ford explains Overwatch's lag comp. limits" [7]: https://youtu.be/W3aieHjyNvw?t=2492 "Tim Ford explains Overwatch's lag comp. mitigation" -[8]: https://alontavor.github.io/AdvancedLatencyCompensation/ \ No newline at end of file +[8]: https://alontavor.github.io/AdvancedLatencyCompensation/ +[9]: https://github.com/mattleibow/jitterphysics/wiki/Sweep-and-Prune +[10]: https://github.com/bevyengine/rfcs/pull/18 +[11]: https://www.researchgate.net/publication/293809946_Believable_Dead_Reckoning_for_Networked_Games \ No newline at end of file diff --git a/networked_replication.md b/networked_replication.md index ec33b704..e5b43d58 100644 --- a/networked_replication.md +++ b/networked_replication.md @@ -136,8 +136,9 @@ For better encapsulation, I'd prefer if multiple world functionality and nested ## Unresolved Questions - Can we provide lints for undefined behavior like mutating networked state outside of `NetworkFixedUpdate`? -- ~~Do rollbacks break change detection or events?~~ As long as we're careful to update the appropriate change ticks, it should be okay. -- ~~When sending interest-managed updates, how should we deal with weird stuff like there being references to entities that haven't been spawned or have been destroyed?~~ Already solved by generational indexes. +- ~~Will rollbacks break change detection?~~ As long as we're careful to update the appropriate change ticks, it should be okay. +- Will rollbacks break events? +- ~~When sending interest-managed updates, how should we deal with weird stuff like there being references to entities that haven't been spawned or have been destroyed?~~ I believe this is solved by using generational indexes for the network IDs. - How should UI widgets interact with networked state? React to events? Exclusively poll verified data? - How should we handle correcting mispredicted events and FX? - Can we replicate animations exactly without explicitly sending animation data? From e5cb8b86d3a6b5967e227a16808c4e7eb5d41e6d Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Sat, 29 May 2021 08:48:56 -0500 Subject: [PATCH 34/43] more typos and grammar Sorry to all my English teachers. Also moved the player vs. client thing back to the top. The distinction is important for interest management. --- implementation_details.md | 27 ++++++++++++++------------- networked_replication.md | 2 +- 2 files changed, 15 insertions(+), 14 deletions(-) diff --git a/implementation_details.md b/implementation_details.md index 9c670490..91e7f425 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -15,6 +15,11 @@ - Ideally, `World` could reserve and split off a range of entities, with separate component storages. ([#16][1] could potentially be used for this). - The ECS scheduler should support arbitrary cycles in the stage graph (or equivalent). Want ergonomic support for nested loops. + +## `Connection` != `Player` + +I know I've been using the terms "client" and "player" somewhat interchangeably, but `Connection` and `Player` should be separate tokens. There's no benefit in forcing one player per connection. Having `Player` be its own thing makes it easier to do stuff like online splitscreen, temporarily substituting vacancies with bots, etc. + ## Storage The server maintains a storage resource containing a full copy of the latest networked state as well as a ring buffer of deltas (for the last `N` snapshots). Both are updated lazily using Bevy's built-in change detection. @@ -50,11 +55,11 @@ For delta-compression, the server just compresses whichever deltas clients need (a.k.a. eventual consistency) -Eventual consistency isn't inherently reliant on prioritization and filtering, but they're essential for the optimal player experience. +Eventual consistency isn't inherently reliant on prioritization and filtering, but they're essential for an optimal player experience. If we can't send everything, we should prioritize what players want to know. They want live updates on objects that are close or occupy a big chunk of their FOV. They want to know about their teammates or projectiles they've fired, even if those are far away. The server has to make the most of each packet. -Similarly, game designers often want to hide certain information from certain players. Limiting the amount of hidden information that can be exploited by cheaters is often crucial to a game's long-term health. Battle royale players, for example, don't need and probably shouldn't even have their opponents' inventory data. In practice, the guards aren't be perfect (e.g. *Valorant's* Fog of War not preventing wallhacks), but something is better than nothing. +Similarly, game designers often want to hide certain information from certain players. Limiting the amount of hidden information that gets leaked and exploited by cheaters is often crucial to a game's long-term health. Battle royale players, for example, don't need and probably shouldn't even have their opponents' inventory data. In practice, the guards are never perfect (e.g. *Valorant's* Fog of War not preventing wallhacks), but something is better than nothing. Anyway, to do all this interest management, the server needs to track some extra metadata. @@ -67,17 +72,17 @@ struct InterestMetadata { } ``` -This metadata tracks a few things: +This metadata contains a few things: - the age of each entity's oldest undelivered change, per player -This is used as the send priority value so that the server only sends something when it changes. I think it's a better core idea than assigning entities arbitrary update frequencies. +This is used as the send priority value. I think having the server only send something when it has changed is better than assigning entities arbitrary update frequencies. - the relevance of each component, per entity, per player -This controls which components are sent. By default, change detection will mark a component as relevant for everybody, and then some form of rule-based filtering (maybe [entity relations][10]) can be used to selectively omit or force delivery. +This controls which components are sent. By default, change detection will mark components as relevant for everybody, and then some form of rule-based filtering (maybe [entity relations][10]) can be used to selectively omit or force-include them. -(Might need second age value that accounts for irrelevant changes.) +(Might need a second age value that accounts for irrelevant changes.) - the position of every networked entity (if available) @@ -87,13 +92,13 @@ Alternatives like grids and potentially visible sets (PVS) can be explored and a - results of the AOI intersection tests, per player -Self-explanatory. Once the other metadata has been updated, the server sorts this array in priority order and writes the relevant components of these entities until the packet is full or all relevant entities have been written. +Self-explanatory. Once the other metadata has been updated, the server sorts this array in priority order and writes the relevant data until the player's packet is full or everything gets written. *So what do we do if packets are lost?* -Whenever the server sends a packet, it remembers the priorities of the included entities (well, of their row indexes), then zeroes their priority and relevance. Later, if the server is notified that some previously sent packets were probably lost, it can pull this info and restore all the priorities (plus the however many ticks have passed). +Whenever the server sends a packet, it remembers the priorities of the included entities (actually their row indexes), then resets their priority and relevance. Later, if the server is notified that some previously sent packets were probably lost, it can pull this info and restore the priorities (plus the however many ticks have passed). -For restoring the relevance of an entity's components, there are two cases. If the delta matching the entity's age still exists, the server can use that as a reference and only flag its changed components. All get flagged otherwise. +For restoring the relevance of an entity's components, there are two cases. If the delta matching the entity's age still exists, the server can use that as a reference and only flag its changed components. Otherwise, they all get flagged. *So how 'bout those edge cases?* @@ -139,10 +144,6 @@ Let's consider a simpler default: This might seem wasteful, but think about it. If-then just hides performance problems from you. Heavy rollback scenarios will exist regardless. You can't prevent clients from running into them. Mispredictions are *especially* likely during heavier computations like physics. Just have clients always rollback and re-sim. It's easier to profile and optimize your worst-case. It's also more memory-efficient, since clients never need to store old predicted states. -## `Connection` != `Player` - -I know I've been using the terms "client" and "player" somewhat interchangeably, but `Connection` and `Player` should be separate tokens. There's no benefit in forcing one player per connection. Having `Player` be its own thing makes it easier to do stuff like online splitscreen, temporarily substituting vacancies with bots, etc. - ## "Clock" Synchronization Using a fixed rate, tick-based simulation simplifies how we need to think about time. It's like scrubbing a timeline, from one "frame" to the next. The key point is that everyone follows the same sequence. Clients may be simulating different points on the timeline, but tick 480 is the same simulation step for everyone. diff --git a/networked_replication.md b/networked_replication.md index e5b43d58..d51720b2 100644 --- a/networked_replication.md +++ b/networked_replication.md @@ -105,7 +105,7 @@ TODO: Example App configuration. ## Implementation Strategy -[See here for a big idea dump.](../main/implementation_details.md) (Hopefully, I can clean this up later.) +[See here for a big idea dump.](../main/implementation_details.md) (Hopefully I can clean this up later.) ## Drawbacks From 18110694ca134ead1deb43b603a293a208f901bc Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Sat, 29 May 2021 09:08:33 -0500 Subject: [PATCH 35/43] sorry to whoever gets pinged with these, must fix all typos --- implementation_details.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/implementation_details.md b/implementation_details.md index 91e7f425..c430a192 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -76,11 +76,11 @@ This metadata contains a few things: - the age of each entity's oldest undelivered change, per player -This is used as the send priority value. I think having the server only send something when it has changed is better than assigning entities arbitrary update frequencies. +This is used as the send priority value. I think having the server only send something when it has changed is better than assigning entities arbitrary update frequencies. For players on the same client, their send priorities are identical. - the relevance of each component, per entity, per player -This controls which components are sent. By default, change detection will mark components as relevant for everybody, and then some form of rule-based filtering (maybe [entity relations][10]) can be used to selectively omit or force-include them. +This controls which components are sent. By default, change detection will mark components as relevant for everybody, and then some form of rule-based filtering (maybe [entity relations][10]) can be used to selectively omit or force-include them. For players on the same client, their relevances are merged with an OR. (Might need a second age value that accounts for irrelevant changes.) @@ -92,11 +92,11 @@ Alternatives like grids and potentially visible sets (PVS) can be explored and a - results of the AOI intersection tests, per player -Self-explanatory. Once the other metadata has been updated, the server sorts this array in priority order and writes the relevant data until the player's packet is full or everything gets written. +Self-explanatory. For players on the same client, their results are merged with an OR. Once the other metadata has been updated, the server sorts this array in priority order and writes the relevant data until the client's packet is full or everything gets written. *So what do we do if packets are lost?* -Whenever the server sends a packet, it remembers the priorities of the included entities (actually their row indexes), then resets their priority and relevance. Later, if the server is notified that some previously sent packets were probably lost, it can pull this info and restore the priorities (plus the however many ticks have passed). +Whenever the server sends a packet, it remembers the priorities of the included entities (actually their row indexes), then resets their priority and relevance. Later, if the server is notified that some previously sent packets were probably lost, it can pull this info and restore the priorities (plus however many ticks have passed). For restoring the relevance of an entity's components, there are two cases. If the delta matching the entity's age still exists, the server can use that as a reference and only flag its changed components. Otherwise, they all get flagged. @@ -104,7 +104,7 @@ For restoring the relevance of an entity's components, there are two cases. If t Unfortunately, the most generalized strategy comes with its own headaches. -- What do should client do when it loses the first update for an entity? +- What should a client do when it misses the first update for an entity? Is it OK to spawn an entity with incomplete information? If not, how does the client know when it's safe? TBD @@ -229,7 +229,7 @@ A better solution is for the server to assign each networked entity a global ID 3. Bake it into the memory layout. If the layout and order of the snapshot storage is identical on all machines, array indexes and relative pointers can double as global IDs. They wouldn't need to be explicitly written into packets, potentially reducing packet size by 4-8 bytes per entity (before compression). However, we'd probably end up wanting generations anyway to not confuse destroyed entities with new ones. -I recommend 1 as it's the simplest method. Bandwidth and CPU resources would run out long before the reduced entity ranges does. My current strategy is a mix of 1 and 3. +I recommend 1 as it's the simplest method. Bandwidth and CPU resources would run out long before the reduced entity ranges do. My current strategy is a mix of 1 and 3. ## Smooth Rendering @@ -271,13 +271,13 @@ For clients with too-high ping, their interpolation will lag far behind their pr *Overwatch* [allows defensive abilities to mitigate compensated projectiles][7]. AFAIK this is simple to do. If a player activates any defensive bonus, just apply it to all their buffered hitboxes. -When a player is parented to another entity, which they have no control over (e.g. the player is a passenger in a vehicle), the non-predicted movement of that parent must be rewound during compensation to spawn any projectiles fired by the player in the correct location. See [here.][8] +When a player is the child of another, uncontrolled entity (e.g. the player is a passenger in a vehicle), the non-predicted movement of that parent entity must be rewound during lag compensation, so that any projectiles fired by the player spawn in the correct location. See [here.][8] ## Messages (RPCs) TODO -Messages are good for sending global alerts and any gameplay mechanics you explicitly want modeled as requests. They can be unreliable or reliable. You can also postmark messages to be processed on a certain tick like inputs. That can only be best effort, though. +Messages are good for sending global alerts and any gameplay mechanics you explicitly want modeled as requests. They can be unreliable ("Hey, spawn this particle effect!") or reliable. ("I want to buy 4 medkits.") You can also postmark messages to be processed on a certain tick like inputs. That can only be best effort, though. The example I'm thinking of is buying items from an in-game vendor. The server doesn't simulate UI, but ideally we can write the message transaction in the same system. A macro might end up being the most ergonomic choice. From 0a2e656248d4d50bc719845b3bc8c7924ca15945 Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Tue, 1 Jun 2021 13:11:37 -0500 Subject: [PATCH 36/43] added words on type ids and edited interest management section again --- implementation_details.md | 100 +++++++++++++++++++++++++------------- 1 file changed, 66 insertions(+), 34 deletions(-) diff --git a/implementation_details.md b/implementation_details.md index c430a192..2cffb584 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -3,22 +3,39 @@ ## Requirements -- `ComponentId` should be stable between clients and the server. -- Must isolate networked and non-networked state. - - Entities must be born (non-)networked. They cannot become (non-)networked. - - Networked entities must have a "network ID" component at minimum. - - Networked components and resources must only hold or reference networked data. - - Networked components must only be mutated inside `NetworkFixedUpdate`. +Type identifiers (in some form) need to match between all connected machines. Options so far: + +- `TypeId` + - [debatable stability][14] + - no additional mapping needed (just reuse the world's) +- ["`StableTypeId`"][13] + - currently unavailable + - no additional mapping needed (just reuse the world's) +- `ComponentId` + - fragile, requires networked components and resources registered first and in a fixed order (for all relevant worlds) + - no mapping needed +- `unique_type_id` + - uses ordering described in a `types.toml` file + - needs mapping between these and `ComponentId` +- `type_uuid` + - not an index + - needs mapping between these and `ComponentId` ## Wants -- Ideally, `World` could reserve and split off a range of entities, with separate component storages. ([#16][1] could potentially be used for this). -- The ECS scheduler should support arbitrary cycles in the stage graph (or equivalent). Want ergonomic support for nested loops. +- If it were possible to split off a range of entities into a [sub-world][12] (with its own, separate component storages), that would remove a layer of indirection and allow reusing the `World` API more directly. +- If the ECS scheduler supported arbitrary cycles in the stage graph (or the "stageless" equivalent), I imagine it'd be possible to make the nested loop criteria completely transparent to the end user (no custom stages needed, perhaps). +## Practices users should follow or they'll have UB + +- Entities must be spawned (non-)networked. They cannot become (non-)networked. +- Networked entities must be spawned with a "network ID" component at minimum. +- (Non-)networked components and resources should only hold or reference (non-)networked data. +- Networked components should only be mutated inside `NetworkFixedUpdate`. ## `Connection` != `Player` -I know I've been using the terms "client" and "player" somewhat interchangeably, but `Connection` and `Player` should be separate tokens. There's no benefit in forcing one player per connection. Having `Player` be its own thing makes it easier to do stuff like online splitscreen, temporarily substituting vacancies with bots, etc. +I know I've been using the terms "client" and "player" somewhat interchangeably, but `Connection` and `Player` should be separate tokens. There's no reason the engine should limit things to one player per connection. Having `Player` be its own thing makes it easier to do stuff like online splitscreen, temporarily substituting vacancies with bots, etc. Likewise, a `Connection` should be a platform-agnostic handle. ## Storage @@ -42,7 +59,7 @@ At the end of every tick, the server zeroes the space for the newest delta, then TODO -- See if we can store data in the snapshots without struct padding. +- Store/serialize networked data without struct padding so we're not wasting bandwidth. - Support components and resources that allocate on the heap (DSTs). Won't be possible at first but the solution most likely will be backing this resource with its own memory region (something like `bumpalo` but smarter). That will be important for deterministic desync detection as well. ### Full Updates @@ -59,52 +76,60 @@ Eventual consistency isn't inherently reliant on prioritization and filtering, b If we can't send everything, we should prioritize what players want to know. They want live updates on objects that are close or occupy a big chunk of their FOV. They want to know about their teammates or projectiles they've fired, even if those are far away. The server has to make the most of each packet. -Similarly, game designers often want to hide certain information from certain players. Limiting the amount of hidden information that gets leaked and exploited by cheaters is often crucial to a game's long-term health. Battle royale players, for example, don't need and probably shouldn't even have their opponents' inventory data. In practice, the guards are never perfect (e.g. *Valorant's* Fog of War not preventing wallhacks), but something is better than nothing. +Similarly, game designers often want to hide certain information from certain players. Limiting the amount of hidden information that gets leaked and exploited by cheaters is often crucial to a game's long-term health. Battle-royale players, for example, don't need and probably shouldn't even have their opponents' inventory data. In practice, these barriers are never perfect (e.g. *Valorant's* Fog of War not preventing wallhacks), but something is better than nothing. Anyway, to do all this interest management, the server needs to track some extra metadata. ```rust -struct InterestMetadata { - priority: [Vec; P], - relevance: SparseSet; P]>, - position: Vec>, - within_scope: [Vec; P], +struct InterestMetadata { + // P: players, E: entities, C: components + position: [Option<(usize, SpatialIndex)>; 2 * E], + within_aoi: [[Option<(usize, f32)>; E]; P], + relevance: [[BitSet; C]; P] + age: [[[usize; C]; E]; P] + priority: [[usize; E]; P] } ``` This metadata contains a few things: -- the age of each entity's oldest undelivered change, per player +- position: the min and max AABB coordinates of every networked entity (if available) +- within_aoi: the entities that are inside the area of interest of this player -This is used as the send priority value. I think having the server only send something when it has changed is better than assigning entities arbitrary update frequencies. For players on the same client, their send priorities are identical. +Checking if an entity is inside someone's area of interest (AOI) is just an application of collision detection. AOI results are used to filter information at the entity-level. "Intangible" entities that lack a transform will auto-pass this check. For players on the same client, their results are merged with an OR. -- the relevance of each component, per entity, per player +I don't see a need for ray, distance, and shape queries where BVH or spatial partitioning structures excel, so I'm looking into doing something like a [sweep-and-prune][9] (SAP) with Morton-encoding. Since SAP is essentially just sorting the array, I imagine it might have better performance. Alternatives like grids and potentially visible sets (PVS) can be explored and added later. -This controls which components are sent. By default, change detection will mark components as relevant for everybody, and then some form of rule-based filtering (maybe [entity relations][10]) can be used to selectively omit or force-include them. For players on the same client, their relevances are merged with an OR. +- relevance: whether or not a component value should be sent to this player -(Might need a second age value that accounts for irrelevant changes.) +Relevance is used to filter information at the component-level. By default, change detection will mark components as relevant for everybody, and then some form of rule-based filtering (maybe [entity relations][10]) can be used to selectively omit or force-include them. When sent, a component's relevance is reset to false. For players on the same client, their relevances are merged with an OR. -- the position of every networked entity (if available) +- age: time (in ticks) each component has had a pending change since it was last sent to this player -Checking if an entity is inside someone's area of interest is just an application of collision detection. "Intangible" entities that lack a transform will auto-pass this check. I don't see a need for ray, distance, and shape queries where BVH or spatial partitioning structures excel, so I'm looking into doing something like a [sweep-and-prune][9] with Morton-encoding that might have better performance, since it's basically just sorting this array. +Age serves as the basis for send priority. The send priority of an entity is simply the age of its oldest relevant component. When an entity is sent, the ages of its relevant components are reset to zero. Age is tracked per component to be more robust against the receiving client having packet loss. For players on the same client, this field will be identical. -Alternatives like grids and potentially visible sets (PVS) can be explored and added later. +I think having the server only send undelivered changes and prioritizing the oldest ones is better than assigning entities arbitrary update frequencies. That said, I'll be testing a "zero-cost" alternative where I just count the number of due relevant components per entity. Unlike age, I believe this wouldn't require storing any sent packet metadata (see below). -- results of the AOI intersection tests, per player +- priority: how important it is to send the current state of each entity to this player -Self-explanatory. For players on the same client, their results are merged with an OR. Once the other metadata has been updated, the server sorts this array in priority order and writes the relevant data until the client's packet is full or everything gets written. +Priority is the end result of combining the above metadata. After filtering entities to those within a player's area of interest and filtering their components to only what that player needs to know, the server sorts the interesting entities in priority order and writes each one until the client's packet is full or they're all written. *So what do we do if packets are lost?* -Whenever the server sends a packet, it remembers the priorities of the included entities (actually their row indexes), then resets their priority and relevance. Later, if the server is notified that some previously sent packets were probably lost, it can pull this info and restore the priorities (plus however many ticks have passed). +Whenever the server sends a packet, it remembers how old the included entities (actually their row indexes) were, then resets the age and relevance of their components. Later, if the server is notified that some previously sent packets were probably lost, it can pull this info and restore the components to a more indicative age (plus however many ticks have passed). -For restoring the relevance of an entity's components, there are two cases. If the delta matching the entity's age still exists, the server can use that as a reference and only flag its changed components. Otherwise, they all get flagged. +For restoring the relevance of an entity's components, there are two cases. If the delta matching the stored age still exists, the server can use that as a reference and only mark its changed components. Otherwise, they all get marked. *So how 'bout those edge cases?* Unfortunately, the most generalized strategy comes with its own headaches. -- What should a client do when it misses the first update for an entity? Is it OK to spawn an entity with incomplete information? If not, how does the client know when it's safe? +- What should a client do when it misses the first update for a entity? Is it OK to spawn a entity with incomplete information? If not, how does the client know when it's safe? + +AFAIK this is only a problem for "kinded" entities that have archetype invariants. I'm thinking two potential solutions: + +1. Have the client spawn new remote entities with any missing components in their invariants set to a default value. +2. Have the server redundantly send the full invariants for a new interesting entity until that information has been delivered once. TBD @@ -122,6 +147,8 @@ unsafe impl Replicate for T {} TODO +probably a custom schedule/stage + ```plaintext The "outer" loop is the number of fixed update steps as determined by the fixed timestep accumulator. The "inner" loop is the number of steps to re-simulate. @@ -142,7 +169,7 @@ Let's consider a simpler default: - Always rollback and re-simulate when you receive a new update. -This might seem wasteful, but think about it. If-then just hides performance problems from you. Heavy rollback scenarios will exist regardless. You can't prevent clients from running into them. Mispredictions are *especially* likely during heavier computations like physics. Just have clients always rollback and re-sim. It's easier to profile and optimize your worst-case. It's also more memory-efficient, since clients never need to store old predicted states. +This might seem wasteful, but think about it. If-then is really an anti-pattern that just hides performance problems from you. Mispredictions will exist regardless of this choice, and they're *especially* likely during heavier computations like physics. Having clients always rollback and re-sim makes it easier to profile and optimize your worst-case. It's also more memory-efficient, since clients never need to store old predicted states. ## "Clock" Synchronization @@ -275,11 +302,13 @@ When a player is the child of another, uncontrolled entity (e.g. the player is a ## Messages (RPCs) -TODO +Messages are good for sending global alerts and any gameplay mechanics where raw inputs aren't expressive enough. For example, buying items bulk from an in-game menu. The server won't simulate UI, so it'd probably be simplest if the client sent a reliable message describing what they want. The "reply" in this example would be implicit in the received state. + +Messages can be reliable. They can also be postmarked to be processed on a certain tick like inputs. That can only be best effort (i.e. tick N or earliest), though. -Messages are good for sending global alerts and any gameplay mechanics you explicitly want modeled as requests. They can be unreliable ("Hey, spawn this particle effect!") or reliable. ("I want to buy 4 medkits.") You can also postmark messages to be processed on a certain tick like inputs. That can only be best effort, though. +I don't really know what these should look like yet. A macro might be the most ergonomic choice, if it means a message can be completely defined in its relevant system. -The example I'm thinking of is buying items from an in-game vendor. The server doesn't simulate UI, but ideally we can write the message transaction in the same system. A macro might end up being the most ergonomic choice. +TODO [1]: https://github.com/bevyengine/rfcs/pull/16 [2]: https://github.com/lemire/FastPFor/blob/master/headers/simple8b_rle.h @@ -291,4 +320,7 @@ The example I'm thinking of is buying items from an in-game vendor. The server d [8]: https://alontavor.github.io/AdvancedLatencyCompensation/ [9]: https://github.com/mattleibow/jitterphysics/wiki/Sweep-and-Prune [10]: https://github.com/bevyengine/rfcs/pull/18 -[11]: https://www.researchgate.net/publication/293809946_Believable_Dead_Reckoning_for_Networked_Games \ No newline at end of file +[11]: https://www.researchgate.net/publication/293809946_Believable_Dead_Reckoning_for_Networked_Games +[12]: https://github.com/bevyengine/rfcs/pull/16#issuecomment-849878777 +[13]: https://github.com/bevyengine/bevy/issues/32 +[14]: https://github.com/bevyengine/bevy/issues/32#issuecomment-821510244 \ No newline at end of file From ffd9eac88b5db5697a942713e0e7e320782e4de4 Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Sat, 5 Jun 2021 15:03:10 -0500 Subject: [PATCH 37/43] small clarifications trying to be exact with terminology --- implementation_details.md | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/implementation_details.md b/implementation_details.md index 2cffb584..3bf29f05 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -228,25 +228,23 @@ interp_time = max(interp_time, predicted_time - max_lag_comp) The key idea here is that simplifying the client-server relationship is more efficient and has less problems. If you followed the Source engine model described [here][3], the server would have to apply inputs whenever they arrive, meaning the server also has to rollback and it also must deal with weird ping-related issues (see the lag compensation section in [this article][4]). If the server never accepts late inputs and never changes its pace, no one needs to coordinate. -## Prediction <-> Interpolation +## Predicted <-> Interpolated -Clients can't directly modify the authoritative state, but they should be able to predict whatever they want locally. Current plan is to just copy the latest authoritative state. If this ends up being too expensive (or when DSTs are supported), we can probably use a copy-on-write layer. +Clients can't directly make persistent changes the authoritative state, but they're allowed to do whatever they want locally. Current plan is to just copy the latest authoritative state. If this ends up being too expensive (or when DSTs are supported), a copy-on-write layer is another option. -To shift components between prediction and interpolation, we can default to either. When remote entities are interpolated by default, most entities will reset to interpolated when modified by a server update. We can then use specialized `Predicted` and `Confirmed` (equivalent to `Not(Predicted)`) query filters to address the two separately. These will piggyback off of Bevy's built-in reliable change detection. +We want to shift components between being predicted (extrapolated) and being interpolated. Either could be default. If interpolation is default, entities would reset to interpolated when modified by a server update. Users could then use specialized `Predicted` and `Confirmed` (equivalent to `Not(Predicted)`) query filters to address the two groups separately. I think these can piggyback off of Bevy's built-in reliable change detection. -Systems will predict by default, but users can opt-out with the `Predicted` filter. Systems with filtered queries (i.e. physics, path-planning) should typically run last. Clients should always predict entities driven by their input and entities whose spawns haven't been confirmed. +Systems would generate predictions by default, but users can opt-out with the `Predicted` filter to only process entities that have already been mutated by an earlier system. Users would typically use these filters for heavier logic like physics. Clients will naturally predict any entities driven by their input and any spawned by their input (until confirmed by the server). Since sounds and particles require special consideration, they're probably best realized through dispatching events to be handled *outside* `NetworkFixedUpdate`. We can use these query filters to generate events that only trigger on authoritative changes and events that trigger on predicted changes to be confirmed or cancelled later. How to uniquely identify these events is another question, though. -Should UI be allowed to reference predicted state or only verified state? - ## Predicting Entity Creation This requires some special consideration. -The naive solution is to have clients spawn dummy entities so that when an update that confirms the result arrives, they'll simply destroy the dummy and spawn the true entity. IMO this is a poor solution because it prevents clients from smoothly blending predicted spawns to their authoritative location. Snapping won't look right. +The naive solution is to have clients spawn dummy entities so that when an update that confirms the result arrives, they'll simply destroy the dummy and spawn the true entity. IMO this is a poor solution because it prevents clients from smoothly blending errors in the predicted spawn's rendered transform. Snapping its visuals wouldn't look right. A better solution is for the server to assign each networked entity a global ID that the spawning client can predict and map to its local instance. There are 3 variants that I know of: @@ -268,7 +266,7 @@ Cameras need some special treatment. Look inputs need to be accumulated at the r We'll also need some way for developers to declare their intent that a motion should be instant instead of smoothly interpolated. Since it needs to work for remote entities as well, maybe this just has to be a bool on the networked transform. -We'll need a special blending for predicted entities and entities transitioning between prediction and interpolation. [Projective velocity blending][11] seems like the de facto standard method for smoothing extrapolation errors, but I've also seen simple exponential decays used. There may be better smoothing filters. +While most visual interpolation is linear, we'll want another blend for quickly but smoothly correcting visual misprediction errors, which can occur for entities that are or just stopped being predicted. [Projective velocity blending][11] seems like the de facto standard method for these, but I've also seen simple exponential decays used. There may be better smoothing filters. ## Lag Compensation From 93fb316269b87d85f16e24e422fed4d6f9c25fb2 Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Thu, 8 Jul 2021 11:54:48 -0500 Subject: [PATCH 38/43] Revised time sync and other sections Been working on a proof of concept for this and found some mistakes in the rushed pseudocode I wrote in the time sync section. Made some revisions, but realized some other stuff was out of data too. --- implementation_details.md | 307 ++++++++++++++++++++++++++------------ 1 file changed, 210 insertions(+), 97 deletions(-) diff --git a/implementation_details.md b/implementation_details.md index 3bf29f05..5e87b544 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -23,15 +23,15 @@ Type identifiers (in some form) need to match between all connected machines. Op ## Wants -- If it were possible to split off a range of entities into a [sub-world][12] (with its own, separate component storages), that would remove a layer of indirection and allow reusing the `World` API more directly. -- If the ECS scheduler supported arbitrary cycles in the stage graph (or the "stageless" equivalent), I imagine it'd be possible to make the nested loop criteria completely transparent to the end user (no custom stages needed, perhaps). +- Ability to split-off a range of entities into a [sub-world][12] with its own, separate component storages. This would remove a layer of indirection, bringing back the convenience of the `World` interface. +- Scheduler support for arbitrary cycles in the stage graph (or "stageless" equivalent). I believe this boils down to arranging stages (or labels) in a hierarchical FSM (state chart) or behavior tree. ## Practices users should follow or they'll have UB - Entities must be spawned (non-)networked. They cannot become (non-)networked. - Networked entities must be spawned with a "network ID" component at minimum. - (Non-)networked components and resources should only hold or reference (non-)networked data. -- Networked components should only be mutated inside `NetworkFixedUpdate`. +- Networked components should only be mutated inside the fixed update. ## `Connection` != `Player` @@ -39,7 +39,11 @@ I know I've been using the terms "client" and "player" somewhat interchangeably, ## Storage -The server maintains a storage resource containing a full copy of the latest networked state as well as a ring buffer of deltas (for the last `N` snapshots). Both are updated lazily using Bevy's built-in change detection. +IMO a fast, data-agnostic networking solution is impossible without the ability to handle things on the bit-level. Memcpy and integer compression are orders of magnitude faster than deep serialization and DEFLATE. + +To that end, the each snapshot should be a pre-allocated memory arena. The core storage resource would then basically amount to a ring buffer of arenas. + +On the server, this resource would hold a ring buffer of deltas (for the last `N` snapshots) with 0th delta being a full copy of the latest networked state. On the client these would all be considered snapshots. ```plaintext delta ringbuf copy of latest @@ -49,28 +53,34 @@ The server maintains a storage resource containing a full copy of the latest net newest delta ``` -This structure is pre-allocated when the resource is initialized and is the same for both full and interest-managed updates. +This architecture has a lot of advantages. It can be pre-allocated when the resource is initialized and it's the same for all replication modes. I.e. no storage differences between input determinism, full state transfer, or interest-managed state transfer. -At the end of every tick, the server zeroes the space for the newest delta, then iterates `Changed` and `Removed`: +These storages can be lazily updated using Bevy's built-in change detection. At the end of every tick, the server zeroes the space for the newest delta, then iterates `Changed` and `Removed`: - Generating the newest delta by xor'ing the changes with the stored copy. - Updating the rest of the ring buffer by xor'ing the older deltas with the newest. - Writing the changes to the stored copy. +Even if change detection became optional, I don't think much speed would be lost if we had to scan two snapshots for bitwise differences. + TODO - Store/serialize networked data without struct padding so we're not wasting bandwidth. -- Support components and resources that allocate on the heap (DSTs). Won't be possible at first but the solution most likely will be backing this resource with its own memory region (something like `bumpalo` but smarter). That will be important for deterministic desync detection as well. +- Support components and resources that allocate on the heap (DSTs). Won't be possible at first but the solution most likely will be backing this resource with its own memory arena (something like `bumpalo` but smarter). That will be important for deterministic desync detection as well. -### Full Updates +### Input Determinism -(a.k.a. delta-compressed snapshots) +In deterministic games, the server bundles received inputs and re-distributes them back to the clients. Clients generate their own snapshots locally whenever they have a full set inputs for a tick. Only one snapshot is needed. Clients also send checksum values to the server, that the server can use to detect desyncs. -For delta-compression, the server just compresses whichever deltas clients need using some variant of run-length encoding (currently looking at [Simple8b + RLE][2]). If the compressed payload is too large, the server will split it into fragments. There's no unnecessary work either. The server only compresses deltas that are going to be sent and the same compressed payload can be sent to any number of clients. +### Full State Transfer -### Interest-Managed Updates +(aka. delta-compressed snapshots) -(a.k.a. eventual consistency) +For delta-compression, the server just compresses whichever deltas clients need using some variant of run-length encoding (currently looking at [Simple8b + RLE][2]). If the compressed payload is too large, the server will split it into fragments. Overall, this is a very lightweight replication method because the server only needs to compress deltas that are going to be sent and the same compressed payload can be sent to any number of clients. + +### Interest-Managed State Transfer + +(aka. eventual consistency) Eventual consistency isn't inherently reliant on prioritization and filtering, but they're essential for an optimal player experience. @@ -81,60 +91,63 @@ Similarly, game designers often want to hide certain information from certain pl Anyway, to do all this interest management, the server needs to track some extra metadata. ```rust -struct InterestMetadata { - // P: players, E: entities, C: components - position: [Option<(usize, SpatialIndex)>; 2 * E], - within_aoi: [[Option<(usize, f32)>; E]; P], - relevance: [[BitSet; C]; P] - age: [[[usize; C]; E]; P] - priority: [[usize; E]; P] +struct InterestMetadata { + changed: Vec, + relevant: Vec, + lost: Vec, + priority: Vec>, } ``` -This metadata contains a few things: +Essentially, the server wants to send clients all the data that: -- position: the min and max AABB coordinates of every networked entity (if available) -- within_aoi: the entities that are inside the area of interest of this player +- belongs entities they're interested in AND +- has changed since they were last received AND +- is currently relevant to them (i.e. they're allowed to know) -Checking if an entity is inside someone's area of interest (AOI) is just an application of collision detection. AOI results are used to filter information at the entity-level. "Intangible" entities that lack a transform will auto-pass this check. For players on the same client, their results are merged with an OR. +I'm gonna gloss over how to check if an entity is inside someone's area of interest (AOI). It's just an application of collision detection. You'll create some interest regions for each client and write from entities that fall within them. -I don't see a need for ray, distance, and shape queries where BVH or spatial partitioning structures excel, so I'm looking into doing something like a [sweep-and-prune][9] (SAP) with Morton-encoding. Since SAP is essentially just sorting the array, I imagine it might have better performance. Alternatives like grids and potentially visible sets (PVS) can be explored and added later. +I don't see a need for ray, distance, and shape queries where BVH or spatial partitioning structures excel, so I'm looking into doing something like a [sweep-and-prune][9] (SAP) with Morton-encoding. Since SAP is essentially sorting an array, I imagine it might be faster. Alternatives like grids and potentially visible sets (PVS) can be explored and added later. -- relevance: whether or not a component value should be sent to this player +Those entities get written in the order of their oldest relevant changes. That way that the most pertinent information gets through if not everything can fit. -Relevance is used to filter information at the component-level. By default, change detection will mark components as relevant for everybody, and then some form of rule-based filtering (maybe [entity relations][10]) can be used to selectively omit or force-include them. When sent, a component's relevance is reset to false. For players on the same client, their relevances are merged with an OR. +Entities that the server always wants clients to know about or those that clients themselves always want to know about can be written without going through these checks or bandwidth constraints. -- age: time (in ticks) each component has had a pending change since it was last sent to this player +So how is that metadata tracked? -Age serves as the basis for send priority. The send priority of an entity is simply the age of its oldest relevant component. When an entity is sent, the ages of its relevant components are reset to zero. Age is tracked per component to be more robust against the receiving client having packet loss. For players on the same client, this field will be identical. +The **changed** field is simply an array of change ticks. We could reuse Bevy's built-in change tracking, but ideally each *word* in the arena would be tracked separately (would enable much better compression). These change ticks serve as the basis for send priority. -I think having the server only send undelivered changes and prioritizing the oldest ones is better than assigning entities arbitrary update frequencies. That said, I'll be testing a "zero-cost" alternative where I just count the number of due relevant components per entity. Unlike age, I believe this wouldn't require storing any sent packet metadata (see below). +The **relevant** field tracks whether or not a component should be sent to a certain client. This is how information is filtered at the component-level. By default, new changes would mark components as relevant for everybody, and then some form of filter rules (maybe [entity relations][10]) could selectively modify those. If a component value is sent, its relevance is reset to `false`. -- priority: how important it is to send the current state of each entity to this player +The **lost** field has bit per change tick, per client. Whenever the server sends a packet, it jots down somewhere which entities were in it and their priorities. Later, if the server is notified that a packet was probably lost, it can pull this info and set the lost bits. -Priority is the end result of combining the above metadata. After filtering entities to those within a player's area of interest and filtering their components to only what that player needs to know, the server sorts the interesting entities in priority order and writes each one until the client's packet is full or they're all written. +If the delta matching the stored priority still exists, the server can use that as a reference to only set a minimal amount of lost bits for an entity. Otherwise, all its lost bits would be set. Similarly, on send the lost bit would be cleared. -*So what do we do if packets are lost?* +Honestly, I believe this is pretty good, but I'm still looking for something that's more accurate while using fewer bits (if possible). -Whenever the server sends a packet, it remembers how old the included entities (actually their row indexes) were, then resets the age and relevance of their components. Later, if the server is notified that some previously sent packets were probably lost, it can pull this info and restore the components to a more indicative age (plus however many ticks have passed). +The **priority** field just stores the end result of combining the other metadata. This array gets sorted and the `Some(entity)` are written in that order, until the client's packet is full or they're all written. -For restoring the relevance of an entity's components, there are two cases. If the delta matching the stored age still exists, the server can use that as a reference and only mark its changed components. Otherwise, they all get marked. +I think having the server only send undelivered changes and prioritizing the oldest ones is better than assigning entities arbitrary update frequencies. -*So how 'bout those edge cases?* +### Interest Management Edge Cases Unfortunately, the most generalized strategy comes with its own headaches. - What should a client do when it misses the first update for a entity? Is it OK to spawn a entity with incomplete information? If not, how does the client know when it's safe? -AFAIK this is only a problem for "kinded" entities that have archetype invariants. I'm thinking two potential solutions: +AFAIK this is only a problem for "kinded" entities that have archetype invariants (aka spawn info). I'm thinking two potential solutions: 1. Have the client spawn new remote entities with any missing components in their invariants set to a default value. 2. Have the server redundantly send the full invariants for a new interesting entity until that information has been delivered once. -TBD +I think #2 is the better solution. + +TBD, I think there are more of these. ## Replicate Trait +TBD + ```rust pub unsafe trait Replicate { fn quantize(&mut self); @@ -145,15 +158,15 @@ unsafe impl Replicate for T {} ## How to rollback? -TODO - -probably a custom schedule/stage +There are two loops over the same chain of logic. ```plaintext -The "outer" loop is the number of fixed update steps as determined by the fixed timestep accumulator. -The "inner" loop is the number of steps to re-simulate. +The first loop is for re-simulating older ticks. +The second loop is for executing the newly-accumulated ticks. ``` +I think a nice solution would be to queue stages/labels using a hierarchical FSM (state chart) or behavior tree. Looping would stop being a special edge case and `ShouldRun` could reduce to a `bool`. The main thing is that transitions cannot be completely determined in the middle of a stage / label. The final decision has to be deferred to the end or there will be conflicts. + ## Unconditional Rollbacks Every article on "rollback netcode" and "client-side prediction and server reconciliation" encourages having clients compare their predicted state to the authoritative state and reconciling *if* they mispredicted, but well... How do you actually detect a mispredict? @@ -171,78 +184,167 @@ Let's consider a simpler default: This might seem wasteful, but think about it. If-then is really an anti-pattern that just hides performance problems from you. Mispredictions will exist regardless of this choice, and they're *especially* likely during heavier computations like physics. Having clients always rollback and re-sim makes it easier to profile and optimize your worst-case. It's also more memory-efficient, since clients never need to store old predicted states. -## "Clock" Synchronization +## Time Synchronization -Using a fixed rate, tick-based simulation simplifies how we need to think about time. It's like scrubbing a timeline, from one "frame" to the next. The key point is that everyone follows the same sequence. Clients may be simulating different points on the timeline, but tick 480 is the same simulation step for everyone. +Networked applications have to deal with relativity. Clocks will drift. Some router between you and the game server will randomly go offline. Someone in your house will start streaming Netflix. Et cetera. The slightest change in latency (i.e. distance) between two clocks will cause them to shift out of phase. -Ideally, clients predict ahead by just enough to have their inputs for each tick reach the server right before it simulates that tick. A commonly discussed strategy is to have clients estimate the clock time on the server (through some SNTP handshake) and use that to schedule their next simulation step, but IMO that's too indirect. +So how do two computers even agree on *when* something happened? -What we really care about is: How much time passes between when the server receives my input and when that input is consumed? If the server just tells me—in its update for tick N—how long my input for tick N sat in its buffer, I can use that information to converge on the correct lead. +It'd be really easy to answer that question if there was an *absolute* time reference. Luckily, we can make one. See, there are [two kinds of time][15]—plain ol' **wall-clock time** and **game time**—and we have complete control over the latter. The basic idea is pretty simple: Use a fixed timestep simulation and number the ticks in order. Doing that gives us a timeline of discrete moments that everyone shares (i.e. Tick 742 is the same in-game moment for everyone). + +With this shared timeline strategy, clients can either: + +- (Input Sync) Try to simulate ticks at the same wall-clock time. +- (State Sync) Try to have their inputs reach the server at same wall-clock time. + +When doing state sync, I'd recommend against having clients try to simulate ticks at the same time. To accomodate inputs arriving at different times, the server itself would have to rollback and resimulate or we'd have to change the strategy. For example, [Source engine][3] games (AFAIK) simulate the movement of each player at their individual send rates and *then* simulate the world at the regular tick rate. However, doing things their way makes having lower ping a technical advantage (search "lag compensation" in this [this article][4]), which I assume is the reason why ~~melee is bad~~ trading kills is so rare in Source engine games. + +### A Relatively Fixed Timestep + +Fixed timesteps are typically implemented as a kind of currency exchange. The time that elapsed since the previous frame is deposited in an accumulator and converted into simulation steps according to the exchange rate (tick rate). ```rust -if received_newer_server_update: - // an exponential moving average is a simple smoothing filter - avg_age = (31 / 32) * avg_age + (1 / 32) * age +pub struct Accumulator { + accum: f64, + ticks: usize, +} - // too late -> positive error -> speed up - // too early -> negative error -> slow down - error = target_age - avg_age +impl Accumulator { + pub fn add_time(&mut self, time: f64, timestep: f64) { + self.accum += time; + while self.accum >= timestep { + self.accum -= timestep; + self.ticks += 1; + } + } + + pub fn ticks(&self) -> usize { + self.ticks + } + + pub fn overtick_percentage(&self, timestep: f64) -> f64 { + self.accum / timestep + } + + pub fn consume_tick(&mut self) -> Option { + let remaining = self.ticks.checked_sub(1); + remaining + } + + pub fn consume_ticks(&mut self) -> Option { + let ticks = if self.ticks > 0 { Some(self.ticks) } else { None }; + self.ticks = 0; + ticks + } +} +``` + +Here's how it's typically used. Notice the time dilation. It's changing the time->tick exchange rate to produce more or fewer simulation steps per unit time. Note that this time dilation only affects the simulation rate. Inside the systems running in the fixed update, you should use the normal fixed timestep for the value of dt. - // reset accumulator - accumulated_correction = 0.0 +```rust +// Determine the exchange rate. +let x = (1.0 * time_dilation); +// Accrue all the time that has elapsed since last frame. +accumulator.add_time(time.delta_seconds(), x * timestep); -time_dilation = remap(error + accumulated_correction, -max_error, max_error, -0.1, 0.1) -accumulated_correction += time_dilation * fixed_delta_time +for step in 0..accumulator.consume_ticks() { + /* ... */ +} -cost_of_one_tick = (1.0 + time_dilation) * fixed_delta_time +// Calculate the blend alpha for rendering simulated objects. +let alpha = accumulator.overtick_percentage(x * timestep); ``` -If its inputs are arriving too early, a client can temporarily run fewer ticks each second to relax its lead. For example, a client simulating 10% slower would shrink their lead by 1 tick for every 10. +Ideally, clients simulate any given tick ahead by just enough so their inputs reach the server right before it does. + +One way I've seen people try to do this is to have clients estimate the wall-clock time on the server (using an SNTP handshake or similar) and from that schedule their next tick. That does work, but IMO it's too inaccurate. What we really care about is how much time passes between the server receiving an input and consuming it. That's what we want to control. The server can measure these wait times exactly and include them in the corresponding snapshot headers. Then clients can use those measurements to modify their tick rate and adjust their lead. -Interpolation is the same. You want the interpolation delay to be as small as possible. All that matters is the interval between received packets and how it varies (or maybe the number of buffered snapshots ahead of your current interpolation time). +For example, if its inputs are arriving too late (too early), a client can temporarily simulate more (less) frequently to converge on the correct lead. ```rust -if received_newer_server_update: - // an exponential moving average is simple smoothing filter - avg_delay = (31 / 32) * avg_delay + (1 / 32) * delay - avg_jitter = (31 / 32) * avg_jitter + (1 / 32) * abs(avg_delay - delay) +if received_newer_server_update { + avg_input_wait_time = (1.0 - a) * avg_input_wait_time + a * latest_input_wait_time; + // Negate here because I'm scaling the timestep and not the rate. + // i.e. 110% tick rate => 90% timestep + error = -(target_input_wait_time - avg_input_wait_time); +} + +// Anything we hear back from the server is always a round-trip old. +// We want to drive this feedback loop with a more up-to-date estimate +// to avoid overshoot / oscillation. +// Obviously, it's impossible for the client to know the current wait time. +// But it can fancy a guess by assuming every adjustment it made since +// the latest received update succeeded. +predicted_error = error; +for tick in (recv_tick..curr_tick) { + predicted_error += ringbuf[tick % ringbuf.len()]; +} + +// This is basically just a proportional controller. +time_dilation = (predicted_error - min_error) * (max_dilation - min_dilation) / (max_error - min_error); +time_dilation = time_dilation.clamp(min_dilation, max_dilation); + +// Store the new adjustment in the ring buffer. +*ringbuf[curr_tick % ringbuf.len()] = time_dilation * timestep; +``` - target_interp_delay = avg_delay + (2.0 * avg_jitter); - avg_interp_delay = (31 / 32) * avg_interp_delay + (1 / 32) * (latest_snapshot_recv_time - interp_time); - - // too early -> positive error -> slow down - // too late -> negative error -> speed up - error = -(target_interp_delay - avg_interp_delay) +Interpolating received snapshots should be very similar. What we're interested in is the remaining time left in the snapshot buffer. You want to always have at least one snapshot ahead of the current "playback" time (so the client always has something to interpolate to). - // reset accumulator - accumulated_correction = 0.0 +## Predict or Delay? +The higher a client's ping, the more ticks they'll need to resim. Depending on the game, that might be too expensive to support all the clients in your target audience. -time_dilation = remap(error + accumulated_correction, -max_error, max_error, -0.1, 0.1) -accumulated_correction += time_dilation * delta_time +In those cases, we can trade more input delay for fewer resim ticks. Essentially, there are three meaningful moments in the round-trip of an input: -interp_time += (1.0 + time_dilation) * delta_time -interp_time = max(interp_time, predicted_time - max_lag_comp) +1. When the inputs are sent. +2. When the true simulation tick happens (conceptually). +3. When resulting update is received. + +```plaintext +0 <-+-+-+-+-+-+-+-+-+-+-+-> t + sim + / \ + / \ + / \ + / \ + / \ + send recv + |<-- RTT -->| ``` -The key idea here is that simplifying the client-server relationship is more efficient and has less problems. If you followed the Source engine model described [here][3], the server would have to apply inputs whenever they arrive, meaning the server also has to rollback and it also must deal with weird ping-related issues (see the lag compensation section in [this article][4]). If the server never accepts late inputs and never changes its pace, no one needs to coordinate. +This gives us a few options for time sync: + +1. **No rollback and adaptive input delay**, preferred for games with prediction disabled +2. **Limited rollback and adaptive input delay**, preferred for games using input determinism with prediction enabled +3. **Unlimited rollback and no input delay**, preferred for games using state transfer with prediction enabled +4. **Unlimited rollback with fixed input delay** as an alternative to #2 (for games allergic to variable input delay) + +"Adaptive input delay" here means "fixed input delay with more as needed." + +I think method #2 is best explaiend as a sequence of fallbacks. Clients first add a fixed amount of input delay. If that's more than their current RTT, they won't need to rollback. If that isn't enough, the client will rollback, but only up to a limit. If even the combination of fixed input delay and limited rollback doesn't cover RTT, more input delay will be added to fill the remainder. + +Method #3 is preferred for games that use state transfer because adding input delay would negatively impact the accuracy of server-side lag compensation. ## Predicted <-> Interpolated -Clients can't directly make persistent changes the authoritative state, but they're allowed to do whatever they want locally. Current plan is to just copy the latest authoritative state. If this ends up being too expensive (or when DSTs are supported), a copy-on-write layer is another option. +When the server has full authority, clients cannot directly write persistent changes to the authoritative state. However, it's perfectly okay for them to do whatever they want locally. That's all client-side prediction really is—local changes. Clients can just copy the latest authoritative state as a starting point. -We want to shift components between being predicted (extrapolated) and being interpolated. Either could be default. If interpolation is default, entities would reset to interpolated when modified by a server update. Users could then use specialized `Predicted` and `Confirmed` (equivalent to `Not(Predicted)`) query filters to address the two groups separately. I think these can piggyback off of Bevy's built-in reliable change detection. +We can also shift components between being predicted (extrapolated) and being interpolated. Either could be default. If interpolation is default, entities would reset to interpolated when modified by a server update. Users could then use specialized `Predicted` and `Confirmed` query filters to address the two separately. These can piggyback off of Bevy's built-in reliable change detection. -Systems would generate predictions by default, but users can opt-out with the `Predicted` filter to only process entities that have already been mutated by an earlier system. Users would typically use these filters for heavier logic like physics. Clients will naturally predict any entities driven by their input and any spawned by their input (until confirmed by the server). +This means systems predict by default, but users can opt-out with the `Predicted` filter to only process components that have already been mutated by an earlier system. Clients will naturally predict any entities driven by their input and any spawned by their input (until confirmed by the server). -Since sounds and particles require special consideration, they're probably best realized through dispatching events to be handled *outside* `NetworkFixedUpdate`. We can use these query filters to generate events that only trigger on authoritative changes and events that trigger on predicted changes to be confirmed or cancelled later. +## Predicted FX Events -How to uniquely identify these events is another question, though. +Sounds and particles need special consideration, since they can be predicted but are also typically handled outside of the fixed update. -## Predicting Entity Creation +We'll need events that can be confirmed or cancelled. The main requirement is tagging them with a unique identifier. Maybe hashing together the tick number, system ID, and entity ID would suffice. -This requires some special consideration. +TBD + +## Predicted Spawns + +This too requires special consideration. The naive solution is to have clients spawn dummy entities so that when an update that confirms the result arrives, they'll simply destroy the dummy and spawn the true entity. IMO this is a poor solution because it prevents clients from smoothly blending errors in the predicted spawn's rendered transform. Snapping its visuals wouldn't look right. @@ -258,7 +360,7 @@ I recommend 1 as it's the simplest method. Bandwidth and CPU resources would run ## Smooth Rendering -Rendering should happen after `NetworkFixedUpdate`. +Rendering should happen later in the frame, sometime after the fixed update. Whenever clients receive an update with new remote entities, those entities shouldn't be rendered until that update is interpolated. We can do this through a marker component or with a field in the render transform. @@ -266,16 +368,16 @@ Cameras need some special treatment. Look inputs need to be accumulated at the r We'll also need some way for developers to declare their intent that a motion should be instant instead of smoothly interpolated. Since it needs to work for remote entities as well, maybe this just has to be a bool on the networked transform. -While most visual interpolation is linear, we'll want another blend for quickly but smoothly correcting visual misprediction errors, which can occur for entities that are or just stopped being predicted. [Projective velocity blending][11] seems like the de facto standard method for these, but I've also seen simple exponential decays used. There may be better smoothing filters. +While most visual interpolation is linear, we'll want another blend for quickly but smoothly correcting visual misprediction errors, which can occur for entities that are or just stopped being predicted. [Projective velocity blending][11] seems like the de facto standard method for these, but I've also seen simple exponential decays used. There may be better error correction methods. ## Lag Compensation Lag compensation deals with colliders and needs to run after all motion and physics systems. All positions have to be settled or you'll get unexpected results. -It seems like a common strategy is to have the server estimate what interpolated state the client was looking at based on their RTT, but we can resolve this without any guesswork. Clients can just tell the server what they were looking at by bundling their interpolation parameters along with their inputs. With this information, the server can reconstruct what each client saw with *perfect* accuracy. +Similar to inputs, I've seen people try to have the server estimate which snapshots each client was interpolating based on their ping, but we can easily do better than that. Clients can just tell the server directly by sending their interpolation parameters along with their inputs. With this information, the server knows to do with *perfect* accuracy. No guesswork necessary. ```plaintext - + tick number (predicted) tick number (interpolated from) tick number (interpolated to) @@ -286,27 +388,37 @@ interpolation blend value So there are two ways to do the actual compensation: - Compensate upfront by bringing new projectiles into the present (similar to a rollback). -- Compensate over time ("amortized"), constantly testing projectiles against the history buffer. +- Compensate over time ("amortized"), constantly testing projectiles against a history buffer. There's a lot to learn from *Overwatch* here. -*Overwatch* shows that [we can treat time as another spatial dimension][5], so we can put the entire collider history in something like a BVH and test it all at once (with the amortized method). +*Overwatch* shows that [we can treat time as another spatial dimension][5], so we can put the entire collider history in something like a BVH and test it all at once (the amortized method). Essentially, you'd generate a bounding box for each collider that surrounds all of its historical poses and then test projectiles for hits against those first (broad-phase), then test those hits against bounding boxes blended between two snapshots (optional mid-phase), then the precise geometry blended between two snapshots (narrow-phase). For clients with too-high ping, their interpolation will lag far behind their prediction. If you only compensate up to a limit (e.g. 200ms), [those clients will have to extrapolate the difference][6]. Doing nothing is also valid, but lagging clients would abruptly have to start leading their targets. -*Overwatch* [allows defensive abilities to mitigate compensated projectiles][7]. AFAIK this is simple to do. If a player activates any defensive bonus, just apply it to all their buffered hitboxes. +You'd constrain the playback time like below and then run some extrapolation logic pre-update. -When a player is the child of another, uncontrolled entity (e.g. the player is a passenger in a vehicle), the non-predicted movement of that parent entity must be rewound during lag compensation, so that any projectiles fired by the player spawn in the correct location. See [here.][8] +```rust +playback_time = playback_time.max((curr_tick * timestep) - max_lag_compensation); +``` -## Messages (RPCs) +*Overwatch* [allows defensive abilities to mitigate compensated projectiles][7]. AFAIK this is simple to do. If a player activates any defensive bonus, just apply it to all their buffered colliders. -Messages are good for sending global alerts and any gameplay mechanics where raw inputs aren't expressive enough. For example, buying items bulk from an in-game menu. The server won't simulate UI, so it'd probably be simplest if the client sent a reliable message describing what they want. The "reply" in this example would be implicit in the received state. +When a player is the child of another, uncontrolled entity (e.g. the player is a passenger in a vehicle), the non-predicted movement of that parent entity must be rewound during lag compensation, so that any projectiles fired by the player spawn in the correct location. [See here.][8] -Messages can be reliable. They can also be postmarked to be processed on a certain tick like inputs. That can only be best effort (i.e. tick N or earliest), though. +## Messages (RPCs and events you can send!) -I don't really know what these should look like yet. A macro might be the most ergonomic choice, if it means a message can be completely defined in its relevant system. +Sometimes raw inputs aren't expressive enough. Examples include choosing a loadout and buying items from an in-game menu. Mispredicts aren't acceptable in these cases, however servers don't typically simulate UI. -TODO +So there's need for a dedicated type of optionally reliable message for text/UI-based and "send once" gameplay interactions. Similarly, global alerts from the server shouldn't clutter the game state. + +These messages can optionally be postmarked to be processed on a certain tick like inputs, but that can only be best effort (i.e. tick N or earliest). + +And while I gave examples of "requests" using these messages, those don't have to receive explicit replies. If the server confirms your purchased items, those would just appear in your inventory in a later snapshot. + +IDK what these should look like yet. A macro might be the most ergonomic choice, if it means a message can be defined in its relevant system. + +TBD [1]: https://github.com/bevyengine/rfcs/pull/16 [2]: https://github.com/lemire/FastPFor/blob/master/headers/simple8b_rle.h @@ -321,4 +433,5 @@ TODO [11]: https://www.researchgate.net/publication/293809946_Believable_Dead_Reckoning_for_Networked_Games [12]: https://github.com/bevyengine/rfcs/pull/16#issuecomment-849878777 [13]: https://github.com/bevyengine/bevy/issues/32 -[14]: https://github.com/bevyengine/bevy/issues/32#issuecomment-821510244 \ No newline at end of file +[14]: https://github.com/bevyengine/bevy/issues/32#issuecomment-821510244 +[15]: https://johnaustin.io/articles/2019/fix-your-unity-timestep \ No newline at end of file From f39fc9f36f1e54737361f348adae76acf9ce1a1d Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Thu, 8 Jul 2021 12:09:28 -0500 Subject: [PATCH 39/43] formatting/typos --- implementation_details.md | 34 +++++++++++----------------------- 1 file changed, 11 insertions(+), 23 deletions(-) diff --git a/implementation_details.md b/implementation_details.md index 5e87b544..19486080 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -39,11 +39,9 @@ I know I've been using the terms "client" and "player" somewhat interchangeably, ## Storage -IMO a fast, data-agnostic networking solution is impossible without the ability to handle things on the bit-level. Memcpy and integer compression are orders of magnitude faster than deep serialization and DEFLATE. +IMO a fast, data-agnostic networking solution is impossible without the ability to handle things on the bit-level. Memcpy and integer compression are orders of magnitude faster than deep serialization and DEFLATE. To that end, each snapshot should be a pre-allocated memory arena. The core storage resource would then basically amount to a ring buffer of these arenas. -To that end, the each snapshot should be a pre-allocated memory arena. The core storage resource would then basically amount to a ring buffer of arenas. - -On the server, this resource would hold a ring buffer of deltas (for the last `N` snapshots) with 0th delta being a full copy of the latest networked state. On the client these would all be considered snapshots. +On the server, this would translate to a ring buffer of deltas (for the last `N` snapshots) along with a full copy of the latest networked state. On the client this would hold a bunch of snapshots. ```plaintext delta ringbuf copy of latest @@ -66,7 +64,7 @@ Even if change detection became optional, I don't think much speed would be lost TODO - Store/serialize networked data without struct padding so we're not wasting bandwidth. -- Support components and resources that allocate on the heap (DSTs). Won't be possible at first but the solution most likely will be backing this resource with its own memory arena (something like `bumpalo` but smarter). That will be important for deterministic desync detection as well. +- Components and resources that allocate on the heap (backed by the arena) may have some issues with the interest management send strategy. First, finding all of an entity's heap allocations is its own problem. Then, writing partial heap information could invalidate existing data on the client. ### Input Determinism @@ -109,9 +107,7 @@ I'm gonna gloss over how to check if an entity is inside someone's area of inter I don't see a need for ray, distance, and shape queries where BVH or spatial partitioning structures excel, so I'm looking into doing something like a [sweep-and-prune][9] (SAP) with Morton-encoding. Since SAP is essentially sorting an array, I imagine it might be faster. Alternatives like grids and potentially visible sets (PVS) can be explored and added later. -Those entities get written in the order of their oldest relevant changes. That way that the most pertinent information gets through if not everything can fit. - -Entities that the server always wants clients to know about or those that clients themselves always want to know about can be written without going through these checks or bandwidth constraints. +Those entities get written in the order of their oldest relevant changes. That way that the most pertinent information gets through if not everything can fit. Entities that the server always wants clients to know about or those that clients themselves always want to know about can be written without going through these checks or bandwidth constraints. So how is that metadata tracked? @@ -119,15 +115,11 @@ The **changed** field is simply an array of change ticks. We could reuse Bevy's The **relevant** field tracks whether or not a component should be sent to a certain client. This is how information is filtered at the component-level. By default, new changes would mark components as relevant for everybody, and then some form of filter rules (maybe [entity relations][10]) could selectively modify those. If a component value is sent, its relevance is reset to `false`. -The **lost** field has bit per change tick, per client. Whenever the server sends a packet, it jots down somewhere which entities were in it and their priorities. Later, if the server is notified that a packet was probably lost, it can pull this info and set the lost bits. - -If the delta matching the stored priority still exists, the server can use that as a reference to only set a minimal amount of lost bits for an entity. Otherwise, all its lost bits would be set. Similarly, on send the lost bit would be cleared. +The **lost** field has bit per change tick, per client. Whenever the server sends a packet, it jots down somewhere which entities were in it and their priorities. Later, if the server is notified that a packet was probably lost, it can pull this info and set the lost bits. If the delta matching the stored priority still exists, the server can use that as a reference to only set a minimal amount of lost bits for an entity. Otherwise, all its lost bits would be set. Similarly, on send the lost bit would be cleared. Honestly, I believe this is pretty good, but I'm still looking for something that's more accurate while using fewer bits (if possible). -The **priority** field just stores the end result of combining the other metadata. This array gets sorted and the `Some(entity)` are written in that order, until the client's packet is full or they're all written. - -I think having the server only send undelivered changes and prioritizing the oldest ones is better than assigning entities arbitrary update frequencies. +The **priority** field just stores the end result of combining the other metadata. This array gets sorted and the `Some(entity)` are written in that order, until the client's packet is full or they're all written. I think having the server only send undelivered changes and prioritizing the oldest ones is better than assigning entities arbitrary update frequencies. ### Interest Management Edge Cases @@ -190,12 +182,12 @@ Networked applications have to deal with relativity. Clocks will drift. Some rou So how do two computers even agree on *when* something happened? -It'd be really easy to answer that question if there was an *absolute* time reference. Luckily, we can make one. See, there are [two kinds of time][15]—plain ol' **wall-clock time** and **game time**—and we have complete control over the latter. The basic idea is pretty simple: Use a fixed timestep simulation and number the ticks in order. Doing that gives us a timeline of discrete moments that everyone shares (i.e. Tick 742 is the same in-game moment for everyone). +It'd be really easy to answer that question if there was an *absolute* time reference. Luckily, we can make one. See, there are [two kinds of time][15]—plain ol' **wall-clock time** and **game time**—and we have complete control over the latter. The basic idea is pretty simple: Use a fixed timestep simulation and number the ticks in order. Doing that gives us a timeline of discrete moments that everyone can share (i.e. Tick 742 is the same in-game moment for everyone). With this shared timeline strategy, clients can either: -- (Input Sync) Try to simulate ticks at the same wall-clock time. -- (State Sync) Try to have their inputs reach the server at same wall-clock time. +- Try to simulate ticks at the same wall-clock time. +- Try to have their inputs reach the server at same wall-clock time. When doing state sync, I'd recommend against having clients try to simulate ticks at the same time. To accomodate inputs arriving at different times, the server itself would have to rollback and resimulate or we'd have to change the strategy. For example, [Source engine][3] games (AFAIK) simulate the movement of each player at their individual send rates and *then* simulate the world at the regular tick rate. However, doing things their way makes having lower ping a technical advantage (search "lag compensation" in this [this article][4]), which I assume is the reason why ~~melee is bad~~ trading kills is so rare in Source engine games. @@ -258,9 +250,7 @@ let alpha = accumulator.overtick_percentage(x * timestep); Ideally, clients simulate any given tick ahead by just enough so their inputs reach the server right before it does. -One way I've seen people try to do this is to have clients estimate the wall-clock time on the server (using an SNTP handshake or similar) and from that schedule their next tick. That does work, but IMO it's too inaccurate. What we really care about is how much time passes between the server receiving an input and consuming it. That's what we want to control. The server can measure these wait times exactly and include them in the corresponding snapshot headers. Then clients can use those measurements to modify their tick rate and adjust their lead. - -For example, if its inputs are arriving too late (too early), a client can temporarily simulate more (less) frequently to converge on the correct lead. +One way I've seen people try to do this is to have clients estimate the wall-clock time on the server (using an SNTP handshake or similar) and from that schedule their next tick. That does work, but IMO it's too inaccurate. What we really care about is how much time passes between the server receiving an input and consuming it. That's what we want to control. The server can measure these wait times exactly and include them in the corresponding snapshot headers. Then clients can use those measurements to modify their tick rate and adjust their lead. E.g. if its inputs are arriving too late (too early), a client can briefly simulate more (less) frequently to converge on the correct lead. ```rust if received_newer_server_update { @@ -293,9 +283,7 @@ Interpolating received snapshots should be very similar. What we're interested i ## Predict or Delay? -The higher a client's ping, the more ticks they'll need to resim. Depending on the game, that might be too expensive to support all the clients in your target audience. - -In those cases, we can trade more input delay for fewer resim ticks. Essentially, there are three meaningful moments in the round-trip of an input: +The higher a client's ping, the more ticks they'll need to resim. Depending on the game, that might be too expensive to support all the clients in your target audience. In those cases, we can trade more input delay for fewer resim ticks. Essentially, there are three meaningful moments in the round-trip of an input: 1. When the inputs are sent. 2. When the true simulation tick happens (conceptually). From 17d349020c1b81b47a0bd384b0fb87e69296b3a8 Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Fri, 9 Jul 2021 10:21:15 -0500 Subject: [PATCH 40/43] added snapshot interpolation back to time sync section --- implementation_details.md | 73 ++++++++++++++++++++++++++++++++------- 1 file changed, 60 insertions(+), 13 deletions(-) diff --git a/implementation_details.md b/implementation_details.md index 19486080..10e7797e 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -134,7 +134,7 @@ AFAIK this is only a problem for "kinded" entities that have archetype invariant I think #2 is the better solution. -TBD, I think there are more of these. +TBD, I'm sure there are more of these. ## Replicate Trait @@ -184,12 +184,12 @@ So how do two computers even agree on *when* something happened? It'd be really easy to answer that question if there was an *absolute* time reference. Luckily, we can make one. See, there are [two kinds of time][15]—plain ol' **wall-clock time** and **game time**—and we have complete control over the latter. The basic idea is pretty simple: Use a fixed timestep simulation and number the ticks in order. Doing that gives us a timeline of discrete moments that everyone can share (i.e. Tick 742 is the same in-game moment for everyone). -With this shared timeline strategy, clients can either: +With this shared timeline strategy, clients have two, mutually exclusive options: - Try to simulate ticks at the same wall-clock time. - Try to have their inputs reach the server at same wall-clock time. -When doing state sync, I'd recommend against having clients try to simulate ticks at the same time. To accomodate inputs arriving at different times, the server itself would have to rollback and resimulate or we'd have to change the strategy. For example, [Source engine][3] games (AFAIK) simulate the movement of each player at their individual send rates and *then* simulate the world at the regular tick rate. However, doing things their way makes having lower ping a technical advantage (search "lag compensation" in this [this article][4]), which I assume is the reason why ~~melee is bad~~ trading kills is so rare in Source engine games. +When using state transfer, I'd recommend against having clients try to simulate ticks at the same time. To accomodate inputs arriving at different times, the server itself would have to rollback and resimulate or you'd have to change the strategy. For example, [Source engine][3] games (AFAIK) simulate the movement of each player at their individual send rates and *then* simulate the world at the regular tick rate. However, doing things their way makes having lower ping a technical advantage (search "lag compensation" in this [this article][4]), which I assume is the reason why ~~melee is bad~~ trading kills is so rare in Source engine games. ### A Relatively Fixed Timestep @@ -231,7 +231,7 @@ impl Accumulator { } ``` -Here's how it's typically used. Notice the time dilation. It's changing the time->tick exchange rate to produce more or fewer simulation steps per unit time. Note that this time dilation only affects the simulation rate. Inside the systems running in the fixed update, you should use the normal fixed timestep for the value of dt. +Here's how it's typically used. Notice the time dilation. It's changing the time->tick exchange rate to produce more or fewer simulation steps per unit time. Just so you know, this time dilation should only affect the tick rate. Inside the systems running in the fixed update, you should always use the normal fixed timestep for the value of dt. ```rust // Determine the exchange rate. @@ -253,13 +253,18 @@ Ideally, clients simulate any given tick ahead by just enough so their inputs re One way I've seen people try to do this is to have clients estimate the wall-clock time on the server (using an SNTP handshake or similar) and from that schedule their next tick. That does work, but IMO it's too inaccurate. What we really care about is how much time passes between the server receiving an input and consuming it. That's what we want to control. The server can measure these wait times exactly and include them in the corresponding snapshot headers. Then clients can use those measurements to modify their tick rate and adjust their lead. E.g. if its inputs are arriving too late (too early), a client can briefly simulate more (less) frequently to converge on the correct lead. ```rust -if received_newer_server_update { - avg_input_wait_time = (1.0 - a) * avg_input_wait_time + a * latest_input_wait_time; - // Negate here because I'm scaling the timestep and not the rate. +if received_newer_server_update { + /* ... updates packet statistics ... */ + // measurements of the input wait time and input arrival delta come from the server + target_input_wait_time = max(timestep, avg_input_arrival_delta + safety_factor * input_arrival_dispersion) + + // I'm negating here because I'm scaling the timestep and not the tick rate. // i.e. 110% tick rate => 90% timestep error = -(target_input_wait_time - avg_input_wait_time); } +// This logic executes every tick. + // Anything we hear back from the server is always a round-trip old. // We want to drive this feedback loop with a more up-to-date estimate // to avoid overshoot / oscillation. @@ -279,7 +284,48 @@ time_dilation = time_dilation.clamp(min_dilation, max_dilation); *ringbuf[curr_tick % ringbuf.len()] = time_dilation * timestep; ``` -Interpolating received snapshots should be very similar. What we're interested in is the remaining time left in the snapshot buffer. You want to always have at least one snapshot ahead of the current "playback" time (so the client always has something to interpolate to). +Interpolating received snapshots is very similar. What we're interested in is the remaining time left in the snapshot buffer. You want to always have at least one snapshot ahead of the current "playback" time (so the client always has something to interpolate to). + +```rust +if received_newer_server_update { + /* ... updates packet statistics ... */ + target_interpolation_delay = max(server_send_interval, avg_update_arrival_delta + safety_factor * update_arrival_dispersion); + time_since_last_update = 0.0; +} + +// This logic executes every frame. + +// Calculate the current interpolation. +// Network conditions are assumed to be constant between updates. +current_interpolation_delay = last_update_received_time + time_since_last_update - playback_time; + +// I'm negating here because I'm scaling time and not frequency. +// i.e. 110% freq => 90% time +error = -(target_interpolation_delay - current_interpolation_delay); +time_dilation = (error - min_error) * (max_dilation - min_dilation) / (max_error - min_error); +time_dilation = time_dilation.clamp(min_dilation, max_dilation); + +playback_time += time.delta_seconds() * (1.0 + time_dilation); +time_since_last_update += time.delta_seconds(); + +// Determine the two snapshots and blend alpha. +let i = buf.partition_point(|&snapshot| (snapshot.tick as f32 * timestep) < playback_time); +let (from, to, blend) = if i == 0 { + // Current playback time is behind all buffered snapshots. + (buf[0].tick, buf[0].tick, 0.0) +} else if i == buffer.len() { + // Current playback time is ahead of all buffered snapshots. + // Here, I'm just clamping to the latest, but you could extrapolate instead. + (buf[-1].tick, buf[-1].tick, 0.0) +} else { + let a = buf[i-1].tick; + let b = buf[i].tick; + let blend = ((b as f32 * timestep) - playback_time) / ((b - a) as f32 * timestep); + (a, b, blend) +} + +// Go forth and (s)lerp. +``` ## Predict or Delay? @@ -304,15 +350,16 @@ The higher a client's ping, the more ticks they'll need to resim. Depending on t This gives us a few options for time sync: 1. **No rollback and adaptive input delay**, preferred for games with prediction disabled -2. **Limited rollback and adaptive input delay**, preferred for games using input determinism with prediction enabled -3. **Unlimited rollback and no input delay**, preferred for games using state transfer with prediction enabled -4. **Unlimited rollback with fixed input delay** as an alternative to #2 (for games allergic to variable input delay) +2. **Bounded rollback and adaptive input delay**, preferred for games using input determinism with prediction enabled +3. **Unbounded rollback and no/fixed input delay**, preferred for games using state transfer with prediction enabled "Adaptive input delay" here means "fixed input delay with more as needed." -I think method #2 is best explaiend as a sequence of fallbacks. Clients first add a fixed amount of input delay. If that's more than their current RTT, they won't need to rollback. If that isn't enough, the client will rollback, but only up to a limit. If even the combination of fixed input delay and limited rollback doesn't cover RTT, more input delay will be added to fill the remainder. +Method #1 basically tries to ensure packets are always received before they're needed by the simulation. The client will add as much input delay as needed to avoid stalling. + +Method #2 is best explained as a sequence of fallbacks. Clients first add a fixed amount of input delay. If that's more than their current RTT, they won't need to rollback. If that isn't enough, the client will rollback, but only up to a limit. If even the combination of fixed input delay and maximum rollback doesn't cover RTT, more input delay will be added to fill the remainder. -Method #3 is preferred for games that use state transfer because adding input delay would negatively impact the accuracy of server-side lag compensation. +Method #3 is preferred for games that use state transfer. Adding input delay would negatively impact the accuracy of server-side lag compensation, so it should almost always be set to zero in those cases. Games that use input determinism might prefer a constant input delay even if it means their game might stutter, unlike method #2. ## Predicted <-> Interpolated From ae03f018d7a15296c70ccb3ce257378ef2454a30 Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Fri, 9 Jul 2021 10:24:18 -0500 Subject: [PATCH 41/43] forgot the header --- implementation_details.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/implementation_details.md b/implementation_details.md index 10e7797e..479b04d0 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -284,6 +284,8 @@ time_dilation = time_dilation.clamp(min_dilation, max_dilation); *ringbuf[curr_tick % ringbuf.len()] = time_dilation * timestep; ``` +### Snapshot Interpolation + Interpolating received snapshots is very similar. What we're interested in is the remaining time left in the snapshot buffer. You want to always have at least one snapshot ahead of the current "playback" time (so the client always has something to interpolate to). ```rust From 627fb218c83ede367078c0390fd5e0a54f315c74 Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Fri, 9 Jul 2021 10:27:20 -0500 Subject: [PATCH 42/43] algebruh mistake --- implementation_details.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/implementation_details.md b/implementation_details.md index 479b04d0..89d5da9c 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -322,7 +322,7 @@ let (from, to, blend) = if i == 0 { } else { let a = buf[i-1].tick; let b = buf[i].tick; - let blend = ((b as f32 * timestep) - playback_time) / ((b - a) as f32 * timestep); + let blend = (playback_time - (a as f32 * timestep)) / ((b - a) as f32 * timestep); (a, b, blend) } From e6db84179906fa01db766ecef6c7145ac73b7185 Mon Sep 17 00:00:00 2001 From: Joy <51241057+maniwani@users.noreply.github.com> Date: Fri, 9 Jul 2021 10:40:46 -0500 Subject: [PATCH 43/43] fixed typo snapshot time is the time stored in the snapshot, not the time when it was received --- implementation_details.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/implementation_details.md b/implementation_details.md index 89d5da9c..7a508a61 100644 --- a/implementation_details.md +++ b/implementation_details.md @@ -299,7 +299,7 @@ if received_newer_server_update { // Calculate the current interpolation. // Network conditions are assumed to be constant between updates. -current_interpolation_delay = last_update_received_time + time_since_last_update - playback_time; +current_interpolation_delay = (latest_snapshot_tick * timestep) + time_since_last_update - playback_time; // I'm negating here because I'm scaling time and not frequency. // i.e. 110% freq => 90% time