Skip to content

Question about API design, regarding transactions #11

@insyri

Description

@insyri

TL;DR: I'd like to build modular repository functions that support transactions. I read the Transactions doc and understand the project focuses on I/O/batching rather than transaction semantics, and recommends Lua scripts for atomic work. Before I invest in building a fork or an in-house wrapper, I'd appreciate some help understanding why transactions were avoided at the API level, along with any concrete pitfalls you saw. This would help me choose between what I pursue.


Hi there,

Before I ask my question, I'd just like to say that I really appreciate the amount of documentation you put into this project. After finding this project, I had a relatively frictionless experience trying to see if this library works with my engineering goals or not via looking at the whole library, along with understanding the heart of RedPipe through both the Rationale page in the docs and the source code. It's very approachable and accessible, which is wildly different from conventional documentation styles and hence helped me become comfortable with the library. Thank you!

My original goal is to write many small, composable functions that can be composed into a single pipeline or executed as a transaction when needed. For example, using an atomic transfer: check balance then debit + credit + ledger entry; in code that might look like:

class WalletRepo:
    # init ...
    def credit(user: str, amount: int, ledger_entry: str, pipe: Pipeline):
        ...

    def debit(user: str, amount: int, ledger_entry: str, pipe: Pipeline):
        ...

    def transfer(from_user: str, to_user: str, amount: int, ledger_entry: str, pipe: Pipeline):
        ...

    def balance(user: str, pipeline: Pipeline):
        ...

    def write_ledger_entry(user: str, entry: str, pipeline: Pipeline):
        ...

Then, in an entry point function, I can utilize these methods for a single pipeline.

I like RedPipe's deferred Future model for composability; however, RedPipe outlines that its focus is i/o efficiency via optimizing for batching, and doesn't have rich transactional support as a core goal. Rather, it suggests to use Lua scripts instead. (Citation) For my developmental goals, I'm looking to do true atomic work (e.g., check-and-mutate across multiple keys). I'm interested in either creating my own fork of RedPipe to achieve these goals or use a smaller in-house version that focuses on the specific goals I have for modular functions w/ transactions. To do this, I was wondering what your struggles were when dismissing transactional support for the RedPipe library.

My question is: What API design components / architectural reasons of RedPipe make it difficult for transactional support?

And first, I can estimate why it could be difficult. You could begin by looking at the unique structure of MULTI/EXEC / transactional pipelines, which differ from normal pipelines. Normal pipelines (pipelines without MULTI/EXEC) is just buffering / commands to then send a batch of commands ran immediately on the server once sent: the core idea is to just reduce round-trips. Transactional pipelines offer both instant and deferred execution, which can be difficult to track throughout the code if the library isn't built to programmatically detect it, making the programmer responsible for that and therefore loses the abstraction and/or ergonomic purposes of the library. Consider the two pipeline types:

Usual pipelines

  1. Client buffers commands locally.
  2. Sends them in one batch to Redis.
  3. Redis executes them sequentially but non-atomically, returning results in order.

Pipelines do not guarantee atomicity between commands. They only reduce round-trips.

Transactional pipelines

  1. WATCH [keys...] --> Marks keys for optimistic locking. If any watched key changes before EXEC, the transaction aborts.
  2. Any commands sent before MULTI are executed immediately (not queued).
  3. MULTI --> Starts a transaction block.
  4. Subsequent commands are queued, not executed.
  5. EXEC --> Executes all queued commands atomically. If watched keys changed transaction is aborted.

RedPipe uses the Future object to wrap pipelines commands in a reference, which improve ergonomics by letting the library compute an internal stack to then map out their respective Future references. This allows for many modular functions and the programmer to keep track of wanted return values post-execution. This system depends on each value being ran in the future, not immediately; furthermore, the library would need to distinguish that for the programmer as well for a robust system.

In transactional pipelines, this presupposition collapses, as the pipeline could be in either step 2 (run immediately, not queued) or step 4 (queued, default behavior); this complicates the use of Future objects, and can be easy to implement a high-level obstacle instead of a high-level abstraction: the core issue comes mixing immediate execution semantics with deferred/future semantics.

Also, I can immediately see why transaction support for Redis Cluster is impossible: cross-slot limitations make generic multi-key transactions harder to design for in any wrapper. That is one practical reason any library would avoid complex transaction APIs.

In the end, I was hoping I could get your feedback on what you felt about what made transactions cumbersome to implement and if my commentary was along the lines of your thinking. Were there specific race conditions or complexity in the codebase that made you avoid transactions? Did you consider adding a 'transactional pipeline' mode (perhaps, where Futures can be immediate or queued)?

Thank you

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions