Skip to content

Timed-out calls can result in out-of-order responses for subsequent calls #100

@AmanTallarium

Description

@AmanTallarium

We've been investigating some mismatched responses being returned from NodeJS.call/3, which I think I've traced back to how timeouts are handled in NodeJS.Worker.

I've created a test which can reproduce the issue we're seeing in a fork of this library: https://github.com/revelrylabs/elixir-nodejs/compare/master...AmanTallarium:elixir-nodejs:out-of-order-responses?expand=1

The result of the above test when running locally is:

 1) test overriding call timeout timed out calls don't result in out-of-order responses (NodeJS.Test)
     test/nodejs_test.exs:184
     Assertion with == failed
     code:  assert expected == actual
     left:  [1, 2, 3, 4, 5, 6, 7, 8, 9, "Call timed out.", 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
     right: [1, 2, 3, 4, 5, 6, 7, 8, 9, "Call timed out.", 11, 12, 10, 13, 14, 15, 16, 17, 18, 19]
     stacktrace:
       test/nodejs_test.exs:197: (test)

Note the 10 appearing after 12 in the actual result, when it should have been discarded due to the timeout of the 10th execution.

Essentially, if a given invocation to NodeJS.call times out, it is possible that a future invocation to NodeJS.call can receive a result intended to be received by the first call. This can mean an entire sequence of calls can get a off-by-one error.

I'm not sure what the ideal resolution for this would be. I've had the some ideas, but would love some feedback before I attempt to implement any of these:

  1. On a timeout in NodeJS.Worker.get_response/2, kill the current GenServer so that poolboy restarts it with a clean slate, and (hopefully) no chance of reading old responses. This feels a bit icky though.
  2. Attach some sort of "invocation id" to each message sent over the port, and only handle response messages with the corresponding ID. This is similar to the current @prefix is being used to ignore unintended output on stdout, and is similar to how many multiplexed protocols work.

I'm sure someone will have a better idea on how to resolve this, and I'd be happy to help with any implementation if desired.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions