Power of the RPC Abstraction

When one first encounters the concept of Remote Procedure Calls (RPCs), it seems obvious. What’s the big deal? We can understand the power of the RPC abstraction only if we try to implement distributed systems without using RPCs. And this is what usually happens in the first assignment in CS 425 — Distributed Systems.

The assignment involves implementing a PingAck system for failure detection. Many students (including myself) were unaware of RPC libraries and built from scratch.

Very likely, at each node they have two threads/goroutines: one that sends the pings and one that receives the Acks.

Now imagine if this was not a distributed setting and you had to do something like PingAck in a single program — i.e. Ping (check something) and get an Ack (a return value): how would you implement it?

You would do it as a function. Right? You just pass an input to a function and it returns a value. The compiler/OS handles everything else for you.

So then why can’t you implement PingAck like that in a distributed setting? As just a function PingAck(node_ip) that returns Ack or times out? Why do you need two threads/goroutines, one for giving your input and one for receiving the return value?

Remote Procedure Calls (RPCs) are exactly that. In layman terms, an implementation of RPC is the boilerplate code you would need to implement your distributed PingAck as just:

Ack = PingAck(node_ip)

If Ack == "":
    print(Node Dead)
Else:
    print(Node Alive)

RPCs are complex pieces of software (for several reasons). One of the most popular implementations is gRPC from Google — available in C++ and Golang.

Note: “Why can’t I do something in a distributed setting exactly like how I would do it in a single-machine setting” is a powerful line of thinking. E.g. if you applied that to Python ML programs, you would end up with the famous Ray Project.

When RPC Is Not a Good Choice

In a later assignment, students had to send large files across nodes. Because they had been using RPCs until then, using the streaming feature of gRPC seemed like an obvious choice.

But the question is: “Are RPCs the right choice for sending large files over the network?” Likely not. Quoting from [1]: “The whole point of gRPC and Protobuf is to optimize serialization. When transferring files you don’t need serialization.”

  1. RPC protocols are designed to “package function calls into binaries efficiently such that they can be executed at any machine running the protocol.”
  2. When sending files, you are just sending data — there is not much “structure” to that data to deserve such packaging (a.k.a. serialization).
  3. So to send files you just do it like the rest of the internet usually does: simple HTTP servers. (There might be other/better choices — if you find out, let me know!)