技术文档RPC 远程过程调用

RPC 远程过程调用

后端分布式网络

内容

什么是RPC

Remote Procedure Call (RPC) 远程过程调用协议,一种通过网络从远程计算机上请求服务，而不需要了解底层网络技术的协议。它使一台机器上的程序能够调用另一台机器上的子程序，而不会意识到它是远程的。提供抽象，即对用户隐藏网络通信的消息传递性质（find some abstractions for communication）

RPC 工作方式

RPC 的工作过程如下：

服务消费方（client）调用以本地调用方式调用服务；
client stub 接收到调用后负责将方法、参数等组装成能够进行网络传输的消息体；这就是所谓的marshalling。
client stub 找到服务地址，并将消息发送到服务端；这个过程可能是面向连接或无连接的。
server stub 收到消息后进行解码；也叫 unmarshalling。
server stub 根据解码结果调用本地的服务；
本地服务执行并将结果返回给 server stub ；
server stub 将返回结果打包成消息并发送至消费方；
client stub 接收到消息，并进行解码；
服务消费方得到最终结果。并且返回值被设置在本地进程的堆栈中。

marshalling & unmarshalling

marshalling:序列化，把数据结构转换成可传输格式（如 {2, 42, 50.00}） unmarshalling:解析收到的数据，如 bytes 1-4 of msg 提取账户 ID。

client() {
   s = socket(UDP)
   msg = {2, 42, 50.00}           // marshalling 序列化
   send(s, server_address, msg)

   response = receive(s)
   check response == "OK"

   msg = {1, 42}
   send(s -> server_address, msg)
   response = receive(s)

   print "balance is" + response
}

server() {
   s = socket(UDP)
   bind s to port 1024
   while (1) {
      msg, client_addr = receive(s)
      type = byte 0 of msg
      if (type == 1) {
          account = bytes 1-4 of msg    // unmarshalling
          result = balance(account)
          send(s -> client_addr, result)
      } else if (type == 2) {
          account = bytes 1-4 of msg
          amount = bytes 5-12 of msg
          deposit(account, amount)
          send(s -> client_addr, "OK")
      }
}

stub

在 RPC 机制中，存根（stub）是一个代理（proxy），用于隐藏远程调用的底层细节，使远程调用看起来像本地调用一样。

客户端存根（Client Stub）
- 负责封装函数调用的参数，并将其转换成适合网络传输的格式（序列化/Marshalling）。
- 发送数据到服务器，并等待服务器的响应。
- 解析服务器的响应（反序列化/Unmarshalling），并将结果返回给客户端应用程序。

deposit_stub(int account, float amount) {
   // marshall request type + arguments into buffer
   // send request to server
   // wait for reply
   // decode response
   // return result
}

服务器存根（Server Stub）
- 负责接收客户端请求并解析数据（反序列化）。
- 调用实际的服务器端逻辑（如数据库查询、存款操作等）。
- 将结果封装并发送回客户端。

loop:
   wait for command
   decode and unpack request parameters
   call procedure
   build reply message containing results
   send reply

RPC semantics

semantics=meaning：语义描述的是代码的预期行为，即调用 RPC_deposit(server, 42, $50.00); 之后，我们应该期待什么？ RPC需要确认请求已经执行，否则会有语义错误。

RPC semantics #1: at-least-once RPC

请求至少执行一次，但可能执行多次（如果客户端超时重试）。

Procedure is executed on the server one or multiple times
Client retries request until it gets a response
When is it useful?
- If operations are idempotent (i.e., same effect whether done once or multiple times) check balance
- idempotent幂等性：操作不管执行多少次结果都是相同的
When is it not useful?

RPC semantics #2: at-most-once RPC

请求最多执行一次，但可能不会执行（如果网络丢包，客户端就收不到结果）。

Procedure is either executed once or not executed at all
How to ensure at-most-once?
- Client includes a unique ID in every request (how to generate this?)
- Server keeps a history of requests it has already answered, their ID, and the result
- If duplicate, server resends result (and no re-execution)
- When can the server garbage collect RPC history?

Garbage Collection Strategy

Option 1: Never?
Option 2:(高并发)
- unique client IDs 每个客户端有唯一 ID（防止不同客户端混淆请求）。
- per-client RPC sequence numbers 每个客户端的 RPC 具有序列号，例如 RPC_Seq = 10, 11, 12...
- client includes "seen all replies <= X" with every RPC 客户端告诉服务器："我已经看到所有回复 ≤ X"，服务器可以丢弃这些老的 RPC。
- 优点：
  - 服务器可以根据客户端的反馈丢弃旧的 RPC，减少存储需求。
  - 客户端可以同时发送多个请求，而不需要等待前一个完成。
Option 3:（单线程）
- only allow client one outstanding RPC at a time 客户端一次只能发送一个未完成的 RPC。
- arrival of seq+1 allows server to discard all <= seq 服务器只存储最近的 RPC，如果收到 seq+1，它可以安全地丢弃 seq 及更早的请求。

RPC semantics #3: exactly-once RPC

At-most-once + client retries until success
However...Not possible in general