This page documents habits and styles I've found useful when working with gRPC and Protocol Buffers.
Use the google.protobuf.Status
message to report errors back to clients – this type should be special-cased by the gRPC library for your language (e.g. grpc-go has "google.golang.org/grpc/status"
. This message can contain arbitrary sub-messages, so servers can offer basic error messages to all clients and structured errors to clients that can handle them.
See google/rpc/code.proto
for details on the meaning of each error code, and the Google Cloud Error Model for good advice on how to write error messages.
Server-side handlers should always propagate deadlines. Clients should almost always set deadlines. Prefer deadlines to timeouts, because the meaning of an absolute timestamp is less ambiguous than a relative time when working across a network boundary.
Depending on your implementation library, it may be possible to define default timeouts in the service schema. Don't do this – the schema author cannot predict what behavior will be appropriate for all implementations or users.
Always represent and store gRPC addresses as a full string, following the URL-like syntax used by gRPC Name Resolution. Restrictive formats like "IP+port tuple" will annoy users who want to run your code as part of a larger framework or integration test, which may have its own ideas about network addresses.
Let addresses be set in a command-line flag or config file, so users can configure them without having to patch your binary. Do this even if you're really really sure the entire world wants to run your service on port 80.
gRPC supports uni-directional and bi-directional message streams. Use streams if the amount of data being transferred is potentially large, or if the other side can meaningfully process data before the input has been fully received. For example, a service offering a SHA256 method could hash the input chunks as they arrive, then send back the final digest when the client closes the request stream.
Streaming is more efficient than sending a separate RPC for each chunk, but less efficient than a single RPC with all chunks in a repeated field. The overhead of streaming can be minimized by using a batched message type.
service Foo {
rpc MyStream(FooRequest) returns (stream MyStreamItem);
}
message MyStreamItem {
repeated MyStreamValue values = 1;
}
message MyStreamValue {
// ... fields for each logical value
}
WARNING: In some implementations (e.g. grpc-go), the stream handles are not thread-safe even if the client stub is. Interacting with a stream handle from multiple threads may cause unpredictable behavior, including silent message corruption.
Each method in your service should have its own Request and Response messages.
service Foo {
rpc Bar(BarRequest) returns (BarResponse);
}
message BarRequest { ... }
message BarResponse { ... }
Don't use the same message for multiple methods unless they're literally implementing the same method with a different API (e.g. unary and streaming variants accepting the same response). Even then, prefer a different type for the part of the API that may vary.
service Foo {
rpc Bar(BarRequest) returns (BarResponse);
rpc BarStream(BarRequest) returns (stream BarResponseStreamItem);
}
message BarRequest { ... }
message BarResponse { ... }
message BarResponseStreamItem { ... }
WARNING: Do not use google.protobuf.Empty
as a request or response type. The API documentation in google/protobuf/empty.proto
is an anti-pattern. If you use Empty, then adding fields to your request/response will be a breaking API change for all clients and servers.
Use a package name including your project name, company (if applicable), and Semantic Versioning major version. The exact format depends on personal taste – popular formats include reverse domain name notation as used in Java, or $COMPANY.$PROJECT
as used by core gRPC types.
com.mycompany.my_project.v1
com.mycompany.MyProject.v1
mycompany.my_project.v1
API versions that are not fully stabilized should have a version suffix like v1alpha
, v2beta1
, or v3test
– see the Kubernetes API versioning policy for more thorough guidance.
Protobuf package names are used in generated code, so try to avoid name components that are commonly used for built-in types or keywords (like return
or void
). This is especially important for generating C++, which (as of protobuf 3.6) does not have a FileOption
to override the default namespace
name calculation.
Try to structure your proto file's on-disk layout so that import
paths match the package name: types in mycompany.my_project.v1
should be imported with import "mycompany/my_project/v1/some_file.proto"
. This is not required by the Protobuf toolchain, but does help humans remember what to type.
Note that if you're using Bazel's built-in proto_library()
rule, it doesn't currently support adjusting the import paths (bazelbuild/bazel#3867). Until that feature is implemented, you'll need to either write your own proto_library
in Starlark, or simply put the .proto sources in the desired directory structure.
In large protobuf messages, it can be annoying to figure out which field number should be used for new fields. To simplify the life of future editors, add a comment at the end of your messages and enums.
message MyMessage {
// ... lots of fields here ...
// NEXT: 42
}
Enum symbol scoping follows old-style C/C++ rules, so that the defined names are not scoped to the enum name:
// symbol `FUN_LEVEL_HIGH' is of type `FunLevel'.
enum FunLevel {
FUN_LEVEL_UNKNOWN = 0;
FUN_LEVEL_LOW = 1;
FUN_LEVEL_HIGH = 2;
// NEXT: 3
}
This can be awkward for users accustomed to languages with more modern scoping rules. I like to wrap the enum in a message:
// symbol `FunLevel::HIGH` is of type `FunLevel::Enum`.
message FunLevel {
enum Enum {
UNKNOWN = 0;
LOW = 1;
HIGH = 2;
// NEXT: 3
}
}
If a field has been deleted, its field number must not be reused by future field additionsreserved
keyword. I always reserve both the field name and number.
enum FunLevel {
// removed -- too much fun
reserved "FUN_LEVEL_EXCESSIVE"; reserved 10;
}
message MyMessage {
reserved "crufty_old_field"; reserved 20;
}
Protobuf doesn't have a built-in generator for API documentation. Of the available options, protoc-gen-doc
seems the most mature. See the protoc-gen-doc
README for syntax and examples.
Protobuf doesn't have a built-in validation mechanism, other than the required
in proto2 (removed in proto3). Lyft's protoc-gen-validate
tool is the best solution I know of for this, though it's in early alpha and currently only supports Go.
In proto3, the ability to mark scalar fields (int32
, string
, etc) as optional was removed. Scalar fields are now always present, and will be a default "zero value" if not otherwise set. This can be frustrating when designing a schema for a system where ""
and NULL
are logically distinct values.
The official workaround is a set of "wrapper types", defined in google/protobuf/wrappers.proto
, that define single-valued messages. Your schema can use .google.protobuf.Int32Value
instead of int32
to get optionality.
import "google/protobuf/wrappers.proto";
message MyMessage {
.google.protobuf.Int32Value some_field = 1;
}
Another approach is to wrap the scalar field in oneof
, with no other choices. This forces even scalar fields to have optionality, and adds helper methods in generated code to detect if the field was set.
message MyMessage {
oneof oneof_some_field {
int32 some_field = 1;
}
}
For a motivational lesson in reuse of field identifiers, see SEC administrative proceeding 3-15570 against Knight Capital regarding loss of $460 million USD in 45 minutes.