Rust and dynamically-sized thin pointers

One of Rust's notable differences from C is its requirement that all values have a defined size, which enables runtime bounds-checking and advanced static analysis tooling such as MIRI. For dynamically-sized types (DSTs) this requirement is implemented using thick pointers, such that each pointer to a dynamically-sized value is an (address, size) tuple.

Thick pointers are more convenient and easier to use correctly than the C idiom of passing around value sizes manually, but they have a performance drawback – each thick pointer takes up twice as many registers as a thin pointer, even when pointing to values for which the size can be trivially computed. This overhead is especially noticeable for code that processes packet-based network protocols, which can cause Rust code to underperform C in that niche.

This page is an accompaniment to RFC 3536, which proposes that Rust should support thin pointers to DSTs.

Dynamically-sized types in C

Before getting into Rust's treatment of DSTs, it's useful to have an idea of how C handles them. The C language has been in active industrial use for about 50 years, so many interesting designs have been explored and some have been incorporated into the language itself.

Arrays and strings

The simplest DSTs in C are dynamically-allocated arrays, which are simply contiguous sequences of values. The number of values cannot be computed from the array content, so the size (or length) must be passed around as a second parameter to any function that will operate on the array.

bool validate_utf16(uint16_t *values, size_t *values_len) {
	/* size of `*values` is `values_len * sizeof(uint16_t)` */
}

Slightly more complex are arrays terminated by some sentinel value, such as NUL-terminated strings. The size of these values can be computed, but not in constant time – more importantly, the size of a value (1) can change and (2) can be lost. For example a C string can be truncated by writing '\x00' into the middle, and its size can be lost by overwriting the NUL terminator. This property makes traditional C string manipulation code difficult to reason about.

bool validate_utf8(const char *str) {
	/* This call might take a very long time, or even crash the process. */
	size_t str_len = strlen(str);
}

Due to the unpredictable behavior of C string manipulation code, modern systems programming languages generally avoid NUL-termination (or other inline sentinel values). Even in C, newly-written APIs tend to pass around the length explicitly (compare strcat() with strncat() and strlcat()). Many modern C codebases have adopted some form of thick pointer idiom for dynamically-sized values, for example GLib's GByteArray and GString.

struct GByteArray {
	uint8_t *data;
	unsigned int len;
}

struct GString {
	char *str;
	size_t len;
	size_t allocated_len;
}

Flexible array members

When working with low-level protocols it's common to encounter variable-length structures consisting of a fixed header followed by a blob of payload bytes. The header contains enough info to identify the layout of the payload, which can then be parsed with content-specific logic.

    byte
+----------+--------+--------+--------+--------+
|   [0..4) |      length (little-endian)       |
+----------+--------+--------+--------+--------+
|   [4..6) | request id (LE) |
+----------+--------+--------+
|        6 | opcode |
+----------+--------+
|        7 | flags  |
+----------+--------+--------+--------+--------+
| [8, ...) |               data                |
+----------+--------+--------+--------+--------+

Pointers to the buffer only need to contain the address of the first byte, because the buffer itself contains the length. This allows functions to have fewer parameters, improving performance due to reduced stack spills.

/* Parameters fit in registers on x86_64 with System V calling convention */
void packet_tee(
	  struct TeeOptions *options
	, struct Packet *packet
	, uint8_t *out_a, size_t out_a_len
	, uint8_t *out_b, size_t out_b_len
);

Traditionally this sort of layout would be implemented by placing a placeholder array at the end of a struct, either of length 0 or length 1. Examples exist from many long-lived codebases, including both Windows and Linux.

/* https://source.winehq.org/git/wine.git/blob/wine-9.10:/include/winnt.h */
#define ANYSIZE_ARRAY   1
typedef struct _TOKEN_GROUPS {
    DWORD GroupCount;
    SID_AND_ATTRIBUTES Groups[ANYSIZE_ARRAY];
} TOKEN_GROUPS, *PTOKEN_GROUPS;

/* https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/linux/inotify.h?h=v5.6 */
struct inotify_event {
	__s32   wd;             /* watch descriptor */
	__u32   mask;           /* watch mask */
	__u32   cookie;         /* cookie to synchronize two events */
	__u32   len;            /* length (including nulls) of name */
	char    name[0];        /* stub for possible name */
};

/* https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/linux/soundcard.h?h=v5.6 */
struct sysex_info {
	short key;              /* Use SYSEX_PATCH or MAUI_PATCH here */
#define SYSEX_PATCH	_PATCHKEY(0x05)
#define MAUI_PATCH	_PATCHKEY(0x06)
	short device_no;        /* Synthesizer number */
	int len;                /* Size of the sysex data in bytes */
	unsigned char data[1];  /* Sysex data starts here */
};

This idiom was so widespread that C99 incorporated it into the language as "flexible array members", which uses a slightly different syntax to differentiate from the case of an actual fixed-size array.

As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member. [...] when a . (or ->) operator has a left operand that is (a pointer to) a structure with a flexible array member and the right operand names that member, it behaves as if that member were replaced with the longest array (with the same element type) that would not make the structure larger than the object being accessed;

The Linux kernel has been migrating to C99 flexible array members over time.

/* https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/linux/inotify.h?h=v6.9 */
struct inotify_event {
	__s32   wd;             /* watch descriptor */
	__u32   mask;           /* watch mask */
	__u32   cookie;         /* cookie to synchronize two events */
	__u32   len;            /* length (including nulls) of name */
	char    name[];         /* stub for possible name */
};

Pointers to incomplete types

C permits structures to be declared without being defined, which is widely used to reduce transitive includes of header files. If the header never defines the struct then it is an "incomplete type" – pointers to incomplete types may be used like any other pointer, but the type itself (or any value of that type) cannot be inspected. Incomplete types are frequently used by C programmers to exclude implementation details from the public API.

struct SomeStruct;

struct SomeStruct *some_struct_new();
void some_struct_free(struct SomeStruct *);

An interesting consequence of incomplete types is that the pointers don't have to actually point to anything. C programmers are used to thinking of pointers and integers as basically interchangeable[0], and many extant C libraries will do things like stuff metadata into the low bits of an aligned pointer[1].

Dynamically-sized types in Rust

One of Rust's core design goals is to address the security and correctness problems inherent in dynamically-allocated values without a known size. Every Rust value has a known size and alignment, and the type system prevents dynamically-sized values from being used in contexts that require a statically-known size.

The core::mem module provides functions to inspect the size and alignment of values at runtime, and this ability is heavily used in the dynamic allocation subsystem (Box, etc). The existence of these functions implies that all Rust references have a known size and alignment[2].

// Every `Sized` type has a statically-known size and alignment.
pub const fn align_of<T>() -> usize
pub const fn size_of<T>() -> usize

// Every referenced value has a known size and alignment.
pub fn align_of_val<T: ?Sized>(val: &T) -> usize
pub fn size_of_val<T: ?Sized>(val: &T) -> usize

Slices and strings

Where C has dynamically-sized arrays and NUL-terminated strings, Rust has the slice and str types. Pointers to these types are an (address, size) tuple, making it impossible to accidentally lose track of how large any particular value is.

This also extends to values that contain a slice or string. The Rust equivalent to C's flexible array members uses roughly the same syntax, but the pointers to such a type will contain the size of the dynamically-sized field. A Rust binding to the Linux kernel's inotify API might contain code like this:

use core::ffi::c_char;

// A `*const InotifyEvent` will be twice the size of a `*const ()`, due
// to storing the length of `name`.
struct InotifyEvent {
	wd: i32,        // watch descriptor
	mask: u32,      // watch mask
	cookie: u32,    // cookie to synchronize two events
	len: u32,       // length (including nulls) of name
	name: [c_char], // stub for possible name
}

Note that the length in a *const InotifyEvent pointer is redundant with the length in the InotifyEvent::len field – that redundancy is what this page is intended to identify a solution for.

External types

Rust programmers that want to write bindings to C APIs that use incomplete types as opaque handles found themselves with a problem. Rust doesn't have the concept of an incomplete type, which means there's no good way to declare external FFI functions with type-safe pointer types.

extern "C" {
    fn some_struct_new() -> *mut SomeStruct;
    fn some_struct_free(_: *mut SomeStruct);
}

// In C this is a forward declaration, but in Rust this is declaring a
// statically-sized empty structure.
//
// SomeStruct would implement `Sized`, and thus could be (erroneously)
// passed directly as a function parameter, used as an array item type, etc.
struct SomeStruct;

// This definition of `SomeStruct` wouldn't implement Sized, but because
// Rust uses thick pointers for dynamically-sized types the `extern`
// function declarations would have the wrong ABI.
struct SomeStruct {
	_opaque: [u8],
}

// An uninhabited enum implies that `*const SomeStruct` will never point
// to a valid value, which in turn implies that creating a `&SomeStruct`
// reference is undefined behavior.
enum SomeStruct {}

One proposed solution to this is RFC 1861, which allows declaring external types. These types do double-duty as both incomplete types and flexible array members[3].

#![feature(extern_types)]
extern "C" {
	type SomeStruct;
	type FlexibleArrayU8;
}

// These function declarations have the correct ABI.
extern "C" {
    fn some_struct_new() -> *mut SomeStruct;
    fn some_struct_free(_: *mut SomeStruct);
}

// `HasFlexibleArray` is `!Sized` and `*const HasFlexibleArray` is a thin pointer.
struct HasFlexibleArray {
	len: u32,
	data: FlexibleArrayU8,
}

Unfortunately, combining these properties into a single feature has lead to conflicts when the semantics of C's incomplete types and flexible array members diverge. For example, an incomplete type doesn't have a known alignment[4], which means that it's incoherent for a reference to an extern type to exist.

// This is fine as long as the flexible array member really is a `[u8]`.
fn align_of_HasFlexibleArray(x: &HasFlexibleArray) {
	println!("static alignment: {}", core::mem::align_of::<InotifyEvent>());
	println!("dynamic alignment: {}", core::mem::align_of_val(x));
}

// What should this function print?
fn align_of_SomeStruct(x: &SomeStruct) {
	println!("alignment: {}", core::mem::align_of_val(x));
}

There are three relatively straightforward solutions to this problem, but they all require decoupling incomplete types from flexible array members.

The first option is to simply forbid extern types from becoming references or struct fields. I say "simply" here because it's easy to describe the goal, but the implications for the language semantics are tricky. It also might just be deferring the solution until later, because core::mem::align_of_val_raw() may stabilize one day, and then that function's behavior when passed a pointer to a value with an extern type would be difficult to specify.

A second option is to introduce a pointer-ish type that isn't actually a pointer. It would be the same size as a pointer and propagate provenance, but couldn't be directly converted to a pointer or reference. The guarantees this type would provide are weaker than actual pointers within the Rust aliasing model, since there's no way to tell what the C library stuffed into its bits. Semantically this might be similar to WebAssembly references, which are similarly opaque.

use core::ffi::Incomplete;
struct SomeStruct;

extern "C" {
    fn some_struct_new() -> Incomplete<SomeStruct>;
    fn some_struct_free(_: Incomplete<SomeStruct>);
}

The final option is to just give up on representing C incomplete types in the Rust language. FFI bindings to libraries that use incomplete types would be required to define their own wrapper types, with whatever semantics make sense for that library's data model. There would no longer be any way to identify SomeStruct within the Rust type system, only the handle type – which, being a pointer, is Sized.

#[repr(transparent)]
struct SomeStructRef { ptr: NonNull<()> }

extern "C" {
    fn some_struct_new() -> SomeStructRef;
    fn some_struct_free(_: SomeStructRef);
}

Whatever the future of extern types looks likes, my hope is that decoupling them from unsized thin pointers will improve the roadmap clarity for both features.

Proposal for !Sized thin pointers

The fundamental goal of this proposal is performance optimization. It should be possible to write a library in pure Rust that performs processing of (for example) IP packets with performance equivalent to a well-optimized C implementation of the same logic. Achieving that goal may require writing Rust that is non-idiomatic, including the use of unsafe in cases where a less performant solution would be able to use safe APIs.

As a starting point, consider a dynamically-sized value as being equivalent to a union with lots of array members:

#[repr(C)]
pub struct Packet {
	data_len_le: u16,
	data: [u8],
}

// IS EQUIVALENT TO

#[repr(C)]
pub struct Packet {
	data_len_le: u16,
	data: PacketData,
}
union PacketData {
	len_n0: [u8; 0],
	len_n1: [u8; 1],
	len_n2: [u8; 2],
	// ...
	len_max: [u8; u16::MAX as usize],
}

In such a type it is possible to safely obtain the address of Packet::data , but all members of the data array are potentially uninitialized and/or beyond the bounds of the allocated object. The only safe operation on PacketData is to interpret it as a 0-sized array – further conversion requires unsafe code to assert the validity of the data length.

impl Packet {
	pub fn data(&self) -> &[u8] {
		let data_ptr = self.data.as_ptr();
		let data_len = usize::from(u16::from_le(self.data_len_le));
		// SAFETY: The `Packet` type must ensure that the data length is valid.
		unsafe { core::slice::from_raw_parts(data_ptr, data_len) }
	}
}

impl PacketData {
	fn as_ptr(&self) -> *const u8 {
		// SAFETY: 0 is always a valid array length
		core::ptr::from_ref(unsafe { &self.len_n0 }).cast::<u8>()
	}
	fn as_mut_ptr(&mut self) -> *mut u8 {
		// SAFETY: 0 is always a valid array length
		core::ptr::from_mut(unsafe { &mut self.len_n0 }).cast::<u8>()
	}
}

The above code works in that rustc generates the correct assembly, but it doesn't work semantically because Packet implements Sized. That also means that size_of_val() will return incorrect results.

There needs to be a way to tell the compiler that Packet is a special kind of unsized type, one that doesn't need thick pointers. That hint also needs to be wired into some user-provided code that can report the size, otherwise size_of_val() can't work.

As a rough sketch, imagine a #[repr(thin_unsized)] attribute that marked a DST type as being a thin-pointer type, and required that type to implement a companion trait ThinUnsized.

// Safety:
// * The reported size of a value must not change during its lifetime.
unsafe trait core::marker::ThinUnsized {
	// Safety:
	// * The pointer must be correctly aligned and point to an initialized value.
	unsafe fn size_of_val_raw(ptr: *const Self) -> usize;
}

#[repr(C, thin_unsized)]
pub struct Packet {
	data_len_le: u16,
	data: [u8],
}

unsafe impl ThinUnsized for Packet {
	unsafe fn size_of_val_raw(ptr: *const Self) -> usize {
		let data_len = u16::from_le(ptr.cast::<u16>().read());
		core::mem::size_of::<u16>() + usize::from(data_len)
	}
}

Rust permits DSTs to be used as the final field of a struct, in which case that struct also becomes a DST. Thin-pointer DSTs would behave similarly, with no manual implementation of ThinUnsized required (or permitted).

// ContainsPacket is !Sized and implements ThinUnsized by adding a fixed
// overhead to the result of `Packet::ThinUnsized::size_of_val_raw()`.
pub struct ContainsPacket {
	some_data: u32,
	packet: Packet,
}

Interactions with Mutex and Box

Originally I was unsure whether a thin-pointer DST should have an immutable size. One could imagine, for example, a buffer that could be used to assemble an IP packet by incrementally appending data to the payload. However, discussion on the RFC revealed that the proposed new behavior would only be compatible with existing semantics if the reported size was the same during the entire lifetime of the object.

The first case to consider is Box, which can store unsized values (e.g. Box<[u8]>). When the Box is dropped the value's size is used to construct a Layout for deallocation. If the size of a Packet could change then Box<Packet> would be unsound, and introducing a new ?Trait bound to signify values with immutable sizes is impractical for numerous reasons.

Secondly, it is possible to call size_of_val() on a &Mutex. If the size of a value within a Mutex can change then it would be unsafe to compute its value without holding the lock, which implies size_of_val might deadlock or panic.

Thus, one of the safety preconditions for ThinUnsized must be that the reported size does not change during the value's lifetime.

The C99 language spec itself sorta guarantees this property via the extistence of uintptr_t, and to make matters worse there's a huge number of C libraries out there that assume size_t and uintptr_t are equivalent for all purposes. I won't get into the code that assumes pointer-sized unsigned int and/or unsigned long, which is unfortunately common in many older codebases.
If a type has an alignment of 8 (for example), then any pointer to a value of that type will have its lower 3 bits unset. These bits can be used to attach metadata, such as whether the value is dynamically or statically allocated. API functions that need to dereference the pointer will mask off the low bits first.
Although the requirement that all referenced values have a known size and alignment is (to my knowledge) not formally documented, it is baked into the language in various ways. My working assumption is that such a requirement will become part of a future Rust language spec, should such a document ever be written.
The Rust compiler is already using extern types to implement flexible array members in its own code. See OpaqueListContents in compiler/rustc_middle/src/ty/list.rs.
Incomplete types don't even necessarily have a knowable alignment. It's valid for a C library to return values of different types from some_struct_new(), as long as all of those types start with a SomeStruct.