Improved UNIX socket networking in QEMU 7.2

QEMU 7.2 quietly introduced two new network backends, -netdev dgram and -netdev stream. Unlike the older -netdev socket, these new backends directly support AF_UNIX socket addresses without the need for an intermediate wrapper tool.

The situation up until now

QEMU has a -netdev socket network backend, which will send/receive Ethernet frames via TCP (the connect= and listen= modes) or UDP (the mcast= and udp= modes). This functionality isn't well documented, and its intended use appears to be as a sort of simple network hub for hosts that can't use a TAP device[1].

$ qemu-system-x86_64 --help
[...]
-netdev socket,id=str[,fd=h][,listen=[host]:port][,connect=host:port]
                configure a network backend to connect to another network
                using a socket connection
-netdev socket,id=str[,fd=h][,mcast=maddr:port[,localaddr=addr]]
                configure a network backend to connect to a multicast maddr and port
                use 'localaddr=addr' to specify the host address to send packets from
-netdev socket,id=str[,fd=h][,udp=host:port][,localaddr=host:port]
                configure a network backend to connect to another network
                using an UDP tunnel

A less-obvious (and completely undocumented) behavior of -netdev socket is that (1) the fd= syntax is actually its own mutually-exclusive mode, and (2) it doesn't need to be the file descriptor of a TCP socket in particular. This means it's possible to coax QEMU into using a UNIX socket for its network backend, by connecting to the socket in a wrapper process before spawning QEMU. The wrapper doesn't have to be complex; see qemu-wrapper.c for a working example in 50 lines of C.

Whatever process created the UNIX socket can of course do whatever it needs to with the raw Ethernet frames it receives, including acting as a switch or VPN or whatever. If you don't already have a preferred usermode network library, I recommend Scapy as a comprehensive and beginner-friendly option. For a starting point, try using print-frames.py to log network traffic of a Debian live CD:

New backends in QEMU 7.2

The QEMU 7.2 release adds two new network backends, -netdev dgram and -netdev stream. Although the related mailing list discussion[2] make it clear that the new functionality exists to better support UNIX sockets, in classic QEMU fashion this minor detail has been left out of the documentation[3].

-netdev stream,id=str[,server=on|off],addr.type=inet,addr.host=host,addr.port=port[,to=maxport][,numeric=on|off][,keep-alive=on|off][,mptcp=on|off][,addr.ipv4=on|off][,addr.ipv6=on|off]
-netdev stream,id=str[,server=on|off],addr.type=unix,addr.path=path[,abstract=on|off][,tight=on|off]
-netdev stream,id=str[,server=on|off],addr.type=fd,addr.str=file-descriptor
                configure a network backend to connect to another network
                using a socket connection in stream mode.
-netdev dgram,id=str,remote.type=inet,remote.host=maddr,remote.port=port[,local.type=inet,local.host=addr]
-netdev dgram,id=str,remote.type=inet,remote.host=maddr,remote.port=port[,local.type=fd,local.str=file-descriptor]
                configure a network backend to connect to a multicast maddr and port
                use ``local.host=addr`` to specify the host address to send packets from
-netdev dgram,id=str,local.type=inet,local.host=addr,local.port=port[,remote.type=inet,remote.host=addr,remote.port=port]
-netdev dgram,id=str,local.type=unix,local.path=path[,remote.type=unix,remote.path=path]
-netdev dgram,id=str,local.type=fd,local.str=file-descriptor
                configure a network backend to connect to another network
                using an UDP tunnel

			

The -netdev stream backend works just like the pseudo-TCP example above, but doesn't require a wrapper:

The -netdev dgram backend is a bit different. Since datagrams are inherently unidirectional, frames sent to the host use a separate socket from frames sent to the guest. The receiving program also needs to be adjusted, because QEMU (reasonably) doesn't length-prefix datagrams.

print-frames-dgram-arp.py is an expanded version of the earlier example. It waits for the VM to send an ARP request for address 192.168.100.101, then prints any frames received after that request. Within the VM I turned off Avahi (noisy), manually configured the network, and used Python to send a UDP packet.

Within the -netdev dgram flag, the value of local.path= is the socket address that the host will send frames to, and remote.path= is the socket address that the host will receive frames from.

Despite my general crankiness about the docs coverage, I'm quite happy to see this functionality land. Native support for AF_UNIX datagrams is exciting (for a certain type of person) because it eliminates a lot of the complexity involved in wiring up QEMU with a userspace network stack. Using UNIX sockets means you don't need to worry about port conflicts, it doesn't need TAP so it's sandbox-friendly, and the VM's network won't break if the packet processor restarts.

Appendix A: qemu-wrapper.c

Nothing fancy here, it just creates a socket and connects it to a user-provided path.

/* Copyright (c) John Millikin <john@john-millikin.com> */
/* SPDX-License-Identifier: 0BSD */
#include <stdio.h>
#include <string.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <sys/un.h>
#include <unistd.h>

int main(int argc, char **argv) {
	int sock_fd, rc;
	char *sock_path;
	size_t sock_path_len;
	struct sockaddr_un sock_addr = {AF_UNIX, ""};

	if (argc < 3) {
		fprintf(stderr, "Usage: %s <socket> <qemu> [args...]\n", argv[0]);
		return 1;
	}

	sock_path = argv[1];
	sock_path_len = strlen(sock_path);
	if (sock_path_len >= sizeof sock_addr.sun_path) {
		fprintf(stderr, "Socket path \"%s\" too long\n", sock_path);
		return 1;
	}
	memcpy(sock_addr.sun_path, sock_path, sock_path_len + 1);

	sock_fd = socket(AF_UNIX, SOCK_STREAM, 0);
	if (sock_fd == -1) {
		perror("Failed to create socket");
		return 1;
	}

	rc = connect(sock_fd, (struct sockaddr*)&sock_addr, sizeof sock_addr);
	if (rc == -1) {
		fprintf(stderr, "Failed to connect to socket \"%s\": ", sock_path);
		perror(NULL);
		return 1;
	}

	execv(argv[2], argv + 2);
	fprintf(stderr, "%s: ", argv[2]);
	perror(NULL);
	return 1;
}

Appendix B: print-frames.py

Reads Ethernet frames from a socket, then uses Scapy to parse and print them.

The expected format of the TCP stream doesn't seem to be documented. In my testing the Ethernet frames were always prefixed with their length as a big-endian 32-bit uint.

#!/usr/bin/python3
# Copyright (c) John Millikin <john@john-millikin.com>
# SPDX-License-Identifier: 0BSD
import os
import os.path
import socket
import struct
import sys

from scapy import all as scapy

socket_path = sys.argv[1]
if os.path.exists(socket_path):
    os.remove(socket_path)

sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
sock.bind(socket_path)
sock.listen(1)
conn, addr = sock.accept()
while True:
	frame_len_buf = conn.recv(4)
	if len(frame_len_buf) == 0:
		break
	(frame_len,) = struct.unpack("!L", frame_len_buf)
	frame = scapy.Ether(conn.recv(frame_len))
	print(repr(frame))
	print("")

Appendix C: print-frames-dgram-arp.py

Similar as above, but adjusted for unidirectional sockets and expanded to verify that sending frames (ARP responses) to the VM works as expected. Within the VM, ping 192.168.100.101 and watch the ICMP frames come through.

#!/usr/bin/python3
# Copyright (c) John Millikin <john@john-millikin.com>
# SPDX-License-Identifier: 0BSD
import os
import os.path
import socket
import sys

from scapy import all as scapy

send_socket_path = sys.argv[1]
recv_socket_path = sys.argv[2]
if os.path.exists(recv_socket_path):
    os.remove(recv_socket_path)

send_sock = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)
recv_sock = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)
recv_sock.bind(recv_socket_path)

ready = False
while True:
	frame = scapy.Ether(recv_sock.recv(9001))
	if ready:
		print(repr(frame))
		print("")

	if not isinstance(frame.payload, scapy.ARP):
		continue
	if frame.payload.op != 1: # who-has
		continue

	if frame.payload.pdst == "192.168.100.101":
		resp_bytes = scapy.raw((scapy.Ether(
			dst="52:54:00:12:34:56",
			src="52:54:00:12:34:ff",
		) / scapy.ARP(
			op=2, # is-at
			hwsrc="52:54:00:12:34:ff",
			psrc="192.168.100.101",
			hwdst="52:54:00:12:34:56",
			pdst="192.168.100.100",
		)))
		send_sock.sendto(resp_bytes, send_socket_path)
		ready = True

  1. https://wiki.qemu.org/Documentation/Networking#Socket

  2. [PATCH v7 00/14] qapi: net: add unix socket type support to netdev backend

  3. In case the reader thinks I'm being unfair by expecting --help output to have more detail, consider that the QEMU documentation page System Emulation ยป Invocation is the most complete reference I can find for QEMU's -netdev flags, and it doesn't even mention the dgram or stream network backends.

Change Feed