Linux provides the KMS (Kernel Mode Setting) API to let applications query and configure display settings. It's used by Wayland compositors and other programs that need to configure the hardware directly. I found the C API a little verbose and hard to follow so I made libdrm-ocaml, which lets us run commands interactively in a REPL.
We'll start by discovering what hardware is available and how it's currently configured, then configure a monitor to display a simple bitmap, and then finally render a 3D animation. The post should be a useful introduction to KMS even if you don't know OCaml.
( this post also appeared on Hacker News )
Table of Contents
- Running it yourself
- Querying the current state
- Making changes
- 3D rendering
- Linux VTs
- Debugging
- Conclusions
Running it yourself
If you want to follow along, you'll need to install libdrm-ocaml and an interactive REPL like utop. With Nix, you can set everything up like this:
git clone https://github.com/talex5/libdrm-ocaml
cd libdrm-ocaml
nix develop
dune utop
You should see a utop # prompt, where you can enter OCaml expressions.
Use ;; to tell the REPL you've finished typing and it's time to evaluate, e.g.
1 2 | |
Alternatively, you can install things using opam (OCaml's package manager):
opam install libdrm utop
utop
Then, at the utop prompt enter #require "libdrm";; (including the leading #).
Querying the current state
Before changing anything, we'll start by discovering what hardware is available.
I'll introduce the API as we go along, but you can check the API reference docs if you want more information.
Finding devices
To list available graphics devices:
1 2 3 4 5 6 7 8 9 10 | |
libdrm scans the /dev/dri/ directory looking for devices.
It uses stat to find the device major and minor numbers and uses the virtual /sys filesystem to get information about each one.
This is a PCI device, and the information corresponds to the values from lspci, e.g.
$ lspci -nns 0:1:0.0
01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI]
Baffin [Radeon RX 550 640SP / RX 560/560X] [1002:67ff] (rev ff)
Each graphics device can have a primary and a render node. The primary node gives full access to the device, including configuring monitors, while the render node just allows applications to render scenes to memory. In the last post I was using the render to node to create a 3D image, and then sending it to the Wayland compositor for display. This time we'll be doing the display ourselves, so we need to open the primary node:
1 2 | |
To check the driver version:
1 2 3 | |
If you're familiar with the C API, this corresponds to the drmGetVersion function,
and Drm.Device.list corresponds to drmGetDevices2;
I reorganised things a bit to make better use of OCaml's modules.
Listing resources
Let's see what resources we've got to play with:
1 2 3 4 5 6 7 8 | |
Note: The Kernel Mode Setting functions are in the Drm.Kms module.
The C API calls these functions drmMode*, but I found that confusing as
e.g. drmModeGetResources sounds like you're asking for the resources of a mode.
A CRTC is a CRT Controller, and typically controls a single monitor (known as a Cathode Ray Tube for historical reasons). Framebuffers provide image data to a CRTC (we create framebuffers as needed). Connectors correspond to physical connectors (e.g. where you plug in a monitor cable). An Encoder encodes data from the CRTC for a particular connector.
Resources diagram (simplified)
Connectors
To save a bit of typing, I'll create an alias for the Drm.Kms module:
1
| |
You could also open Drm.Kms to avoid needing any prefix, but I'll keep using K for clarity.
To get details for the first connector (the head of the list):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | |
This is DisplayPort connector 1 (usually called DP-1) and it's currently Connected.
The connector also says which modes are available on the connected monitor.
I was lucky in that the first connector was the one I'm using,
but really we should get all the connectors and filter them to find the connected ones.
List.map can be used to run get on each of them:
1 2 3 4 5 | |
Then to filter:
1 2 3 4 5 6 | |
We'll investigate c, the first connected one:
1 2 3 | |
A note on IDs
In the libdrm C API, IDs are just integers. To avoid mix-ups, I made them distinct types in the OCaml API. For example, if you try to use an encoder ID as a connector ID:
1 2 3 4 5 6 | |
Normally this is what you want, but for interactive use it's annoying that you can't just pass a plain integer. e.g.
1 2 3 4 | |
You can get any kind of ID with Drm.Id.of_int (e.g. K.Connector.get dev (Drm.Id.of_int 71)),
but that's still a bit verbose, so you might prefer to (re)define a prefix operator for it, e.g.
1 2 3 | |
(note: ! is the only single-character prefix operator available in OCaml)
Modes
Modes are shown in abbreviated form in the connector output. To see the full list:
1 2 3 4 5 6 7 8 9 10 | |
Note: I annotated various pretty-printer functions with [@@ocaml.toplevel_printer],
which causes utop to use them by default to display values of the corresponding type.
For example, showing a list of modes uses this short summary form.
Displaying an individual mode shows all the information.
Here's the first mode:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | |
Properties
Some resources can also have extra properties.
Use get_properties to fetch them:
1 2 3 4 5 6 | |
Linux only returns a subset of the properties until you enable the atomic feature. Let's turn that on now:
1 2 | |
(Module.(expr) is a short-hand that brings all of Module's symbols into scope for expr,
so we don't have to repeat the module name for both set and atomic)
And getting the properties again, we now have an extra CRTC_ID,
telling us which controller this connector is currently using:
1 2 3 4 5 6 | |
Encoders
The Linux documentation says:
Those are really just internal artifacts of the helper libraries used to implement KMS drivers. Besides that they make it unnecessarily more complicated for userspace to figure out which connections between a CRTC and a connector are possible, and what kind of cloning is supported, they serve no purpose in the userspace API. Unfortunately encoders have been exposed to userspace, hence can’t remove them at this point. Furthermore the exposed restrictions are often wrongly set by drivers, and in many cases not powerful enough to express the real restrictions.
OK. Well, let's take a look anyway:
1 2 3 4 5 6 7 | |
Note: We need Option.get here because a connector might not have an encoder set yet.
Where the C API uses 0 to indicate no resource,
the OCaml API uses None to force us to think about that case.
As the documentation says, the encoder is mainly useful to get the CRTC ID:
1 2 | |
We could instead have got that directly from the connector using its properties:
1 2 | |
CRT Controllers
1 2 3 4 5 6 7 | |
An active CRTC has a mode set (presumably from the connector's list of supported modes), and a framebuffer with the image to be displayed.
If I keep calling Crtc.get, I see that it is sometimes showing framebuffer 93 and sometimes 94.
My Wayland compositor (Sway) updates one framebuffer while the other is being shown, then switches which one is displayed.
Framebuffers
My CRTC is currently displaying the contents of framebuffer 93:
1 2 | |
1 2 3 4 5 6 7 | |
A framebuffer has up to 4 framebuffer planes (not to be confused with CRTC planes; see later), each of which references a buffer object (also known as a BO and referenced with a GEM handle).
This framebuffer is using the XR24 format, where there is a single BO with 32 bits for each pixel
(8 for red, 8 green, 8 blue and 8 unused).
Some formats use e.g. a separate buffer for each component
(or a different part of the same buffer, using offset).
Modern graphics cards also support format modifiers, but my card is too old so I just get None.
Linux's fourcc.h header file describes the various formats and modifiers.
Modifiers seem to be mainly used to specify the tiling.
I don't have permission to see the buffer object, so it appears as (handle = None).
The pitch is the number of bytes from one row to the next (also known as the stride).
Here, the 15360 is simply the width (3840) multiplied by the 4 bytes per pixel.
CRTC planes
In fact, Crtc.get is an old API that only covers the basic case of a single framebuffer.
In reality, a CRTC can combine multiple CRTC planes, which for some reason aren't returned with the other resources
and must be requested separately:
1 2 | |
(note: you need to enable "atomic" mode before requesting planes; we already did that above)
1 2 3 4 5 6 7 8 9 10 11 12 | |
A lot of these planes aren't being used (don't have a CRTC), which we can check for with a helper function:
1 2 | |
Looks like Sway is using two planes at the moment:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | |
More information is available as properties:
1 2 3 4 5 6 7 8 9 10 11 | |
- Plane 52 is a
Primaryplane and is using framebuffer 93 (as we saw before). - Plane 55 is a
Cursorplane, using framebuffer 98 (and theAR24format, with alpha/transparency).
A plane chooses which part of the frame buffer to show (SRC_X, SRC_Y, SRC_W and SRC_H)
and where it should appear on the screen (CRTC_X, CRTC_Y, CRTC_W and CRTC_H).
The source values are in 16.16 format (i.e. shifted left 16 bits).
Oddly, Plane.get returned crtc_x,crtc_y = 0,0 for both planes, but
the properties show the correct cursor location (CRTC_X = 3105; CRTC_Y = 1518;).
Having the cursor on a separate plane avoids having to modify the main screen image whenever the mouse pointer moves, which is good for low latency (especially if the GPU is busy rendering something else at the time), power consumption (the GPU can stay powered down), and allows showing an application's buffer full screen without the compositor needing to modify the application's buffer.
You might also have some Overlay planes,
which can be useful for displaying video.
My graphics card seems to be too old for that.
Expanded resources diagram
Here's an expanded diagram showing some more possibilities:
- Some framebuffer formats take the input data from multiple buffers.
- A framebuffer can be shared by multiple CRTCs (perhaps with each plane showing a different part of it).
- A CRTC can have multiple planes (e.g. primary and cursor).
- A single CRTC can show the same image on multiple monitors.
Making changes
If I try turning off the CRTC (by setting the mode to None) from my desktop environment it fails:
1 2 | |
The reason is that I'm currently running a graphical desktop and Sway owns the device
(so my dev is not the DRM "master"):
1 2 | |
That can be fixed by switching to a different VT (e.g. with Ctrl-Alt-F2) and running it there. However, this will result in a second problem: I won't be able to see what I'm doing!
If you have a second computer then you can SSH in and test things out from there, but for simplicity we'll leave the utop REPL at this point and write some programs instead.
For example, query.ml shows the information we discovered above:
dune exec -- ./examples/query.exe
1 2 3 4 | |
Non-atomic mode setting
Linux provides two ways to configure modes: the old non-atomic API and the newer atomic one.
examples/nonatomic.ml contains a simple example of the older (but simpler) API.
It starts by finding a device (the first one with a primary node supporting KMS), then
finds all connected connectors (as we did above), and calls show_test_page on each one:
1 2 3 4 5 6 | |
restoring_afterwards stores the current configuration, runs the callback,
and then puts things back to normal when that finishes (or you press Ctrl-C).
The program waits for 2 seconds after showing the test page before exiting.
show_test_page finds the CRTC (as we did above),
takes the first supported mode, creates a test framebuffer of that size,
and configures the CRTC to display it:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | |
If the connector doesn't have a CRTC, we could find a suitable one and use that, but for simplicity the example just skips such connectors.
To run the example (switch away from any graphical desktop first or it won't work):
dune exec -- ./examples/nonatomic.exe
Dumb buffers
Typically the pixel data to be displayed comes from some complex rendering pipeline,
but Linux also provides dumb buffers for simple cases such as testing.
The Test_image.create function used above creates a dumb buffer with a test pattern:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | |