drm/tegra NVIDIA Tegra GPU and display driver¶
NVIDIA Tegra SoCs support a set of display, graphics and video functions via the host1x controller. host1x supplies command streams, gathered from a push buffer provided directly by the CPU, to its clients via channels. Software, or blocks amongst themselves, can use syncpoints for synchronization.
Up until, but not including, Tegra124 (aka Tegra K1) the drm/tegra driver supports the built-in GPU, comprised of the gr2d and gr3d engines. Starting with Tegra124 the GPU is based on the NVIDIA desktop GPU architecture and supported by the drm/nouveau driver.
The drm/tegra driver supports NVIDIA Tegra SoC generations since Tegra20. It has three parts:
A host1x driver that provides infrastructure and access to the host1x services.
A KMS driver that supports the display controllers as well as a number of outputs, such as RGB, HDMI, DSI, and DisplayPort.
A set of custom userspace IOCTLs that can be used to submit jobs to the GPU and video engines via host1x.
Driver Infrastructure¶
The various host1x clients need to be bound together into a logical device in order to expose their functionality to users. The infrastructure that supports this is implemented in the host1x driver. When a driver is registered with the infrastructure it provides a list of compatible strings specifying the devices that it needs. The infrastructure creates a logical device and scan the device tree for matching device nodes, adding the required clients to a list. Drivers for individual clients register with the infrastructure as well and are added to the logical host1x device.
Once all clients are available, the infrastructure will initialize the logical device using a driver-provided function which will set up the bits specific to the subsystem and in turn initialize each of its clients.
Similarly, when one of the clients is unregistered, the infrastructure will destroy the logical device by calling back into the driver, which ensures that the subsystem specific bits are torn down and the clients destroyed in turn.
Host1x Infrastructure Reference¶
- struct host1x_client_ops
host1x client operations
Definition
struct host1x_client_ops {
int (*init)(struct host1x_client *client);
int (*exit)(struct host1x_client *client);
int (*suspend)(struct host1x_client *client);
int (*resume)(struct host1x_client *client);
};
Members
init
host1x client initialization code
exit
host1x client tear down code
suspend
host1x client suspend code
resume
host1x client resume code
- struct host1x_client
host1x client structure
Definition
struct host1x_client {
struct list_head list;
struct device *host;
struct device *dev;
struct iommu_group *group;
const struct host1x_client_ops *ops;
enum host1x_class class;
struct host1x_channel *channel;
struct host1x_syncpt **syncpts;
unsigned int num_syncpts;
struct host1x_client *parent;
unsigned int usecount;
struct mutex lock;
};
Members
list
list node for the host1x client
host
pointer to struct device representing the host1x controller
dev
pointer to struct device backing this host1x client
group
IOMMU group that this client is a member of
ops
host1x client operations
class
host1x class represented by this client
channel
host1x channel associated with this client
syncpts
array of syncpoints requested for this client
num_syncpts
number of syncpoints requested for this client
- struct host1x_driver
host1x logical device driver
Definition
struct host1x_driver {
struct device_driver driver;
const struct of_device_id *subdevs;
struct list_head list;
int (*probe)(struct host1x_device *device);
int (*remove)(struct host1x_device *device);
void (*shutdown)(struct host1x_device *device);
};
Members
driver
core driver
subdevs
table of OF device IDs matching subdevices for this driver
list
list node for the driver
probe
called when the host1x logical device is probed
remove
called when the host1x logical device is removed
shutdown
called when the host1x logical device is shut down
-
int host1x_device_init(struct host1x_device *device)¶
initialize a host1x logical device
Parameters
struct host1x_device * device
host1x logical device
Description
The driver for the host1x logical device can call this during execution of
its host1x_driver.probe
implementation to initialize each of its clients.
The client drivers access the subsystem specific driver data using the
host1x_client.parent
field and driver data associated with it (usually by
calling dev_get_drvdata()).
-
int host1x_device_exit(struct host1x_device *device)¶
uninitialize host1x logical device
Parameters
struct host1x_device * device
host1x logical device
Description
When the driver for a host1x logical device is unloaded, it can call this function to tear down each of its clients. Typically this is done after a subsystem-specific data structure is removed and the functionality can no longer be used.
-
int host1x_driver_register_full(struct host1x_driver *driver, struct module *owner)¶
register a host1x driver
Parameters
struct host1x_driver * driver
host1x driver
struct module * owner
owner module
Description
Drivers for host1x logical devices call this function to register a driver with the infrastructure. Note that since these drive logical devices, the registration of the driver actually triggers tho logical device creation. A logical device will be created for each host1x instance.
-
void host1x_driver_unregister(struct host1x_driver *driver)¶
unregister a host1x driver
Parameters
struct host1x_driver * driver
host1x driver
Description
Unbinds the driver from each of the host1x logical devices that it is bound to, effectively removing the subsystem devices that they represent.
-
int host1x_client_register(struct host1x_client *client)¶
register a host1x client
Parameters
struct host1x_client * client
host1x client
Description
Registers a host1x client with each host1x controller instance. Note that
each client will only match their parent host1x controller and will only be
associated with that instance. Once all clients have been registered with
their parent host1x controller, the infrastructure will set up the logical
device and call host1x_device_init()
, which will in turn call each client’s
host1x_client_ops.init
implementation.
-
int host1x_client_unregister(struct host1x_client *client)¶
unregister a host1x client
Parameters
struct host1x_client * client
host1x client
Description
Removes a host1x client from its host1x controller instance. If a logical device has already been initialized, it will be torn down.
Host1x Syncpoint Reference¶
-
u32 host1x_syncpt_id(struct host1x_syncpt *sp)¶
retrieve syncpoint ID
Parameters
struct host1x_syncpt * sp
host1x syncpoint
Description
Given a pointer to a struct host1x_syncpt, retrieves its ID. This ID is often used as a value to program into registers that control how hardware blocks interact with syncpoints.
-
u32 host1x_syncpt_incr_max(struct host1x_syncpt *sp, u32 incrs)¶
update the value sent to hardware
Parameters
struct host1x_syncpt * sp
host1x syncpoint
u32 incrs
number of increments
-
int host1x_syncpt_incr(struct host1x_syncpt *sp)¶
increment syncpoint value from CPU, updating cache
Parameters
struct host1x_syncpt * sp
host1x syncpoint
-
int host1x_syncpt_wait(struct host1x_syncpt *sp, u32 thresh, long timeout, u32 *value)¶
wait for a syncpoint to reach a given value
Parameters
struct host1x_syncpt * sp
host1x syncpoint
u32 thresh
threshold
long timeout
maximum time to wait for the syncpoint to reach the given value
u32 * value
return location for the syncpoint value
-
struct host1x_syncpt *host1x_syncpt_request(struct host1x_client *client, unsigned long flags)¶
request a syncpoint
Parameters
struct host1x_client * client
client requesting the syncpoint
unsigned long flags
flags
Description
host1x client drivers can use this function to allocate a syncpoint for
subsequent use. A syncpoint returned by this function will be reserved for
use by the client exclusively. When no longer using a syncpoint, a host1x
client driver needs to release it using host1x_syncpt_free()
.
-
void host1x_syncpt_free(struct host1x_syncpt *sp)¶
free a requested syncpoint
Parameters
struct host1x_syncpt * sp
host1x syncpoint
Description
Release a syncpoint previously allocated using host1x_syncpt_request()
. A
host1x client driver should call this when the syncpoint is no longer in
use. Note that client drivers must ensure that the syncpoint doesn’t remain
under the control of hardware after calling this function, otherwise two
clients may end up trying to access the same syncpoint concurrently.
-
u32 host1x_syncpt_read_max(struct host1x_syncpt *sp)¶
read maximum syncpoint value
Parameters
struct host1x_syncpt * sp
host1x syncpoint
Description
The maximum syncpoint value indicates how many operations there are in queue, either in channel or in a software thread.
-
u32 host1x_syncpt_read_min(struct host1x_syncpt *sp)¶
read minimum syncpoint value
Parameters
struct host1x_syncpt * sp
host1x syncpoint
Description
The minimum syncpoint value is a shadow of the current sync point value in hardware.
-
u32 host1x_syncpt_read(struct host1x_syncpt *sp)¶
read the current syncpoint value
Parameters
struct host1x_syncpt * sp
host1x syncpoint
-
struct host1x_syncpt *host1x_syncpt_get(struct host1x *host, unsigned int id)¶
obtain a syncpoint by ID
Parameters
struct host1x * host
host1x controller
unsigned int id
syncpoint ID
-
struct host1x_syncpt_base *host1x_syncpt_get_base(struct host1x_syncpt *sp)¶
obtain the wait base associated with a syncpoint
Parameters
struct host1x_syncpt * sp
host1x syncpoint
-
u32 host1x_syncpt_base_id(struct host1x_syncpt_base *base)¶
retrieve the ID of a syncpoint wait base
Parameters
struct host1x_syncpt_base * base
host1x syncpoint wait base
KMS driver¶
The display hardware has remained mostly backwards compatible over the various Tegra SoC generations, up until Tegra186 which introduces several changes that make it difficult to support with a parameterized driver.
Display Controllers¶
Tegra SoCs have two display controllers, each of which can be associated with zero or more outputs. Outputs can also share a single display controller, but only if they run with compatible display timings. Two display controllers can also share a single framebuffer, allowing cloned configurations even if modes on two outputs don’t match. A display controller is modelled as a CRTC in KMS terms.
On Tegra186, the number of display controllers has been increased to three. A display controller can no longer drive all of the outputs. While two of these controllers can drive both DSI outputs and both SOR outputs, the third cannot drive any DSI.
Windows¶
A display controller controls a set of windows that can be used to composite multiple buffers onto the screen. While it is possible to assign arbitrary Z ordering to individual windows (by programming the corresponding blending registers), this is currently not supported by the driver. Instead, it will assume a fixed Z ordering of the windows (window A is the root window, that is, the lowest, while windows B and C are overlaid on top of window A). The overlay windows support multiple pixel formats and can automatically convert from YUV to RGB at scanout time. This makes them useful for displaying video content. In KMS, each window is modelled as a plane. Each display controller has a hardware cursor that is exposed as a cursor plane.
Outputs¶
The type and number of supported outputs varies between Tegra SoC generations. All generations support at least HDMI. While earlier generations supported the very simple RGB interfaces (one per display controller), recent generations no longer do and instead provide standard interfaces such as DSI and eDP/DP.
Outputs are modelled as a composite encoder/connector pair.
RGB/LVDS¶
This interface is no longer available since Tegra124. It has been replaced by the more standard DSI and eDP interfaces.
HDMI¶
HDMI is supported on all Tegra SoCs. Starting with Tegra210, HDMI is provided by the versatile SOR output, which supports eDP, DP and HDMI. The SOR is able to support HDMI 2.0, though support for this is currently not merged.
DSI¶
Although Tegra has supported DSI since Tegra30, the controller has changed in several ways in Tegra114. Since none of the publicly available development boards prior to Dalmore (Tegra114) have made use of DSI, only Tegra114 and later are supported by the drm/tegra driver.
eDP/DP¶
eDP was first introduced in Tegra124 where it was used to drive the display panel for notebook form factors. Tegra210 added support for full DisplayPort support, though this is currently not implemented in the drm/tegra driver.
Userspace Interface¶
The userspace interface provided by drm/tegra allows applications to create GEM buffers, access and control syncpoints as well as submit command streams to host1x.
GEM Buffers¶
The DRM_IOCTL_TEGRA_GEM_CREATE
IOCTL is used to create a GEM buffer object
with Tegra-specific flags. This is useful for buffers that should be tiled, or
that are to be scanned out upside down (useful for 3D content).
After a GEM buffer object has been created, its memory can be mapped by an
application using the mmap offset returned by the DRM_IOCTL_TEGRA_GEM_MMAP
IOCTL.
Syncpoints¶
The current value of a syncpoint can be obtained by executing the
DRM_IOCTL_TEGRA_SYNCPT_READ
IOCTL. Incrementing the syncpoint is achieved
using the DRM_IOCTL_TEGRA_SYNCPT_INCR
IOCTL.
Userspace can also request blocking on a syncpoint. To do so, it needs to
execute the DRM_IOCTL_TEGRA_SYNCPT_WAIT
IOCTL, specifying the value of
the syncpoint to wait for. The kernel will release the application when the
syncpoint reaches that value or after a specified timeout.
Command Stream Submission¶
Before an application can submit command streams to host1x it needs to open a
channel to an engine using the DRM_IOCTL_TEGRA_OPEN_CHANNEL
IOCTL. Client
IDs are used to identify the target of the channel. When a channel is no
longer needed, it can be closed using the DRM_IOCTL_TEGRA_CLOSE_CHANNEL
IOCTL. To retrieve the syncpoint associated with a channel, an application
can use the DRM_IOCTL_TEGRA_GET_SYNCPT
.
After opening a channel, submitting command streams is easy. The application
writes commands into the memory backing a GEM buffer object and passes these
to the DRM_IOCTL_TEGRA_SUBMIT
IOCTL along with various other parameters,
such as the syncpoints or relocations used in the job submission.