Overcoming Software Limitations in AMD-based FPGA Video Designs

Reading Time: 5 minutes

Introduction

In today’s market, video capture and display designs have become very common, especially in fields like the medical industry. While both FPGAs and traditional SoCs offer hardware IP that enable video capture, the former are a popular design choice, given the ability to fully customize a video pipeline, and the flexibility in both the design phase and in production. With devices like AMD’s (formally Xilinx) Zynq Ultrascale+, companies can have the best of both FPGA and SoC worlds, leveraging the customizability and flexibility of an FPGA, while still retaining access to powerful Arm processors and tools that can drive the capture and further processing of video.

In this article, we will explore the common use case of capturing video on AMD’s FPGA based SoCs and the importance of hardware design flexibility in the matter.

The Challenge

AMD provides a large suite of included video IP, including video scalers, colorspace converters, mixers, etc. that enable the user to quickly construct complex video capture pipelines. Additionally, AMD provides a suite of drivers, both bare metal and Linux that help to minimize software development efforts and cost for the development of a particular product.

Engineers often find that it is advantageous to leverage Linux in these designs, especially given the capability of frameworks like the V4L2 subsystem and Gstreamer. While AMD provides powerful drivers that allow their video IP to fit within the Linux ecosystem, these drivers often have specific design use cases or software limitations. It is important to distinguish hardware capabilities from software limitations. Said even more clearly, just because the hardware is capable of a function, this does not mean that the [existing] software driver or video framework is. This can have major ramifications on the time to market and overall non-recurring engineering cost for a product, especially if the designer is not familiar with the complexity of custom Linux drivers and customization.

To help the user understand this on a deeper level, we should examine three of the known design cases where the hardware may support the architecture but the Linux software does not:

Examples

#1 Gamma LUT IP

AMD’s Gamma LUT IP enables hardware-accelerated Gamma correction of video streams. This IP supports both RGB and YUV 4:4:4 streams. However, at the time this article was published, the associated Linux V4L2 subdevice driver only supports RGB mode.

#2 Framebuffer DMAs

The framebuffer read/write IP from AMD provides a format aware DMA engine with the ability to move video data to and from memory. In the video capture case, we are using the “write” flavor of the IP to capture video data to a buffer in memory. With this IP, AMD provides not only the DMA driver, but also a composite V4L2 driver. This is extremely powerful, as out of the box, the user can quickly utilize video capture functionality while minimizing the amount of custom software development work. However, users who want to use a YUV 4:4:4 memory format for the DMA will find that Gstreamer does not currently support any of these particular memory formats that the DMA engine can produce. While this does not mean custom software could not support it, this is a severe limitation in the case where the user wants to use Gstreamer as the primary tool for driving video capture.

#3 Video Mixer IP

The video mixer IP from AMD allows for the mixing and blending of multiple video streams into a single video output stream. Oftentimes it may be desirable to use this on the video capture side, to mix for example multiple camera inputs into a single one which can be captured by Linux. However, the mixer Linux driver is modeled as a DRM/KMS (Display) driver – that means that if it is to be used “out of the box” in Linux, it is intended to be used on the Display side and not directly as a video capture device.

Solutions

These are only three specific examples, however, depending on the use case, the user may find others. While virtually anything is possible given enough customization, practically speaking, most designers are often on limited schedules, budgets and perhaps even their knowledge in the matter. Fortunately with FPGAs, the hardware is flexible. The design team should leverage the flexibility of the hardware to their advantage whenever possible to maximize the re-use of existing software and tools.

For instance, in the case of example #1, if YUV 4:4:4 based Gamma correction is a hard requirement, but the software team is not able to customize the Linux driver, the hardware team could insert additional CSC IP to colorspace convert the video to RGB so it can be Gamma corrected and then back to YUV 4:4:4 later.

For example #2, the designer could consider perhaps leveraging CSC IP on the capture front end to convert YUV 4:4:4 to RGB in memory which is supported in Gstreamer. This could later be converted back in another hardware/software operation to YUV 4:4:4 later on in the video output or process. Alternatively, the designer could consider whether YUV 4:2:2 would be an acceptable alternative, since Gstreamer supports many more 4:2:2 formats.

#3 is a more complex case. Depending on the interest of this article, we may explore potential solutions for it in a future article.

Conclusion

Suffice to say, this subject is complex. It is thus important that you:

Educate you and your team on the various AMD IP, software drivers, and their system-level capabilities.
Start a conversation with both the hardware and software teams early on in the design phase to identify any of these potential challenges.
Be open to modifying requirements of the design where practical given your considerations for development cost and impact on system resources and throughput.

Additionally, consider bringing in an expert like Cornersoft Solutions to your design conversation and planning early on. Doing so can potentially save your company time and money in the end. Having a knowledgeable partner like us as another tool in your arsenal can only benefit you and your team long term.

What have you found is the most challenging aspect in your AMD based video capture design? Leave a comment below as we would love to discuss the topic some more with the community!