Gaming PC

AV1 Video Encode at 1W Per Stream

AMD is announcing new dedicated media accelerator and video encoding cards for the data center this morning. This is the first Alveo MA35D released under the AMD brand. This card is a successor to the earlier Xilinx card line adopted by AMD as part of its acquisition of Xilinx, entering the market for dedicated video encoding cards. The latest generation Alveo media accelerator cards add encoding support for AV1 and 8K resolutions while quadrupling the maximum number of simultaneous video streams, promising a significant performance boost over the previous generation.

Like its predecessor, the Alveo U30, the MA35D is a pure video encoding card designed for the data center. So that his ASIC was designed specifically for real-time/interactive video encoding and Xilinx he aims to do one thing very well. This design strategy contrasts sharply with Intel (GPU Flex Series) and his NVIDIA (T4 & L4) competitors. These products are GPU-based and leverage the flexibility of GPUs and integrated video encoders to serve as video. Encode cards, game cards, or other roles assigned to them. In comparison, the MA35D is a relatively simple product, focused solely on making video encoding more optimal and efficient.

The Alveo MA35D is both new and familiar to AMD, as it is a product line AMD inherited as part of its Xilinx acquisition and consequently developed by the Adaptive & Imbedded Computing Group. Previous data center video encoding products released by AMD were based on the GPU lineup, so this is the latest video encoding card for the former Xilinx team, but this is why AMD has launched such a dedicated video encoding card. It is the first time. –And is a prime example of the new market opportunities AMD was looking for when acquiring Xilinx.

The card’s target market, like its predecessor, is the data center market. AMD’s main clients are live streaming services and other interactive video services (Twitch, cloud gaming, video conferencing, etc.), which require a large number of video streams to be encoded in real time in a server environment. Like AMD’s EPYC processors, this is part of a server aimed at a very select group of companies.

Diving into the Alveo MA35D hardware itself, AMD is touting a significant generational upgrade over its predecessor. The Alveo U30 was an H.264 and H.265 encoding card that could encode up to 8 1080p streams, but the Alveo MA35D greatly extends this to his 32 1080p streams. Meanwhile, support for the latest generation AV1 codec has been added, joining the existing H.264 and H.265 options and increasing the maximum stream resolution from 4K to 8K, a quadruple of itself.

At the heart of the card is AMD’s unnamed video encoding ASIC, called the Video Processing Unit (VPU). The MA35D includes two VPUs, each with his 8GB pool of LPDDR5 memory and a PCIe 5.0 x4 connection to the host processor. The VPU is built on a 5nm process, but oddly AMD doesn’t disclose the fab used, so I suspect it’s a Samsung 5nm process (ed: At this point, if someone uses TSMC, they’re usually bragging about it.).

Internally, each VPU contains four video encoding blocks, plus various accessory blocks required to make it a fully functional chip. Two of the encoding blocks are full-featured and support H.264, H.265, and AV1, while the other two blocks are AV1-only. This highlights the computational complexity of the new codec. Other blocks in the VPU include a video decoder block for transcoding, a memory controller, a management controller, a bitrate scaler, a compositing engine, and a 22 TOPS throughput AI processor that further enhances the card’s video encoding quality.

As for the video encode block itself, AMD engineers said that despite overlapping similarities between this part and AMD’s GPU efforts, the VPU’s video encode block is a unique design, and AMD’s GPU video I quickly realized that it wasn’t pulled out of the encoding block. It’s not surprising that AMD would eventually integrate encoder IP across its product line, but the current generation of products had the Alveo MA35D’s VPU in development before the Xilinx acquisition was completed, so the previous The Xilinx team has completed what it started. This means that the VPU has its own quirks, but the Alveo team is somewhat proud of having built a better video encoder.

VPU also marks the move of the Alveo video encoder family to a fully ASIC-based product. Of course, Xilinx is best known for its programmable FPGAs. The previous Alveo U30’s processor used hard logic for the video encoding block, which was combined with his FPGA fabric network. So the product was still a mix of ASIC and FPGA designs. His VPU in the MA35D, on the other hand, is a proven true His ASIC with no FPGA elements, allowing the company to take full advantage of the power efficiency benefits of using fixed-function logic in its dedicated products.

And energy efficiency is another major advantage over older U30 cards, and what AMD sees as a key advantage over its competitors. The card’s official TDP is 50 Watts, but in practice AMD has found the typical power consumption of the card to be closer to around 35 Watts, or over 1W per 1080p60 stream. This is a 66% reduction in energy consumption per stream compared to the U30, where he was just over 3W on a single 1080p stream.

Meanwhile, new to the Alveo MA35D and its VPUs is the AI ​​Acceleration Block. Unlike GPU-based products, this is not for pseudo-AI tasks such as image recognition. Rather, AMD uses his AI accelerator to feed additional data to video encoders to further improve encoding quality. An AI processor rated for 22 TOPS performance evaluates the stream frame by frame and uses that analysis to adjust the encoding parameters used by the rest of the chip.

An AI processor that uses both region-of-interest encoding and artifact detection allows MA35D to inherently solve problems at lower bitrates than simple video encoding strategies. Region of interest encoding allows parts of the video to receive higher quality encoding (text, faces, etc.). Artifact detection, on the other hand, can catch when the encoder is fed blocky or degraded images (which are really difficult to encode). Remove/modify them before the frame is sent for encoding.

Overall, AMD makes some pretty aggressive image quality claims with the Alveo MA35D. The H.264 and H.265 image quality should be similar to the x264 Medium and x265 Medium presets respectively, but the card’s AV1 encode quality should be equivalent to AV1 Slow. These comparisons are based on VMAF scores and the settings required to achieve similar scores. Or to frame things on a bitrate basis, with AV1 AMD says the MA35D can offer the same image quality as his Alveo U30 in H.264 mode at 55% of the bitrate ( 1.8x efficiency improvement).

Finally, next to the MA35D’s video encoding capabilities, the interesting thing is that the VPU’s management processor has moved from Arm to RISC-V. The U30’s processor used a quad-core Cortex-A53 core, while the MA35D VPU uses a pair of quad-core RISC-V cores, but AMD doesn’t specify which one. The RISC-V architecture has quietly given Arm a boost in management controllers like these. Here’s another example of that transition in action.

Even the entire Alveo MA35D card with two VPUs is small enough and comes in a half-height half-length form factor. At 50W TDP, the card is powered entirely through the PCIe slot and uses PCIe x8 connectors (which split to x4 per VPU). Also, as is common with data center accelerator cards, the MA35D is passively cooled.

According to AMD, Alveo is currently sampling to partners. The company plans to start shipping in the third quarter of this year, with a suggested retail price of $1,595.

Related Articles

Back to top button