Home/Instructions/VFWMACCBF16-VV
VFWMACCBF16.VV

RISC-V VFWMACCBF16.VV Instruction Details

Instruction ManualR-type

BF16 vector widening fused multiply-accumulate: multiply BF16 sources and accumulate into FP32 vd.

Instruction Syntax

vfwmaccbf16.vv vd, vs1, vs2, vm
Operand Breakdown
Destination rd: register receiving the operation result.
Source rs1: register holding the first operand.
Source rs2: register holding the second operand.
ZvfbfwmaVector Operations

Instruction Behavior

VFWMACCBF16.VV performs BF16 widening fused multiply-accumulate: 16-bit BF16 elements from vs1 and vs2 are multiplied, the unrounded product is added to the corresponding 32-bit FP32 accumulator in vd, and the sum is rounded according to frm and written back to vd. It is typical for DNN matrix multiply-accumulate. It is part of Zvfbfwma, which depends on Zfbfmin and Zvfbfmin.

Quick Understanding & Search Notes

VFWMACCBF16.VV belongs to the RISC-V BF16 extensions; BF16 is a 16-bit FP format with 1 sign bit, 8 exponent bits, and 7 fraction bits.

Widening multiply-accumulate treats sources as BF16 and the accumulator/result as FP32.
BF16 scalar inputs/results follow RISC-V NaN-boxing rules.

Common Usage Scenarios

Vector Operations

Understand this scenario with real code like «vfwmaccbf16.vv v4, v8, v12 # v4[fp32] += bf16(v8) * bf16(v12)».

Machine Learning

Understand this scenario with real code like «vfwmaccbf16.vv v4, v8, v12 # v4[fp32] += bf16(v8) * bf16(v12)».

Pre-Use Checklist

Syntax Check
  • Confirm the current instruction format is R-type.
  • Confirm the operand order matches the example.
Semantic Check
  • Ensure the destination register usage is compatible with the calling convention.
  • Confirm this is not the lower-level form of a pseudo-instruction expansion.

Pitfalls / Common Confusions

SEW must be 16; other SEW encodings are reserved.
vs1/vs2 are 16-bit BF16, while vd is the 32-bit FP32 accumulator/result.
Requires Zvfbfwma, which depends on Zfbfmin and Zvfbfmin.
vd is both the FP32 accumulator input and output; unrounded BF16 products are added to the FP32 accumulator and rounded by frm.

FAQ

Does VFWMACCBF16.VV imply BF16 add/sub/mul/div support?

No. Zfbfmin/Zvfbfmin mainly provide BF16/FP32 conversion; Zvfbfwma provides widening multiply-accumulate.

What is the SEW restriction for VFWMACCBF16.VV?

Vector BF16 instructions require SEW=16.