Home/Instructions/VFWMACCBF16-VF
VFWMACCBF16.VF

RISC-V VFWMACCBF16.VF Instruction Details

Instruction ManualR-type

BF16 vector widening fused multiply-accumulate: multiply BF16 sources and accumulate into FP32 vd.

Instruction Syntax

vfwmaccbf16.vf vd, rs1, vs2, vm
Operand Breakdown
Destination rd: register receiving the operation result.
Source rs1: register holding the first operand.
Source rs2: register holding the second operand.
ZvfbfwmaVector Operations

Instruction Behavior

VFWMACCBF16.VF performs vector-scalar BF16 widening fused multiply-accumulate: the scalar BF16 value in FPU register rs1 and 16-bit BF16 elements from vs2 are multiplied, the unrounded product is added to the corresponding 32-bit FP32 accumulator in vd, and the sum is rounded according to frm and written back to vd. It is part of Zvfbfwma, which depends on Zfbfmin and Zvfbfmin.

Quick Understanding & Search Notes

VFWMACCBF16.VF belongs to the RISC-V BF16 extensions; BF16 is a 16-bit FP format with 1 sign bit, 8 exponent bits, and 7 fraction bits.

Widening multiply-accumulate treats sources as BF16 and the accumulator/result as FP32.
BF16 scalar inputs/results follow RISC-V NaN-boxing rules.

Common Usage Scenarios

Vector Operations

Understand this scenario with real code like «vfwmaccbf16.vf v4, f0, v8 # v4[fp32] += bf16(f0) * bf16(v8[i])».

Machine Learning

Understand this scenario with real code like «vfwmaccbf16.vf v4, f0, v8 # v4[fp32] += bf16(f0) * bf16(v8[i])».

Pre-Use Checklist

Syntax Check
  • Confirm the current instruction format is R-type.
  • Confirm the operand order matches the example.
Semantic Check
  • Ensure the destination register usage is compatible with the calling convention.
  • Confirm this is not the lower-level form of a pseudo-instruction expansion.

Pitfalls / Common Confusions

SEW must be 16; other SEW encodings are reserved.
The scalar operand is a BF16 value in FPU register rs1; vector source vs2 is 16-bit BF16 and destination/accumulator vd is 32-bit FP32.
Zvfbfwma depends on both Zfbfmin and Zvfbfmin.
vd is both the FP32 accumulator input and output; unrounded BF16 products are added to the FP32 accumulator and rounded by frm.

FAQ

Does VFWMACCBF16.VF imply BF16 add/sub/mul/div support?

No. Zfbfmin/Zvfbfmin mainly provide BF16/FP32 conversion; Zvfbfwma provides widening multiply-accumulate.

What is the SEW restriction for VFWMACCBF16.VF?

Vector BF16 instructions require SEW=16.