In Verilog, this can be implemented using a generate loop:
module array_multiplier #(parameter WIDTH = 4)( input [WIDTH-1:0] a, b, output [2*WIDTH-1:0] product ); wire [WIDTH-1:0] pp [0:WIDTH-1]; // Partial products genvar i; generate for(i = 0; i < WIDTH; i = i + 1) begin assign pp[i] = a & {WIDTH{b[i]}}; end endgenerate // Summation using a tree of adders (simplified) assign product = pp[0] + (pp[1] << 1) + (pp[2] << 2) + (pp[3] << 3); endmodule The problem is speed. The final addition uses a ripple-carry structure. For an N-bit multiplier, the critical path passes through N AND gates and an adder chain with O(N) gate delays. For 32-bit numbers, this becomes impractically slow. When area is constrained (e.g., in an ASIC or a small FPGA), the sequential multiplier is the classic solution. Instead of building all logic at once, it reuses a single adder over multiple clock cycles. multiplier in verilog
In the realm of digital design and computer architecture, the multiplier is a fundamental arithmetic circuit. From the simple act of adjusting a volume control to the complex matrix multiplications in a neural network accelerator, multiplication is a ubiquitous operation. However, for a hardware designer using Verilog, the journey of implementing a multiplier is a critical lesson in the trade-off between area, speed, and power. Unlike software, where the * operator is a high-level abstraction, in Verilog, it can represent anything from a massively parallel array of logic gates to a slow, sequential state machine. In Verilog, this can be implemented using a
Writing a multiplier in Verilog is therefore a lesson in disciplined design. It forces the engineer to think not just in code, but in clocks, gates, and data paths. It demonstrates that in hardware, there is no free lunch: speed, area, and power are an eternal triangle. Mastering the multiplier is the first step toward mastering the art of digital systems design. For 32-bit numbers, this becomes impractically slow