ECDSA—模乘模块
- 2021 年 8 月 23 日
- 筆記
- ECDSA高性能硬件实现
如果a,b属于GF(P),则有乘法运算a*b=r (mod p),
其中r满足0<r<p-1,即a*b除以p的余数。该操作成为模p乘法。本模块输入两个数,完成两个数的模乘运算。
信号名 |
方向 |
位宽 |
端口定义 |
clk |
Input |
1 |
时钟 |
reset |
Input |
1 |
复位 |
multip_en |
Input |
1 |
模乘使能信号 |
a |
Input |
256 |
整数乘数a |
b |
Input |
256 |
整数乘数b |
product |
output |
256 |
模乘运算结果 |
done |
output |
1 |
模乘完成标识 |
算法:模乘运算算法
输入:a,b∈ [0,n-1], multip_en
输出:product, done
- 初始化product,product=0;
- 对于i从0到t-1,重复执行
2.1. 若bi=1,则product=product + a<<1;
- Product = product mod p;
- 返回product。
代码如下:
module mod_multiplier( input clk, reset, input [255:0] a, input [255:0] b, output reg [255:0] product, output reg done ); /* multiplication using bit shifting and adding */ parameter params_p=29; //parameter params_p = 256'hFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F; reg [510:0] a_in, count_in; reg [510:0] b_in,c_in; reg a_load, b_load, c_load, count_load; wire [510:0] a_out, count_out; wire [510:0] b_out, c_out; reg_256 #(511) a_reg(.clk(clk), .load(a_load), .data(a_in), .out(a_out)); reg_256 #(511) b_reg(.clk(clk), .load(b_load), .data(b_in), .out(b_out)); reg_256 #(511) c_reg(.clk(clk), .load(c_load), .data(c_in), .out(c_out)); reg_256 #(511) count(.clk(clk), .load(count_load), .data(count_in), .out(count_out)); reg [2:0] state, next_state; parameter init = 3'd0; parameter start = 3'd1; parameter shift_AB = 3'd2; parameter reduce_B = 3'd3; parameter reduce_C = 3'd4; parameter finish = 3'd5; always@(posedge clk) begin if(reset) state <= init; else state <= next_state; end always@(*) begin next_state = state; case(state) init: next_state = start; start: next_state = shift_AB; shift_AB: begin if((b_out << 1) >= {255'b0,params_p}) next_state = reduce_B; else next_state = reduce_C; end reduce_B: begin if(b_out >= {255'b0,params_p}) next_state = reduce_B; else next_state = reduce_C; end reduce_C: begin if(count_out == 8'd254) next_state = finish; else next_state = shift_AB; end finish: next_state = finish; default: ; endcase end always@(*) begin count_in = count_out; b_in = b_out; c_in = c_out; a_in = a_out; count_load = 1'b0; a_load = 1'b0; b_load = 1'b0; c_load = 1'b0; product = 256'b0; done = 1'b0; case(state) init: begin done = 1'b0; count_in = 8'b0; a_in = a; b_in = {2'b0, b}; c_in = 510'b0; count_load = 1'b1; a_load = 1'b1; b_load = 1'b1; c_load = 1'b1; end start: begin if(a[0] == 1) c_in = b_out; else c_in = 510'b0; c_load = 1'b1; end shift_AB: begin a_in = a_out >> 1; a_load = 1'b1; b_load = 1'b1; b_in = b_out << 1; end reduce_B: begin b_load = 1'b1; if(b_out >= {255'b0, params_p}) b_in = b_out - {255'b0,params_p}; else b_in = b_out; end reduce_C: begin c_load = 1'b1; if(a_out[0] == 1'b1) begin if((c_out + b_out) >= {255'b0, params_p}) c_in = (c_out + b_out) - {255'b0, params_p}; else c_in = c_out + b_out; end else begin if(c_out >= {255'b0, params_p}) c_in = c_out - params_p; else c_in = c_out; end count_in = count_out + 8'b01; count_load = 1'b1; end finish: begin if(c_out < params_p) begin done = 1'b1; product = c_out[255:0]; end else begin c_in = c_out - {255'b0,params_p}; c_load = 1'b1; done = 1'b0; end end default: ; endcase end endmodule
其中reg256.v模块为暂存模块,主要是暂时存储一下temp reg。
代码如下:
module reg_256 #(parameter size = 256) ( input clk, load, input [size-1:0] data, output reg [size-1:0] out ); always @ (posedge clk) begin if (load) out <= data; else out <= out; end endmodule
仿真采用一条简单的曲线先验证结果的正确性,再选用spec256k1曲线来测试模乘速率,
其参数如下所示,后面的模块都会采用这两条曲线做仿真验证。此次只需将parameter中的p进行修改即可切换曲线。
sepc256k1曲线参数
p = FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F a = 0 b =7 G_x = 79BE667E F9DCBBAC 55A06295 CE870B07 029BFCDB 2DCE28D9 59F2815B 16F81798 G_y = 483ADA77 26A3C465 5DA4FBFC 0E1108A8 FD17B448 A6855419 9C47D08F FB10D4B8 n = FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFE BAAEDCE6 AF48A03B BFD25E8C D0364141 h = 01
testbeach如下所示:
`timescale 1ns/1ns module mod_multiplier_tb(); reg clk, reset; reg [255:0] a, b; wire [255:0] product; wire done; mod_multiplier mult0( .clk(clk), .reset(reset), .a(a), .b(b), .product(product), .done(done) ); always #5 clk = ~clk; initial begin clk = 0; reset = 1'b1; #6 reset = 1'b0; a = 12; b = 25; /* a = 256'h09; b = 256'hFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFE2F; */ /*#5200 $display("ad/p=%d",product); #20 $stop();*/ end endmodule
P=29时:选用的测试数据点为(12,25),12*25=300 mod 29 = 10(mod 29),结果正确。
P=256‘hFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F时,
选用测试点为:a = 256’h09; b = 256’hFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFE2F;
从时间上来看,运算速率看来跟数的大小无关,而是跟相对于p的大小有关。