1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113
| #include <riscv_vector.h> #include <stddef.h> #include <stdio.h> #include <math.h> #include <time.h> #define N 31
float input[N] = {-0.4325648115282207, -1.6655843782380970, 0.1253323064748307, 0.2876764203585489, -1.1464713506814637, 1.1909154656429988, 1.1891642016521031, -0.0376332765933176, 0.3272923614086541, 0.1746391428209245, -0.1867085776814394, 0.7257905482933027, -0.5883165430141887, 2.1831858181971011, -0.1363958830865957, 0.1139313135208096, 1.0667682113591888, 0.0592814605236053, -0.0956484054836690, -0.8323494636500225, 0.2944108163926404, -1.3361818579378040, 0.7143245518189522, 1.6235620644462707, -0.6917757017022868, 0.8579966728282626, 1.2540014216025324, -1.5937295764474768, -1.4409644319010200, 0.5711476236581780, -0.3998855777153632};
float output_golden[N] = { 1.7491401329284098, 0.1325982188803279, 0.3252281811989881, -0.7938091410349637, 0.3149236145048914, -0.5272704888029532, 0.9322666565031119, 1.1646643544607362, -2.0456694357357357, -0.6443728590041911, 1.7410657940825480, 0.4867684246821860, 1.0488288293660140, 1.4885752747099299, 1.2705014969484090, -1.8561241921210170, 2.1343209047321410, 1.4358467535865909, -0.9173023332875400, -1.1060770780029008, 0.8105708062681296, 0.6985430696369063, -0.4015827425012831, 1.2687512030669628, -0.7836083053674872, 0.2132664971465569, 0.7878984786088954, 0.8966819356782295, -0.1869172943544062, 1.0131816724341454, 0.2484350696132857};
float output[N] = { 1.7491401329284098, 0.1325982188803279, 0.3252281811989881, -0.7938091410349637, 0.3149236145048914, -0.5272704888029532, 0.9322666565031119, 1.1646643544607362, -2.0456694357357357, -0.6443728590041911, 1.7410657940825480, 0.4867684246821860, 1.0488288293660140, 1.4885752747099299, 1.2705014969484090, -1.8561241921210170, 2.1343209047321410, 1.4358467535865909, -0.9173023332875400, -1.1060770780029008, 0.8105708062681296, 0.6985430696369063, -0.4015827425012831, 1.2687512030669628, -0.7836083053674872, 0.2132664971465569, 0.7878984786088954, 0.8966819356782295, -0.1869172943544062, 1.0131816724341454, 0.2484350696132857};
void saxpy_golden(size_t n, const float a, const float *x, float *y) { for (size_t i = 0; i < n; ++i) { y[i] = a * x[i] + y[i]; } }
void saxpy_vec(size_t n, const float a, const float *x, float *y) { size_t l;
vfloat32m8_t vx, vy;
for (; n > 0; n -= l) { l = __riscv_vsetvl_e32m8(n); vx = __riscv_vle32_v_f32m8(x, l); x += l; vy = __riscv_vle32_v_f32m8(y, l); vy = __riscv_vfmacc_vf_f32m8(vy, a, vx, l); __riscv_vse32_v_f32m8 (y, vy, l); y += l; } }
int fp_eq(float reference, float actual, float relErr) { float absErr = relErr * ((fabsf(reference) > relErr) ? fabsf(reference) : relErr); return fabsf(actual - reference) < absErr; }
int main() { struct timespec start, end; clock_gettime(CLOCK_REALTIME, &start); saxpy_golden(N, 55.66, input, output_golden); clock_gettime(CLOCK_REALTIME, &end); long seconds = end.tv_sec - start.tv_sec; long nanoseconds = end.tv_nsec - start.tv_nsec; if (start.tv_nsec > end.tv_nsec) { --seconds; nanoseconds += 1000000000; } printf("Elapsed time for normal add: %ld.%09ld seconds\n", seconds, nanoseconds); clock_gettime(CLOCK_REALTIME, &start); saxpy_vec(N, 55.66, input, output); clock_gettime(CLOCK_REALTIME, &end); seconds = end.tv_sec - start.tv_sec; nanoseconds = end.tv_nsec - start.tv_nsec; if (start.tv_nsec > end.tv_nsec) { --seconds; nanoseconds += 1000000000; } printf("Elapsed time for RVV add: %ld.%09ld seconds\n", seconds, nanoseconds); int pass = 1; for (int i = 0; i < N; i++) { if (!fp_eq(output_golden[i], output[i], 1e-6)) { printf("failed, %f=!%f\n", output_golden[i], output[i]); pass = 0; } } if (pass) printf("passed\n"); return (pass == 0); }
|