poly1305/asm/poly1305-x86_64.pl: add poly1305_blocks_vpmadd52_4x.

As hinted by its name new subroutine processes 4 input blocks in
parallel. It still operates on 256-bit registers and is just
another step toward full-blown AVX512IFMA procedure.

Reviewed-by: Rich Salz <rsalz@openssl.org>
1 file changed