arm64: optimize __asm_{flush, invalidate}_dcache_all
__asm_dcache_all can directly return to the caller of
__asm_{flush,invalidate}_dcache_all.
We do not have to waste x16 register here.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: York Sun <york.sun@nxp.com>