Accurate Sum and Dot Product with New Instruction for High-Precision Computing on ARMv8 Processor

The accumulation of rounding errors can lead to unreliable results. Therefore, accurate and efficient algorithms are required. A processor from the ARMv8 architecture has introduced new instructions for high-precision computation. We have redesigned and implemented accurate summation and the accurat...

Full description

Saved in:

Bibliographic Details
Main Authors:	Kaisen Xie, Qingfeng Lu, Hao Jiang, Hongxia Wang
Format:	Article
Language:	English
Published:	MDPI AG 2025-01-01
Series:	Mathematics
Subjects:	compensated precision accurate summation accurate dot product error analysis error-free transformation
Online Access:	https://www.mdpi.com/2227-7390/13/2/270
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832588086154887168
author	Kaisen Xie Qingfeng Lu Hao Jiang Hongxia Wang
author_facet	Kaisen Xie Qingfeng Lu Hao Jiang Hongxia Wang
author_sort	Kaisen Xie
collection	DOAJ
description	The accumulation of rounding errors can lead to unreliable results. Therefore, accurate and efficient algorithms are required. A processor from the ARMv8 architecture has introduced new instructions for high-precision computation. We have redesigned and implemented accurate summation and the accurate dot product. The number of floating-point operations has been reduced from <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>7</mn><mi>n</mi><mo>−</mo><mn>5</mn></mrow></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>10</mn><mi>n</mi><mo>−</mo><mn>5</mn></mrow></semantics></math></inline-formula> to <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>4</mn><mi>n</mi><mo>−</mo><mn>2</mn></mrow></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>7</mn><mi>n</mi><mo>−</mo><mn>2</mn></mrow></semantics></math></inline-formula>, compared with the classic compensated precision algorithms. It has been proven that our accurate summation and dot algorithms’ error bounds are <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi>γ</mi><mrow><mi>n</mi><mo>−</mo><mn>1</mn></mrow></msub><msub><mi>γ</mi><mi>n</mi></msub><mi>cond</mi><mo>+</mo><mi>u</mi></mrow></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi>γ</mi><mi>n</mi></msub><msub><mi>γ</mi><mrow><mi>n</mi><mo>+</mo><mn>1</mn></mrow></msub><mi>cond</mi><mo>+</mo><mi>u</mi></mrow></semantics></math></inline-formula>, where ‘cond’ denotes the condition number, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi>γ</mi><mi>n</mi></msub><mo>=</mo><mi>n</mi><mo>·</mo><mi>u</mi><mo>/</mo><mrow><mo>(</mo><mn>1</mn><mo>−</mo><mi>n</mi><mo>·</mo><mi>u</mi><mo>)</mo></mrow></mrow></semantics></math></inline-formula>, and <i>u</i> denotes the relative rounding error unit. Our accurate summation and dot product achieved a 1.69× speedup and a 1.14× speedup, respectively, on a simulation platform. Numerical experiments also illustrate that, under round-towards-zero mode, our algorithms are as accurate as the classic compensated precision algorithms.
format	Article
id	doaj-art-bc40195db12741f183316ed0dd072240
institution	Kabale University
issn	2227-7390
language	English
publishDate	2025-01-01
publisher	MDPI AG
record_format	Article
series	Mathematics
spelling	doaj-art-bc40195db12741f183316ed0dd0722402025-01-24T13:39:58ZengMDPI AGMathematics2227-73902025-01-0113227010.3390/math13020270Accurate Sum and Dot Product with New Instruction for High-Precision Computing on ARMv8 ProcessorKaisen Xie0Qingfeng Lu1Hao Jiang2Hongxia Wang3College of Computer Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Computer Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Computer Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Science, National University of Defense Technology, Changsha 410073, ChinaThe accumulation of rounding errors can lead to unreliable results. Therefore, accurate and efficient algorithms are required. A processor from the ARMv8 architecture has introduced new instructions for high-precision computation. We have redesigned and implemented accurate summation and the accurate dot product. The number of floating-point operations has been reduced from <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>7</mn><mi>n</mi><mo>−</mo><mn>5</mn></mrow></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>10</mn><mi>n</mi><mo>−</mo><mn>5</mn></mrow></semantics></math></inline-formula> to <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>4</mn><mi>n</mi><mo>−</mo><mn>2</mn></mrow></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>7</mn><mi>n</mi><mo>−</mo><mn>2</mn></mrow></semantics></math></inline-formula>, compared with the classic compensated precision algorithms. It has been proven that our accurate summation and dot algorithms’ error bounds are <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi>γ</mi><mrow><mi>n</mi><mo>−</mo><mn>1</mn></mrow></msub><msub><mi>γ</mi><mi>n</mi></msub><mi>cond</mi><mo>+</mo><mi>u</mi></mrow></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi>γ</mi><mi>n</mi></msub><msub><mi>γ</mi><mrow><mi>n</mi><mo>+</mo><mn>1</mn></mrow></msub><mi>cond</mi><mo>+</mo><mi>u</mi></mrow></semantics></math></inline-formula>, where ‘cond’ denotes the condition number, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi>γ</mi><mi>n</mi></msub><mo>=</mo><mi>n</mi><mo>·</mo><mi>u</mi><mo>/</mo><mrow><mo>(</mo><mn>1</mn><mo>−</mo><mi>n</mi><mo>·</mo><mi>u</mi><mo>)</mo></mrow></mrow></semantics></math></inline-formula>, and <i>u</i> denotes the relative rounding error unit. Our accurate summation and dot product achieved a 1.69× speedup and a 1.14× speedup, respectively, on a simulation platform. Numerical experiments also illustrate that, under round-towards-zero mode, our algorithms are as accurate as the classic compensated precision algorithms.https://www.mdpi.com/2227-7390/13/2/270compensated precisionaccurate summationaccurate dot producterror analysiserror-free transformation
spellingShingle	Kaisen Xie Qingfeng Lu Hao Jiang Hongxia Wang Accurate Sum and Dot Product with New Instruction for High-Precision Computing on ARMv8 Processor Mathematics compensated precision accurate summation accurate dot product error analysis error-free transformation
title	Accurate Sum and Dot Product with New Instruction for High-Precision Computing on ARMv8 Processor
title_full	Accurate Sum and Dot Product with New Instruction for High-Precision Computing on ARMv8 Processor
title_fullStr	Accurate Sum and Dot Product with New Instruction for High-Precision Computing on ARMv8 Processor
title_full_unstemmed	Accurate Sum and Dot Product with New Instruction for High-Precision Computing on ARMv8 Processor
title_short	Accurate Sum and Dot Product with New Instruction for High-Precision Computing on ARMv8 Processor
title_sort	accurate sum and dot product with new instruction for high precision computing on armv8 processor
topic	compensated precision accurate summation accurate dot product error analysis error-free transformation
url	https://www.mdpi.com/2227-7390/13/2/270
work_keys_str_mv	AT kaisenxie accuratesumanddotproductwithnewinstructionforhighprecisioncomputingonarmv8processor AT qingfenglu accuratesumanddotproductwithnewinstructionforhighprecisioncomputingonarmv8processor AT haojiang accuratesumanddotproductwithnewinstructionforhighprecisioncomputingonarmv8processor AT hongxiawang accuratesumanddotproductwithnewinstructionforhighprecisioncomputingonarmv8processor

Accurate Sum and Dot Product with New Instruction for High-Precision Computing on ARMv8 Processor

Similar Items