Accurate Sum and Dot Product with New Instruction for High-Precision Computing on ARMv8 Processor
The accumulation of rounding errors can lead to unreliable results. Therefore, accurate and efficient algorithms are required. A processor from the ARMv8 architecture has introduced new instructions for high-precision computation. We have redesigned and implemented accurate summation and the accurat...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2025-01-01
|
Series: | Mathematics |
Subjects: | |
Online Access: | https://www.mdpi.com/2227-7390/13/2/270 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832588086154887168 |
---|---|
author | Kaisen Xie Qingfeng Lu Hao Jiang Hongxia Wang |
author_facet | Kaisen Xie Qingfeng Lu Hao Jiang Hongxia Wang |
author_sort | Kaisen Xie |
collection | DOAJ |
description | The accumulation of rounding errors can lead to unreliable results. Therefore, accurate and efficient algorithms are required. A processor from the ARMv8 architecture has introduced new instructions for high-precision computation. We have redesigned and implemented accurate summation and the accurate dot product. The number of floating-point operations has been reduced from <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>7</mn><mi>n</mi><mo>−</mo><mn>5</mn></mrow></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>10</mn><mi>n</mi><mo>−</mo><mn>5</mn></mrow></semantics></math></inline-formula> to <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>4</mn><mi>n</mi><mo>−</mo><mn>2</mn></mrow></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>7</mn><mi>n</mi><mo>−</mo><mn>2</mn></mrow></semantics></math></inline-formula>, compared with the classic compensated precision algorithms. It has been proven that our accurate summation and dot algorithms’ error bounds are <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi>γ</mi><mrow><mi>n</mi><mo>−</mo><mn>1</mn></mrow></msub><msub><mi>γ</mi><mi>n</mi></msub><mi>cond</mi><mo>+</mo><mi>u</mi></mrow></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi>γ</mi><mi>n</mi></msub><msub><mi>γ</mi><mrow><mi>n</mi><mo>+</mo><mn>1</mn></mrow></msub><mi>cond</mi><mo>+</mo><mi>u</mi></mrow></semantics></math></inline-formula>, where ‘cond’ denotes the condition number, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi>γ</mi><mi>n</mi></msub><mo>=</mo><mi>n</mi><mo>·</mo><mi>u</mi><mo>/</mo><mrow><mo>(</mo><mn>1</mn><mo>−</mo><mi>n</mi><mo>·</mo><mi>u</mi><mo>)</mo></mrow></mrow></semantics></math></inline-formula>, and <i>u</i> denotes the relative rounding error unit. Our accurate summation and dot product achieved a 1.69× speedup and a 1.14× speedup, respectively, on a simulation platform. Numerical experiments also illustrate that, under round-towards-zero mode, our algorithms are as accurate as the classic compensated precision algorithms. |
format | Article |
id | doaj-art-bc40195db12741f183316ed0dd072240 |
institution | Kabale University |
issn | 2227-7390 |
language | English |
publishDate | 2025-01-01 |
publisher | MDPI AG |
record_format | Article |
series | Mathematics |
spelling | doaj-art-bc40195db12741f183316ed0dd0722402025-01-24T13:39:58ZengMDPI AGMathematics2227-73902025-01-0113227010.3390/math13020270Accurate Sum and Dot Product with New Instruction for High-Precision Computing on ARMv8 ProcessorKaisen Xie0Qingfeng Lu1Hao Jiang2Hongxia Wang3College of Computer Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Computer Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Computer Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Science, National University of Defense Technology, Changsha 410073, ChinaThe accumulation of rounding errors can lead to unreliable results. Therefore, accurate and efficient algorithms are required. A processor from the ARMv8 architecture has introduced new instructions for high-precision computation. We have redesigned and implemented accurate summation and the accurate dot product. The number of floating-point operations has been reduced from <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>7</mn><mi>n</mi><mo>−</mo><mn>5</mn></mrow></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>10</mn><mi>n</mi><mo>−</mo><mn>5</mn></mrow></semantics></math></inline-formula> to <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>4</mn><mi>n</mi><mo>−</mo><mn>2</mn></mrow></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>7</mn><mi>n</mi><mo>−</mo><mn>2</mn></mrow></semantics></math></inline-formula>, compared with the classic compensated precision algorithms. It has been proven that our accurate summation and dot algorithms’ error bounds are <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi>γ</mi><mrow><mi>n</mi><mo>−</mo><mn>1</mn></mrow></msub><msub><mi>γ</mi><mi>n</mi></msub><mi>cond</mi><mo>+</mo><mi>u</mi></mrow></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi>γ</mi><mi>n</mi></msub><msub><mi>γ</mi><mrow><mi>n</mi><mo>+</mo><mn>1</mn></mrow></msub><mi>cond</mi><mo>+</mo><mi>u</mi></mrow></semantics></math></inline-formula>, where ‘cond’ denotes the condition number, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi>γ</mi><mi>n</mi></msub><mo>=</mo><mi>n</mi><mo>·</mo><mi>u</mi><mo>/</mo><mrow><mo>(</mo><mn>1</mn><mo>−</mo><mi>n</mi><mo>·</mo><mi>u</mi><mo>)</mo></mrow></mrow></semantics></math></inline-formula>, and <i>u</i> denotes the relative rounding error unit. Our accurate summation and dot product achieved a 1.69× speedup and a 1.14× speedup, respectively, on a simulation platform. Numerical experiments also illustrate that, under round-towards-zero mode, our algorithms are as accurate as the classic compensated precision algorithms.https://www.mdpi.com/2227-7390/13/2/270compensated precisionaccurate summationaccurate dot producterror analysiserror-free transformation |
spellingShingle | Kaisen Xie Qingfeng Lu Hao Jiang Hongxia Wang Accurate Sum and Dot Product with New Instruction for High-Precision Computing on ARMv8 Processor Mathematics compensated precision accurate summation accurate dot product error analysis error-free transformation |
title | Accurate Sum and Dot Product with New Instruction for High-Precision Computing on ARMv8 Processor |
title_full | Accurate Sum and Dot Product with New Instruction for High-Precision Computing on ARMv8 Processor |
title_fullStr | Accurate Sum and Dot Product with New Instruction for High-Precision Computing on ARMv8 Processor |
title_full_unstemmed | Accurate Sum and Dot Product with New Instruction for High-Precision Computing on ARMv8 Processor |
title_short | Accurate Sum and Dot Product with New Instruction for High-Precision Computing on ARMv8 Processor |
title_sort | accurate sum and dot product with new instruction for high precision computing on armv8 processor |
topic | compensated precision accurate summation accurate dot product error analysis error-free transformation |
url | https://www.mdpi.com/2227-7390/13/2/270 |
work_keys_str_mv | AT kaisenxie accuratesumanddotproductwithnewinstructionforhighprecisioncomputingonarmv8processor AT qingfenglu accuratesumanddotproductwithnewinstructionforhighprecisioncomputingonarmv8processor AT haojiang accuratesumanddotproductwithnewinstructionforhighprecisioncomputingonarmv8processor AT hongxiawang accuratesumanddotproductwithnewinstructionforhighprecisioncomputingonarmv8processor |