Synchronization in graph analysis algorithms on the Partially Ordered Event‐Triggered Systems many‐core architecture

Abstract One of the key problems in designing and implementing graph analysis algorithms for distributed platforms is to find an optimal way of managing communication flows in the massively parallel processing network. Message‐passing and global synchronization are powerful abstractions in this rega...

Full description

Saved in:

Bibliographic Details
Main Authors:	Ashur Rafiev, Alex Yakovlev, Ghaith Tarawneh, Matthew F. Naylor, Simon W. Moore, David B. Thomas, Graeme M. Bragg, Mark L. Vousden, Andrew D. Brown
Format:	Article
Language:	English
Published:	Wiley 2022-03-01
Series:	IET Computers & Digital Techniques
Online Access:	https://doi.org/10.1049/cdt2.12041
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832559619247964160
author	Ashur Rafiev Alex Yakovlev Ghaith Tarawneh Matthew F. Naylor Simon W. Moore David B. Thomas Graeme M. Bragg Mark L. Vousden Andrew D. Brown
author_facet	Ashur Rafiev Alex Yakovlev Ghaith Tarawneh Matthew F. Naylor Simon W. Moore David B. Thomas Graeme M. Bragg Mark L. Vousden Andrew D. Brown
author_sort	Ashur Rafiev
collection	DOAJ
description	Abstract One of the key problems in designing and implementing graph analysis algorithms for distributed platforms is to find an optimal way of managing communication flows in the massively parallel processing network. Message‐passing and global synchronization are powerful abstractions in this regard, especially when used in combination. This paper studies the use of a hardware‐implemented refutable global barrier as a design optimization technique aimed at unifying these abstractions at the API level. The paper explores the trade‐offs between the related overheads and performance factors on a message‐passing prototype machine with 49,152 RISC‐V threads distributed over 48 FPGAs (called the Partially Ordered Event‐Triggered Systems platform). Our experiments show that some graph applications favour synchronized communication, but the effect is hard to predict in general because of the interplay between multiple hardware and software factors. A classifier model is therefore proposed and implemented to perform such a prediction based on the application graph topology parameters: graph diameter, degree of connectivity, and reconvergence metric. The presented experimental results demonstrate that the correct choice of communication mode, granted by the new model‐driven approach, helps to achieve 3.22 times faster computation time on average compared to the baseline platform operation.
format	Article
id	doaj-art-46d70bd482054b00926b17b26650c27a
institution	Kabale University
issn	1751-8601 1751-861X
language	English
publishDate	2022-03-01
publisher	Wiley
record_format	Article
series	IET Computers & Digital Techniques
spelling	doaj-art-46d70bd482054b00926b17b26650c27a2025-02-03T01:29:40ZengWileyIET Computers & Digital Techniques1751-86011751-861X2022-03-01162-3718810.1049/cdt2.12041Synchronization in graph analysis algorithms on the Partially Ordered Event‐Triggered Systems many‐core architectureAshur Rafiev0Alex Yakovlev1Ghaith Tarawneh2Matthew F. Naylor3Simon W. Moore4David B. Thomas5Graeme M. Bragg6Mark L. Vousden7Andrew D. Brown8School of Engineering Newcastle University Newcastle upon Tyne UKSchool of Engineering Newcastle University Newcastle upon Tyne UKSchool of Engineering Newcastle University Newcastle upon Tyne UKComputer Architecture Group Cambridge University Cambridge UKComputer Architecture Group Cambridge University Cambridge UKDepartment of Electrical and Electronic Engineering Imperial College London London UKElectronics and Computer Science University of Southampton Southampton UKElectronics and Computer Science University of Southampton Southampton UKElectronics and Computer Science University of Southampton Southampton UKAbstract One of the key problems in designing and implementing graph analysis algorithms for distributed platforms is to find an optimal way of managing communication flows in the massively parallel processing network. Message‐passing and global synchronization are powerful abstractions in this regard, especially when used in combination. This paper studies the use of a hardware‐implemented refutable global barrier as a design optimization technique aimed at unifying these abstractions at the API level. The paper explores the trade‐offs between the related overheads and performance factors on a message‐passing prototype machine with 49,152 RISC‐V threads distributed over 48 FPGAs (called the Partially Ordered Event‐Triggered Systems platform). Our experiments show that some graph applications favour synchronized communication, but the effect is hard to predict in general because of the interplay between multiple hardware and software factors. A classifier model is therefore proposed and implemented to perform such a prediction based on the application graph topology parameters: graph diameter, degree of connectivity, and reconvergence metric. The presented experimental results demonstrate that the correct choice of communication mode, granted by the new model‐driven approach, helps to achieve 3.22 times faster computation time on average compared to the baseline platform operation.https://doi.org/10.1049/cdt2.12041
spellingShingle	Ashur Rafiev Alex Yakovlev Ghaith Tarawneh Matthew F. Naylor Simon W. Moore David B. Thomas Graeme M. Bragg Mark L. Vousden Andrew D. Brown Synchronization in graph analysis algorithms on the Partially Ordered Event‐Triggered Systems many‐core architecture IET Computers & Digital Techniques
title	Synchronization in graph analysis algorithms on the Partially Ordered Event‐Triggered Systems many‐core architecture
title_full	Synchronization in graph analysis algorithms on the Partially Ordered Event‐Triggered Systems many‐core architecture
title_fullStr	Synchronization in graph analysis algorithms on the Partially Ordered Event‐Triggered Systems many‐core architecture
title_full_unstemmed	Synchronization in graph analysis algorithms on the Partially Ordered Event‐Triggered Systems many‐core architecture
title_short	Synchronization in graph analysis algorithms on the Partially Ordered Event‐Triggered Systems many‐core architecture
title_sort	synchronization in graph analysis algorithms on the partially ordered event triggered systems many core architecture
url	https://doi.org/10.1049/cdt2.12041
work_keys_str_mv	AT ashurrafiev synchronizationingraphanalysisalgorithmsonthepartiallyorderedeventtriggeredsystemsmanycorearchitecture AT alexyakovlev synchronizationingraphanalysisalgorithmsonthepartiallyorderedeventtriggeredsystemsmanycorearchitecture AT ghaithtarawneh synchronizationingraphanalysisalgorithmsonthepartiallyorderedeventtriggeredsystemsmanycorearchitecture AT matthewfnaylor synchronizationingraphanalysisalgorithmsonthepartiallyorderedeventtriggeredsystemsmanycorearchitecture AT simonwmoore synchronizationingraphanalysisalgorithmsonthepartiallyorderedeventtriggeredsystemsmanycorearchitecture AT davidbthomas synchronizationingraphanalysisalgorithmsonthepartiallyorderedeventtriggeredsystemsmanycorearchitecture AT graemembragg synchronizationingraphanalysisalgorithmsonthepartiallyorderedeventtriggeredsystemsmanycorearchitecture AT marklvousden synchronizationingraphanalysisalgorithmsonthepartiallyorderedeventtriggeredsystemsmanycorearchitecture AT andrewdbrown synchronizationingraphanalysisalgorithmsonthepartiallyorderedeventtriggeredsystemsmanycorearchitecture

Synchronization in graph analysis algorithms on the Partially Ordered Event‐Triggered Systems many‐core architecture

Similar Items