Traces Synchronization in Distributed Networks

This article proposes a novel approach to synchronize a posteriori the detailed execution traces from several networked computers. It can be used to debug and investigate complex performance problems in systems where several computers exchange information. When the distributed system is under study,...

Full description

Saved in:
Bibliographic Details
Main Authors: Eric Clément, Michel Dagenais
Format: Article
Language:English
Published: Wiley 2009-01-01
Series:Journal of Computer Systems, Networks, and Communications
Online Access:http://dx.doi.org/10.1155/2009/190579
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832564343059775488
author Eric Clément
Michel Dagenais
author_facet Eric Clément
Michel Dagenais
author_sort Eric Clément
collection DOAJ
description This article proposes a novel approach to synchronize a posteriori the detailed execution traces from several networked computers. It can be used to debug and investigate complex performance problems in systems where several computers exchange information. When the distributed system is under study, detailed execution traces are generated locally on each system using an efficient and accurate system level tracer, LTTng. When the tracing is finished, the individual traces are collected and analysed together. The messaging events in all the traces are then identified and correlated in order to estimate the time offset over time between each node. The time offset computation imprecision, associated with asymmetric network delays and operating system latency in message sending and receiving, is amortized over a large time interval through a linear least square fit over several messages covering a large time span. The resulting accuracy is such that it is possible to estimate the clock offsets in a distributed system, even with a relatively low volume of messages exchanged, to within the order of a microsecond while having a very low impact on the system execution, which is sufficient to properly order the events traced on the individual computers in the distributed system.
format Article
id doaj-art-850cfee0b8064a7d8da35cd1f5a5ee6e
institution Kabale University
issn 1687-7381
1687-739X
language English
publishDate 2009-01-01
publisher Wiley
record_format Article
series Journal of Computer Systems, Networks, and Communications
spelling doaj-art-850cfee0b8064a7d8da35cd1f5a5ee6e2025-02-03T01:11:11ZengWileyJournal of Computer Systems, Networks, and Communications1687-73811687-739X2009-01-01200910.1155/2009/190579190579Traces Synchronization in Distributed NetworksEric Clément0Michel Dagenais1Department of Computer Engineering, École Polytechnique de Montréal, P. O. Box 6079, Downtown, Montreal, QC, H3C 3A7, CanadaDepartment of Computer Engineering, École Polytechnique de Montréal, P. O. Box 6079, Downtown, Montreal, QC, H3C 3A7, CanadaThis article proposes a novel approach to synchronize a posteriori the detailed execution traces from several networked computers. It can be used to debug and investigate complex performance problems in systems where several computers exchange information. When the distributed system is under study, detailed execution traces are generated locally on each system using an efficient and accurate system level tracer, LTTng. When the tracing is finished, the individual traces are collected and analysed together. The messaging events in all the traces are then identified and correlated in order to estimate the time offset over time between each node. The time offset computation imprecision, associated with asymmetric network delays and operating system latency in message sending and receiving, is amortized over a large time interval through a linear least square fit over several messages covering a large time span. The resulting accuracy is such that it is possible to estimate the clock offsets in a distributed system, even with a relatively low volume of messages exchanged, to within the order of a microsecond while having a very low impact on the system execution, which is sufficient to properly order the events traced on the individual computers in the distributed system.http://dx.doi.org/10.1155/2009/190579
spellingShingle Eric Clément
Michel Dagenais
Traces Synchronization in Distributed Networks
Journal of Computer Systems, Networks, and Communications
title Traces Synchronization in Distributed Networks
title_full Traces Synchronization in Distributed Networks
title_fullStr Traces Synchronization in Distributed Networks
title_full_unstemmed Traces Synchronization in Distributed Networks
title_short Traces Synchronization in Distributed Networks
title_sort traces synchronization in distributed networks
url http://dx.doi.org/10.1155/2009/190579
work_keys_str_mv AT ericclement tracessynchronizationindistributednetworks
AT micheldagenais tracessynchronizationindistributednetworks