Abstract
This paper investigates the feasibility of the pragmatic implementation of distributed algorithms in real-world cyber-physical systems (CPSs). We detail how distributed (and parallel) computing systems can be realized using existing system-design software for CPS development. A series of experiments is devised and used to verify the information-exchange capacity of the computing nodes and the synchronous and real-time operational effectiveness of the overall distributed computing system development framework. Finally, an actual distributed computing system is realized and validated by executing a fundamental distributed consensus protocol.
1 Introduction
In recent years, researchers in academia and industry have taken an interest in cyber-physical systems (CPSs). Such systems integrate sensing, computation, control, and networking into physical objects and infrastructure, connecting them to the internet and each other [1]. These integration and interconnectivity hold the potential to bolster the realization of physically interactive and geographically distributed applications in medical systems [2], industrial systems [3], transportation systems [4], distributed environment monitoring systems [5], smart grids [6], and smart cities [7], to name a few.
A distinguishing characteristic of a CPS is its information-processing architecture, which dictates how actions, such as processing, memory storage, and inference, are carried out within the system. There are two primary architectural paradigms: centralized architecture [8–14] and distributed architecture [1,15]. In a centralized architecture, all information processing and decision-making tasks occur within a single processing unit. This approach is often characterized by a central controller or processor that gathers data from various sensors and devices, processes it, and makes decisions based on the collected information. On the other hand, we use the term “distributed” CPS (DCPS) to describe CPSs capable of distributed data processing, inter-processor communication, and distributed decision-making [16–22].
The technical brief addresses the challenge posed by DCPSs, akin to distributed computing systems, regarding the development of real-time distributed algorithms to support coordinated task performance. Various distributed algorithms have been proposed [23–25], often validated through theoretical proofs or limited offline simulation [26,27]. However, practical implementation within a real DCPS setup remains uncommon, typically relying on centralized architectures.
Existing DCPS research primarily focuses on introducing conceptual frameworks, mathematical models, or algorithms, with little attention to real-world implementation considerations [28–30]. The growing interest in DCPS applications within measurement, control, and dynamic systems calls for turnkey solutions based on existing technology, capable of acquiring sensory information and enabling distributed inference and automation (e.g., diagnostics and industrial automation). We approach DCPS design from a control engineering and industrial automation standpoint, leveraging computerware to materialize control system functionalities (data acquisition and instrument control) in a distributed manner.
First, we discuss distributed system synchronization, a fundamental design prerequisite for all real-time distributed systems, and we establish how existing software for network time protocols (NTPs) can be used to deal with this challenge. Second, we design and implement three experiments in a real-world three-node DCPS to validate the synchronous operation of the entire system as it executes a basic distributed average consensus algorithm.
This work serves as a quick reference for the development of a real-time DCPS for the implementation of distributed algorithms. We identify principal hardware and software architectures, modalities, and properties required for the realization of DCPS. This allows researchers and practitioners in dynamic systems and control systems to focus on algorithm development, bypassing implementation challenges. The presented experimental procedures establish foundational methods for validating synchronous DCPS operation, a crucial step before deploying task-specific distributed computing algorithms. Notably, this explicit experimental approach is currently lacking in the literature.
In Sec. 2, we define the key components, elements, and notions required to model and implement a DCPS. Section 3 provides details on clock synchronization. In Sec. 4 we detail the design of three experiments formulated to validate the synchronous operation of the nodes within a real-time DCPS. In Sec. 5, we describe the implementation of the experiments formulated in Sec. 4 using NI labview and a three-node DCPS. Results and concluding remarks are presented in Secs. 6 and 7, respectively.
2 Definitions and Terminology
In this section, we define the key components, elements, and notions to be taken into consideration to effectively design and implement a DCPS that can (1) acquire data from a real-world physical process or environment in real-time, (2) perform distributed decision-making, and (3) control actuators based on computation results. To this end, herein we attempt to connect notions from control theory, computing, embedded systems, and communications. These elements and notions are classified under the distinction of the physical and cyber layers. The architectural framework of DCPS is depicted in Fig. 1.
The Physical Layer
The physical layer consists of sensors, actuators, and computing devices to which sensors and actuators are interfaced. In this layer, data from a physical process—e.g., environmental, biological, or engineered—are collected by the sensors, and processed by the computing devices, which drive the actuators that act on the physical process.
Sensor: A device or module that measures a physical variable. The sensor creates a digital numerical value (signal), capturing the measurement of a physical variable, which can be processed by a computer.
Actuator: A component of a system that acts on the system, e.g., a motor that creates motion as a primary effect. It is responsible for inducing a change in the system.
Embedded System: A microprocessor-based computer hardware system with software that is designed to perform dedicated functions, either as an independent module or as a part of a complex system. At the core, there is an integrated circuit designed to carry out information processing, usually signal analysis of real-time measured signals, for real-time operations.
Node: An embedded system together with its connected sensors and actuators.
Neighborhood: The neighborhood of node i, denoted by , is the set of all nodes having a one-to-one wired or wireless communication link with node i.
The Cyber Layer
The cyber layer consists of the communication network, network protocols, and other cloud-based computation and storage resources available to the DCPS [31].
Network: In DCPSs, a network is a set or group of nodes that are interconnected through a communication channel. DCPS applications adopt a peer-to-peer model of inter-processor communication within a network where there is no distinction between servers and clients. Alternatively, a client-server model is used in many other CPS applications.
Communication Channel: A communication channel consists of the hardware and software infrastructure that serves as the medium to transmit information from one network device (node) to another. A channel can be wired or wireless. All communication channels in DCPSs are bidirectional. For example, the same channel allows node p to send a message to node q and vice versa.
Send Phase: The period it takes to prepare a data packet at a node and publish it through the communication channel to other nodes.
Receive Phase: The period it takes for a node to receive (read) data from its neighboring nodes and unpack it for its local computation.
Computation Phase: The period it takes to process (compute) information received from neighboring nodes.
3 Clock Synchronization
Coordinating nodes in a DCPS involve message exchange within the network. The precise knowledge of the global time is essential for node coordination, yet accuracy in synchronizing embedded systems’ clocks is constrained. Simultaneous system clock readings may yield differing local time values [32]. Consequently, a synchronization mechanism at each node is necessary to manage clock deviations [33]. A suitable approach employs NTP, facilitating clock synchronization between networked systems. In this setup, each DCPS node acts as an NTP client, regularly polling an NTP time server for the current time. A comprehensive NTP reference is provided in Ref. [34]. Readily available software for realizing an NTP are nettime and abouttime.
Hadzilacos and Toueg [35] offer a thorough definition of a synchronous distributed system: (1) processes exhibit known upper and lower execution time bounds; (2) transmitted messages adhere to known bounded delivery times; and (3) process local clocks have a defined drift rate from real time. Synchronization upholds these conditions, ensuring coordinated node operation in a DCPS.
4 Experiments for Testing System Concurrency and Synchronicity
This section describes three experiments formulated to validate if the nodes in a DCPS operate synchronously in real time. The three experiments were designed based on the three indicators of a synchronous computing system operation proposed by Hadzilacos and Toueg [35]. A synchronous distributed algorithm at each node is prescribed to go through three sequential phases, namely, the send, receive, and computation phases during each program’s phase (execution period) as shown in Fig. 2. All nodes in the network are expected to go through each of the three phases concurrently during each execution period [36].
4.1 Experiment 1: Verification of Consistency in Send Phase Intervals.
This experiment checks the bound on the time interval between two consecutive send period instances at node i, denoted by , where i denotes the node and k is the time index. Each node is programed to send data (e.g., numeric values) to and receive data from neighboring nodes, and to log the timestamp at the start of each send period. The time interval is calculated as the numerical time difference between consecutive send phase timestamps. Ideally, should have a consistent value.
4.2 Experiment 2: Consistency Validation of Send–Receive Intervals.
This experiment measures the time interval between the instances a node sends and receives data to and from its neighboring nodes, respectively, in one execution period. This time interval is denoted by and illustrated in Fig. 3. To give an instance, node i logs the time timestamp it sends and receives data, respectively, to and from its neighboring node j. The difference between these two timestamps is . The values of are expected to be consistent for all instances.
4.3 Experiment 3: Node Synchronicity.
The last experiment examines the synchronous operation of the clocks at the nodes within the DCPS network. To accomplish this, the time difference between timestamps recorded at the start of the send periods of any two nodes in the network, e.g., nodes i and j, denoted by as shown in Fig. 4 is determined. The difference between the timestamps recorded for any two neighboring nodes, , is calculated for all pairs of nodes. The values obtained represent a measure of the degree to which the clocks at each node are synchronized. Ideally, is expected to be close to zero for any two nodes that are perfectly synchronized.
5 Experimental Setup for Implementing DCPS
In this section, we present a brief description of the computing configuration that was used to implement a three-node DCPS for executing the experiments designed in Sec. 4. We also provide a brief introduction to NI labview software—the system-design software used at the hardware nodes for the execution of our experiments.
The setup of the three computers is shown in Fig. 5. Each represents a node with computational, communication, and storage resources. The three computers were connected wirelessly to a network router and a unique IP address was assigned to each node. Each node was programed to communicate with neighboring nodes and process data using NI labview functions.
labview offers a variety of ways to transfer data between computational entities. We have chosen the network streams model as the foundation for the realization of DCPSs. This model uses a peer-to-peer approach for passing messages between computers, which better aligns with the fundamental tenets of distributed computing. On the other hand, the shared variables model imitates a decentralized client-server setup for communication. With network streams, labview allows a continuous flow of data between two of its applications using read and write endpoints.
Two primary labview functions enable node and phase concurrency: the timed loop and the timed sequence functions, illustrated in Fig. 1. The timed loop establishes a deterministic processing loop, characterized by a precisely defined time for each iteration. This loop enforces the execution period () for individual nodes. Conversely, the timed sequence structure comprises one or more subdiagrams, or frames, which are executed in sequence. Each frame within the timed sequence structure is executed only once and specifies the initiation and termination times for each frame. This structure configures the timing parameters for the three distinct phases. A comprehensive exposition of these two functions is available within the documentation of the NI labview software. The software's template of the two functions is shown in Fig. 6.
6 Results
This section presents the results derived from the analysis of timestamp data recorded by each of the three nodes during the labview implementation of the verification experiments detailed in Sec. 4. For every experiment, 150 timestamps were logged at each node during the exchange of data between neighboring nodes.
Table 1 delineates the mean absolute error of the three time intervals. In experiment 1, the predefined bound () carries a reference value of 100 ms. Similarly, experiment 2 establishes a reference value of 10 ms for the period between the send and receive phases.
Experiment 1 | Experiment 2 | Experiment 3 | |||
---|---|---|---|---|---|
4.0907 × 10−16 s | 2.3649 × 10−16 s | 4.1317 × 10−4 s | |||
1.3812 × 10−5 s | 1.0000 × 10−6 s | 3.4267 × 10−5 s | |||
6.7114 × 10−6 s | 9.2000 × 10−4 s | 4.4707 × 10−4 s |
Experiment 1 | Experiment 2 | Experiment 3 | |||
---|---|---|---|---|---|
4.0907 × 10−16 s | 2.3649 × 10−16 s | 4.1317 × 10−4 s | |||
1.3812 × 10−5 s | 1.0000 × 10−6 s | 3.4267 × 10−5 s | |||
6.7114 × 10−6 s | 9.2000 × 10−4 s | 4.4707 × 10−4 s |
While a perfectly synchronous distributed system would ideally yield zero mean errors across all three experiments, practical distributed systems are near-synchronous aiming for small and consistent error bounds. From the data tabulated in Table 1, it is evident that the mean absolute error is in the order of sub-microseconds, indicating its insignificance. These negligible deviations signify that the time intervals between the send and receive phases remain consistent, thereby corroborating the presence of concurrency among nodes. Experiment 3’s minimal error values further affirm the synchronicity of the nodes. The simultaneous commencement of send phases across all nodes substantiates the synchronization of clocks within the network. Each of the nodes is a distinct computing machine with unique processing, memory, and networking characteristics, which accounts for the negligible variation in values found in Table 1.
Finally, to evaluate the efficacy of the experimental system, we conducted testing through the execution of a fundamental distributed protocol. A consensus algorithm, adopted by Ref. [25], was programed within the experimental DCPS, delineating a set of rules governing processing and message propagation among nodes with the aim of achieving mutual agreement on a shared value.
For the implementation of the consensus algorithm, each node was initially assigned a random value, symbolizing a singular sensor measurement within an actual DCPS. Utilizing the present value of the node in conjunction with values obtained from neighboring nodes, the distributed protocol computes an estimate that progressively converges to the mean value of the nodes. Following 15 iterative cycles, the node is reinitialized with a new random number, akin to simulating a fresh sensor measurement for each node, thereby initiating another cycle of the consensus loop. After 10 successive executions of the consensus loop, the collected data were graphed to ascertain the convergence pattern of the consensus algorithm.
The plot in Fig. 7 shows that the distributed algorithm reaches a consensus and is capable of executing distributed computation tasks as expected. The algorithm converges to an average of the starting values at each node. These results indicate that the three-node distributed system constructed for the experiments is synchronous.
7 Conclusions
In this paper, we established node synchronization as a prerequisite for the effective implementation of a distributed algorithm in a real-time DCPS. Based on proposed indicators of synchronization in the literature, we formulated and implemented three experiments to validate the synchronous operation of the nodes of a real-time three-node DCPS, which executes a distributed average consensus algorithm.
Acknowledgment
This material is based upon work supported by the National Science Foundation under Grant No. CMMI-2002627.
Conflict of Interest
There are no conflicts of interest.
Data Availability Statement
The datasets generated and supporting the findings of this article are obtainable from the corresponding author upon reasonable request.