Abstract
Cyber-enabled manufacturing systems are becoming increasingly data-rich, generating vast amounts of real-time sensor data for quality control and process optimization. However, this proliferation of data also exposes these systems to significant cyber-physical security threats. For instance, malicious attackers may delete, change, or replace original data, leading to defective products, damaged equipment, or operational safety hazards. False data injection attacks can compromise machine learning models, resulting in erroneous predictions and decisions. To mitigate these risks, it is crucial to employ robust data processing techniques that can adapt to varying process conditions and detect anomalies in real-time. In this context, the incremental machine learning (IML) approaches can be valuable, allowing models to be updated incrementally with newly collected data without retraining from scratch. Moreover, although recent studies have demonstrated the potential of blockchain in enhancing data security within manufacturing systems, most existing security frameworks are primarily based on cryptography, which does not sufficiently address data quality issues. Thus, this study proposes a gatekeeper mechanism to integrate IML with blockchain and discusses how this integration could potentially increase the data integrity of cyber-enabled manufacturing systems. The proposed IML-integrated blockchain can address the data security concerns from both intentional alterations (e.g., malicious tampering) and unintentional alterations (e.g., process anomalies and outliers). The real-world case study results show that the proposed gatekeeper integration algorithm can successfully filter out over 80% of malicious data entries while maintaining comparable classification performance to standard IML models. Furthermore, the integration of blockchain enables effective detection of tampering attempts, ensuring the trustworthiness of the stored information.
1 Introduction
In advanced manufacturing systems, especially those that are cyber-enabled, the rapidly increasing volume of data has greatly advanced productivity and efficiency. However, it also poses critical challenges regarding how to effectively share and store such massive amounts of data. More importantly, recent studies have identified that data security protection is another extremely important technical challenge in data-rich systems [1]. In recent decades, automated data processing in manufacturing systems has been extensively studied by leveraging machine learning (ML) techniques [2–4]. Although these studies have made significant contributions to advancing the efficiency and quality assurance of manufacturing systems, the trustworthiness of ML outcomes remains a critical concern [5]. To achieve trustworthy data analysis and decision-making, it is crucial to ensure that the data themselves are secure (i.e., trustworthy) from the user's perspective.
Furthermore, the integration of cyber-physical systems into manufacturing, which connects machines through communication protocols and sensor networks, may also lead to even higher security risks for the manufacturing system. Recent reports have shown [6–8] that the vulnerabilities of industrial control systems and digital manufacturing technologies could increase the risk of cyberattacks on the manufacturing sector. For example, attackers can intercept and tamper with sensor data streams, performing false data injection (FDI) attacks [9,10] that may mislead ML models used for process monitoring and control. Such data manipulation can lead to the production of defective or unsafe products without timely anomaly detection. Thus, both ensuring the trustworthiness of manufacturing data and conducting frequent audits are crucial. In this context, blockchain technology presents itself as a viable solution, as discussed in recent studies [11,12]. The employment of blockchain could make the data immutable, meaning once data are recorded, they cannot be altered or deleted, which significantly enhances data integrity and trust. Additionally, blockchain provides traceability, allowing every transaction or data entry to be tracked and verified, thereby improving accountability and transparency within the system. Furthermore, the decentralized nature of blockchain eliminates the need for a central authority, reducing the risk of a single point of failure and enhancing the overall security and resilience of the system against cyberattacks.
However, the application of blockchain in engineering practice still experiences a problem called the “scalability trilemma” between scalability, decentralization, and security [13]. Specifically, in practice, it is still challenging to achieve the optimal level for all three aspects simultaneously. For example, prioritizing decentralization and security can limit a blockchain's ability to process transactions quickly. On the other hand, when only considering the centralized system scenario, which has been commonly adopted in manufacturing systems, both scalability and security can be achieved [14], leading to successful integration of blockchain to manufacturing systems [15]. However, considering the real-world application scenario, blockchain does not inherently guarantee the quality or integrity of the data being fed into the system. This limitation arises because blockchain secures data after it has been recorded [16]. If erroneous, poor-quality, or malicious data are entered into the blockchain, such data will become an immutable part of the chain. Without mechanisms to assess data quality, the blockchain may propagate and secure flawed or compromised data. For instance, if false sensor readings are securely recorded on the blockchain without validation, the manufacturing process might continue under erroneous assumptions, leading to significant equipment damage or product quality issues. Furthermore, FDI attacks that introduce malicious data can compromise the ML models, and if this data are stored on the blockchain, they become challenging to remove or correct. Also, it is worth mentioning the blockchain technology is still vulnerable to several types of attacks like the majority attack, also known as the 51% attack [17]. Such attacks may increase the security risk in centralized and relatively small decentralized blockchains.
Most existing blockchain applications in manufacturing focus on leveraging it as a separate data security tool, lacking collaboration with the data analytics tools leveraged in manufacturing systems, such as ML models. Therefore, the objective of this study is to team ML and blockchain, and thereby strengthen the capability of blockchain in its application to cyber-enabled manufacturing systems. In practice, a notable challenge when incorporating ML models is the issue of catastrophic forgetting [18]. This challenge is particularly prevalent in manufacturing systems, where data generation is continuous and subject to a well-documented phenomenon known as concept drift [19]. This refers to the phenomenon where the statistical properties of an ML model's input data change over time. In other words, the relationship between the features and the target variable which the model is trying to learn is not constant but evolves or shifts. This may lead to degradation in the model's performance over time if it is not adapted or updated to accommodate these changes. To address this challenge, this study proposes to utilize incremental machine learning (IML) which not only can prevent catastrophic forgetting [20] and detect potential concept drifts [21], but also serves as an advanced data curation technique that assures data quality and security in manufacturing processes. If the process change is relatively small, it can be accepted by the model, allowing the system to adapt to normal operational conditions during the retraining. However, significant anomalies due to cyber-attacks must be identified and filtered out to maintain data integrity and model performance.
A brief illustration of the proposed teaming framework is presented in Fig. 1. The first step in the data processing pipeline is to collect data from its sources, i.e., the manufacturing system. This can involve streaming data from machines, cameras, or standalone sensors. Then an IML model continually receives this data, updating its knowledge and refining its predictions in a real-time manner. The rationale of the proposed method is that the IML model serves as an intelligent gatekeeper to help with the blockchain by detecting and excluding the unqualified data, e.g., maliciously injected fake data, unexpected outliers caused by data collection errors. More specifically, this model dynamically evaluates the incoming data streams in real-time, employing sophisticated anomaly detection, pattern recognition, and adaptive learning mechanisms to discern and filter out potentially malicious or anomalous data patterns. Meanwhile, the collected data are also secured by the blockchain, forming a joint security protection framework to safeguard the streaming data in cyber-enabled manufacturing systems. Concurrently, the incremental nature of the blockchain, characterized by the sequential addition of validated data blocks, complements the adaptive learning process of the machine learning model. This strategic alignment ensures that only authenticated and validated data, free from potential threats, is permanently recorded within the blockchain.
By fortifying the initial entry point into the blockchain network, this approach minimizes the risk of storing compromised or tampered data, thereby preserving the integrity and trustworthiness of the blockchain-stored information. It can detect alterations in the storage and prevent unintended modifications. Further, the secured data are then stored in a more permanent location as well, such as a data warehouse, data lake, or even decentralized storage protocols such as Filecoin [22] and Arweave [23]. Notably, it could also be analyzed to extract insights and knowledge. The results of the analysis can then be used to make business decisions, improve processes, or drive other actions.
Nevertheless, there are two major challenges that need to be addressed to fully realize the proposed integration of IML and blockchain. One challenge is the development of efficient and scalable IML algorithms capable of training models on large datasets, especially in a streaming data scenario. The challenge lies in ensuring accurate learning without completely retraining the model on the entire dataset each time. If left unaddressed, scalability issues can lead to delays in model updates, resulting in outdated predictions or even system overload. This necessitates the development of IML algorithms designed specifically for efficiency. Second, ensuring that the IML models themselves are secure and not susceptible to poisoning or adversarial attacks is non-trivial, especially when the models are continuously learning from data that could be manipulated. To address the challenges, this study presents a proof-of-concept that innovatively integrates IML and blockchain.
The main contribution lies in the new integrative framework, and it is also tailored for the applications in manufacturing systems. The deterministic algorithm of the “gatekeeper” complements the incremental nature of both components in the framework, namely, IML and blockchain. The proposed framework addresses the limitations of blockchain by preventing bad data from becoming part of the immutable ledger, thereby reducing the risk of attacks that exploit data integrity such as FDI attacks. This work also discusses and demonstrates how blockchain-based data storage and security protection could be enhanced by applying IML approaches in cyber-enabled manufacturing systems. To validate the effectiveness of the proposed framework, a case study was conducted, where real-time sensor data were collected to detect potential cyber-physical attacks during the additive manufacturing (AM) process.
The rest of this paper is organized as follows. Section 2 provides a review of related works, and then the proposed research methodology is elaborated in Sec. 3. Section 4 presents a real-world case study with experiment results and discussions. Finally, the conclusions are summarized, and future work directions are discussed in Sec. 5.
2 Literature Review
2.1 Application of Blockchain in Cyber-Enabled Manufacturing.
Using the blockchain technology [24], transactions are bundled into blocks, cryptographically linked to form a continuous chain. These blocks are validated and added to the chain through a consensus mechanism, ensuring agreement among participants before acceptance. Utilizing cryptographic hashing and consensus protocols like proof of work, proof of stake, roll-delegated proof of stake [25], or other consensus algorithms, blockchain ensures data integrity, immutability, and security.
By providing a secure, transparent, and immutable record of transactions, blockchain can improve traceability, provenance, and security throughout the manufacturing process. Blockchain can be used to track the movement of goods and materials throughout the supply chain, from raw materials to finished products. This capability helps manufacturers identify counterfeit products, prevent product recalls, and enhance compliance with regulations. Industries leverage blockchain for various applications, such as ensuring traceability and transparency in manufacturing supply chain management [26], maintaining manufacturing–supplier relationship and reducing verification costs [27], and improving cloud manufacturing [28]. Also, blockchain can automate many of the manual processes involved in manufacturing, such as tracking inventory and managing payments. This can free up time and resources for manufacturers to focus on other tasks, such as innovation and product development. However, the primary advantage of blockchain for manufacturing is enhanced security—decentralization, hashing, and consensus mechanism make it difficult for hackers to tamper data or destroy the data chain. Blockchain technology is widely used by companies and researchers in many applications. For example, Liu et al. [29] developed a blockchain-based product credit mechanism to manage cross-enterprise collaborations securely, fairly, and effectively within their social manufacturing network. On the other side, Shi et al. [30] researched how blockchain could be applied in AM and considered it from micro-perspective by securing G-code in real-time monitoring AM process. Another area where blockchain is used is improving anti-counterfeiting and copyright protection procedures for AM [31].
Blockchain can also be utilized to support cloud manufacturing [32]. Cloud manufacturing is a new manufacturing platform utilizing concepts from cloud computing, the Internet of Things (IoT), service-oriented computing, and virtualization to convert manufacturing resources and operations into a set of manufacturing services that can be smartly integrated and managed. Al-Jaroodi and Mohamed [17] presented their framework named Man4Ware to enable blockchain applications such as digital identities, distributed security, smart contracts, and micro-controls in manufacturing systems. Also, Liu et al. [33] have proven the efficiency of blockchain against DDoS attack against Cyber-Physical System (CPS). Abeyratne and Monfared [26] developed a system that uses a distributed ledger in the manufacturing supply chain and demonstrated potential benefits of it, including durability, transparency, and immutability. Private organizations are also using this technology. For example, Oracle has recently included a centralized blockchain feature [34] inside their proprietary database system. Thus, blockchain has proven to be a reliable tool for increasing data security and trust mostly by leveraging linked hashes. Specifically, blockchain could be advantageous in cyber-enabled systems like additive manufacturing since it can create an additional layer of protection in databases. However, most of the existing blockchain-based security protection frameworks are mainly cryptography based, in which the data quality is not sufficiently considered.
2.2 Incremental Machine Learning in Manufacturing.
Predictive ML approaches can be used for effective machine maintenance scheduling by monitoring machine status and for possible failure prediction. This approach is called model predictive control, and it could help to prevent unplanned downtime and unnecessary checks [35]. Also, a common application for ML is quality control, while trained algorithms search for product defects during a manufacturing process or label completed parts as acceptable or unacceptable. Another application of ML is condition monitoring using computer vision to recognize accidents and provide essential information about the load level. Also, machine algorithms are used in product development by reducing time spent on this process at every step starting from market analysis to prototype testing.
To address some well-known issues in traditional ML, such as the catastrophic forgetting [20] and concept drift, leveraging the IML framework is a common solution [36,37]. Also known as continual [21], sequential, or lifelong learning, it is a machine learning paradigm where the learning process is conducted progressively, allowing models to learn from new data while retaining knowledge from previously seen data. This approach is particularly useful in scenarios where it is computationally expensive or impossible to retrain a model from scratch each time new data becomes available. Incremental learning can be broadly categorized into three types: instance-incremental, class-incremental, and concept-incremental learning [38]. Instance-incremental learning involves the model learning from new instances of data, while class-incremental learning extends the model's ability to recognize new classes that were not present in the initial training set. Concept-incremental learning, on the other hand, allows the model to adapt to changes in the underlying data distribution over time.
Recent studies have also explored IML applications in different manufacturing processes, such as semiconductor manufacturing [39] and wire arc additive manufacturing [40]. In addition, another recent study [41] introduced a mode-cloud based transfer incremental learning method, which combined transfer and incremental learning for batch manufacturing process. However, most of the existing studies did not consider the potential security issue, which may exist in cyber-enabled manufacturing. To address the limitations discussed in Secs. 2.1 and 2.2, this study proposes to integrate the blockchain and IML, and thus advancing the cyber-enabled manufacturing systems by complementing each other's strength.
3 Research Methodology
The proposed research methodology is summarized in Fig. 2. In real-world manufacturing, data quality is influenced by various factors, including normal process fluctuations and potential cyber threats. The proposed framework integrates IML with the blockchain technology to establish a robust defense mechanism that prevents the inclusion of malicious data into the blockchain network. The framework is designed to discern between minor process changes—which are acceptable and beneficial for the model to learn—and significant anomalies caused by cyber-attacks. This method dynamically evaluates incoming data streams in real-time and filters out potentially malicious or anomalous data while allowing normal process variations to be incorporated into the model retraining. Essentially, this framework can be viewed as an advanced data curation technique, as it not only ensures data security but also enhances its quality before it is stored in the blockchain. More details about this process are presented in Sec. 3.2.
Besides, blockchain's sequential addition of validated data, also directly supports the iterative refinement of IML models. Main features of the blockchain-enabled manufacturing data storage, which are included in the proposed framework, are described in detail in Sec. 3.1. To wrap up, the methodological framework does not only contribute to enhancing the security posture by proactively filtering out malicious data but also fortifies the blockchain's resilience against potential threats (more details are presented in Sec. 3.3).
3.1 Blockchain-Enabled Manufacturing Data Storage.
In this study, the proposed framework relies on blockchain's immutability, enabling verification of the entire chain through the calculation of each block's hash. That is applicable since each block contains hashes derived from its content and from previous block content, enabling trustworthy data storage and sharing in manufacturing system. As shown in Fig. 3, the blockchain is presented as one programmed object containing blocks, each block stores a hash derived from the previous one. A block composed of raw data, timestamp, and calculated hash. For this study, a single block represents one sensor measurement to align with the framework's focus on uniformity. Main features of the hashing function—it is not reversible, i.e., it accepts any length data and outputs a fixed-size result. For example, a hash for text “manufacturing” is 039d6add7ee0b6cf3cd97ffc914f663fa2e36f853824d8c4371c63f33b237d88.
The blockchain verification function works as a tool by comparing the calculated hash with the stored hash within one block. This verification function demonstrates high time efficiency, since calculating hash is a very computational effective process. For example, if the hash of block #10 is , the function takes raw data from block #10 and hash from block #9, and then uses them to calculate hash . If , the block is verified. The same procedure is established for each block in a chain. Hashing functions are designed to be extremely sensitive to changes in their input. Even a slight change in the input can result in a completely different hash value, so it can be easily recognized.
The necessity of linking hashes in our framework lies in its ability to ensure data integrity and facilitate tamper detection. This lack of linkage means that an attacker could alter a data point and its hash without affecting other data points, making it difficult to detect tampering. In contrast, when hashes are linked, any change to a single block would require recalculating the hashes of all subsequent blocks, which is practically infeasible to be undetected. This ensures that any attempt to alter data is immediately apparent, as the chain's integrity is compromised. Thus, integrating blockchain technology is a promising way to enhance the security of automated data processing framework which typically uses machine learning.
3.2 Leveraging Incremental Machine Learning.
As discussed in the previous sections, this study aims to integrate IML with blockchain technology. Compared with the conventional supervised machine learning, IML can leverage new data for model updating, without completely retraining the model [21]. In manufacturing applications, particularly under the scenario of sensor-based online process monitoring, where real-time sensor data are continuously generated. Consequently, the fast growing of data volume may lead to significantly high computational burden if the machine learning model must be completely retrained frequently. With the adoption of IML, such technical issues can be potentially addressed.
Notably, before adopting the newly coming data points for model updating, the leveraged KD-enabled IML model could also be composed as a code block and perform as a “discriminator” to assign a predicted label to the newly coming data point. If the predicted label is consistent with the label associated with the data point, then the new data point will be adopted for IML. Since such cumulative nature of IML corresponds with the blockchain, the integration framework between them is described in the next section.
3.3 The Blockchain-Incremental Machine Learning Integrated Framework.
The proposed framework focuses on securing the collected data for online process monitoring in manufacturing, which analyzes the data one piece at a time and includes two main parts—IML model and blockchain structure. The IML model is responsible for assigning data to a proper category, and the main role blockchain is for leveraging level of trust and security. It is worth mentioning that security delivered by blockchain is limited since that technology is still vulnerable against several types of attacks like majority attack, also known as 51% attack. Therefore, it would be beneficial to enable an additional security protection mechanism in the blockchain framework. Thus, in the proposed method, besides the role of process monitoring, the IML model will also serve to identify the validity of the new coming sensor data before using them for learning and storage to enable a higher level of trust in stored results.
The framework addresses two primary threat models: sensor-level data manipulation and post-generation data tampering. Sensor-level threats include physical or cyber intrusions that affect data integrity during data collection and labeling, while post-generation threats involve unauthorized modifications to stored data. As shown in Fig. 4, the data are streaming continuously from an online sensing platform in a manufacturing system. The sensing platform collects real-time data during the manufacturing process, and the data format could be presented as a continuous stream of numeric values, e.g., vector of features associated with specific timestamp or time window. Each single piece of data is transferred electronically using communication channels within one cyber-physical system. To ensure the consistency of data labels in our proposed framework, we propose a rigorous verification process before data are hashed and stored in the blockchain. This process involves verifying that the labels are accurate and consistent with the data characteristics. Once the data and their associated labels are processed and stored in the blockchain, any changes made to the labels afterward will not affect either the training process or the blockchain storage due to the immutability of the blockchain. This immutability ensures that once data are recorded, they cannot be altered without detection, thereby maintaining label consistency over time and preserving the integrity of the framework.
Once the data are labeled, they are then processed by the IML model, which acts as an intelligent “gatekeeper,” preventing malicious or suspicious data from entering the system. Specifically, a pretrained ML model or a most-updated IML model, i.e., the ML-based process monitoring model that is continuously updated via IML, makes a prediction for the incoming data entry, and compares that predicted label to the label that is associated with the incoming data entry. If it does not match, the new data entry will be prohibited to be added to the blockchain, since the system cannot be sure if this new datapoint is trustworthy. Furthermore, data from blockchain compounds in minibatch contribute to a learning process letting the model acquire new knowledge, i.e., updating the ML-based processing monitoring model by IML. Thus, the gatekeeper algorithm prevents the model from being affected by potential malicious attacks, enabling effective integration between IML and blockchain. It is also worth noting that the proposed framework is tailored for manufacturing systems, where the data distribution will not change dramatically in a short period of time. Therefore, even though the gatekeeper algorithm may result in a potential bias, the model's learning ability should not be significantly affected, since IML is designed to address concept drift including cases of smooth distribution change. Moreover, when the data are rich with noise and errors, the IML model may benefit from learning from natural, slight shifts due to the filtering mechanism of the gatekeeper algorithm.
The detailed procedure of the proposed framework is outlined in Algorithm 1. It begins by sequentially processing data instances, evaluating predictions against known labels, and storing verified data in a blockchain structure. The key component of the algorithm is the gatekeeper which prevents the model from training on suspicious data. This involves creating blocks with timestamps and associating them with specific blockchains based on label categories. The algorithm ensures data integrity and security by computing and linking hashes for each block of data, facilitating verification to detect any potential tampering. Subsequently, the model is continuously trained using the stored data, updating parameters based on calculated losses and gradients. This iterative process continues until all data instances are processed, resulting in an organized database structured within blockchain. Overall, the algorithm orchestrates a systematic fusion of incremental learning and blockchain mechanisms to manage data streams securely, verify their authenticity, and train machine learning model, thereby ensuring the integrity and reliability of the stored information.
The blockchain-IML integrated framework
Input: Stream of data with length , learning rate , i = 0, t = 0
Initialize: IML model's parameters , target blockchain with genesis block and hash
Repeat:
Step 1: Gatekeeper check
Choose data from
Compute prediction label
If equals
Then continue
Else: go to Step 1
Step 2: Storing data
Create timestamp
Use , label from and xi to create block
Attach to blockchain
Compute hash hi for the b using , , ,
Link with
Step 3: Verification
j = 0
Repeat:
Compute hash of jth block
If the result is not equal to linked hash: display an alert message
j = j + 1
Until:j = i + 1
Step 4: Training model
Use , to compute loss by Eq. (5)
Compute the gradient and update W by Eq. (6)
Until:
Output: Organized database
4 Case Study
4.1 Experimental Setup.
A desktop fused filament fabrication-based 3D printer is deployed for sample fabrication and data collection [43], as shown in Fig. 5. To monitor the AM process, a MEMS accelerometer (i.e., STMicroelectronics LSM6DS3) is mounted on the side of the extruder [42], which can record the real-time vibration signals of the extruder in three axes with approximately 1 Hz sampling frequency. The Raspberry Pi2 4b single-board computer was used for data acquisition from the vibration sensor.
In this study, a solid cube (2 cm sided) was fabricated as the nominal design with the machine setup parameters summarized in Table 1. The feedstock material used is polylactic acid filaments. To validate the effectiveness of the proposed approach, we leveraged a potential cyber-physical attack detection case in this study, which has also been adopted in the related prior studies [43,44]. Specifically, the inner shape of cube was altered (see Fig. 6(b) and “Layer Design” row in Table 1). This could be treated as a simulated attack (i.e., anomaly) scenario where an internal feature (i.e., void) is inserted. Due to the alteration, this attack could lead to significantly compromised mechanical properties of the final products. However, the printing time of nominal and altered products is comparable so that the defects cannot be easily detected by simply tracking the printing time. Thus, such anomalies are expected to be successfully detected in a timely manner during the printing process rather than post-processing inspection. However, the deliberate insertion of internal voids represents a significant anomaly resulting from potential cyber-attacks, which differ greatly from normal process variations. According to the prior studies, leveraging supervised machine learning approaches to recognize the anomaly patterns from online sensor data is a promising to accomplish this task [45,46]. However, the successful implementation of those approaches also relies on the trustworthiness and security of sensor data used in model training.

Designs of sample parts: (a) a nominal part and (b) an altered part, a small square-shaped void was inserted to simulate malicious manipulations due to cyber-attacks
The design parameters of nominal and altered parts
Design parameters | Nominal | Anomaly |
---|---|---|
Printing speed | 50 mm/s | 50 mm/s |
Layer thickness | 0.25 mm | 0.25 mm |
Nozzle temperature | 200 °C | 200 °C |
Infill rate | 20% | 20% |
Bed temperature | 60 °C | 60 °C |
Layer design | Layer # 1–80: solid | Layer # 1–34: solid Layer # 35–80: a square hole inside (1 × 1 × 1 cm) |
Design parameters | Nominal | Anomaly |
---|---|---|
Printing speed | 50 mm/s | 50 mm/s |
Layer thickness | 0.25 mm | 0.25 mm |
Nozzle temperature | 200 °C | 200 °C |
Infill rate | 20% | 20% |
Bed temperature | 60 °C | 60 °C |
Layer design | Layer # 1–80: solid | Layer # 1–34: solid Layer # 35–80: a square hole inside (1 × 1 × 1 cm) |
The model parameter setup
Step | Parameter | Value |
---|---|---|
Knowledge distillation setup | Weight parameter | 0.7 |
Distilled temperature | 5 | |
CNN setup | Kernel size | 3 |
Stride | 2 | |
Padding | 1 | |
Hidden convolutional layer 1 | (1, 16) | |
Hidden convolutional layer 2 | (16, 32) | |
Hidden dense layer 1 | (256, 120) | |
Hidden dense layer 2 | (120, 84) | |
Output layer | (84, 2) | |
Number of pre-training epoch | 5 | |
Number of incremental training epoch | 25 |
Step | Parameter | Value |
---|---|---|
Knowledge distillation setup | Weight parameter | 0.7 |
Distilled temperature | 5 | |
CNN setup | Kernel size | 3 |
Stride | 2 | |
Padding | 1 | |
Hidden convolutional layer 1 | (1, 16) | |
Hidden convolutional layer 2 | (16, 32) | |
Hidden dense layer 1 | (256, 120) | |
Hidden dense layer 2 | (120, 84) | |
Output layer | (84, 2) | |
Number of pre-training epoch | 5 | |
Number of incremental training epoch | 25 |
4.2 Data Preprocessing.
The experiment data were collected in a sequential manner from the mounted sensor in the AM machine, as introduced in Sec. 4.1. There are five replications of printing the altered part, and every printing replication contains approximately 850 observations, i.e., sensor measurements, including the acceleration toward three-dimensional axes. The collected dataset is preprocessed using a sliding window with length n, where each data sample contains n sequential measurements, i.e., the data entry includes the observation i plus additional past observations (lags), as shown in Fig. 7. In this study, the window size has been chosen empirically as 30, and the overlap of two consecutive windows v is 28. Ground-truth labels were assigned as follows: first 15–35% of each experiment is “normal,” and the latter 65–75% is “abnormal.” The rest was removed in this case study, to avoid the potential incorrect labeling and guarantee that the “normal” windows are within layer 1–34, and the “abnormal” windows are within the scope of layer 35–80 (according to Table 1).
4.3 Benchmark Method Selection and Evaluation Metrics.
In this subsection, evaluation studies from different aspects were designed to validate the effectiveness of the proposed method in advancing the security of manufacturing systems. First, it is necessary to evaluate the effectiveness of the employed gatekeeper algorithm in the proposed method. To perform this evaluation, we simulate the attack scenario as follows: malicious data are injected into the online sensor data that are collected to feed the machine learning model. Specifically, the malicious data have reverted labels with intention to “break” the model and make monitoring and further analysis useless. The performance can be evaluated by tracking two groups of metrics: (1) the percentage of the malicious data that can be correctly filtered out; and (2) the improvement of classification performance, which can be measured by the accuracy, precision, recall, and F-1 score. The detailed results and discussions are presented in Sec. 4.5.
Besides, it is necessary to validate the efficiency improvement of the proposed IML and blockchain integration framework, compared to the conventional approach, e.g., retraining the ML model by combining the new data and existing data. Meanwhile, the effectiveness of blockchain on data authentication also needs to be validated. To perform this, as shown in Fig. 8, three different scenarios of malicious tampering on the sensor data were created and tested:
Deletion of blocks on the blockchain. The scenario involves unauthorized block elimination, including erasing all data and hash stored in the block. So, any presence sign of the block won't be left by the attacker.
Slight/Severe malicious tampering on several parts of data stored in the block. This type of attack tries to slightly change data within the block, so this change won't be captured by the monitoring system.
Replacing one block by another on the blockchain. During this scenario, an attacker is generating corrupt block which imitates the target block. After getting unauthorized access to a blockchain, the attacker is trying to replace the target block with a corrupted one.

Security validation scenarios for the framework: (a) block deletion, (b) block alteration, and (c) block replacement
4.4 Implementation Details of the Proposed Method
4.4.1 Blockchain.
In this study, the blockchain was designed and implemented by the python script capable to add block, calculate hash, and verify the whole chain similar to one presented in Ref. [15]. The software wraps up index, timestamp, data itself, and hash of previous block, and further calculates its own hash by applying hashing function to this wrapped data. The SHA-256 function is used to calculate the hash [49]. The verification process, as described in Sec. 3.1, validates the hash for each block again as well as compares it with one stored in the next block.
4.4.2 Incremental Machine Learning Setup.
As described in Sec. 3, the IML model leveraged in this study is a knowledge distillation-enabled incremental convolutional neural network (CNN) [42]. According to preliminary experiments and comparison, the model structure and parameters that were leveraged in the case study are presented in Table 2.
4.5 Results and Discussions.
Table 3 presents the comparison between the IML with and without the gatekeeper, in terms of the classification performance and the portion of malicious data that is filtered out. Overall, compared to the IML without the gatekeeper, using IML with a gatekeeper can provide comparable classification performance. Meanwhile, it offers increased data security against suspicious observations. For sensor-level threats, the gatekeeper algorithm successfully filtered 86.47% of malicious data.
The comparison of classification result
Approaches | Accuracy | Precision | Recall | F1-score | Portion of malicious data filtered out |
---|---|---|---|---|---|
IML without gatekeeper (benchmark) | 0.7608 | 0.8137 | 0.6640 | 0.7313 | 0% |
IML with gatekeeper (proposed) | 0.7216 | 0.6929 | 0.7760 | 0.7321 | 86.47% |
Approaches | Accuracy | Precision | Recall | F1-score | Portion of malicious data filtered out |
---|---|---|---|---|---|
IML without gatekeeper (benchmark) | 0.7608 | 0.8137 | 0.6640 | 0.7313 | 0% |
IML with gatekeeper (proposed) | 0.7216 | 0.6929 | 0.7760 | 0.7321 | 86.47% |
Note: For each evaluation metric, the approach with better performance is highlighted in bold.
Figure 9 displays how accuracy and loss change over epochs for both systems. Although the gatekeeper equipped with the IML model converges slower than the benchmark (without gatekeeper), it showcases better performance upon convergence (at the eighth epoch). However, the proposed framework struggles to maintain such high-performance post-convergence, losing capabilities when trained on limited data. This phenomenon probably can be explained by the possible overfitting of the neural network model. This occurs because the gatekeeper may over-emphasize specific patterns in the training data, limiting the model's capability to learn the new information. In practical manufacturing scenarios involving online process monitoring, it's essential that the gatekeeper strikes a balance between filtering out significant anomalies due to cyber-attacks and accepting minor process changes for retraining. Thus, if the gatekeeper is too restrictive, it may hinder the model's ability to adapt to normal process variations, affecting its generalization capability. Thus, better mitigating the risk could be a potential future direction for this work.
Furthermore, the integration of data preprocessing steps is crucial for enhancing the quality of the data before it enters the IML model. Preprocessing involves several key activities, including but not limited to sampling and scaling. By implementing these preprocessing techniques, the framework ensures that the data fed into the IML model is clean, consistent, and representative of the actual manufacturing conditions.
Besides, to validate the effectiveness of the blockchain, as introduced in Sec. 4.3, three different malicious data tampering scenarios were also manually simulated. In all scenarios, including block deletion, data modification within a block, and block replacement, the system successfully detected inconsistencies during the verification process. By recalculating the hash value of each block and comparing the corresponding hash value with the original hash, malicious deletion/addition on the block was identified. While performing slight modification, it also was successfully detected by the proposed blockchain-enabled framework, since changing just one number inside the block leads to completely different hash linked to the block. Similarly, the replaced block was also detected by the blockchain. All alerts were released by the program almost instantly after an attack due to the incremental fashion of the framework. Example alert messages are demonstrated in Fig. 10.
While the linked-hash mechanism significantly enhances data integrity and security, it is also worth noting that it is not entirely immune to all forms of attacks. The framework operates under certain assumptions:
The blockchain is deployed within a controlled manufacturing environment with restricted access to authorized personnel and systems.
Cryptographic hash functions used (e.g., SHA-256) are considered secure and up to date.
Regular audits and monitoring are in place to detect and respond to potential insider threats or unauthorized access attempts.
By adhering to these conditions, our mechanism effectively identifies and mitigates tampering attempts, thereby enhancing the overall security of the manufacturing data storage system.
Furthermore, Table 4 demonstrates the profound computational cost benefit gained by the proposed integration of incremental learning and blockchain. The drastic reduction in elapsed time per training epoch—lowering from 0.4400 s with traditional batch learning to a mere 0.2795 s with the proposed approach which leverages incremental learning. Meanwhile, the standard deviation is also even lower. This efficiency improvement directly translates to significant savings in processing power, storage requirements, and potential correspondence to manufacturing constraints, without loss of security protection. It is also worth mentioning that executing the blockchain in our proposed method does not significantly increase the required computational cost.
Computational cost comparison
Learning type | Elapsed time per training epoch (s) |
---|---|
IML + Blockchain (proposed) | 0.2795 ± 0.0112 |
Traditional | 0.4400 ± 0.0195 |
Learning type | Elapsed time per training epoch (s) |
---|---|
IML + Blockchain (proposed) | 0.2795 ± 0.0112 |
Traditional | 0.4400 ± 0.0195 |
Note: For each evaluation metric, the approach with better performance is highlighted in bold.
The implementation feasibility of the proposed framework varies across different manufacturing applications. In additive manufacturing, as demonstrated in this case study, the layer-by-layer nature of the process allows for straightforward integration of IML and blockchain. Sequential data generation aligns well with the incremental learning approach, while blockchain effectively secures process data. This makes AM an ideal candidate for initial implementation, requiring minimal modifications to existing infrastructure. The proposed framework can also be extended to applications in other manufacturing systems. For example, CNC machining would require additional considerations for implementation. The high-speed nature of CNC operations and the complexity of tool path monitoring would necessitate more sophisticated sensor networks and ML model as well as faster data processing capabilities. While the proposed framework can be adapted for CNC applications, the implementation would require careful optimization of the gatekeeper algorithm to handle the higher data generation rates and more complex feature sets associated with tool wear and cutting parameters. Furthermore, in robotics and automation systems, the implementation challenges primarily stem from the need to process multiple streams of dynamic motion data simultaneously. The framework would need to be modified to handle real-time trajectory data and force control parameters, potentially requiring higher computational resources for the IML model to maintain effective monitoring. However, the blockchain component could be particularly valuable in securing robot motion commands and preventing unauthorized trajectory modifications.
Thus, according to the results demonstrated above, our proposed method is promising to help with the online detection of abnormal behaviors as well as the security protection for the database primed for analysis and decision-making. The designed gatekeeper algorithm could also help to save the manufacturing system from potential malicious data tampering. Moreover, due to the integration of blockchain, the collected sensor data can be better secured against malicious tampering. With such effective data security protection, the sensor-based process monitoring for manufacturing systems could be more trustworthy. The capability of incremental learning to streamline model updates also significantly improves computational efficiency. It is also worth mentioning that IML and blockchain offer complementary capabilities to significantly boost system scalability.
5 Conclusions and Future Work
This study presents a novel approach that integrates IML with blockchain technology to enhance security in cyber-physical manufacturing systems. By storing collected data in the blockchain, unintended data modification can be timely detected and accurately located based on the mismatch of the hash values in the corresponding block. The integration of IML with blockchain could also greatly improve computational efficiency. Furthermore, the gatekeeper algorithm in the proposed method helps to filter out more than 80% of the maliciously tampered sensor data before sending to IML and blockchain storage, significantly alleviating the security concern while reducing computational cost.
There are still a few future research directions left open. First, the computational efficiency of the proposed method needs to be improved, since the verification cost of blockchain tends to increase as the data volume increases based on the current framework. Second, fine-tuning the gatekeeper's sensitivity is crucial to ensure it effectively filters out malicious data while permitting minor process variations that are inherent in real-world manufacturing. Developing adaptive thresholding techniques or incorporating domain knowledge could enhance the gatekeeper's performance in distinguishing between significant anomalies and acceptable changes. Third, the proposed framework can be further extended to dynamically incorporating newly presented classes, such as the new types of anomalies. Last but not least, considering the rapid growth of blockchain, to incorporate the advanced blockchain-based security mechanisms, such as the zero-knowledge proofs and smart contracts [50–52], along with exploring differential privacy techniques [53,54] could also potentially strengthen the security of manufacturing systems.
Footnote
Raspberry Pi is a trademark of Raspberry Pi Ltd.
Acknowledgment
This work is partially supported by the National Science Foundation under Award No. TIP-2141184.
Conflict of Interest
There are no conflicts of interest.
Data Availability Statement
The datasets generated and supporting the findings of this article are obtainable from the corresponding author upon reasonable request.