Although complex geometries are attainable with additive manufacturing (AM), a major barrier preventing its use in mission-critical applications is the lack of geometric accuracy of AM parts. Existing geometric dimensioning and tolerancing (GD&T) characteristics are defined based on simple landmark features, and thus, need to be customized to capture the subtle difference in parts with complex geometries. Hence, the objective of this work is to quantify the geometric deviations of additively manufactured parts from a large data set of laser-scanned coordinates using an unsupervised machine learning (ML) approach called the self-organizing map (SOM). The central hypothesis is that clusters recognized by the SOM correspond to specific types of geometric deviations, which in turn are linked to certain AM process conditions. This hypothesis is tested on parts made while varying process conditions in the fused filament fabrication (FFF) AM process. The outcomes of this research are as follows: (1) visualizing and quantifying the link between process conditions and geometric accuracy in FFF and (2) significantly reducing the amount of point cloud data required for characterizing of geometric accuracy. The significance of this research is that this unsupervised ML approach resulted in less than 3% of over 1 million data points being required to fully quantify the part geometric accuracy.

## Introduction

### Motivation.

Additive manufacturing (AM) enables the fabrication of complex geometry parts for a broad range of materials including metals, ceramics, and polymers [1–4]. However, AM parts are afflicted by low geometric accuracy and poor surface integrity. The poor geometry of AM parts prevents their use in industries, such as aerospace, healthcare, and automotive, where precision is important [5]. The prevalent approach for quantifying the geometric accuracy of AM parts is to assess critical geometric dimensioning and tolerance (GD&T) characteristics, such as flatness, cylindricity, and circularity [6], based on several sample points (tens to hundreds) taken on certain part features. Although GD&T characteristics provide insights into feature-based geometric errors, the sampling strategy required for the extraction of GD&T characteristics needs to be carefully defined for parts with complex geometries. In AM, each combination of material, design, and machine may create parts with specific types of geometric deviations in terms of direction and magnitude. We substantiate these assertions with experimental data from a fused filament fabrication (FFF) AM process in Sec. 3. Therefore, there is a need for developing methods that can efficiently characterize the geometric accuracy of AM parts with complex shapes.

### Objective and Significance.

The goal of this work is to quantify the geometric deviations of additively manufactured parts from a large data set of laser-scanned coordinates. This is an important research area; recent developments in nondestructive scanning techniques, such as laser, structured light, and computer tomography, engender point-by-point coordinate measurements of AM parts. These techniques can generate millions of coordinate points (three-dimensional (3D) point cloud). However, new analytical methods for quantifying the geometric accuracy of AM parts from these large scan datasets are required. Accordingly, the objective of this work is to use an unsupervised machine learning (ML) algorithm called self-organizing map (SOM) to overcome this open research challenge. The SOM is used for demarcating or categorizing the point cloud measurements into limited number (tens) of clusters such that measurement points within the same cluster have similar shape deviations in terms of their severity (magnitude) and direction. Next, the clusters are ranked as per the severity (magnitude) of geometric deviations; contained in these few top-level clusters represent the most severe types of geometric defects associated with the part. The central hypothesis is that each SOM-derived cluster represents a unique type of geometric deviation specific to the process-material combination associated with the part, and thus, the clusters are surrogate signatures of the part geometric accuracy. Experimental results with FFF AM process indeed corroborate this hypothesis; it is shown that process conditions can be chosen to reduce the geometric errors by analyzing the number of clusters and the magnitude of deviations within a few important clusters (critical clusters).

The remainder of this paper is organized as follows: Section 2 introduces the existing methods and their limitations. Section 3 describes the experimental measurement procedure and FFF process conditions that are used to print the parts along with the sample data, along with an introduction of the SOM approach. Section 4 applies the SOM algorithm to experimental 3D point cloud data and discusses the applications of SOM for geometric accuracy characterization and data reduction. Section 5 highlights the conclusions of this study and avenues for future research. The significance of this research is that the SOM-based unsupervised ML approach resulted in less than 3% of over 1 million data points being required to quantify the part geometric accuracy.

## Existing State-of-the-Art and Research Gaps

Typically, characterizing the geometric accuracy of a part relies on one of the following approaches: evaluation based on visual assessment [5], measurement based on specific landmarks on the part [7], and GD&T measurements using coordinate measuring machines [8]. These approaches provide only limited information of the overall geometric accuracy of the part and may not be capable of capturing the subtle difference parts of complex shapes fabricated using AM [1]. The existing literature pertaining to the characterization and optimization of geometric accuracy in AM can be divided into three groups: (1) data science (empirical modeling), (2) physics-based modeling (mostly finite element (FE) analysis), and (3) shape compensation.

### Data Science Approaches (Empirical Modeling of Geometric Accuracy in AM).

Measuring part geometric accuracy is typically performed with various approaches, including ultrasonic, X-ray tomography, and laser scanning [8–12]. The resulting data are typically analyzed using statistical models (e.g., response surface regression and analysis of variance [13–16]) to develop an empirical mapping of the relationship between input process parameters and geometric accuracy. For example, Wang et al. [17] used least squares regression to correlate shrinkage of fabricated parts and process parameters (e.g., laser power, layer pitch, and scanning speed) in the stereolithography (SLA) process. Zhou et al. [13] used a 3D coordinate measuring machine and surface profilometer to capture geometric accuracy, and surface roughness of parts in SLA process. Through Taguchi experimental designs, Zhou et al. [13] found that by adjusting process parameters, such as layer thickness, hatch spacing, overcure, blade gap, and position on build plane, the part errors can be controlled to be less than 0.0013 mm/mm. Furthermore, Noriega et al. [18] used an artificial neural network model coupled with an optimization algorithm to improve GD&T characteristics such as distance between parallel faces in FFF fabricated parts. However, these previous works only use limited number of samples with simple geometries to model GD&T characteristics such as cylindricity and flatness; thus, these works cannot capture the critical features for the parts with complex shape [19,20].

These previous works use limited number of samples with simple geometries to model GD&T characteristics such as cylindricity and flatness, and thus, cannot capture the critical features for the parts with complex shape [21]. The work presented herein goes one step further by examining multiple features, but is nonetheless also based on a test artifact composed of elementary geometries. The authors will extend this research to more complex AM geometries, such as thin walls and steep overhangs in their forthcoming works.

### Physics-Based Modeling Approaches (Finite Element Modeling of Geometric Accuracy in AM).

Thermomechanical modeling has been proposed to account for thermal-related deformation and the geometric accuracy of AM parts. These methods consider the effect of process parameters such as slice thickness, part orientation, scanning speed, and material properties [21]. The advantage is that finite element models can be used to simulate a build without actual fabrication, and to investigate the relationship between process parameters and geometric accuracy [22,23] (given that the models are validated over the applicable process parameter range). Pal et al. [22] proposed a 3D dislocation density-based model to capture the relationship between thermal contours and residual stresses. They controlled the layer-by-layer geometrical errors in the prebuild stage by adjusting AM process parameters, including scan pattern and build orientation. This method accurately predicts the mechanical properties (e.g., residual stress) as a function of process parameters, leading to less postprinting rework.

Paul et al. predicted part shrinkage in powder-based AM processes by modeling the powder-to-liquid- and liquid-to-solid-state changes [23]. FE modeling is used to track the temperature gradient history in various layers and compute the overall shrinkage in the part. Subsequently, the authors use virtual geometric models to characterize thermal deformation in terms of GD&T characteristics. Jiang et al. proposed a simulation model to examine the inaccuracy of the mask exposure scanning stereolithography process [24]. Based on the simulation results, the authors suggested an optimal scan path and exposure time. Analysis of the scanning pattern indicated that the sharp corners on the part shrink faster than rounded edges. While useful at the design phase, FE-designed process plans tend to deviate from the actual build geometry due to the inability to accommodate the uncertainty stemming from materials or process parameters. The errors in the printing process accumulate over several layers and can eventually result in significant deviations.

### Shape Compensation.

Researchers have suggested approaches to enhance geometric accuracy of AM parts via shape compensation [15,16,23]. In particular, deformation may occur during the solidification of each layer and accumulate over layers, eventually resulting in distortion [17,25]. A procedure developed by Huang compensates for the two-dimensional (2D) shape deformation in SLA [25]. The proposed method provides a minimum area deviation norm to rectify the 2D shape deformation. However, the model considers shrinkage in layers to occur independently and does not capture the interaction between layers; therefore, this method is limited to simple shapes.

### Research Gap.

Very few of the existing studies are dedicated to the analysis of point cloud data for the characterization of AM geometric integrity. A recent study by Rao et al. [12] applies spectral graph theory (SGT) to point cloud data generated parts based on different process parameters to rank/classify the geometric accuracy of parts. The proposed approach ranks the overall quality of AM parts based on the spectral graph Fiedler number, a criterion commonly used in graph theory to assess connectivity. However, the type of geometric deviation (i.e., direction and magnitude) associated with a specific design of parts cannot be obtained from this approach. The method proposed in the current study addresses this issue from a fundamental level. Instead of using only one absolute value to evaluate the geometric accuracy of part, the approach outlined herein identifies candidate features based on the shapes of designs, enabling automated certification of geometric accuracy in the future.

## Methodology

It is noted that the test parts used for this work are made on a consumer-grade FFF printer (Makerbot2X) and relatively inexpensive laser scanner (NextEngine). The authors acknowledge that although the choice of a desktop machine constrains the experimental scope of this paper, however, the mathematical concepts are independent of the hardware and manufacturing process. The native software on the FFF machine was used for slicing and toolpath planning under default conditions.

### Characteristics of Point Cloud Data

#### Test Part.

The test artifact used for experimental studies is a simplification of the NAS 979 standard artifact. The NAS 979 artifact is traditionally used to test the accuracy of machining centers [26–28]. Recently, Cooke and Soons at NIST used the NAS 979 part to assess AM process performance [27,28]. GD&T characteristics, such as squareness, size (length, breadth, and thickness), and circularity, were used to compare the quality of parts produced using electron beam melting and direct metal laser sintering metal AM processes. For the experimental tests reported in the current work, the NAS 979 standard test artifact was simplified so that it was easier to make in terms of build time and feature complexity; this simplified design is called as circle-square-diamond. A schematic of the test part used in this study is shown in Fig. 1(a).

#### Data Acquisition.

A desktop 3D laser scanner (NextEngine) is used to scan the surface of FFF test parts and obtain point-by-point coordinate measurements of the geometry, referred to as a 3D point cloud. The laser scanner records reflected light from the part surface as a point in the 3D space, with a maximum volumetric deviation. The scanner uses eight 10 mW class 1 red lasers at 650 nm in two arrays of four diodes. These arrays rotate together and project lines onto the object and two 3.0 megapixel cameras capture the reflection of the lasers over the part. This desktop scanner was used in macromode, which is capable of scanning objects up to 130 mm $\xd7$ 10 mm in a single scan at ±0.127 mm accuracy. The scans and measurements generate a large volume of data (e.g., 110 Mb for a small part with volume 7911 mm^{3}).

Geometric deviations of the fabricated parts can be calculated by comparing the point cloud data to the original computer-aided design (CAD) model. A sample of data is shown in Table 1. $(X,Y,Z)$ represents the coordinates of measurement point, and ($\Delta x$, $\Delta y,\Delta z$) represents the corresponding measurement of geometric deviation in the $(X,Y,Z)$ directions, respectively. For the test part shown in Fig. 1, the scanning results in over 1 GB of point cloud data in the ASCII text format, consisting of 18,098,301 rows with each row representing a measurement of geometric deviation.

Measurement point | $X$ | $Y$ | $Z$ | $\Delta x$ | $\Delta y$ | $\Delta z$ |
---|---|---|---|---|---|---|

1 | 23.73 | 28.31 | 1.68 | 0.07 | −0.18 | 0.00 |

2 | 23.73 | 28.35 | 1.59 | 0.06 | −0.14 | 0.00 |

… | … | … | … | … | … | … |

2,999,999 | 33.01 | 16.91 | 8.82 | 0.00 | 0.00 | 0.16 |

3,000,000 | 33.11 | 16.82 | 8.82 | 0.00 | 0.00 | 0.16 |

… | … | … | … | … | … | … |

9,999,999 | 43.65 | 37.11 | 1.73 | 0.00 | 0.00 | 0.57 |

10,000,000 | 43.99 | 37.24 | 1.58 | 0.24 | 0.00 | 0.41 |

Measurement point | $X$ | $Y$ | $Z$ | $\Delta x$ | $\Delta y$ | $\Delta z$ |
---|---|---|---|---|---|---|

1 | 23.73 | 28.31 | 1.68 | 0.07 | −0.18 | 0.00 |

2 | 23.73 | 28.35 | 1.59 | 0.06 | −0.14 | 0.00 |

… | … | … | … | … | … | … |

2,999,999 | 33.01 | 16.91 | 8.82 | 0.00 | 0.00 | 0.16 |

3,000,000 | 33.11 | 16.82 | 8.82 | 0.00 | 0.00 | 0.16 |

… | … | … | … | … | … | … |

9,999,999 | 43.65 | 37.11 | 1.73 | 0.00 | 0.00 | 0.57 |

10,000,000 | 43.99 | 37.24 | 1.58 | 0.24 | 0.00 | 0.41 |

We note that laser scanning consists of several steps, such as point cloud extraction from the CAD design, alignment of the measured scan to CAD, and the subsequent analysis, each of which has its own literature [29–35]. In practice, laser scanning requires a careful part alignment procedure to obtain consistent results. The alignment step requires matching of (at least four) points from the raw point cloud data with CAD model. The following method, which has been described in our previous works, showed the least variability. Four points each on the square and diamond portions are used to align the part as depicted in Fig. 2. Scanning was conducted on a sturdy table in a dark room and by coating the part with a thin layer of anti-reflective gray modeling paint.

#### Experimentation.

The experimental conditions are shown in Table 2 along with the typical number of point cloud data points obtained at each condition. Two process parameters are varied in these experiments, namely, the infill percentage (*I _{f}*) and the extruder temperature (

*t*). The experimental procedure is described in depth by Dsouza [39] and in a recent publication by Tootooni et al. [38]. The dataset has 12 distinct discrete experimental treatment conditions with different characteristics. In this design of experiments plan, no experimental data were obtained under certain treatment conditions because of repeated failure to print. Two parts were printed (left and right sides of the print bed) at each experimental run.

_{e}Infill ($If$, %) | ||||
---|---|---|---|---|

Temperature ($te,$°C) | 70% | 80% | 90% | 100% |

220 °C | No experiment | 1,233,867 | No experiment | |

225 °C | No experiment | 1,712,653 | 1,107,267 | 685,961 |

230 °C | 1,233,867 | 1,250,357 | 1,619,690 | 1,796,948 |

235 °C | No experiment | 1,795,849 | 1,758,031 | 1,692,290 |

240 °C | No experiment | No experiment | 2,211,521 | No experiment |

Infill ($If$, %) | ||||
---|---|---|---|---|

Temperature ($te,$°C) | 70% | 80% | 90% | 100% |

220 °C | No experiment | 1,233,867 | No experiment | |

225 °C | No experiment | 1,712,653 | 1,107,267 | 685,961 |

230 °C | 1,233,867 | 1,250,357 | 1,619,690 | 1,796,948 |

235 °C | No experiment | 1,795,849 | 1,758,031 | 1,692,290 |

240 °C | No experiment | No experiment | 2,211,521 | No experiment |

##### Rationale for selecting the infill percentage (I_{f}) levels.

Four levels of infill percentage (*I _{f}*), namely, 70%, 80%, 90%, and 100%, are investigated. Infill percentage (

*I*) determines the amount of material in the inside of the part; 70% infill percentage implies the part has 30% void volume. Lower infill percentage reduces the weight of the part, while affecting the strength. At higher infill values, it is observed in the authors' preliminary experiments (see Fig. 3) that the geometric integrity deteriorates, while strength improves. The reason for selecting an infill percentage range of 70–100% is based on longitudinal fracture test results (Fig. 3) with FFF-printed ASTM 638D-Type V specimens. The samples were built face down on the platen, the load was applied parallel to the long edge of the sample (i.e., perpendicular to the build direction). The samples built at

_{f}*I*= 100% showed significantly higher fracture strength compared to other levels. Pertinently, the specimen fracture strength was statistically indistinguishable with infill percentages at and below 90% (

_{f}*I*≤ 90). Hence, a balance between strength and geometric integrity must be sought. Accordingly, the lowest infill percentage level was set at 70% considering the inevitable tradeoff between strength and geometric integrity.

_{f}For instance, Fig. 4 shows the flooded contour plot for four different test parts fabricated under different infill percentages, at 230 °C extruder temperature. It is visually evident that a different infill percentage ($If$) results in a different part geometric accuracy; the first three parts ($If=70%,\u200980%\u2009and\u200990%$) are better than $If=100%$. The effect of infill percentage (*I _{f}*) on the internal morphology of the part can be hypothesized by examining the micrographs shown in Fig. 5. Previous studies provide ample evidence to the presence of thermal residual stresses in FFF parts. For instance, Zhang and Chou [40] modeled the residual stresses resulting from cyclical heating and cooling in FFF and subsequently used their validated model to assess the effects of feed rate (or velocity of extruder movement), road width, and layer thickness. At lower infill percentages (70%, 80%), the thermal residual stress has an avenue to be relieved given the gap between adjacent roads; the gap allows for shrinkage without warping. In contrast, at 100% infill, the lack of gap between roads prevents stress relief, leading to warpage and deleteriously affecting the geometric integrity.

##### Rationale for selecting the extruder temperature (t_{e}) levels.

In the case of extruder temperature (*t _{e}*), the recommended printing temperature for ABS material is 230 °C. Exploratory tests showed that at lower temperatures (<215 °C), the nozzle fails to extrude the material consistently, and at higher temperature (>245 °C), the filament vaporizes. Accordingly, five levels of extruder temperatures (220 °C, 225 °C, 230 °C, 235 °C, and 240 °C) are chosen to study the effect of temperature on geometric integrity.

#### GD&T Data Analysis.

Geometric dimensioning and tolerancing characteristics can be ascertained based on the sampling of point cloud data, including flatness, thickness, circularity, cylindricity, and concentricity, as shown in Figs. 1(b) and 1(c). In Fig. 6, two GD&T characteristics (thickness and flatness) are mapped against the two FFF process conditions (infill percentage [*I _{f}*] and extruder temperature [

*t*]) with three replications. In a recent work by the authors [28,29], it was shown that some GD&T characteristics defined based on simply shapes may not capture major geometric defects of specific part shapes and may even be negatively correlated (

_{e}*ρ*< 0). A negative correlation between flatness and thickness is evident in Fig. 6; the process conditions resulting in lower deviation in thickness may lead to high deviations in flatness (

*ρ*∼ −0.2). The size of the circle represents the ratio between the deviation in thickness and the deviation in flatness. For example, at 90% infill, there is no clear trend about the effects of temperature on the ratio between thickness and flatness. Hence, it is difficult to use the GD&T data to ascertain what the optimal set of processing conditions are, given that some of the GD&T measures contradict one another.

### Profiling Geometric Deviations Based on SOM Analysis.

Traditional GD&T characteristics may not be sufficient to capture the geometric accuracy of AM parts. Herein, we use unsupervised ML based on neural networks to recognize and identify new geometric deviations. The chosen unsupervised ML approach is the concept of SOM, which clusters the various geometric deviations into multiple classification types according to their directions and magnitudes. Parts within each SOM-identified cluster are similar in magnitude and direction of geometric deviations. Thus, each cluster represents a unique type of geometric deviation. This is useful to identify different types of geometric deviations associated with specific process conditions. By focusing on the critical types of geometric deviations, the proposed method is robust to noise in the data. Thus, the volume of the overall dataset can be significantly compressed. This will speed adaptation of the 3D laser scanning in industrial application and enhance the scanning speed of parts and reducing the data size.

As an introduction to SOM, it is a type of an unsupervised ML neutral network that uses a procedure called competitive learning to discern patterns [41,42]. The SOM maps high-dimensional input data into a 2D space, while preserving the topological interrelationship between data; the mapping does not change the relative distance or similarity among data points [43]. The clustered data reduce the dimensionality and intuitively characterize the similarity among data points. The 2D map resulting from SOM is called a space membership map.

Figure 7 is a schematic SOM network for a dataset with three attributes. This structure of the network could address a clustering challenge on a dataset with three attributes and 16 possible clusters. As shown in Fig. 7, each structure of the SOM network contains three types of entities: input neurons, connecting vectors, and output neurons. The vectors connect the input neurons to the output neurons and neighboring output neurons to each other. Each vector has a weight that is produced randomly in the initialization procedure and updated in the training process. Based on the nature of connections in the output layer, two types of SOM maps exist: quadrilateral or hexagonal. The network in Fig. 7 yields a quadrilateral map with at most four connections for each neuron with its neighbors.

Deviations in the part (Fig. 1) are denoted by the vector $\Delta (i)=(\Delta x(i),\Delta y(i),\Delta z(i))$, where $i$ represents each data point in point cloud data of the fabricated part, and $\Delta x(i),\Delta y(i),$ and $\Delta z(i)$ are deviations in the $x,y,$ and $z$ directions for a measurement point $i$. We apply a SOM with 5 $\xd7$ 5 hexagonal structure to the point cloud data, which yields 25 clusters (cells). These 25 hexagonal cells represent different types of geometric deviations and captures the discrepancy in various types of geometric deviations in terms of the magnitude and directions. The number $nk$ in each cell is the number of deviation data points associated with cell *k* (e.g., $n1$ is the number of point measurements in cluster 1). Some cells can be empty and others may be combined into one if they are highly correlated; this is done in an unsupervised fashion by the SOM algorithm. In the end, low correlation between points in adjacent cells results from a difference in the magnitude and/or direction of deviation. An example of the SOM output can be found in Fig. 8.

The size of the SOM model, characterized by the number of cells, governs the accuracy and efficiency of the clustering algorithm. A higher-order model with larger number of cells is capable of capturing the subtle differences among various types of geometric deviations. However, it also results in longer computation time and tends to overfit the data, making the results sensitive to noise. By contrast, a lower-order model is faster and captures the major types of geometric deviations. However, subtle changes in geometric deviations may not be captured with a lower order model. The selection of SOM models depends on availability of computational resources and the application. In this work, we choose a 5 $\xd7$5 hexagonal SOM, resulting in a maximum number of 25 (=5 $\xd7$5) clusters. This resolution is reasonable for identifying the major types of geometric deviations and does not cause a significant computational burden. For instance, increasing the number of clusters to 36 (=6 $\xd7$6) increases the computation time to 1200 s compared to 800 s for 25 clusters, this further increases to 1800 s when 49 (=7 $\xd7$7) clusters are implemented. Using a 7 $\xd7$ 7 SOM also leads to spurious segmentation of the data.

The training of the SOM model is influenced by the structure of the point cloud data. Selecting the data samples to initiate the SOM network is difficult, because the current dataset is too large to process data points for each treatment condition at once and the dataset has 12 distinct experimental point clouds (i.e., 12 processing conditions/parts) with different characteristics. Existing approaches for SOM training recommend exposing the network to each of the data from the start to the end. This approach suffers from the following drawback: the first incoming rows of data (data-rows) have an advantage, because the SOM is vulnerable to initial exposures. Another approach is to randomly pick the data for each and every data-row being presented to the network. This approach suffers from the disadvantage of uncertainty resulting from not exposing the data-rows in an equitable manner. To overcome these challenges, the network is exposed to 200 batches of 5000 data-rows drawn randomly from one of the process conditions (also selected randomly) in Table 2.

## Results: Clustering Additive Manufacturing Point Cloud Data Using Self-Organizing Maps

### Characterizing Deviations Types.

Each cell in Fig. 8 represents a type of geometric deviation, and the numbers in each cell are the number of data points belonging to that cell. The results of the SOM clustering for two combinations of process parameters are shown in Fig. 8. We apply a 5 $\xd7$ 5 SOM to the point cloud data to profile the types of geometric deviations for each fabricated part. Figure 8 illustrates that the fabricated part with process condition ($te$ = 225 °C, $If$ = 100%) has more clusters than process condition ($te$ = 230 °C, $If$ = 80%), this means that processing condition ($te$ = 225 °C, $If$ = 100%) results in more types of deviations in terms of direction and magnitude. The significance of this result is twofold: (1) SOM can provide an informatics indicator about the overall geometric accuracy of the part fabricated using each combination of process parameters and (2) the deviations of a part associated with specific combination of process parameters can be characterized by the critical types of deviations, which can significantly reduce the amount of data needed.

Self-organizing map results provide an intuitive representation of the types of deviations for each combination of process parameters. Figure 9(a) depicts different types of geometric deviations, based on SOM clustering, from parts fabricated using two sets of process conditions: circle denoting the process condition ($te$ = 225 °C, $If$ = 100%) and square for the process condition ($te$= 230 °C, $If$= 100%). The horizontal and vertical axes represent the deviation in *x* and *y* build directions, respectively. The size of the markers represents the magnitude of deviation in the *z* direction; the larger the marker, the worse the part consistency. Figure 9(a) shows that the process condition ($te$ = 225 °C, $If$ = 100%) leads to shrinkage in the *x* direction and expansion in the *y* direction because multiple clusters have been observed in the second quadrant (i.e., negative and positive), whereas process condition ($te$ = 230 °C, $If$ = 100%) results in the part being larger in the *x* direction (closer to nominal).

The outcome of such an analysis is that once the types of geometric deviations for specific processing conditions are profiled, actions such as CAD compensation and machining can be taken to improve the geometric accuracy. For instance, to improve the geometric accuracy of parts fabricated using the process condition associated with $te$ = 225 °C and $If$ = 100%, the users can provide an allowance in the CAD model in the *x* direction and scale down in the *y* direction to compensate for the shrinkage and expansion in the respective directions; for $te$ = 230 °C and $If$ = 100%, the CAD file can shrink in the *y* direction. We also note that both process conditions result in an outlier cluster in the fourth quadrant caused by minor expansion in the *x* direction and significant shrinkage in the *y* direction. Similarly, Fig. 9(b) demonstrates the profiles of geometric accuracy for parts printed using process condition associated with ($te$ = 225 *°*C, $If$ = 80%) (circle), and $(te$ = 230 °C, $If$= 80%) (diamond). The quality of the part printed using the process conditions depicted in Fig. 9(b) is better compared to the process conditions in Fig. 9(a), because in general the geometric deviations are more consistent in all directions.

While the SOM-based analysis in Fig. 9 was able to clearly identify the types of geometric deviations, the visual characteristics could not distinguish subtle differences in the geometric deviations of those parts [13]. This corroborates our hypothesis that SOM-based analysis of data links geometric accuracy to specific process conditions which is not possible with traditional approaches. Furthermore, the SOM clustering approach identifies the types of geometric deviations specific to the part designs.

### Characterizing Overall Geometric Accuracy.

We next characterize the overall geometric integrity of a part by taking into account the number of data points in each cluster. For any cluster $Cj$, we define the weighted cluster deviation as $\Delta \xafj=(nj/N)\Vert \Delta j\Vert ,forj=1,\u2026,m,$ where $\Delta j$ represents the center of cluster $Cj$. Hence, $\Delta \xafj$ captures the deviation magnitude of cluster $Cj$ weighted by the number data points in the cluster. If a relatively high $\Delta \xafj$ value for a certain cluster is observed, the implications are twofold: (1) the magnitude of the corresponding type of geometric deviation is significantly high and (2) the number of points with this type of geometric deviation is relatively large, compared to geometric deviations in other directions. The average magnitude of deviation for the whole part is calculated as $\xi =(1/m)\u2211j=1m(nj/N)\Vert \Delta j\Vert $, which can be used as an indicator of the part geometric accuracy.

We plot the $\Delta \xafj$'s values for various combinations of process parameters ($te=225\u2009\xb0C,230\u2009\xb0C$, and $If=80%,90%,100%$) as shown in Fig. 10. The values of $\Delta \xafj$'s are represented by light to dark blocks. The lighter block denote lower values of $\Delta \xafj$ with a minimum value of 0, whereas the darker block represent higher values. For example, process condition ($te$ = 225 $\u2009\xb0C$, $If$ = 100%) yields a larger number of clusters with significant values of $\Delta \xafj$ (dark blocks). This means that parts fabricated using this combination of process parameters result in multiple types of major deviations, each of which includes a large amount of data points. However, process condition with a lower infill percentage ($If$ = 90%) results in much fewer number of clusters with significant $\Delta \xafj$ values. Therefore, the overall geometric accuracy of parts printed using process condition ($If$ = 90%) is comparatively higher than process condition $(If$ = 100%). The average magnitude of geometric deviation, $\xi $, is also presented. The value of $\xi $ is much higher for process condition ($If$ = 100%), compared to process condition ($If$ = 90%).

Comparing $\Delta \xafj$ maps (as shown in Fig. 10) across various combinations of process parameters allows to establish a relationship between processing conditions and the geometric accuracy of the part. The process condition $(te$ = 230$\u2009\xb0C$, $If$ = 80%) yields the minimum number of clusters signifying different types of deviations. Besides, the process conditions ($te$ = 225$\u2009\xb0C$, $If$ = 80%) and ($te$ = 225$\u2009\xb0C$, $If$ = 90%) are close to the best process condition. The geometric accuracy of as-built parts for each processing condition can also be ordered according to the $\xi $ values. The processing condition ($te$ = 230$\u2009\xb0C$, $If$ = 80%) results in the lowest $\xi $ value among the six combinations of process parameters, and thus, results in the highest geometric accuracy. This finding is remarkably consistent with Tootooni et al., where the authors assessed the geometric integrity of parts based on SGT [38]. The SGT method provides an overall description of the geometric accuracy using graph theoretic quantifiers. The presented approach additionally characterizes the profiles of different parts in terms of types and magnitude of deviation as shown in Fig. 10. This information is not captured in the recent work by Tootooni et al. [38].

### Data Reduction Based on Critical Clusters.

We reduce the amount of point cloud data needed to characterize the part geometric accuracy based on the top-*k* critical clusters, i.e., the clusters with highest magnitude of deviations. We denote by $\tau $ the threshold of deviation magnitude for the top-*k* clusters, which selects clusters whose deviation magnitude is higher than $\tau $, i.e., $\Vert \Delta j\Vert \u2265\tau $ for $j=1,\u2026,m$. Figure 11 shows the deviation magnitude of all SOM clusters ordered from the highest to the lowest. The horizontal axis shows the label of SOM clusters. The vertical axis shows the magnitude of deviation (mm). When $\tau =0\u2009mm$, all the clusters are selected. On the other hand, when $\tau =4\u2009mm$, no clusters are selected. The choice of $\tau $ depends on the need of data compression and the requirements of geometry tolerance. We choose $\tau =1.5\u2009mm$ (green line) as a threshold for data reduction, which selects the top-four clusters, namely clusters 21, 16, 22, and 11.

We find that (1) the selected clusters consist of only 2.4% of the total data points and (2) the ranking of geometry accuracy based on the top-four clusters is completely consistent with the ranking based on the full data set as shown in Fig. 10. Due to the limited length of the paper, we do not present the ranking of parts based on the top-four clusters here. This result (Fig. 11) means that we only need 2.4% of the data points to accurately characterize the geometric accuracy of a part. This will tremendously accelerate the process of part scanning by focusing on the areas associated with the top-four clusters.

We find that the selected top-four clusters correspond to critical locations of the part, such as corners, edges, and interface between two structures. The areas identified by top-four clusters are shown in Fig. 12. Figure 12(a) illustrates the deviation points in an exponentially shaped curve for cluster 22, heights of which ranges from −0.45 mm to 1.5999 mm, which is illustrated in front view by the red line. Figure 12(b) demonstrates the deviation points in circular-shaped curve for clusters 16 and 21 with heights varying from 7.3959 mm to 7.5834 mm. For cluster 11, deviation points are recommended to scan in the pattern as shown in Fig. 12(c). In this case, height value ranges from 3.6544 mm to 4.9834 mm. This analysis significantly reduces scanning time and cost in AM process. Only 2.4% of the areas of the part need to be scanned for capturing the difference between parts produced under different conditions. Surface scanning focusing on such critical locations identified by top-four clusters will significantly improve the speed of quality inspection, while capturing the main geometric deviations associated with the fabricated parts.

### Verification With K-Means Unsupervised Clustering.

We applied the K-Means unsupervised clustering method to validate the efficiency of SOM clustering technique. The training procedure for the K-means model is similar to the SOM; the model is exposed to 200 batches of 5000 data-rows drawn randomly from one of the (randomly) selected process conditions. Subsequently, the trained model of K-means is applied to each process condition to generate the clustering map, which contains different types of geometric deviations. The clustering map for two combinations of process parameters is illustrated in Fig. 13, similar to the obtained results from SOM, parts printed using process condition ($te=230\u2009\xb0C,If$ = 90%) yield the minimum overall geometric deviation. Moreover, the preference ranking of parts contingent on the geometric integrity remains unchanged. However, the geometric deviations assessed by K-means are larger than those estimated from SOM clustering. This is because K-means clustering is more sensitive to noise in the data compared to SOM; in that K-means clustering assigns the noisy observations (outliers) to the main clusters, which causes an increase in the magnitude of deviation ($\Delta j\xaf$), which in turn is directly related to overall geometric deviation ($\xi $) [44]. Furthermore, the amount of point cloud data needed to characterize the part geometric accuracy based on the top-*k* critical clusters is slightly larger with K-means clustering. The top nine clusters require ∼3% of the total data points compared to ∼2.4% for SOM.

## Conclusions

This paper proposes using an unsupervised machine learning approach called self-organizing map to assess the geometric accuracy of AM parts from a large dataset of laser-scanned 3D coordinates. The laser-scanned data are compared to the original CAD, and the resultant difference is used to characterize the geometric deviations of the as-built part. The central idea is that the clusters identified by the SOM are indicative of the magnitude (severity) and direction of geometric deviations. The approach was tested with experimental data obtained from laser-scanning of parts made with the FFF AM process by varying infill percentage $(If)$ and extruder temperature $(te)$. The key findings of this work are summarized as follows:

- (1)
Laser-scanned 3D point cloud data were used to assess the part geometric accuracy. Thus far, GD&T characteristics have been primarily used to quantify the geometric accuracy of AM parts; this traditional approach was shown to be ineffective. GD&T characteristics need to be carefully customized to distinguish between parts made using different FFF AM process conditions. Our approach provides a data-driven framework to profile the types of geometric deviations, which are uniquely defined by the design and process conditions. The overall geometric accuracy of the part is represented by a SOM map, extracted based on millions of data points.

- (2)
The SOM-derived clusters were able to discriminate between parts made with different FFF process conditions. The SOM-based analysis also recognized the magnitude and direction of deviations. It was observed that the geometric accuracy of FFF parts becomes worse with the increase in infill percentage $(If)$. This approach helps to advance the understanding about the effects of process parameters on not only the geometric accuracy of the whole part but also specific types of deviations. This is crucial for establishing the causal relationship between process/design parameters and geometric deviations of as-built parts. Once a specific type of deviation is identified during quality inspection, its root cause in the design space can be immediately pinpointed and adjusted to improve the part quality.

- (3)
The major clusters of SOM analysis only account for 2.4% of the total data points. In other words, instead of scanning the entire part and process the whole dataset of point cloud coordinates (millions of data points), which may be expensive and time-consuming, a small portion of the surface area recommended by critical clusters can be scanned and analyzed. This will drastically increase the scanning and processing speed (up to 50 times faster) of the AM specimens. This will also result in higher rate of quality inspection, without reducing the accuracy of geometric integrity characterization, and eventually improve the quality and repeatability of AM parts.

Although the proposed clustering-based method is tested based on parts fabricated using a FFF system, the developed method for the profiling of geometric deviations can be applied to characterize the geometric accuracy of parts fabricated based on other AM systems, such as laser powder bed fusion and directed energy deposition, or even other manufacturing processes. The insights from this work open the following avenues for future research:

- (1)
Understanding not only the effect of process parameters on geometric accuracy after the build, but also prescriptive compensation and design rules to avoid shape deviation. Furthermore, the analysis presented herein must be extended to more complex shapes and finer features.

- (2)
Incorporating in situ monitoring and assessment of geometric integrity using sensors, such as structured light scanners built into the machine.

- (3)
Integrating machine learning techniques, such as those proposed in this work with process modeling to understand why geometric inaccuracy occurs in additive manufacturing.

## Acknowledgment

The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Laboratory or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein. The experimental data for this work are from the doctoral dissertation [45] and MS thesis research [39] of Dr. M. Samie Tootooni and Mr. Ashley Dsouza, respectively, under the guidance of one of the authors (PKR).

## Funding Data

- •
Army Research Laboratory (Grant No. W911NF-15-2-0025).