On topological data analysis for structural dynamics: an introduction to persistent homology

Topological methods can provide a way of proposing new metrics and methods of scrutinising data, that otherwise may be overlooked. In this work, a method of quantifying the shape of data, via a topic called topological data analysis will be introduced. The main tool within topological data analysis (TDA) is persistent homology. Persistent homology is a method of quantifying the shape of data over a range of length scales. The required background and a method of computing persistent homology is briefly discussed in this work. Ideas from topological data analysis are then used for nonlinear dynamics to analyse some common attractors, by calculating their embedding dimension, and then to assess their general topologies. A method will also be proposed, that uses topological data analysis to determine the optimal delay for a time-delay embedding. TDA will also be applied to a Z24 Bridge case study in structural health monitoring, where it will be used to scrutinise different data partitions, classified by the conditions at which the data were collected. A metric, from topological data analysis, is used to compare data between the partitions. The results presented demonstrate that the presence of damage alters the manifold shape more significantly than the effects present from temperature.


Introduction
Topological methods are very rarely used in structural dynamics generally, although considering the structure and topology of observed data may be useful. Topological methods can provide new metrics and methods of scrutinising data; the most rudimentary and powerful of which, persistence homology, will be discussed and used extensively in this paper. By applying topological methods, an understanding of the topological structure of data can be used to formulate arguments and develop an understanding of system parameters on a manifold shape. This statement is especially true when working with higher-dimensional data sets, where an intuitive understanding of a manifold is not easy to visualise. By using topological methods, an understanding can be quantifiably analysed. When considering engineering data, data-sets are often embedded in higher-dimensional space, and topological information is often not well explored, leaving out potentially important and insightful information.
Topological data analysis (TDA), is a recently-developed and fast-growing field that has found its way into many areas of science and engineering. The general idea of TDA applies concepts from algebraic topology to data sets. The primary focus of TDA is to determine the shape of the manifold in which sampled data are embedded. This process is achieved by identifying 2D holes, 3D cavities and higher-dimensional analogues within the data structures. From these sampled data, an approximation to the topological structure can be calculated by use of simplicial complexes, which are higher-dimensional analogues of graphs. From the simplicial complexes, the persistent homology can be calculated, this is then used to understand the topological structure of the data. The persistent homology is invariant for each data set, and can be used to identify the data set; much like a fingerprint.
After the simplicial complexes have been constructed for some point data, these can be manipulated using ideas from algebraic topology, in order to determine algebraic groups that will capture information about the shape and structure of the simplicial complex; these groups are topological invariants. For any simplicial complex, algebraic topology can deduce a property called the homology, which encodes information about its number of k−dimensional holes. A generalisation of homology will be discussed here with respect to pioneering work by Edelsbrunner [1,2], where the homology can be considered over a range of simplicial complexes; this is called the persistent homology, aptly named, as this uncovers how the homology persists over a range of scales.
The field of algebraic topology is well studied, but the application of the ideas to discrete point clouds has contributed to a boom in computational topology. Many standard packages exist to analyse data with regards to topological methods: GUDHI [3] and Ripser [4] being the most influential throughout this piece of work.
The layout of the paper is as follows: Section 2 will be devoted to persistent homology and its significance, and will provide an intuitive understanding. Some use cases of persistent homology will also be discussed. Section 3 will introduce and analyse the topology of some common attractors from the field of nonlinear dynamics. Section 4 will then go on to look at the classic Z24 Bridge structural health monitoring (SHM) case study, and explore the role of topology as a metric between partitions of the data set.

Manifolds
Manifolds are continuous surfaces from which the data are assumed to be sampled. By understanding the topology of the sampled data points, it is the aim of TDA to extract topological information about the underlying manifold from the data. The manifold shape is unknown prior to analysis, and persistent homology will identify features within the manifold over a range of length scales. Thereby, understanding the shape of the sampled data, it is the conjecture of TDA that the shape of the manifold is also understood.
Formally, a manifold is a space that is locally homeomorphic to some n−dimensional Euclidean space, R n . In this work, manifolds are primarily thought of as the underlying space from which the generated or collected data are sampled. Throughout this work, there will be a reoccurring idea that a change in the parameters of a dynamic system may change the shape of the associated manifold, therefore the topology can potentially be used to identify a change in the system.

Simplicial Complex
Simplicial complexes will be used as a way of attributing quantifiable shape to the data; they can be thought of as higherdimensional analogues of graphs, giving a way of encoding connections between vertices. In TDA, the vertices of the simplicial complexes are the observed data points. Simplicial complexes can be analysed to output the homology of the data, and following this, the persistent homology. The homology and persistent homology are key topological invariants that can be used to describe the structure of the data.
A simplicial complex is a structure made up of fundamental building blocks called simplices, the first four types of which are shown in Fig. 1. Each vertex in the simplex is fully connected to all the other vertices and the space enclosed by the vertices is part of that simplex. For instance, ∆ 2 encloses a two-dimensional area, ∆ 3 encloses a three-dimensional volume. This sequence can be generalised for ∆ k enclosing a k−dimensional space between (k + 1) fully connected vertices. A filtration of a simplicial complex, K, is a nested sequence of sub-complexes, such that, 11], and ∅ is the empty set, i.e, there are no points inside.
There are many ways to construct a simplicial complex from point data. For simplification, only one method will be discussed within this paper, the Vietoris-Rips (VR) complex [18]. The VR complex can be constructed for point data present in a metric space.
A metric ∂ X is defined on a set X, which maps two elements from X into the positive real numbers. The mapping gives the associated distance between the two elements. For a set to be deemed a metric space, it must satisfy the following axioms: 1. for any x, y ∈ X then ∂ X (x, y) = ∂ X (y, x); 2. for any x, y ∈ X then ∂ X (x, y) ≥ 0, and = 0 iff x = y; 3. for any x, y, z ∈ X then ∂ X (x, z) ≤ ∂ X (x, y) + ∂ X (y, z).
If these axioms are satisfied, (X, ∂ X ) forms a metric space, where ∂ X is the metric on X.
An open ball is defined on a metric space, (X, ∂ X ). For a point a ∈ X and ε > 0, the subset of X consisting of all the points x ∈ X such that ∂ X (a, x) < ε is referred to as the open ball of radius ε at a.
For the VR complex VR ε (X, ∂ X ), let (X, ∂ X ) be a finite metric space and ε ∈ R >0 be a fixed value that determines the scale of the VR complex [19], where X is the set of vertices and ∂ X is the metric on X. A VR complex is defined by the following condition As ε goes from 0 to +∞, a nested sequence of complexes defines the Vietoris-Rips filtration [11].
ε Figure 2: The process of constructing a VR complex.
This process is depicted in Fig. 2 for some randomly-sampled data and an arbitrary value of ε, which can be seen as the radius of the balls. The existence of a simplex is determined by how the balls intersect between the vertices. For a VR complex, a simplex between some set of vertices is formed if the Euclidean distance between all those vertices is less than ε.

Persistent Homology
Before delving into persistent homology, the homology of a topological space needs to be defined. Many great introductions and summaries of homology already exist; the more intrigued reader may want to consult [6,7,[9][10][11]20]. In very quick terms, the homology is calculated by computing successive boundary operations on a chain complex of a topological space. Homology is essentially the quantification of the number of voids present in a topological space, or in the case of TDA, a simplicial complex. The homology groups, H k (X) are invariants for the data set X, where k refers to the dimension of the homology group. Generally, the k th homology group encodes information about the number of k−dimensional holes in the data. Under the rules of topology, discontinuities (voids) cannot be created or destroyed under continuous maps (homeomorphisms). Therefore, a simplicial complex can be categorised by the properties underpinned by the homology. The homology can be used to categorise and compare between simplicial complexes, and by extension, data sets.
From the homology, the Betti numbers are defined as the rank of the homology groups. If the Betti numbers for two topological spaces are different, these spaces are not topologically identical, meaning a continuous bijective map between the spaces does not exist. The zeroth Betti number, β 0 , is the rank of the zeroth homology group, H 0 (X), and refers to the number of connected sets in the topological space. The first Betti number, β 1 , is the rank of the first homology group, H 1 (X), and refers to the number of non-contractible holes present in the space. The second Betti number, β 2 , refers to the number of enclosed volumes in the topological space. This analogy carries on further for higher dimensions.
This discussion now raises the question: which length scale ε is representative of the topology of the data? When constructing the VR complexes, for the same data set, different values of epsilon, will result in different values for the Betti numbers. The hyper-parameter ε determines the Betti numbers for that specific instance of some point cloud data. Additionally, when the feature present within the data is at a length scale less than ε this feature will not be expressed, as ε will span the feature. This reasoning means that only topological properties that are described at a length scale greater than ε can be captured. A problem arises here, as usually the feature scale is not known prior to analysis, and a manifold may have many multi-scale features.
The answer to this problem is to vary ε and see how the Betti numbers evolve and persist. When varying ε, a filtered simplicial complex is formed. This can be thought of as a chain or sequence of simplices; with n disconnected points at the beginning and a fully connected (n − 1)simplex at the end. The next simplicial complex in the filtration is the previous simplicial complex plus the next simplex to be formed as ε increases. Fig. 3 shows some simplicial complexes in this process. A filtered simplicial complex is an ordering of simplices to show how they evolve as a distance scale is increased. From this, one can form an idea of how the topology evolves over the filtration.
Obtaining the homology for a single value of ε provides very limited information; this notion is almost redundant, because of potential varying feature length scales in the manifold. For this reason, it is vital to consider what homological features persist as ε is varied. The goal of persistent homology is to track the homology classes as ε is varied. The process of varying ε does not bias any disk size, as all are being considered. This process will give an initial value, ε min , where a specific homological feature comes to life and ε max , where the feature is no longer considered for that simplicial complex. This range of values [ε min , ε max ] is called the persistence interval for that homological feature. Each persistence interval is attributed a Betti number. Following this, the set of all persistence intervals is descriptive for that manifold, giving information about in which dimension a hole exists in the data and over what range of values it persists.
The persistence intervals obtained can be represented in two ways: barcodes or persistence diagrams. In a barcode representation, the x−axis refers to the value of ε. As ε increases, the barcode shows which features persist. The set of intervals are plotted with each interval beginning at its ε min and ending at its ε max . The colour of the interval on the barcode refers to the Betti number, β k [15]. The value of the y−axis can simply be thought of as an indexing of the intervals in the barcode.  The other method of visualising the set of persistence intervals is the persistence diagram, or birth-death diagram. In this representation, ε min is plotted on the x−axis and ε max is plotted on the y−axis, with each interval represented by the point (ε min , ε max ). Intuitively, there is a line defined by y = x, below which points will not be plotted. This fact means that the grey region in Fig. 3 will never contain any points. The line y = x has the interpretation that the feature must first exist before it can die. Reading these diagrams is as intuitive as reading the barcodes; the vertical height of the point from the line y = x is analogous to the length of the interval, that is, the further a point is from the line y = x, the more the feature is persistent. An example of a birth-death diagram can be seen in Fig. 4.
On both the barcode and birth-death diagrams, it can be seen that one feature persists to ∞; this is because there will always be a fully-connected simplex that persists to infinity. There will be a value of ε fc that results in a fully-connected simplex, where every vertex is connected to every other vertex. For values ε > ε fc the simplex will remain fully connected, and therefore this will continue to infinity. The removal of this infinite interval is called the reduced homology. The reduced homology is required for use in many calculations.
The space of barcodes actually forms a metric space [9]; the distance between the barcodes is a measure of similarity of two barcodes. As the persistent intervals are invariant for a manifold, the data manifolds can be represented by their persistent homology. This notion of a metric space allows one to compare the similarity of manifolds. Metrics between barcodes are well established and the one used in this paper is the p−Wasserstein distance [21].
Definition 2.1. Given two barcodes B 1 and B 2 . For p > 0, the p−Wasserstein distance, ∂ Wp , is given by, where φ is a matching between B 1 and B 2 , d ∞ is the supremum metric, and Z is a persistence interval in B 1 [9].
The Wasserstein distance is used to measure the similarity between two persistence barcodes, by computing the sum over their edge lengths. The q-Wasserstein distance is defined as the minimal value achieved by a perfect matching between the points of the two diagrams, where the value of a matching is defined as the q th root of the sum of all edge lengths to the power q [21].

Calculating Fractal Dimension
The first application of persistent homology in this paper is on calculating fractal dimension, which is a measure of how much space a set occupies. To gauge the idea of the fractal dimension intuitively, consider that a single point is zero-dimensional, a line spanning two points is one-dimension, and the area enclosed by three points is two-dimensional, and so on. Now, suppose a curve, consisting of purely one-dimensional elements has infinite length and is bounded by a finite region. This curve may be believed to occupy more space than something that is only considered one-dimensional, but it also does not completely fill this arbitrarily-defined space. A logical conclusion is that the dimension of this curve is somewhere in between one and two, depending on how efficiently it occupies this area.
The first notable work on quantifying fractal dimension was by Hausdorff [22], where a measure of roughness was conceived; this was later expanded on by Falconer, and generalised for point clustering in [23]. Mandelbrot, popularised the idea of fractal dimension in the notable work [24].
The two most common methods of quantifying fractal dimension are the Hausdorff and box-counting dimensions. The boxcounting method exploits the dimensional relations to scalability and the difference in occupied space over different scales. The box-counting method is very popular because of its ease of computation. The method of using the persistent intervals, which is described later, more closely represents the box-counting method.
To calculate the box-counting dimension, assume a space X. The smallest number of sets of diameter that can cover X, is referred to as N ( ). The scale determines N ( ), and the dimension of X determines the rate of change of N ( ). As the scale is altered, it is expected that, for positive constants d B and c. It is said that X has a box dimension of d B . To solve for the dimension of the set, this equation can be rearranged to give, with the constant term, c, disappearing in the limit [23].
Inherently, fractals are infinitely-complex objects that are self-similar, often over an infinite range of scales. This view gives a problem when working with finite point-data sets, as only finite information can be captured in a finite point-data set. When zooming in excessively, points will become more sparse and the approximations will inevitably become less exact. In reality for finite sets, self similarity may only be visible over a small number of scales. This process can be seen later, in Fig. 9, where the self similarity is clear over two length scales, then thereafter points become sparse.

Fractal Dimension -Persistent Homology
Persistent homology can be used to calculate fractal dimension, given some point data sampled from a fractal. This idea first came from that of the minimal spanning tree (MST) [25,26]. The MST method was shown as a viable method to calculate the fractal dimension of a set [27]. Numerous works have showed that the MST method is equivalent to using the zeroth homology group for calculating the fractal dimension [27][28][29][30]. Unlike using the MST method, the persistent homology can be generalised for higher-order homology groups, therefore returning more information regarding the topology of the fractal shapes.
Often when analysing the persistent data, the smaller intervals are discarded as noise, as these mean that this specific feature only persists for a short while. In the case of calculating fractal dimension, there is information contained in all of the persistent intervals, as the idea here is to deduce the fractal dimension of X n points as n → ∞ and seeing how the lengths of the persistence intervals in each homology group vary. The shorter intervals provide a good measure of how the local geometry is present in the finite random sample.
To calculate the fractal dimension from the persistent homology, the α-weighted sum for α > 0, of the persistent intervals, in a given dimension should be computed, as follows, (x 1 , . .., x n ) are n points sampled from a manifold, most interestingly one that exhibits fractal behaviour. P H i (x 1 , . .., x n ) is the i th dimension persistent homology group. |I| is the length of the interval from P H i . The alpha-weighted sum of the persistence intervals tracks the rate at which the topological noise decays, and similarly to the box-counting method, this can be linked to the fractal dimension [31]. The value of alpha will give a larger weighting to longer intervals.
Definition 2. 2. Let X be a bounded subset of a metric space and µ a measure on X. For all i ∈ N and a value α > 0, which is used to give a weighting to larger persistence intervals in the calculation [31], the persistent homology dimension can be defined as, where, Here, sup refers to the largest value in the set. The operator E, is used as the expected value of a random variable. Finally, E i α (x 1 , . .., x n ) is the alpha-weighted cumulative sum over the n points. This result means that the embedding dimension For the case when i = 0, the persistence intervals are equivalent to the lengths of the edges in the MST. From the MST, there is a set of n vertices and a set of n − 1 edges, where each edge spans two vertices. This work is built on ideas from Kruskal [25] and Prim [26] in the 1950s. The edges in the MST are equivalent to the persistence intervals in the reduced zeroth homology group.
The MST approach formulates the problem in terms of graph theory, with V being the set of vertices and E being the edge-set, connecting the vertices. Two vertices are referred to as connected, if a path connects them. A path is the successive joining of adjacent edges from one vertex to another, i.e, one can walk from one vertex to the other without leaving the path. If these edges form a closed loop, this is called a circuit. If a graph does not contain a closed circuit, it is called a tree. A spanning tree has an attribute called its length. The length of a spanning tree is the sum of all the edges in the tree. The MST is the spanning tree that spans the points most efficiently, by minimising the length of the edges. The algorithms used to calculate MSTs are very well optimised, meaning that competitive results for calculating fractal dimensions can be obtained from their usage [31]. One can use the lengths of the edges in an MST to approximate the dimension of the manifold from which the vertices are sampled. The asymptotic behaviour of Eq. (4) is studied to calculate the fractal dimension.
This property is useful, as algorithms for calculating MSTs are much faster than calculating persistent intervals. However, the MST is only equivalent to the zeroth homology group. If more information is required for higher-dimensional homological features, the slower, but more informative persistence algorithms must be used. Fig. 5 shows the VR complex and the MST for an attractor.
When dealing with higher-order persistent homology, i ≥ 1, things become more tricky. With P H 0 , it is known that there are going to be n − 1 edges for n points. Whereas, for higher-order persistent homology, this is not the case. There exist metric spaces where the number of persistent intervals grows faster than linearly in the number of points. To get around this problem, a limit for the upper bound can be proven. In the case of the VR complex, Schweinhart proved |P H 1 | = O(n) for the points (x 1 , . .., x n ) [29].

Attractors
Strange attractors arise from nonlinear dynamical systems. A dynamical system is one which is modelled (and evolves) in a phase space embedded in R n , where the geometric object is embedded in Euclidean space and parameterised by time. Attractors represent the state to which a dissipative dynamical system will eventually converge, seemingly regardless of initial conditions. Attractors are common-place for study as they describe the asymptotic behaviour of many dynamical systems [32][33][34].

Lorenz Attractor
The Lorenz attractor is defined by the differential equations, The attractor was first discovered by Lorenz, when studying non-periodic turbulent flows [32]. It displays an interesting topology, with two holes being present in the manifold. It also shows a Cantor-set like behaviour over its cross sections [35] for certain parameters.
For the system parameters ρ = 28, σ = 10, and β = 8 3 , the fractal dimension has previously been calculated by Viswanath [35] to be 2.063 . To obtain a structure with the fine details, showing the Cantor-like fractal cross sections, a large number of points are required for the embedding. 10000 points were calculated here for a dense embedding of the Lorenz attractor. Using the MST method of calculating the fractal dimension [31,36,37], of the embedded attractor shown in Fig. 6, an approximation to the Hausdorff dimension of 2.0826 ± 0.03603 was calculated. This value is calculated by determining the linear gradient of the log-log plot of the cumulative sum of the MST edges, which are analogous to the persistent intervals in P H 0 . Now, in order to capture the global topology of the system, fewer points are required to calculate the persistent homology. In this example, a more sparse Lorenz attractor is sampled with only 1000 points, shown in Fig. 7. These samples were taken over the same range as the previous case but now there is a greater time step between each point. This procedure ensures a relatively dispersed distribution of points over the Lorenz attractor. By taking a dispersed distribution of points on the manifold, the global topology of the manifold should become more clear.  The birth-death diagram can be seen on the right hand side of Fig. 7. There are no large differences in β 0 , represented by the red points; this strongly indicates that the Lorenz attractor consists of only one connected component. For the case of the first homology group, there is a large amount of noise present close to the line y = x. Most interestingly from this plot, two features persist long enough to be deemed properties of the manifold. These points are (2.59, 7.29) and (1.97, 9.17). The holes represented by these points can be visibly seen on the left hand side of Fig. 7 as the holes present within the manifold. The smaller of the two intervals represents the void in the left-hand side of the plot.

Hénon Attractor
The Hénon attractor is a two-dimensional quadratic map with a constant Jacobian; it was first conceived as a simplified discrete map of the Lorenz system. As the map is discrete, it has become a common object for study in dynamical systems, as computations for generating a large number of points are fast [34]. The Hénon attractor used in this work is defined by, In this case, the well studied parameters of a = 1.4, b = 0.3, and an initial point of (0.1, 0.3) were used, as these give a convergent solution. 2000 iterations were taken to give a finite approximation of the orbit of this specific Hénon attractor, with the initial conditions listed. Using the MST method of calculating the fractal dimension [31,36,37], a value of 1.2558 ± 0.04476 was obtained. A previously-calculated value from work by Grassberger [38], calculated the dimension to be 1. 26.  The Hénon attractor is well known to exhibit a self-similar, Cantor-like, behaviour over its cross sections [34]. The longitudinal structure of the Hénon attractor is simple, with each curve appearing to be a 1D manifold. The traversal structure is the interesting part; this can been seen to be self similar in Fig. 9.  The dimensions of (a), (b), and (c) in Fig. 9 are 1.233, 1.245, 1.526; this shows that, as the data become more sparse, this fractal dimension calculation method becomes less accurate as more data points are missing at the smaller length scales.

Rössler Attractor
This attractor was first formulated in [33,39], it is obtained by solving the differential equations, (a) Rössler Attractor.
(b) log-log plot to determine fractal dimension, with gradient (red).  From looking at the persistence barcode in Fig. 11, it can be seen that all the red dots, corresponding to β 0 are close and show no clear split between these values. This observation strongly implies that the Rössler attractor is a fully-connected manifold. The range in which these occur is most likely because of the data points along the 'flick' of the Rössler attractor being more sparse; if this were a continuous embedding, these features would not be present.
For the case of the first-dimensional Betti numbers, represented by the blue dots in Fig. 11, there is a lot of topological noise close to the line y = x. Interestingly, there is a point far from the line y = x, which is representative of the topology of the Rössler attractor. This point has the coordinates (4.00, 10.3), and is representative of the hole formed over the 'flick' of the attractor. These persistence data will now act as the benchmark for the Rössler attractor, and multiple embeddings will be compared to this. The aim here is to minimise the Wasserstein distance (WD) between the reconstructed object and the original attractor. The smaller the WD, the closer the topologies.

Reconstruction from Time Delay Embedding
In this case, the topologies of a reconstructed phase space, formed from a 1D time series from the attractor, and the original point cloud are compared. Persistent homology will be used to determine the optimal embedding parameter from the 1D time series.
Time-delay embeddings were first introduced in [41] as a method of inducing geometry from a one-dimensional time series. This work was further expanded on by Takens [42], proving that the topology can be perfectly reconstructed for chaotic attractors. Skraba et al [43], then showed that persistent homology could be used with the time-delay embedding to yield useful topological results.
Given a time-varying series, f : t → R, the time-delay embedding can be stacked d times, each with a delay α, to give a new embedding φ : t → R d where the new embedding is represented by, From this embedding, the time series now has a topology induced. This embedding is a reconstruction of the topology from a one-dimensional time series into any desired dimension. Takens' theorem shows that under certain circumstances, this reconstructed attractor is homeomorphic to the original attractor [42].
The delay embedding highlights periodic recurrent features in the time series. Recurrent behaviour will be highlighted in the time-delay embedding as a loop. Persistent homology can then be used to quantify the size of these loops.
If the delay is too small, there will not be sufficient information to form meaningful topology, and the reconstruction will be too similar to a straight diagonal line. Therefore, by maximising the size of the holes present in the attractor by using the persistent homology, an optimal time-delay embedding can be determined. Conversely, using a too-large delay will result in nonsense, as the gap between the readings will be too large and will not show a local change over the manifold, resulting in a deformed attractor.
By computing a range of delay embeddings, the persistent homology of these can be calculated and then the WD can be used as a metric of topological similarity, to give the optimal delay embedding. Fig. 12 shows the WD between the original Rössler attractor and attractors formed from a delay embedding over a range of delays. By taking the delay for the minimal WD, this implies that the topology of the original attractor and the reconstructed delay embedding have the most similar topology. Therefore, the delay that minimises the WD is the one that gives the optimal delay embedding. There is a clear periodic trend emerging in Fig. 12; this trend occurs as the time-delay embedding comes in and out of phase with the original attractor. The general trend on Fig. 12 shows an increase in the WD as the delay is increased. This trend is because of artefacts induced in the reconstruction, as successive points are sampled over different periods from the attractor.  From Fig. 12 it can be seen that a delay equal to one gives the smallest WD; in this case, the reconstructed topology is too similar to the diagonal set, and the topology is relatively uninteresting. If one was to assume an delay embedding of α = 0, this would result in the straight line x = y = z, which will clearly have an uninteresting topology. Therefore, in this analysis, only values after the first peak (α = 13) will be considered, as these ensure an interesting topology is being formed from the embedding. On the other hand, using a too-large value for the delay gives a deformed manifold. Two bad examples of delay embeddings, α = 1 and α = 57, can be seen in Fig. 13.  For α = 19, not only the topology of the attractor is most accurately being captured, but from Fig. 14 (a), there is also a strong likeness to the original geometry. This is not likely to be the case for all other reconstructions, as Takens' only proved this for the topology [42].

Introduction to the problem
The Z24 bridge was a structure connecting the two regions, Koppigen and Utzenstorf, in Switzerland. Data were collected over a whole year, with a sensor network placed over the bridge to collect modal parameters. Sensors were also used to measure the air temperature, soil temperature and humidity. Because of the extreme conditions of the Swiss weather, air temperature was recorded as low as −9 • C and as high as 36 • C. As a result, the temperature effects are clearly visible on the calculated natural frequencies [44]. Shortly before the destruction of the Z24 bridge, there was controlled damage introduced to the system, which is also visible in the natural frequencies after a certain point in time.
The change in temperature is the biggest contributing factor to the change in the natural frequency (≈ 30%). The change in temperature has a greater impact than introducing damage (≈ 7%). For this reason, the change in the magnitude of the natural frequencies offers little insight into the presence of damage. The main problem here is to separate the damage case from the temperature effects.
The data set can be broken down into four categories, according to the air temperature at the time of the measurements, and whether damage was present. Fig. 15 shows the temperature readings and the first four calculated natural frequencies; the corresponding colours refer to: • Light blue making up the freezing data set. This is any value with a temperature reading T < 0 • C.
• Dark blue, making up the cold data set. This is any value in the temperature range 0 • C ≤ T < 4 • C.
• Red, making up the warm data set. This is any value with a temperature reading 4 • C ≤ T .
• Black, making up the damage-state data set. This is any reading taken after an index of 3475, irrespective of the temperature.
At every measurement instance, the natural frequencies {ω i }, can be calculated; where ω i is the resonance frequency for the i th mode. The first set of the n natural frequencies can be represented as a point in R n , where the i th axis is for the value of ω i .
As the Z24 data set has been sampled over different environmental and operational variations (EOVs), it is expected that the resonance frequencies will vary, depending on the environmental variables on the day. The bigger the change in the EOVs, the greater the change in ω i . As the temperature changes, it is expected that material properties will be affected, therefore resulting in a change in the natural frequency.
In the case of the Z24 Bridge, by sampling over the time frame of a year, there will be slight changes in humidity, air temperature and soil temperature, that mean that each natural frequency will be slightly different from any other day. As the first four natural frequencies have previously been extracted for each reading [44], each point can be plotted in R 4 . Plotting these points will trace out a manifold shape that is paramaterised by all of these EOVs. The data points are then assumed to lie on a manifold, representative of this specific bridge. TDA can then be used to form an understanding of what is expected for the shape of the natural frequency manifold for the Z24 data set.
Since the data have been partitioned into freezing, cold, warm, and damage, TDA can be used to compare the relative shapes of these manifolds. A significant change in the shape of the manifold could potentially be an indicator of the presence of damage.

Wasserstein Distance of Z24 partitions
There will be three case studies presented here, all on the Z24 data set. Each case study will include a new aspect of analysis to make the analysis more concrete.
1. The first case study will compare the manifolds embedded in four dimensional space, where each of the axes corresponds to a natural frequency ω 1 , ω 2 , ω 3 and ω 4 . This case will also show how normalising by the number of points when using the WD makes a more robust metric for comparing data sets of different sizes.
2. The second case study will remove ω 2 , and plot the manifold embedded in R 3 . Previous analysis of the Z24 bridge, has shown that a lot of the nonlinear features are present in the second natural frequency. This case study is useful, as now that the natural frequency data are embedded in three dimensions, the data can be plotted and visualised, whilst showing that topological arguments are still valid in lower-dimensional shadows. 3. The third case study shows the robustness of the manifolds to a linear dimension-reduction algorithm. In this case, the fourdimensional embedding of the data will be compressed down to two and three dimensions [45]. The reduced-dimension manifolds are then analysed with TDA.

Raw Natural Frequency Case
For the first case, a full walk-through of the calculation procedure will be displayed. For the succeeding cases, this will be omitted to limit repetition. This case will take the first four natural frequencies obtained from the Z24 data set [44]. The natural frequencies will be represented by a point in four-dimensional space. It is believed that the introduction of damage will change the shape of the manifold in a more substantial way than the change in temperature. TDA can be used here to compare between the different shapes of the partitioned manifolds.
The persistent homology of the manifolds are calculated accordingly. The partitions are not all of the same size. The warm data set is the largest data set, and this will be randomly split in half to form two data sets, the original warm data set will remain in the analysis. The second can be used to verify the results; the two new random subsets will have a very similar topological structure, as they are all sampled from the same manifold. There will be slight differences because of topological noise formed in the persistence intervals from missing points in the smaller samples. These effects should be negligible when over the true global structure of the manifold, there are enough points to adequately describe the topology. For smaller partitions, the absence of many missing points will affect the topology in a significant way.  Table 1: WD values over each partition. Table 1 shows the WDs over the different partitions of the data set. These values are relatively uninteresting, as the WD values are a factor of the number of points present in the data. This effect can be seen clearly between the warm and the warm subsets, as all the WD values for the full warm data set are roughly twice the size of the two random subsets. This effect shows that the number of points present in the point cloud is linked to the size of the WD value.
As a more informative measure, the WD values can be summed along the rows. Summing along the rows, gives an understanding of how different a data set is from all the others. The larger the value, the more different it is. As well as this, one can normalise by the number of points present in the data set; this gives a normalised value, independent of size of the data set. This measure then acts as a metric to discriminate between manifold shapes for manifolds with a varying number of points present.  Here, the change in the partition size and its effect on the WD value will be discussed, (Figure 16). The left-hand plot in Fig. 16 shows how the sum of the WD values for any data set varies as the partition size of the warm data set changes. It can be seen that the very cold, cold, warm, and damage sets are all changing proportionally, as the partition size is varied, as is expected. On the other hand, the two warm subsets vary a lot as the partition size changes. For small partition sizes, the topology of the manifold is subsampled to an extreme and a likeness between the manifolds cannot be established.
The right-hand plot of Fig. 16 shows the sum of all the WDs for a data set, and then scaled by the number of points presents in the data set. This figure shows how dividing by the number of points almost perfectly maps the warm subsets to the full warm set, in the regions where there are enough points to adequately describe the topology.  Table 2: Summed and scaled WD data.
As can be seen here, the damage case is clearly the most different in terms of manifold structure, compared to the other data sets.
The freezing data are the next most distinct.
For reference, if the partitions are not included in the analysis, results are obtained as in Tables 3 and 4. The damage manifold comes out as less different from the other values, as this now contains a greater weight in the analysis. When the warm subsets were included, this gave a much larger weight to this condition. Despite this, the damage manifold still comes out as the most topologically-dissimilar manifold.   Previous analysis of the Z24 Bridge data has determined that ω 2 contains the most nonlinear features. The analysis presented in the previous section can be repeated here, but after eliminating ω 2 and plotting the manifold in three dimensions. This restriction acts as a visual aid as the data can now be plotted; it also presents a case where a lower-dimensional shadow is taken of the data, and the topology is preserved over this type of subsampling. Figure 17: ω 1 , ω 3 , ω 4 3D plot, with the colours specifying data partitions.  Table 5: WD values with the highly nonlinear ω 2 removed.

Number of Points
This example shows the resilience of the TDA procedure; how a lower-dimensional shadow still contains enough information to distinguish the damage data partition as the most different.

Principal Component Analysis
In this section, a linear dimension-reduction algorithm, called Principal Component Analysis (PCA) [45] will be applied to the data. Naturally, when reducing the dimension of the data, some information will be lost. The 4D space will be reduced to the principal components present in 3D and 2D; the results of both will be presented. This demonstration will show how the data's topological structure is preserved over linear transformations, and how the accuracy degrades as the dimension reduction becomes more distant from the true embedding dimension.
The first principal component can be thought of as the direction that maximises the variance of the data into a projected space. The successive principal components are the ones that maximise the variance of the data into the projected space that are also orthonormal to all the previous components.
To clarify how the topological structure is altered when taking the principal components, a case will be considered that uses some points randomly sampled from a torus. The example is shown in Fig. 18. This figure shows that as the PCA embedding dimension is reduced, topological information is being lost. When going from a 3D embedding to a 2D embedding, the volume enclosed by the torus is lost. Following this progression, when going from 2D to 1D, the embedding is simply a straight line with a distribution of points weighted at the ends. These changes in the embedding dimension mean topological information in the dimensions higher than the current embedding are lost, i.e. in Fig. 18 (b) the 3D-volume is lost, and in Fig. 18 (c), the 2D-hole is lost. Despite this loss of information, the topology captured in the remaining principal components is still adequately represented, in Fig. 18 (b), the 2D-hole is still visible, and in (c), the manifold still consists of one connected component.  The first PCA example presented here on the Z24 Bridge data set, is for the dimension reduction from 4D space into 3D space. As can be seen in Fig. 19, calculating the PCA of the data has still kept the clusters in somewhat separate regions of space. Under close inspection, the warm, cold, and freezing all seem to be single clusters. Whereas, the damage cluster appears to be comprised of two clusters positioned very close to one another. This property is one that will result in the topology of the damage partition being different to the other clusters. Figure 19: A visualisation of the first three principal components of Z24 natural frequency data.  Table 6: WD values for data reduced from 4D to 3D.

WD
Performing the PCA of the data is interesting, as it shows that the topology and persistent homology is conserved over the first three principal components. As can be seen in Table 6, using the principal components of the data still results in the same orderings between the different partitions.
Now for a larger dimension reduction, into 2D space. This result still preserves the topology, although the WD values are a lot less distinct between the different cases. This result is because more topological information is being lost, as the data is compressed down to only the first two principal components. Despite the loss of information, the ordering from the scaled WD calculations still remains the same.   Table 7: WD values for data reduced from 4D to 2D.

Time-Delay Embedding
In this section, the topology of the natural frequency manifold will be reconstructed from each of the 1D time series. For the purposes of visualisation, the Z24 time series data will be embedded in 3D, so that an intuitive understanding of the topology can be formed.  show the delay embeddings of the first four natural frequencies, projected into three dimensions. The embeddings for ω 1 , ω 3 and ω 4 show a similar structure at the delay value α = 75. Whereas, ω 2 has a different topology defined around the colder temperatures. This can be inferred by the second mode having a more pronounced effect from the freezing temperature variations. Figure 21: ω 1 Time Delay Embedding α = 75. Figure 22: ω 2 Time Delay Embedding α = 75.  When delaying the time series, this means that the delayed dimensions will have different temperature readings from the one-dimensional time series data used to form the embedding. The value for the temperature used to represent the point is the one from the initial time series data. The other dimensions will have their own respective temperatures. The classes used for the temperatures in these plots are a little less certain than in previous cases. This classification isn't too much of a problem, as the change in temperature is assumed to be a smooth change over each day, and as the value of α = 75 is relatively small compared to the size of the data set, the time delay can be inferred as a local change.

Conclusion
The focus of this paper was to present an overview of TDA to the structural dynamics community, with a variety of use cases that highlight its depth and borderline limitless capabilities.
For the case of attractors, the calculation of the dimension of some common attractors was facilitated. A novel method for determining the optimal delay for constructing an attractor's topology from a 1D time series was also presented.
With respect to the Z24 data set, topological methods have been able to single out the damage data partition as the most topologically dissimilar. However, further analysis on topological methods for damage detection would need to be explored to understand the true limits and possibilities of TDA in SHM. An insight into the data structure provides powerful insight into the operating conditions of a machine or structure.
Future work on TDA, will look at different topological methods; aside from using the Wasserstein distance as a metric, other case studies will also be considered. A further journal paper, which extends on the ideas presented here will be submitted.