## Abstract

Breaking down the total loss in a turbomachine into a number of low-order, physical models is a powerful way of developing loss models for informing design decisions. Better loss models lead to better design decisions. A problem, however, is that in complex flows, it is often not clear how to break a flow down physically without making assumptions. An additional problem is that the designer often does not know what assumptions should be made to derive the most accurate and general physical models. In practice, this problem often leads to loss models of low accuracy, which only work in a limited part of the overall design space. This paper shows that machine learning can be used to augment a designer in the process of developing loss models for complex flows. It is shown that it is able to help a designer discover new, more accurate and general, physical models, highlighting to a designer what assumptions should be made to retain the physics important to the problem. The paper illustrates the new method using the problem of compressor and turbine profile loss. This problem was chosen because it is well understood and therefore is a good way of validating the new method. However, surprisingly the new method is shown to be able to develop a new profile loss model which is more accurate and general than previous models. This is shown to have been achieved by the machine learning finding a new, more general, underlying model for trailing edge mixing loss.

## 1 Introduction

The ability of the human mind to deduce physical models from observations in the world around us is one of the marvels of creation. Loss models are a prime example of this in the context of turbomachinery as they reduce the complex details of the flow into simple low-order physical models which can be understood by a designer and allow informed design decisions to be made. The problem is, however, that the human mind often struggles to comprehend how to put together simple physical models to represent a complex flow. Assumptions are often made to reduce the complexity of the problem but, it is often not clear whether these assumptions are representative of the real flow. As a result, the overall loss models which are developed are often of a low accuracy and only work in a limited part of the overall design space. The objective of this work is to develop a data-centric approach which augments a human’s ability to derive loss models which are more general and accurate than existing models, and therefore highlighting what assumptions should be made to derive such models.

An example where a more general and accurate underlying physical model has been discovered, using the data-centric approach presented in this paper, is the case of mixing loss. Currently, three models for mixing loss exist, as shown in Fig. 1.

The left-hand side of Fig. 1 shows three mixing loss models. Model A is the mixing of a small injected mass flow into a much larger freestream flow [1]. Model B is the mixing out of a finite blockage in a constant area duct [2]. Model C is the mixing out of a finite blockage in an infinitely large freestream flow [3]. The details of these models are discussed later in the paper; however, what is important is how in the limit of each model, as they approach areas of the design space where the other models are defined, they predict very different losses. In this paper, the data-centric approach presented is used to highlight a single underlying physical model, more general than the other models, which accurately predicts the correct mixing loss over the entire design space. This new model is shown in a schematic form as the gray surface on the right-hand side of Fig. 1.

The importance of finding these general underlying models is not only to improve the accuracy of loss modeling but also to provide designers with more physical ways of abstracting problems. There is a natural tendency of the human mind to want to use physical models over a wider range of design problems, than for which they were originally intended. The more general an underlying physical model is, the more powerful it is, in terms of allowing a much wider range of new problems to be considered.

This paper has two aims; the first is to demonstrate the capability of a data-centric approach, which uses machine learning in conjunction with large computational datasets, to augment a designers ability to develop, and put together, underlying physical models for loss. This is demonstrated on the problem of developing profile loss models for the design of compressor and turbine blades. Profile loss was chosen because it is well understood and therefore is a good way of validating the new method. The second aim is to show how using this method highlighted flaws in the physical understanding of current loss models to allow a more generic physical model to be found.

## 2 How Mixing Loss Models Guide Design Choices

It is important to understand how each of the three mixing models, shown in Fig. 1, drive the designer to make different design choices. Each model is considered in turn by applying conservation of mass and the steady flow momentum and energy equations to the control volume shown in Fig. 2. Each of the three models differ in terms of the assumptions made about the boundary conditions to the control volume.

*A*

_{wake}/

*A*

_{tot}→ 0, the rate of loss creation is given by Eq. (1), where $m\u02d9inj=\rho AwakeVinj$ is the injected mass flow.

*A*

_{wake}/

*A*

_{tot}, in a uniform freestream flow from Ref. [2]. In this case,

*V*

_{inj}/

*V*

_{free}= 0. For this case, Eq. (1) gives zero loss since $m\u02d9inj=0$, which is clearly incorrect. Denton’s model shows that the rate of loss creation is given by Eq. (2), where $m\u02d9tot=\rho AtotVexit$ is the total mass flow of the freestream.

*V*

_{inj}/

*V*

_{free}= 0 and

*A*

_{wake}/

*A*

_{tot}→ 0. For this case, Eqs. (1) and (2) both give zero loss. However, in external aerodynamics, the rate of loss creation is usually written in terms of the form drag coefficient of the object,

*C*

_{D}, Eq. (3) [3].

The models discussed above show how the choice of model acts to guide the design process. In many cases, the problem is even more complicated because the details of the mixing problem lies somewhere in between these models. In such cases, it is not clear what design choices the designer should make. The new method, presented in this paper, is shown to help develop a more general mixing model, which underlies the three models above, and therefore can act to help guide the design process over a much wider range of problems.

## 3 Extracting Physical Models From Data

In this section, the data-centric approach used in this study is discussed. The approach has four steps, shown in Fig. 3, and listed below:

*Generation of data*—Generate a dataset which spans the design space.*Determine physical parameters*—Use current physical understanding to list all the physical parameters that may be relevant to the problem.*Automated learning of low-order models*—Use automated machine learning techniques to highlight and learn combinations of the physical parameters which best fit the data.*Human exploration of the design space*—Rapid manual exploration of the design space in order to understand the physics that is highlighted by the model.

The approach used is a way of augmenting the development of physical models that utilizes the strengths of both machine learning and the human mind. Step 3 overcomes the human limitation of deducing patterns in complex datasets by utilizing machine learning which can easily learn models in complex datasets. These learned models are not, however, guaranteed to be physical but are useful for highlighting the kind of physics important in the model. Step 4 provides a rapid exploration user interface that provides a quick way of determining key physical insights that are highlighted by the model, and physics that may also be missing. From this analysis the user then returns to step 2, changes, or selects a different set of physical parameters and repeats steps 3 and 4. It was found that by repeating this cycle a few times, a new loss model could be determined. The combination of machine learning in step 3 and human exploration in step 4 was found to be an incredibly powerful combination. The rest of the section describes each of the four steps in more detail.

### 3.1 Generation of Data.

The first stage is to generate a dataset which fully spans the design space. The data can be sampled in any manner but must include points at the extremes of the design space to ensure a machine learning model is fully trained. If physical models are to be extracted, the design space needs to span a wide enough space to cover all the relevant physical mechanisms.

The method used for populating the dataset in this work was a form of Bayesian optimization [4]. This method involved generating a Gaussian process model for predicting aerodynamic loss from some initial data, using the regions with highest prediction uncertainty to guide the selection of new data points to be evaluated. The purpose of this was to select data points which would be most likely to increase the accuracy of the Gaussian process model and therefore minimize the amount of data required to be collected.

To ensure a dataset is physically consistent, computational simulations were used. These have the advantage of enabling many different physical parameters to be extracted from the data which are difficult to obtain from experiments and enables physical parameters to be formed retrospectively without the need to take new measurements. The machine learning approach used in this work relies on the data containing physically accurate trends but does not require the data to predict the precise values for any specific case. This is analogous to how computational fluid dynamics (CFD) is already used in design, where designs are varied and analyzed in CFD to assess how different physical mechanisms are affected by design changes.

Best practice also suggests using design rules to constrain the dataset to physically realizable designs. Without this constraint, many of the data points will fail to converge or give meaningful results. Compressor blades for this study were therefore designed using a set of aerodynamic design rules, as shown in Fig. 4. These constraints were that at design incidence, the stagnation streamline must touch the geometric leading edge point and that the shape factor of the boundary layer on the suction surface from peak suction must vary linearly to a specified value at the trailing edge. In addition to this, the thickness distribution was generated using geometric design rules in class shape space transformed coordinates, based on work by Kulfan [5] with only the maximum and trailing edge thicknesses allowed to vary, the wedge angle set as a function of these values. Furthermore, the physics of the problem was constrained to that of a fully turbulent, incompressible flow (*M*_{inlet} = 0.2) at a Reynolds number based on chord of 6.33 × 10^{5}, conditions typical toward the rear stages in a high-pressure compressor.

Together these design rules mean that the compressor blade geometric parameters and pitch are fixed by five design parameters as shown in Table 1. The stage loading and flow coefficient were used to set the blade velocity triangle, assuming the reaction was 50%. The other three parameters resulted in the pitch to chord ratio and the thickness of the blade being fixed. For this study, over 3000 compressor blade designs were predicted using MISES with a subset of 1000 predicted also using Reynolds-averaged Navier–Stokes (RANS). Predictions in RANS were undertaken at a range of incidences.

### 3.2 Determine Physical Parameters.

The second stage is to use current physical understanding to specify parameters which may have physical relevance for predicting the variable of interest. The purpose of this step is to generate the parameters which will be used by the next step to develop the model. This is an iterative process using previous literature and experience to determine which parameters should be tried. If there is no prior knowledge of the kind of physical parameters to expect, then other techniques, such as analysis of variance [6], SHapley Additive exPlanation [7], and interaction information [8], can be used to try assess interactions that may exist between the parameters and so inform the kind of parameters to use.

*V*

_{TE pass}, is a velocity based on the average static pressure across the passage at the trailing edge plane. The second,

*V*

_{TE ss}, is the velocity based on the pressure local to the suction surface at the trailing edge.

### 3.3 Automated Learning of Low-Order Models.

The third stage is to use machine learning to deduce the low-order structure present in the data. The low-order fits are informative since they reveal the combinations of physical parameters which best capture the variation in the data. Producing informative low-order models, however, is not trivial since many of the suggested physical parameters will be highly correlated to each other. When input parameters are highly correlated, deterministic methods for dimension reduction such as principal components analysis and active subspaces [9] cannot guarantee a unique tractable solution.

The primary technique used in this work for generating low-order fits, therefore, is that of symbolic regression. This method uses a genetic algorithm to trial potential parameter combinations, called genes, that will improve the predictive capability of a model for a quantity of interest. Due to the stochastic nature of the algorithm, there is no guarantee that the same result will be obtained each time it is run or that the results reflect the true low-order structure. However, when running the algorithm multiple times, certain parameter structures appear consistently suggesting these may indeed be reflecting the true low-order structure. In this study, symbolic regression was conducted using GPTIPS2 [10]. Gene numbers were varied between two and four to help discern what parameter combinations were significant for predicting profile loss. These parameter combinations were constrained to multiplication or division by other parameters to keep the genes interpretable.

### 3.4 Human Exploration of the Design Space.

The fourth stage of the process involves the user rapidly exploring the design space. This is achieved by building a high-fidelity machine learning model of the dataset which can then be explored without running more computational solutions. This means that the design space can be explored incredibly quickly. A typical exploration process would be to undertake a large number of virtual experiments, each holding different parameters constant to test the sensitivities of other parameters. This provides an understanding of how different parameters interact. Using this method, it can be quickly understood whether the new low-order model, predicted by step 3, is physical, and if not what parameters, or combination of parameters, are missing.

## 4 Numerical and Experimental Methods

The training data for the data-centric approach in this study were obtained using computational simulations for 2D compressor and turbine blades. The compressor data were generated using two different codes, MISES [11,12] an Euler code with a coupled boundary layer solver, and Turbostream [13], a RANS code. The two codes were chosen because of their very different underlying modeling assumptions.

In the MISES solutions, the boundary layers on both surfaces of the blade were tripped at the leading edge to ensure that they were turbulent. The trailing edge model of MISES solves the momentum integral equation in the wake with an adjusted shape factor to account for the dead air region. The model has been extensively calibrated against the most detailed near-wake experiments available [14]. These include four test cases on a model of an RAE 2822 airfoil [15] and experiments on the near-wake region of a transonic compressor cascade [16]. It should be noted that the experimental data on which the MISES trailing edge model was calibrated showed no significant vortex shedding and that the trailing edge model, therefore, assumes a constant pressure across the base region. The close agreement between MISES and the experimental measurements leads the authors to believe that it is more accurate than the RANS solution in the trailing edge region.

The RANS solutions were undertaken with turbostream using the Spalart–Allmaras turbulence model and a boundary layer *y* + <3. The pressure around the trailing edge base region was extracted from the solutions. Because the RANS solver is steady, vortex shedding cannot be present.

Because both codes do not have trailing edge vortex shedding for any of the cases, the results can be easily compared. A comparison of the velocity distribution around a typical blade using the two codes is shown in Fig. 6. A comparison of the loss coefficient for the whole training set, predicted by both codes, is shown in Fig. 7. The figures show that the two codes are in close agreement.

In addition to the compressor data, MISES simulations from a set of turbine blade profiles from Ref. [17] were used to validate the models learned using the compressor data and obtain new profile loss models. Once again these had fully turbulent boundary layers. Finally, an experimental transonic dataset from Ref. [18] was used to validate the models. Some of the cases had vortex shedding and some did not. These cases are used in the paper to investigate the effect of vortex shedding.

## 5 The Learned Profile Loss Model

### 5.1 Comparison With Existing Models.

Equation (5) was compared to two different models. The first is a model given by Greitzer et al. [19], and the second is a model given by Denton [2]. Both models are derived in a similar way using conservation of mass and the steady flow momentum equation applied to the control volume from the blade trailing edge plane to the exit plane. The flow is assumed to be incompressible and to be uniform across the passage at the trailing edge plane. The main difference between the two models is that Greitzer et al. assume that the base pressure of the blade is equal to the freestream pressure while Denton defines a base pressure coefficient which fixes the pressure on the blade trailing edge relative to that of the freestream.

*ω*

_{TE}≈

*ω*

_{2}. A comparison of the machine learning model and the Denton model using this assumption, against the RANS predictions of loss, is plotted in Fig. 9. The Denton model has an RMSE of 25.3% still 5 times that of the learned model. The Denton model is observed to systematically under predict the loss as a result of neglecting the differences between the trailing edge velocity and the exit velocity.

*V*

_{TE}/

*V*

_{2}=

*w*/(

*w*−

*t*−

*δ**), which allows Eq. (7) to be rewritten as

### 5.2 Physical Interpretation of the Learned Model.

*θ*/

*w*, is identical and represents loss generated within the blades attached boundary layers and the loss generated by the mixing out of these boundary layers at the trailing edge flow condition.

The second term in the two models differs. In Denton’s model, it is, ((*t* + *δ**)/*w*)^{2}, the trailing edge fractional blockage squared. In the machine learning model it is, ((*t* + *δ**)/*w*)(1 − *V*_{2}/*V*_{TEss}), the trailing edge fractional blockage multiplied by fractional change in velocity between freestream velocity close to the trailing edge and the velocity at the exit plane downstream. It is important to note that the learned model uses a local freestream velocity close to the trailing edge *V*_{TEss}.

The third term only occurs in the Denton model and represents the effect of the base pressure variation on loss.

A comparison of both the Denton’s model and the learned model, against the RANS predictions, with the first term in the correlation removed is plotted in Fig. 10. This represents the loss due to the mixing out of the trailing edge blockage downstream of the blade row. It is observed that the addition of the base pressure term reduces the error in the Denton model, but that the machine learning model still has a significantly lower error.

### 5.3 Importance of the Local Trailing Edge Velocity.

To determine the reason that the learned model is more accurate at predicting the loss due to the mixing out of the trailing edge blockage, step 4 of the data-centric approach, described in Sec. 3.4, was used. Using this approach highlighted that the cause of the error was due to non-uniformity in freestream flow across the trailing edge plane. This can be seen in Fig. 11 which shows the velocity variation across the trailing edge plane for two blades optimized for the same design point and thickness distribution but with a significantly different pitch to chord.

It should be noted that in Fig. 11, the reason that the velocity in the freestream close to the pressure surface is not equal to the velocity in the freestream close to the suction surface is due to the stagger of the blade. This means that the trailing edge plane meets the pressure surface upstream of the point where the surface joins the trailing edge circle. In practice, the velocity in the freestream at the point where the pressure surface meets the trailing edge circle is found to be equal to the velocity in the freestream at the point where the suction surface meets the trailing edge circle.

Figure 11 shows that as the pitch to chord is raised, the velocity in the freestream close to the trailing edge significantly differs from the average velocity in the freestream. The difference between the two can be up to 10%. Highly loaded blades have more asymmetry between the suction surface and pressure surface velocity distributions and therefore lead to a more non-uniform freestream velocity across the trailing edge plane.

Referring to the machine learning model, Eq. (5), it can be seen that it has chosen the local freestream velocity close to the trailing edge, *V*_{TEss}, as part of the model for blockage mixing loss. This is because the mixing process downstream of the trailing edge involves the wake fluid mixing only with the freestream fluid close to the edge of the wake.

### 5.4 Redefining the Base Pressure Coefficient.

It is interesting to note that a base pressure term does not appear in the machine learning model, Eq. (5). This could either imply that the base pressure term does not exist or that it is too small for the machine learning to identify. It was also found, using step 4 in the data-centric approach, that the static pressure varied in the freestream at the trailing edge plane, in a similar way to the velocity shown in Fig. 11.

### 5.5 Preliminary Design Implications.

In practice, the learned loss model will be used as part of a preliminary design system. These systems correlate the terms in the loss model with design parameters such as flow coefficient, stage loading coefficient, pitch chord, etc. When a new design is considered, these correlations are used to determine the input parameters to the loss model so that the profile loss of the blade can be predicted.

To determine how the Denton model and the learned model perform in such a situation, low fidelity correlations were derived for each of the terms in the loss models using the original set of blade designs. These low fidelity correlations were then used to predict the parameters in the loss models. Finally, these parameters were put in the two models to obtain loss predictions. It should be noted that the loss models are multiplied by (*V*_{2}/*V*_{1})^{2} to produce a loss coefficient referenced to the inlet conditions as is more commonly used in compressor design. The results can be seen in Fig. 13. The figure shows that the error in the low fidelity correlations causes the RMSE to rise to 8.4% for the learned model; this is still, however, half the RMSE of 20.5% from the Denton model. This shows that improved accuracy of the learned model is still of importance when the model is used as part of a preliminary design system.

### 5.6 Summary.

## 6 Validation of the Profile Loss Model

To assess whether the learned model can also be applied to turbines, MISES predictions from Ref. [17] and experimental test cases from Ref. [18] have been used. Turbine datasets are a good test of the universality of the model since mixing loss accounts for a much larger proportion of the total loss in a turbine. Turbine designs also have a much larger variety of wedge angles, have a much larger compressibility effect, and are more affected by vortex shedding, which are all known to affect the overall loss [2,20]. The first set of data is from MISES calculations for a set of turbine designs produced by Clark [17] and was used in order to test the effects of wedge angle and compressibility on the mixing loss. The wedge angle of the flow approaching the trailing edge was calculated using the angle of the inviscid flow at the trailing edge rather than the actual geometry of the blade. The range of wedge angles explored was 10 deg < *α* < 20deg. The value of $cpb*$ was set to zero for all the predictions.

The second dataset used for validation is an experimental dataset produced by Rossiter et al. [18] for a transonic trailing edge. This dataset was used since it contained a range of wedge angles and Reynolds numbers. The variation in the base pressure due to the influence of vortex shedding was also measured. The effects of wedge angle, compressibility, and base pressure are therefore considered in the following sections.

### 6.1 Influence of Wedge Angle.

*M*

_{2}= 0.4 were used. This is because at this Mach number, the flow can still be approximated as incompressible. Figure 14 shows loss predictions using Eq. (13) for these designs significantly under-predicting the overall loss. Some would traditionally attribute this loss to the influence of base pressure [20]; MISES, however, assumes in the calculation that the base pressure is the same as the local pressure in the boundary layer at the trailing edge so this cannot be the source of this discrepancy. The data-centric approach was therefore applied to this new dataset to deduce the underlying cause of the discrepancy. The resulting equation, which now includes the effect of wedge angle, is given by

The newly derived equation highlights an additional wedge angle loss term, the third term in the equation, which increases as the wedge angle increases. Figure 14 demonstrates how these wedge angle corrections almost completely eliminate the prediction error suggesting they capture the correct underlying physics.

The model in Eq. (14) is also plotted for the experimental results from Ref. [18] in Fig. 15. The red points are the learned model without the wedge angle correction terms and the blue points include the wedge angle correction terms. The wedge angle correction again is observed to significantly reduce the error in the loss predictions.

### 6.2 Influence of Compressibility.

*M*

_{2}= 0.55 and

*M*

_{2}= 0.7 from the first dataset were considered. Equation (14) is compared against the numerically calculated loss in Fig. 16. For the compressible cases, the kinetic energy loss coefficient, Eq. (15), is used.

### 6.3 Influence of Vortex Shedding.

To assess the influence of vortex shedding, the experimental dataset from Ref. [18] was used and predictions for the kinetic energy loss coefficient were made using Eq. (14). The resulting loss predictions without the base pressure term are shown in Fig. 17. The predictions are observed to predict the measured loss well for cases where there is detached vortex shedding but under predict the loss by around 50% for cases where there is transonic vortex shedding. This result highlights that for cases with detached vortex shedding, i.e., where the boundary layer conditions at the trailing edge are steady, the base pressure term is small. For cases where there is transonic vortex shedding, i.e., where the boundary layer conditions at the trailing edge are unsteady, the measured base pressure coefficient, *c*_{pb}*, becomes non-zero, dropping in value as the vortex shedding becomes more unsteady.

Figure 18 shows loss predictions from Eq. (14) using the measured time averaged base pressure coefficient. The prediction error for the cases with transonic vortex shedding at the trailing edge is shown to be significantly reduced. As *ζ*_{2} rises above 0.03, the magnitude of the vortex shedding becomes extremely large and the results from the model start to deviate from that of the measured values. However, in this regime, the unsteadiness in the flow is very large and it is not clear how the measured values should be averaged.

### 6.4 Summary.

Testing the new profile loss model on turbine data has shown the need for the wedge angle to be accounted for. The wedge angle has been shown to introduce an additional term in the loss model. However, once this has been properly included, the model is found to accurately predict both compressor and turbine profile loss.

## 7 New Analytical Mixing Loss Model

The most important conclusion from the previous two sections is that the model for the mixing out of a trailing edge blockage should be referenced to the velocity and pressure in the freestream at the edge of the wake. The control volume analysis used in the mixing model at the start of the paper, shown in Fig. 2, does not consider the effect of the wake mixing locally with the freestream at the edge of the wake. In this section, a new control volume analysis is developed which considers this effect. This new way of thinking about the problem will be shown to be more general and to be able to accurately predict the behavior shown in models A, B, and C.

The new definition of the control volume is shown in Fig. 19. In the same way as at the start of the paper, the two flows are referred to as the freestream and the injected flow. If the velocity of the injected flow is set to zero, then the injected flow represents a wake. However, allowing the injected flow velocity to vary makes the model more general.

In this model, the control volume has a fixed area set by the area of the injected flow at inlet. In reality, the area of the mixing region grows as the injected flow mixes with the freestream. However, in the new model, it is assumed that mass enters the control volume to fill the mass deficit in the injected flow, $m\u02d9def$, at the average local freestream velocity and that all mixing occurs within the control volume. The new way of defining the control volume ensures that it is the velocity of the freestream fluid adjacent to the injected flow, which is used to set the mixing loss.

The rate of loss creation in the control volume shown in Fig. 19 can be calculated by applying conservation of mass and the steady flow momentum and energy equations to the control volume, see Appendix A. The process is assumed to be incompressible for simplicity. Across the mixing process, the velocity in the freestream changes by a finite amount Δ*V*_{loc}. The cause of this change may be a consequence of mass entrainment into the wake, or by external factors such as changes in the freestream stream tube width. Across the mixing process, the velocity of the injected flow also changes by a finite amount Δ*V*_{inj}.

*m*

_{1}and

*m*

_{2}each traveling at different velocities

*V*

_{1}and

*V*

_{2}, which undergo an inelastic collision and end up at the same velocity

*V*

_{3}as illustrated in Fig. 20. The loss in kinetic energy due to the collision is shown in Appendix B and is given by

*J*is the impulse between the two masses that brings them to the same velocity. The equation shows that the lost kinetic energy can be written as a velocity difference multiplied by the impulse required to bring the masses to equilibrium. This equation can be split into two terms which correspond to the work done on each mass, by the impulse, in bringing it to its final velocity

*V*

_{1}>

*V*

_{2}, the first term corresponds to the work done to decelerate

*m*

_{1}and the second term corresponds to the work done to accelerate

*m*

_{2}. The difference between the two is the net kinetic energy lost due to the collision.

The form of Eq. (16) can now be interpreted. Work is done by the freestream accelerating the injected flow to its final velocity. Work is also done by the injected flow decelerating the freestream to its final velocity. The loss created is the difference between the two. The term in brackets in Eq. (16) can be interpreted as the average shear force between the two mass flows. This is not intuitive but is proved in Appendix C for two small flows. The rate of loss creation is equal to the net rate of work done by the shear force acting between the two mass flows.

Equation (16) also highlights that there are two mass flowrates of importance in the problem, the mass flowrate of the injected flow and the mass flowrate required to fill the deficit in the injected flow. The shear force depends on the rate of change of momentum of both of these mass flows. The shear force, and therefore the loss, is large when the rate of change of the momentum of either or both of the mass flowrates is large.

Figure 21 shows the loss model in Eq. (16), plotted as a red dashed line, compared with the analytical solution obtained when considering a control volume around the entire flow, plotted as the black line. The agreement between the two lines demonstrates that the modeling assumptions are accurately capturing the physics of the real flow.

The green and blue lines show the loss broken down into the two terms in Eq. (16). When the freestream velocity change Δ*V*_{loc} is small, the shear force is dominated by the momentum change of the injected flow, the green curve, and this is the dominant source of loss. When the freestream velocity change Δ*V*_{loc} is large, the shear force is dominated by the momentum change of the mass filling the wake region, the blue curve, and this is the major source of loss. This shows that for the mixing loss model to be general, it must capture both mechanisms.

## 8 Universality of the New Mixing Model

### 8.1 Mixing of a Small Injected Mass Flow.

*V*

_{loc}→ 0, and so the second term of Eq. (16) is zero. The rate of loss creation is therefore given by

*V*

_{loc}−

*V*

_{inj})/

*V*

_{loc}< 0.3.

### 8.2 Mixing of a Trailing Edge Blockage.

*V*

_{loc}can be written as

*V*

_{loc}−

*V*

_{inj})/

*V*

_{loc}= 1.

*V*

_{loc}=

*V*

_{TEss}and

*V*

_{exit}=

*V*

_{2}. Substituting this into Eq. (21) and dividing by the exit kinetic energy produces a loss coefficient

*t*+

*δ**), and the total flow area per unit depth is the passage width,

*w*, then Eq. (24) becomes

### 8.3 Airfoil Form Drag.

*V*

_{loc}/

*V*

_{exit}→ 1

*b*,

*r*, and

*θ*are shown in Fig. 23.

*θ*→ 0 and

*r*→

*c*giving

*b*is given by

*C*

_{D}gives

*c*/

*t*. In these observations, it is apparent that the form drag coefficient scales inversely with the fineness ratio. Hoerner also highlights that for infinite half-bodies,

*t*/

*c*→ 0, and the body would have no drag [22] as is predicted by Eq. (34).

### 8.4 The Effect of Wedge Angle on Mixing Loss.

The effect of wedge angle on the mixing loss of a trailing edge flow can be modeled by considering the turning of an inviscid flow in the freestream followed by the mixing of parallel flows with a injected flow with no initial velocity (Fig. 24). In this model, it is assumed that the stream-wise momentum after flow turning is what sets the momentum entering from the freestream and thus the shear force and loss.

*α*< 20 deg; the last two terms are both the product of two very small terms, so can be neglected. It is also possible to assume cos(

*α*/2) ≈ 1 for outside of the bracket of the first term and so the equation can be approximated by

*t*+

*δ**), and the total flow area per unit depth is the passage width,

*w*, then Eq. (39) can be rewritten as

This model therefore demonstrates the theoretical basis for how the wedge angle contributes toward loss in the mixing out of a trailing edge. Its effect is to increase the rate of change of momentum for the flow coming into the wake region and therefore increases the shear force resulting from the flow mixing. This reduces the pressure recovery of the expanding flow, increasing the overall loss. This result explains the observations by Sieverding et al. [20] that as wedge angle is increased, the pressure of the base increases relative to the mixed out pressure. The difference presented here is that it has been demonstrated that the wedge angle increases the loss, not the base pressure. If the thickness, velocity, and boundary layer thickness at the trailing edge are fixed, the change in base pressure relative to the mixed out pressure is a consequence of the exit pressure dropping from increased loss rather than from the base pressure increasing.

### 8.5 Summary.

In this section, it has been demonstrated that the new analytical model is universal in its application to a wide variety of mixing problems. The loss models discovered using the data-centric approach have taught the authors to think differently about mixing loss and shown that it is possible to accurately capture the physics for a wide variety of cases if the control volume considered is that of the region behind the wake of the injected flow instead of one around the entire flow, as illustrated in Fig. 25.

## 9 Conclusion

In this work, it has been shown that a data-centric approach can be applied to deduce a physics-based low-order model for the profile loss of compressor and turbine blades. The resulting profile loss model was demonstrated to be simpler and more accurate than the models presented by Denton [2] and Greitzer et al. [19]. The success of the approach has demonstrated the ability to make use of large sets of computational simulations to learn physical connections within the data.

Second, the new physics-based model highlighted errors in previous understanding about the loss generated from the mixing out of a blockage. It shows that the loss is set by the velocity and pressure in the freestream local to the trailing edge. It is shown that when the local flow conditions are used, the error in the mixing loss model more than halves. It also shows that when the base pressure coefficient is redefined using the local static pressure, its value drops by more than an order of magnitude, and the base pressure coefficient becomes so small that it does not have a significant effect on loss for cases where the flow is steady at the trailing edge. The magnitude of this base pressure coefficient only becomes significant when there is unsteadiness at the trailing edge, i.e., when transonic vortex shedding occurs.

Third, a new model has been developed for mixing loss based on new modeling assumptions derived from the model learned using the data-centric approach. This model has been shown to be the general case of which a number of existing mixing models are specific cases. The model provides the designer with better physical insight about how to think about a wide range of mixing problems.

Finally, the data-centric approach has shown that machine learning methods can be used to augment a designer in the process of developing loss models for complex flows. It has also been shown that the method can help in understanding what assumptions should be made to produce more general and accurate loss models. It has been shown that these models are most general when the model retains the key physics present at the infinitely small scale.

## Acknowledgment

The authors would like to thank Alex Rossiter and Chris Clark for providing data to be analyzed in this work. The authors would also like to thank Mark Girolami, James Taylor, Andrew Wheeler, Graham Pullan, Harry Simpson, and Ho-On To for their advice and support throughout this project. The authors would also like to thank the Engineering and Physical Sciences Research Council (EPSRC) for funding this work.

## Conflict of Interest

There are no conflicts of interest.

## Data Availability Statement

The datasets generated and supporting the findings of this article are obtainable from the corresponding author upon reasonable request.

## Nomenclature

*c*=true chord

- $m\u02d9$ =
mass flowrate

*p*=static pressure

*t*=trailing edge thickness

*w*=total exit passage width

*x*=distance along the chord

*A*=area

*F*=shear force

*H*=suction surface boundary layer shape factor

*J*=impulse of a force

*M*=Mach number

- $S\u02d9$ =
entropy generation rate

*T*=temperature

*V*=velocity

*c*_{pb}=base pressure coefficient

*c*_{p}=specific heat capacity

- $m\u02d9def$ =
mass flowrate required to fill the wake

*p*_{0}=stagnation pressure

- $tmax$ =
maximum thickness

*C*_{D}=form drag coefficient

*α*=wedge angle

*β*=flow angle

*δ** =combined boundary layer displacement thickness

- Δ
*p*_{0}= change in stagnation pressure

- Δ
*V*= initial velocity difference between the two streams

- Δ
*V*_{loc}= change of velocity in the freestream

- Δ
*V*_{inj}= change of velocity in the injected flow

*ζ*=kinetic energy loss coefficient =$(h2\u2212h2s)/(1/2)V22$

*θ*=combined boundary layer momentum thickness

*ρ*=density

*ϕ*=flow coefficient

- Φ =
lost kinetic energy

*ψ*=stage loading coefficient

*ω*=stagnation pressure loss coefficient Δ

*p*_{0}/(1/2)*ρV*^{2}

## Subscripts

## Abbreviations

## Appendix A: Derivation of the Model for the Mixing of Parallel Flows

*V*

_{inj}, the change in velocity of the injected flow, is given by

*V*

_{inj}= 0, the equation is also valid for compressible flow.

## Appendix B: The Inelastic Collision of Two Masses

## Appendix C: Derivation of the Average Shear Force Between Two Small Flows

It is possible to find the average shear force between two flows only when there is a finite velocity difference over an infinitely small length scale, *dA*. This is because only when both length scales are small can area changes in the streams be neglected and the pressure forces balance between the two streams.

## Appendix D: Derivation of the Stream-Wise Velocity Change Due to Inviscid Turning

The stream-wise velocity change due to inviscid flow turning can be considered as the sum of velocity changes over *N* square control volumes of length *dy*, with a linear variation in both tangential and stream-wise velocity (Fig. 28). Linear variation is valid since these are infinitesimal length scales.

*N*based on the length scales, Eq. (D1), and the total change in tangential velocity using the partial derivative in the stream-wise direction, Eq. (D2). The total change in stream-wise velocity can be written as the sum of the changes generated in each square control volume, Eq. (D3). Substituting Eq. (D1) into this equation and rewriting using the tangential stream-wise velocity gradient gives Eq. (D4).