This paper presents development of accurate turbulence closures for wake mixing prediction by integrating a machine-learning approach with Reynolds Averaged Navier-Stokes (RANS)-based computational fluid dynamics (CFD). The data-driven modeling framework is based on the gene expression programming (GEP) approach previously shown to generate non-linear RANS models with good accuracy. To further improve the performance and robustness of the data-driven closures, here we exploit that GEP produces tangible models to integrate RANS in the closure development process. Specifically, rather than using as cost function a comparison of the GEP-based closure terms with a frozen high-fidelity dataset, each GEP model is instead automatically implemented into a RANS solver and the subsequent calculation results compared with reference data. By first using a canonical turbine wake with inlet conditions prescribed based on high-fidelity data, we demonstrate that the CFD-driven machine-learning approach produces non-linear turbulence closures that are physically correct, i.e. predict the right downstream wake development and maintain an accurate peak wake loss throughout the domain. We then extend our analysis to full turbine-blade cases and show that the model development is sensitive to the training region due to the presence of deterministic unsteadiness in the near wake region. Models developed including this region have artificially large diffusion coefficients to over-compensate for the vortex shedding steady RANS cannot capture. In contrast, excluding the near wake region in the model development produces the correct physical model behavior, but predictive accuracy in the near-wake remains unsatisfactory. We show that this can be remedied by using the physically consistent models in unsteady RANS, implying that the non-linear closure producing the best predictive accuracy depends on whether it will be deployed in RANS or unsteady RANS calculations. Overall, the models developed with the CFD-assisted machine learning approach were found to be robust and capture the correct physical behavior across different operating conditions.