Rolling bearings play an important role in the aerospace industry, manufacturing, and nuclear engineering. To ensure the reliable and stable operation of various mechanical equipment, the research on bearing fault diagnosis is very practical and critical. With the rapid development of smart manufacturing and industrial big data, deep learning has become an effective solution for emerging fault identification. Due to the different distributions of training samples and test samples for bearing faults, researchers have introduced many transfer learning methods to solve this problem. The traditional methods assume that the features of the target domain are known, or that the target domain has sufficient samples (similar to the number of training samples). However, these conditions are often difficult to meet in reality. In this paper, we proposed a Stable Feature Reweighting Transformer (SFRT) to deeply mine correlated features in fault signals and remove dependencies between correlated and irrelevant features. This helps the model mine the correspondence between distinguishing features and fault labels. Therefore, our method can complete the fault diagnosis task under variable working conditions with high quality when the target domain is unknown. Extensive experiments on CWRU and PU datasets demonstrate the effectiveness of our method on multiple distributional generalization tasks. Compared to state-of-the-art methods, our model produces the best fault identification accuracy.