The National Transportation Safety Board in the United States and the Transportation Safety Board of Canada publish reports about major railroad accidents. The text from these accident reports were analyzed using the text mining techniques of probabilistic topic modeling and k-means clustering to identify the recurring themes in major railroad accidents. The output from these analyses indicates that the railroad accidents can be successfully grouped into different topics. The output also suggests that recurring accident types are track defects, wheel defects, grade crossing accidents, and switching accidents. A major difference between the Canadian and U.S. reports is the finding that accidents related to bridges are found to be more prominent in the Canadian reports.

