The FRA railroad grade crossing accident database contains text comment fields that may provide additional information about grade crossing accidents. New text mining algorithms provide the potential to automatically extract information from text that can enhance traditional numeric analyses. Topic modeling algorithms are statistical methods that analyze the words of original texts to automatically discover the themes that run through them. A frequently used topic-modeling algorithm is Latent Dirichlet Analysis (LDA). In this paper we will show several examples of how labeled LDA can be applied to the FRA grade crossing data to better understand categories of words and phrases that are associated with various types of grade crossing accidents.

This content is only available via PDF.
You do not currently have access to this content.