The FRA railroad grade crossing accident database contains text comment fields that may provide additional information about grade crossing accidents. New text mining algorithms provide the potential to automatically extract information from text that can enhance traditional numeric analyses. Topic modeling algorithms are statistical methods that analyze the words of original texts to automatically discover the themes that run through them. A frequently used topic-modeling algorithm is Latent Dirichlet Analysis (LDA). In this paper we will show several examples of how labeled LDA can be applied to the FRA grade crossing data to better understand categories of words and phrases that are associated with various types of grade crossing accidents.
Skip Nav Destination
2015 Joint Rail Conference
March 23–26, 2015
San Jose, California, USA
Conference Sponsors:
- Rail Transportation Division
ISBN:
978-0-7918-5645-1
PROCEEDINGS PAPER
Applying Topic Modeling to Railroad Grade Crossing Accident Report Text Available to Purchase
Trefor Williams,
Trefor Williams
Rutgers University, Piscataway, NJ
Search for other works by this author on:
Christie Nelson,
Christie Nelson
Rutgers University, Piscataway, NJ
Search for other works by this author on:
John Betak
John Betak
Collaborative Solutions, LLC, Albuquerque, NM
Search for other works by this author on:
Trefor Williams
Rutgers University, Piscataway, NJ
Christie Nelson
Rutgers University, Piscataway, NJ
John Betak
Collaborative Solutions, LLC, Albuquerque, NM
Paper No:
JRC2015-5633, V001T06A002; 5 pages
Published Online:
June 10, 2015
Citation
Williams, T, Nelson, C, & Betak, J. "Applying Topic Modeling to Railroad Grade Crossing Accident Report Text." Proceedings of the 2015 Joint Rail Conference. 2015 Joint Rail Conference. San Jose, California, USA. March 23–26, 2015. V001T06A002. ASME. https://doi.org/10.1115/JRC2015-5633
Download citation file:
25
Views
Related Proceedings Papers
Related Articles
Modeling Information Needs in Engineering Databases Using Tacit Knowledge
J. Comput. Inf. Sci. Eng (September,2002)
A Framework Based on K-Means Clustering and Topic Modeling for Analyzing Unstructured Manufacturing Capability Data
J. Comput. Inf. Sci. Eng (February,2020)
A High Precision Direct Integration Scheme for Nonlinear Dynamic Systems
J. Comput. Nonlinear Dynam (October,2009)
Related Chapters
Topographic Processing of Very Large Text Datasets
Intelligent Engineering Systems through Artificial Neural Networks Volume 18
Compromise between Tensile and Fatigue Strength
New Advanced High Strength Steels: Optimizing Properties
Comparative Study of Text Representation Methods
International Conference on Information Technology and Computer Science, 3rd (ITCS 2011)