The increasing design documents created in the design process provide a useful source of process-oriented design information. Hence, the need for automated design information extraction using advanced text mining techniques is increasing. However, most of the existing text mining approaches have problems in mining design information in depth, which results in low efficiency in applying the discovered information to improve the design project. With the aim of extracting process-oriented design information from design documents in depth, this paper proposes a layered text mining approach that produces a hierarchical process model which captures the process behavior at the different level of details. Our approach consists of several interrelated algorithms, namely, a content-based document clustering algorithm, a hybrid named entity recognition (NER) algorithm and a frequency-based entity relationship detection method, which have been integrated into a system architecture for extracting design information from coarse-grained views to fine-grained specifications. To evaluate the performance of the proposed algorithms, experiments were conducted on an email archive that was collected from a real-life design project. The results showed an increase in the detection accuracy for the process-oriented information detection.

