Abstract

As supply chain complexity and dynamism challenge traditional management approaches, integrating large language models (LLMs) and knowledge graphs (KGs) emerges as a promising method for advancing supply chain analytics. This article presents a methodology crafted to harness the synergies between LLMs and KGs, with a particular focus on enhancing supplier discovery practices. The primary goal is to transform and integrate a vast body of unstructured supplier capability data into a harmonized KG, thus improving the supplier discovery process and enhancing the accessibility and findability of manufacturing suppliers. Through an ontology-driven graph construction process, the presented methodology integrates KGs and retrieval-augmented generation with advanced LLM-based natural language processing techniques. With the aid of a detailed case study, we showcase how this integrated approach not only enhances the quality of answers and increases visibility for small- and medium-sized manufacturers but also amplifies agility and provides strategic insights into supply chain management.

1 Introduction

The manufacturing supply chains are facing significant challenges stemming from a lack of resilience, flexibility, and visibility, which hampers their ability to effectively respond to disruptions and adapt to changing market conditions. These deficiencies can severely impact an industry’s, or even a nation’s, security and economic prosperity. For example, recent disruptions—triggered by health crises, natural disasters, trade conflicts, cybersecurity breaches, and geopolitical instabilities—have unveiled the vulnerability of suppliers and supply chains. The heavy reliance on single-source suppliers, especially in critical industries such as semiconductor manufacturing, biomanufacturing, and medical equipment and technology, can expose supply chains to disruptions, shortages, and production bottlenecks.

To compete in today’s rapidly changing business landscape, it is crucial for small- and medium-sized manufacturers (SMMs) to become more resilient and responsive by enhancing their ability to quickly adapt, overhaul, and rearrange in the face of unforeseen changes. One approach for improving the resiliency of SMMs is to improve their visibility and findability and to provide then with the ability to widely advertise their capabilities using standardized models. Supply chain managers should be able to quickly and accurately identify qualified suppliers when deploying new supply chains or restructuring existing supply chains. Intelligent methods for accurate and efficient discovery of manufacturing capability and capacity could enhance the responsiveness of manufacturing supply chains by enabling efficient and timely supply chain reconfiguration and readjustment.

Currently, the discovery of SMMs is facilitated by a mix of traditional and digital methods, including industry directories, trade associations, business networks, online platforms, and generic search engines. Notable examples of online platforms include the ManufacturedNC database [1], MFG.com [2], and Thomasnet [3] which serve as extensive directories for suppliers regionally and globally. Despite their widespread use, several challenges persist in online directories and databases. They often suffer from rigid search functionality, which limits the adaptability of queries. Additionally, their backend data schema is proprietary and nonextensible, hindering the dynamic evolution and extension of the manufacturing capability knowledge base. The data trapped in online platforms and directories must be unlocked and exposed to pave the way for more creative methods for supplier search and discovery.

Recent advances in large language model (LLM) and LLMs like ChatGPT [4] provide a unique opportunity for developing a more flexible and intelligent approach in the search for manufacturing partners. These powerful methods can allow supply chain managers to sift through vast amounts of data to identify potential SMMs based on specific criteria such as industry focus, technological capabilities, and geographical location. However, off-the-shelf AI-enabled search engines frequently generate inaccurate details about SMM capabilities, technological resources, and capacity. This issue is indicated in the question described in Table 1, where GPT-4, despite generating a broader range of supplier options, fails to verify crucial certifications like “Certified ISO 13485”—a standard for the medical device industry. Consequently, GPT-4 might recommend companies that do not meet the specific industry standards required, leading to potential mismatches. Such inaccuracies prolong the search process and complicate the formation of appropriate business partnerships. Therefore, both traditional and advanced methods of supplier discovery fall short of providing information that is comprehensively integrated and flexible enough to meet diverse client needs.

Table 1

A sample question for supplier discovery

QuestionModelCompaniesPlastic capabilityISO 13485Located in NC
List 10 companies having the material capability of plastic, as well as certificates of ISO 13485 in NC.GPT-4 with browsing and analysisNelipak Healthcare Packaging
Elite Technology
FEAmax LLC
Jaeco Precision Inc.
Reich LLC
Stanford Manufacturing
C2C Plastics
Protolabs
Graph-augmented GPT-3.5 TurboAnuva
Bleep Sleep, LLC
Gilero
Bright Plastics
Blur Development Group
EG Industries (dba EG-GILERO)
QuestionModelCompaniesPlastic capabilityISO 13485Located in NC
List 10 companies having the material capability of plastic, as well as certificates of ISO 13485 in NC.GPT-4 with browsing and analysisNelipak Healthcare Packaging
Elite Technology
FEAmax LLC
Jaeco Precision Inc.
Reich LLC
Stanford Manufacturing
C2C Plastics
Protolabs
Graph-augmented GPT-3.5 TurboAnuva
Bleep Sleep, LLC
Gilero
Bright Plastics
Blur Development Group
EG Industries (dba EG-GILERO)

The integration of knowledge graphs (KGs) with LLMs and retrieval-augmented generation (RAG) presents a promising approach to enhancing data organization in the manufacturing sector. RAG is a framework that combines retrieval techniques with generative models. It uses external knowledge sources to improve the responses of LLMs. In RAG, an LLM model retrieves relevant information from a database or documents before generating an answer, which helps in providing more accurate and contextually rich responses. KGs allow for the integration of diverse datasets from various sources, facilitating a holistic view of information and enabling cross-domain analysis. LLMs and RAG can work collaboratively to automatically construct high-quality, scalable KGs [5]. These graphs detail the entities and their interrelations that collectively represent the manufacturing capabilities of SMMs. The data within these KGs are then fed back into the LLMs, utilizing KG-augmented generation to refine and enhance the model’s output, as indicated in Table 1. This compound solution streamlines the search and discovery process in the manufacturing industry, offering an efficient and effective means of identifying manufacturing partnerships.

Despite the capabilities of LLMs, incorporating manufacturing ontology and thesaurus remains critical for achieving domain-specific precision and contextual relevance in sourcing manufacturing capabilities. The ontology acts as the backbone of a KG, structuring information in a way that reflects the complexities and relationships inherent in the manufacturing domain [6]. Meanwhile, the thesaurus standardizes entities within the KG, ensuring that terminology is consistent and comprehensive [7]. The inclusion of semantic information models such as ontologies and formal thesauri addresses the challenges LLMs face with the specialized technical language of manufacturing. They enhance searchability and accuracy, enabling LLM-based AI to generate information that aligns with industry norms and domain knowledge more effectively.

The main contributions highlighted in this article are as follows:

  1. We introduce a supplier and manufacturing capability discovery system to facilitate the identification and search of SMMs.

  2. We present a method that can transform vast manufacturing data into an interactive supplier search engine, using graph-based RAG (Graph RAG).

  3. We propose an ontology-driven triplet extraction and generation method by fine-tuning LLMs for supplier capability identification.

  4. We propose an entity normalization method to standardize triplets and seamlessly integrate new manufacturing data into an existing supplier discovery system by using RAG and manufacturing thesaurus.

For the rest of the article, Sec. 2 reviews the related work, and Sec. 3 defines the methodology and presents the details of the proposed method. In Sec. 4, a case study is conducted to demonstrate the effectiveness of our method. The limitations and future directions of our work are discussed in Sec. 5, respectively. Finally, the article is concluded in Sec. 6.

2 Related Work

2.1 Supplier Discovery With Ontologies and Natural Language Processing.

In supplier discovery, utilizing the synergy of semantic technologies, ontologies, and natural language processing (NLP) is critical. Ameri and McArthur [8] illustrate how classification and inference rules refine supplier categorization, such as in sand casting, to bolster discovery precision. Lee et al. [9] demonstrate the value of semantic web systems in enhancing long-term supply chain establishment, showcasing semantic technologies’ ability to support a diversified approach in supplier identification. Mesmer and Olewnik [10] further detail the role of manufacturing process ontologies in bridging knowledge gaps for those unfamiliar with intricate manufacturing domains. Lastly, Papa et al. [11] reveal the effectiveness of NLP in financial services’ supplier discovery, pointing out NLP’s versatility and efficiency.

However, leveraging semantic technologies, ontologies, and NLP in supplier discovery faces several challenges, including the need for quality, standardized data. The varied, heterogeneous, and complex data formats and terminologies in the manufacturing domain can undermine the accuracy of these processes. Additionally, as the industry evolves, ensuring that these systems adapt to new processes and technologies is a major undertaking. Despite advancements in NLP, the presence of ambiguities in language interpretation poses risks of inaccuracies, especially when dealing with the specialized terminology of manufacturing. These issues emphasize the necessity for ongoing updates and refinements to align these technologies with the dynamic nature of the manufacturing domain.

2.2 Triplet Extraction in Knowledge Graph Construction.

Triplet extraction, which involves identifying and extracting triples in the form of (subject,predicate,object), is an essential step in constructing KGs. This process can be performed at both the document level [12] and the sentence level [13] and has been enhanced by deep neural architectures [14]. Moreover, recent advancements have integrated LLMs into relational triplet extraction, as demonstrated in studies such as Refs. [5,15]. However, triplet extraction can be challenging when dealing with manufacturing text data. This is because generalized LLM models may fail to identify manufacturing triplets that are not in a complete sentence. For instance, a data source, such as a manufacturer’s website, may list manufacturing capabilities across multiple lines of text or documents. This requires the ability to understand manufacturing contexts, such as connecting material and process types with product requirements. Additionally, it involves linking each manufacturing capability entity with the manufacturer information, which may be located far away in the text or in different documents or links. Therefore, developing effective triplet extraction methods that can handle these complexities is essential for constructing accurate and comprehensive KGs in the manufacturing domain.

2.3 Integration of Large Language Models and Knowledge Graphs.

Recent advancements in combining KGs with LLMs mark a considerable progression in AI research. A KG-enhanced LLM to infuse explicit factual knowledge into LLMs is proposed [16], improving their text generation and factual reasoning. Soman et al. [17] and Jiang et al. [18] both emphasized the application of KGs to improve medical LLMs.

Integrating LLMs and KGs in manufacturing information systems is significantly improving information integration and process analysis. A study explores the potential of leveraging KGs in conjunction with ChatGPT to streamline the process for identifying manufacturing services [19]. Zhou et al. [20] employed an LLM enhanced with industrial structure causal knowledge from KGs for diagnosing quality issues in aerospace manufacturing. Similarly, Xiao et al. [21] demonstrated how KGs can streamline manufacturing process planning, emphasizing the role of process KGs. These works demonstrate limitations, despite their successes. While they utilize LLM or ontology individually, they do not employ ontology to guide LLM in graph construction. Additionally, although these approaches show good performance on specific tasks, they often struggle to generalize across different tasks. This suggests a need for developing more versatile and broadly applicable approaches within this technological framework.

3 Methodology

In this section, we present a novel methodology that merges LLMs with KGs that improves the sourcing and translation of supply chain information, addressing the challenge introduced in Sec. 2. The methodology is based on the framework illustrated in Fig. 1. The framework utilizes LLMs for initial data interpretation. This processed data is then subjected to a cognitive-semantic transformation that interfaces with a KG centered on supply chain data (KGSC). Enhanced by a dedicated ontology and thesaurus for the supply chain, this framework is designed to effectively harness unstructured supply chain data, improving the retrieval and generation of relevant information. The methodology includes four key subsections: Sec. 3.1, which explains the necessity of ontology and thesaurus in the context of our approach; Sec. 3.2, detailing the methodology for entity and relationship extraction from supply chain data; Sec. 3.3, discussing the importance of entity normalization and its implementation; and Sec. 3.4, which describes the construction of the KG, while Sec. 3.5 then illustrates how to integrate LLMs with the KG, aiming to enhance the capabilities of LLMs.

Fig. 1
The framework of integrating KG, RAG, and LLMs for supply chain discovery
Fig. 1
The framework of integrating KG, RAG, and LLMs for supply chain discovery
Close modal

3.1 Supply Chain Structuring With Ontology and Thesaurus.

In integrating KGs and LLMs for supplier discoveries, ontologies and thesauri are crucial for creating effective KGs. Ontologies serve as the backbone of a KG, providing a structured framework of a priori knowledge on supply chains and manufacturing capabilities for data representation and organization. Without a well-defined ontology, the KG’s structure can be compromised, leading to inefficient data indexing and disorganized extracted data. Ontologies ensure coherence and reliability by standardizing relationships and entities [6].

Thesauri complement ontologies by acting as a mapping mechanism and extending the ontology’s scope. They function like vocabulary dictionaries, offering standardized terms and linguistic variations that enrich the KG’s understanding of language specific to supply chains and manufacturing capabilities. This expansive lexical network links standardized entities with synonyms and related terms, enhancing the precision and depth of knowledge captured and facilitating robust information retrieval.

Ontologies and thesauri formalize the semantics of data for manufacturing capabilities. While thesauri describe lexical relationships among manufacturing capability terms, ontologies represent hierarchical domain knowledge through manufacturing capability classes and their relationships supported by logic-based formalisms. Standards like Simple Knowledge Organization System (SKOS) and Web Ontology Language are used for representing thesauri and ontologies. SKOS models are lightweight and extendable by various user communities, making them cost-effective. Ontologies and thesauri can describe semantic KGs in supply chain discovery, therefore harmonizing data before ingestion and enhancing AI model explainability, while LLMs automate vocabulary extension, further enriching the KG.

The process begins by defining the supply chain ontology OSC, which details entities (E) and relationships (R) relevant to supply chain discovery. This ontology OSC={E,R} acts as the foundational schema for constructing a KG (KGSC). Another foundation for KGSC is the thesaurus TSC={Et,Rt} which contributes its entities (Et) and relationships (Rt) as a hierarchy of supplier capabilities in the graph. A hierarchy of supplier capabilities is a structured classification system that organizes various manufacturing processes, technologies, and skill sets possessed by suppliers into broad capability categories with specific subclasses and alternative techniques within those categories.

3.2 Data-Driven Entity and Relationship Identification With Large Language Models.

The proposed framework utilizes LLMs to process unstructured supply chain data (Dun) which includes not only manufacturing capability data but also contractual documents, logistic reports, and transaction records. The framework extracts and generates supply chain entities (Eg) and their relationships (Rg) in the form of triplets.

For effective entity and relationship extraction, a portion of the unstructured data is labeled to create a dataset (Dlabeled). Each instance in this dataset consists of a sequence of input tokens (d1,,dm), along with all triplets identified by subject matter experts (SMEs) according to OSC, which contain entities Eg and relationships Rg as a label y.

These labeled data are used for supervised fine-tuning. The inputs are passed through a pretrained LLM [22] to obtain the final transformer block’s activation hml, which is then fed into an added linear output layer with parameters Wy to predict y, as shown in Eq. (1):
(1)
The objective to maximize during supervised fine-tuning is shown in Eq. (2):
(2)
The training process is represented by Eq. (3):
(3)
The trained LLM is then applied to the rest of the data to extract entities and relationships for supply chain discovery, as shown in Eq. (4):
(4)

We utilize the metrics of F1 score, precision, recall, and accuracy for assessing the performance of data extraction. Precision underscores the models’ efficiency in minimizing incorrect identified terms—a crucial aspect in ensuring the reliability of the data extracted. Recall evaluates the models’ effectiveness in identifying all pertinent triplets. Merging these two, the F1 score emerges as a comprehensive metric that encapsulates both precision and recall. Additionally, accuracy, a metric traditionally associated with classification tasks, lends itself as a broader measure of the models’ overall efficacy, especially when adapted to quantify the correctness in the process of triplet extraction.

3.3 Entity Normalization.

Entity normalization plays an important role in refining the output of LLMs for higher precision and integration into KGSC. Given the entities and relationships identified by LLMs, we propose a hybrid method for entity normalization that combines RAG and the Jaccard similarity measure. Let:

  • Et={Et1,,Etj,} represent a set of entities from a predefined thesaurus, intended as the standard form of entities.

  • Eg={Eg1,,Egi,} denote a set of entities extracted by LLMs.

  • Er={Er1,,Eri,} be the set of entities after the entity normalization.

In the RAG process, all entities in the predefined thesaurus Et are first indexed using a vector store. Given an input entity Egi, it is transformed into a query vector using an LLM with the function T1. A retrieval function R1 is then used to find an entity Eti in Et that has the highest semantic similarity to the input entity Egi based on the transformed query, as represented in Eq. (5):
(5)
To ensure the precision of entity normalization, we integrate both semantic and structural similarity measures. The Jaccard similarity index, which is critical for assessing the structural similarity between entities, is defined as Eq. (6):
(6)

This index quantifies the overlap between entities, ensuring that they are not only semantically close but also share significant structural elements.

Then, the normalization process is completed by Eq. (7):
(7)
where EriEr.

3.4 Graph Construction.

The extracted relationships and normalized entities, structured according to OSC, as well as the entities and relationships from the thesaurus TSC are integrated into the KG [23]. The overall procedure for constructing KGSC from unstructured data Dun is outlined in Algorithm 1.

TextToKG

Algorithm 1

 1: Input:OSC={E,R},Dun,Et,

 2: Expected result:KGSC

 3: Process unstructured data with an fine-tuned LLM: LLMtrained(OSC,Dun){Eg,Rg}.

 4: Normalize entities with RAG and Jaccard Similarity:

 5: Initialize Er with Eg: Er = Eg.

 6:  for each entity Egi in Egdo

 7:  Identify closest match Eti in Et using RAG model.

 8:  ifEtiexists and J(Eti,Egi)>θthen:

 9:    Eri=Eti.

10:  end if

11: end for

12: Construct KGSC using OSC, {Er,Rg}, and TSC.

This process enriches KGSC with fresh data or updated ontology and thesaurus. The latest state of the graph is achieved by incorporating new entities and relationships when updated supply chain data is received. Similarly, when new entities and relationships are introduced or existing ones are altered within the ontology or thesaurus, KGSC can be updated via the framework. The iterative cycle provided by the framework ensures that KGSC remains an up-to-date and critical asset for supply chain discovery.

3.5 Graph-Based Retrieval-Augmented Generation.

Graph RAG integrates LLMs and graph databases to enhance the capabilities of LLMs by combining retrieval mechanisms and the generation of contextually relevant responses.

First, this process represents the graph using KGSC=(V,E), where V are the nodes and E are the edges. Given an input query q, it is transformed into a graph query language using LLM with the function T2. Then, this process uses a function R2 to retrieve relevant nodes and edges from the graph based on the transformed query. Equation (8) represents this first step as follows:
(8)
Next, this step converts the retrieved subgraph elements into contextual embeddings, by defining Ev and Ee be the embeddings of nodes and edges, respectively, as shown in Eq. (9):
(9)
LLM integrates the embeddings Ev and Ee while using x, the processed text tokens, to generate the final response. Then, the generation function G maps LLM output to the final response Y, as illustrated in Eq. (10):
(10)

4 A Case Study: Supplier Discovery

In this section, we demonstrate the proposed methodology that integrates LLMs and KGs for supplier discovery in the manufacturing industry. This case study presents the framework’s construction process while highlighting five key components: manufacturing ontology and thesaurus formulation (Sec. 4.1), supplier data collection (Sec. 4.2), triplets extraction (Sec. 4.3), entity normalization (Sec. 4.4), and KG design (Sec. 4.5 and graph-based question-answering system construction (Sec. 4.6).

A Supplier Capability Knowledge Graph (SCKG) is a labeled property graph that organizes structured supply chain data with a focus on the capabilities of suppliers. This graph is designed to centralize and streamline information related to suppliers’ strengths and operational competencies. It integrates crucial data points such as manufacturing process capabilities, certifications, industries served, and material handling capabilities. By mapping these elements in a structured, interconnected format, the graph provides a comprehensive view of supplier capabilities, facilitating better decision-making for sourcing and supply chain management.

To build SCKG, initially, supplier data are collected from state-specific supplier discovery platforms. An ontology for supplier discovery, as well as a manufacturing thesaurus, is formulated to facilitate the establishment of the SCKG. A subset of the data is annotated to fine-tune LLMs for a triplet extraction task. The fine-tuned (FT) LLM is then applied to the remaining data for generating other triplets. Entities derived from the extracted triplets undergo entity normalization to merge different representations of the same entity into a consistent form. Finally, the graph is constructed and integrated with an LLM using RAG, which enables users to query the SCKG to identify potential partners in the supply chain network.

4.1 Ontology and Thesaurus Formulation.

In this case study, the Manufacturing Capability Thesaurus (MCT) [24] is employed to establish a unified vocabulary of capability-related concepts extracted from manufacturing suppliers’ websites and supplier discovery platforms. Given the technical nature of manufacturing terminology, the terms generated by large language models frequently fail to capture the precise semantics accurately. The MCT provides technically sound and relevant labels for a wide range of manufacturing concepts. The MCT can be used for normalizing the terms generated by LLM, thus generating a knowledge graph that is more understandable for supply chain discovery. The MCT is a general-purpose controlled vocabulary that has been used in various use cases [7,24].

Each concept in MCT has exactly one preferred label (skos:prefLabel) and can have multiple alternative labels (skos:altLabel). Preferred Label is a SKOS element that makes it possible to assign an authorized name to a concept. For example, as shown in Fig. 2, in the context of metal casting terminology, Foundry Sand is the alternative label for Molding Sand as it is used frequently for referring to the same concept. The broader concept of the Molding Sand is Sand, while Silica Sand and Chromite Sand are the narrower concepts, meaning that they are more specialized forms of Molding Sand. The concept that is semantically related to Molding Sand is Mold. The associativity relationship between concepts, when defined explicitly, can support LLM by providing context for each query.

Fig. 2
The concept diagram of the molding sand based on SKOS terminology
Fig. 2
The concept diagram of the molding sand based on SKOS terminology
Close modal

MCT currently contains more than 2100 concepts designated by about 3900 preferred and alternative labels. Although MCT covers a wide range of manufacturing concepts, it has limited means for expressing the relationships between those concepts. To fill this gap, we use Supply and Demand Open Knowledge Network (SUDOKN) ontology. SUDOKN ontology is an application ontology developed to represent the capabilities of manufacturing companies. SUDOKN ontology can provide the proposed KG construction framework with the expected patterns for the triplets (subject–predicate–object). The ontology was developed according to the Industrial Ontologies Foundry (IOF) [25] procedure and methodology. SUDOKN uses Basic Formal Ontology as the top-level ontology and IOF Core as the mid-level ontology. Some of the core classes and relationships of SUDOKN ontology are shown in Fig. 3. In this case study, only a subset of SUDOKN classes and properties are used.

Fig. 3
The core classes and relationships in SUDOKN ontology
Fig. 3
The core classes and relationships in SUDOKN ontology
Close modal

4.2 Supplier Triplet Dataset.

The foundation of constructing an SCKG begins with gathering data on supplier capabilities, primarily sourced from a supplier discovery platform of North Carolina: the ManufacturedNC database [26], which provides a rich, up-to-date source of information directly from the suppliers or through aggregated industry-specific insights. However, the challenge to integrating these data lies in their heterogeneity in website structure and diversified expressions of manufacturing terms. This variability requires data extraction and standardization techniques to ensure accuracy and usability in an SCKG.

A supplier triplet dataset [27] is built for training and validating the ability of LLMs to extract supplier triplets from unstructured textual data. Page texts are extracted from 1000 supplier web links listed in the ManufacturedNC database. The data are prepared semiautomatically as pairs of prompts and completions for the LLM. The first part of the prompts are consistent and designed for extracting manufacturing triplets, while another consists of raw text extracted from manufacturer web pages using a web scraping pipeline. For the completion, subjects and predicates are predefined by the SUDOKN ontology, and objects are mostly taken directly from the web pages to maintain accuracy, with the remaining objects double-checked and standardized by SMEs according to the SUDOKN ontology. SMEs also harmonize terms to industry standards, such as standardizing “Automotive-ICE” to “Automotive” and “Metal-Aluminum” to “Aluminum.” An example from the dataset is shown in Appendix 6.

4.3 Triplets Extraction.

Fine-tuning is employed in our case study for two reasons: to achieve high accuracy in extracting triplets from manufacturing raw text, as direct application of LLMs does not provide consistent performance, and to enable automatic data extraction for updating the SCKG as new manufacturer pages are added or existing pages are updated. To train an LLM specializing in supplier triplet extraction based on the proposed methodology, we employ GPT-3.5-Turbo-0125, which undergoes fine-tuning utilizing the dataset in Sec. 4.2. The input comprises a structured prompt that includes a company name and references SUDOKN ontology presented in the triplet format, as described in Sec. 4.1. The model’s output consists of a series of triplets, conforming to the outlined ontology.

The training process includes fine-tuning the model using a variety of training–validation–test splits with a maximum of 2301 training steps and a maximum of 1,104,183 trained tokens. We maintain the hyperparameter settings from OpenAI’s supervised fine-tuning approach [22], incorporating a dropout rate of 0.1 in the classifier and utilizing a learning rate of 6.25×105 with a batch size of 32. This fine-tuning typically converges swiftly, requiring 3 epochs for optimal performance. To manage the learning rate effectively, we employ a linear decay schedule with warm-up over the initial 0.2% of training steps. To evaluate the fine-tuning task, we utilize several metrics including training loss, training token accuracy, validation loss, and validation token accuracy [28].

The FT model and Base models are evaluated based on their performance in precision, recall, F1 score, and accuracy, with these metrics derived by analyzing the triplets extracted from test data. To calculate these metrics, we count the number of true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN) generated by both models. For triplet extraction, a TP occurs when the model correctly identifies a relevant triplet present in the test data, an FP is when the model incorrectly identifies a triplet as relevant, a TN is when the model correctly identifies that no triplet should be extracted, and an FN happens when the model fails to identify a relevant triplet. These counts are pivotal for quantitatively assessing each model’s ability to accurately and reliably extract relevant information in the form of triplets from unstructured data.

The comparative analysis of the FT model against the base model, GPT 3.5-Turbo-0125, for triplet extraction tasks across different data splits in Table 2 demonstrates a noticeable enhancement in performance metrics on test data due to fine-tuning. Specifically, the FT model shows its highest efficacy in the 8:1:1 data split, achieving significant improvements in precision, recall, F1 score, and accuracy compared to the base model. Additionally, the 5:2.5:2.5 data split indicates that even with a smaller training set, the FT model maintains high performance and robustness, proving its effectiveness on a larger test set. Overall, these results suggest the benefits of fine-tuning LLMs in improving the model’s effectiveness for supplier triplet extraction.

Table 2

Comparison of LLM performances across different data splits

Data splitModelPrecision (%)Recall (%)F1 score (%)Accuracy (%)
5:2.5:2.5Base58.7357.1055.2541.03
FT91.6994.2492.3787.92
8:1:1Base56.1356.0752.6237.85
FT93.8095.6194.4291.09
Data splitModelPrecision (%)Recall (%)F1 score (%)Accuracy (%)
5:2.5:2.5Base58.7357.1055.2541.03
FT91.6994.2492.3787.92
8:1:1Base56.1356.0752.6237.85
FT93.8095.6194.4291.09

Note: Bold emphasises the best performances.

4.4 Entity Normalization.

Entity normalization ensures uniformity in entities extracted from diverse supplier data extraction pipelines and merges semantically identical but structurally varied entities (e.g., “ISO 9001,” “ISO-9001,” “ISO9001”), improving graph coherence. Without normalization, entities from different data pipelines may use various expressions for the same concepts, making aggregation and comparison difficult and reducing analysis effectiveness. Additionally, standardized terms from LLMs may not align with the thesaurus, leading to a loss of hierarchical and relational context when connecting entities from raw data and entities from the thesaurus in SCKG.

To build an entity normalization model, we integrate the RAG with the base GPT-3.5-Turbo model and an indexing system derived from the MCT. The process involves identifying an entity and attempting to match it with a term from the thesaurus. Upon finding a potential match, we compute the Jaccard similarity between the prenormalization entity and the proposed thesaurus concepts. If this similarity score exceeds a predetermined θ, we proceed with the replacement, thereby normalizing the entity. Should the score fall short of the threshold, we retain the original entity without alteration. This method ensures that the terms with both semantic and structural similar terms are normalized, preserving the accuracy and relevance of the entity data within the KG.

We utilize precision, recall, and F1 score to assess the performance of the entity normalization model. To evaluate these metrics, we count the number of TP, FP, and FN within the set of normalized entities. Specifically, a TP is recorded when the model accurately normalized an entity, an FP occurs when an entity is incorrectly normalized, and an FN is noted when the model misses an entity that requires normalization. The criteria for determining accurate normalization depend on whether the Jaccard similarity between the entities before and after normalization exceeds θ, serving as the benchmark for successful entity normalization.

In a sample of 200 entities randomly chosen from the triplets generated by the FT model on test data, 113 entities are prenormalized by RAG in line with terms from the MCT. The results in Fig. 4 indicate how the entity normalization model’s performance metrics—precision, recall, and F1 score—vary with different threshold settings. As θ increases from 0.2 to 0.9, precision improves significantly, starting at 37.61% and reaching 100% at the highest threshold. Recall, on the other hand, starts off perfect at lower thresholds but begins to decrease from 0.7 onwards, dropping to 63.41% at the 0.9 threshold. The F1 score, which balances precision and recall, initially increases with the threshold, peaking at 85.71% when θ=0.8 which reflects the optimal balance between precision and recall for the entity normalization model. Thus, we set θ=0.8 for the entity normalization model in the subsequent phase of graph construction.

Fig. 4
Evaluation of entity normalization under different θ values (horizontal axis)
Fig. 4
Evaluation of entity normalization under different θ values (horizontal axis)
Close modal

4.5 Supplier Capability Knowledge Graph.

To construct SCKG with 1000 suppliers’ original page text, the FT LLM from Sec. 4.3 with the best performance, guided by the ontology from Sec. 4.1, is applied to identify entities and generate triplets that define supplier capabilities. Then, we employ the entity normalization model with predefined θ from Sec. 4.4 to normalize entities within the triplets according to MCT. The other triplets, including “subclass of” and “same as” relationships along with entities, are extracted from the thesaurus to enrich the SCKG with a hierarchy of manufacturing capabilities. Furthermore, the “same as” relationship addresses the limitations of entity normalization by allowing the merging of semantically identical but structurally distinct entities, such as “additive manufacturing” is the same as “3D printing” and “turning” is the same as “lathe work.” The SCKG, designed for supplier capability sourcing, comprises five entity types and six relationship types, totaling 1663 entities and 6911 relationships. As depicted in Fig. 5, it includes entity labels such as “Supplier,” “Capability,” “Certification,” “Industry,” and “Material” represented in the graph. Both the entity labels and relation labels in the SCKG are derived from the underlying ontology.

Fig. 5
An example of SCKG in Neo4j
Fig. 5
An example of SCKG in Neo4j
Close modal

4.6 Graph-Based Question-Answering System.

Developing question-answering systems (QASs) for supplier discovery presents a series of unique challenges. Foremost among these is the task of structuring knowledge in an accessible manner that allows users to efficiently query information. This requires a carefully designed architecture that can categorize and present data in an intuitive way, ensuring that users can find information with high integration and accuracy. Additionally, maintaining the currency of information poses a significant hurdle. The manufacturing domain is characterized by frequent updates, from changes in manufacturing capabilities to shifts in supply chain dynamics. Implementing mechanisms for continuous monitoring and updating of data is crucial to provide users with reliable and current information. Addressing these challenges is essential for creating a QAS that can effectively facilitate supplier discovery, enhancing connectivity within the manufacturing industry.

The evaluation of QAS employs a comprehensive set of metrics, including precision, recall, F1 score, and accuracy to provide a comprehensive assessment of question answering. These metrics are based on the counts of TP, TN, FP, and FN, which offer direct insight into the accuracy and errors in the answers generated by the system. To prepare these metrics, duplicated terms are removed from the answers, where TP denotes the number of correctly identified terms such as company names and capabilities, FP accounts for incorrectly identified terms, FN represents the correct terms that were missed, and TN is calculated when the sum of TP and FP matches the total number of correct terms required by the query with no excess. This suite of metrics provides a thorough and quantitative assessment of our system’s quality in supplier sourcing.

Our graph-based QAS integrates an SCKG with LLMs to enable querying of structured supplier data through questions posed in natural language. Initially, the SCKG is constructed within Neo4j. The Langchain library is then employed to connect the LLM with the Neo4j graph database, and the schema of the Neo4j database is updated regularly to maintain data consistency [29]. When a user poses a question, the LLM automatically generates Cypher queries to search for relevant entities and relationships within the SCKG. The results are subsequently fed back to the LLM to assist in crafting reliable and interpretable answers.

To assess the efficacy of our SCKG-based QAS, we developed 30 different questions and compared the responses with those from QAS systems utilizing other information indexing methods, including KnowledgegraphIndex [30] and SummaryIndex [31], which also leverage the same source texts as our SCKG. The Summary Index organizes data by chunking document texts into nodes and storing them sequentially in a list, synthesizing answers by iterating through these nodes with optional filters during query time. The KnowledgeGraphIndex constructs a KG by extracting triplets from text and utilizes this graph in conjunction with an LLM’s predictive abilities to enhance query responses. All the methods we compared are integrated with the same LLM, GPT-3.5 Turbo, to ensure a consistent basis for comparison across different indexing approaches. One typical question is provided here in Fig. 6, and more examples are detailed in Appendix 6.

Fig. 6

According to Table 3, the SCKG Index, which integrates our tailored SCKG, demonstrated superior performance across all evaluated metrics—precision, recall, F1 score, and accuracy—highlighting its effectiveness in accurately matching queries with the correct answers. In contrast, the SummaryIndex and KnowledgeGraphIndex, despite exhibiting higher recall rates, lagged in precision, which negatively impacted their overall F1 scores and accuracy percentages. The high recall yet low precision of these methods suggests that while they are capable of retrieving a large number of relevant documents, they also fetch a substantial amount of irrelevant information, leading to less accurate results. The evaluation results underline the benefits of the proposed approach in enhancing the precision and relevancy of query responses within the QAS framework.

Table 3

Performance metrics for different QAS

QAS with different indexing methodsPrecision (%)Recall (%)F1 score (%)Accuracy (%)
SummaryIndex24.0284.7237.4224.16
KnowledgeGraphIndex30.0092.5945.3233.21
SCKG Index94.7698.5196.6094.81
QAS with different indexing methodsPrecision (%)Recall (%)F1 score (%)Accuracy (%)
SummaryIndex24.0284.7237.4224.16
KnowledgeGraphIndex30.0092.5945.3233.21
SCKG Index94.7698.5196.6094.81

Note: Bold emphasises the best performances.

5 Discussion

The advantages of the proposed approach are as follows: first, it integrates with high-quality supplier data, ensuring the reliability of the responses it generates. The SCKG utilized is developed with high accuracy, ensuring robust data integration. Second, the system increases the interpretability of the answers provided by LLM. This is achieved through a transparent process that includes initiating a new GraphCypherQAChain, generating a Cypher query, returning graph data, and finally, producing the answer. This clear sequence of steps allows users to understand how the answers are derived, adding a layer of trust and clarity to the interactions with the system. Third, the SCKG is easy to scale up through our pipeline; the only input required is an original text file, and the output will be the supplier triplets with normalized entities, which can be seamlessly added to the graph. Finally, the construction of a QAS is easy to generalize to other domains. With different ontologies and thesauri support, it can benefit not only supplier sourcing but also other high-value applications, such as healthcare management, legal research, educational content discovery, and customer service automation. This adaptability opens up vast opportunities for applying the proposed methodology across various sectors, enhancing data accessibility, and decision-making processes in industries that require robust information retrieval and analysis systems.

In addition to the novel discovery, via this study, we have identified several areas that present valuable opportunities for further research and improvement. Primarily, the effectiveness of the SCKG and the associated framework is closely tied to the quality and integrity of the supplier data. Any inconsistencies or errors in the initial datasets could affect the system performance, even with the support of ontologies and thesauri. Consequently, there is a need for more diversified data to fine-tune LLMs on triplets extraction, enabling them to effectively handle various text resources. Besides, adaptive fine-tuning can be considered [32]. It can adjust the model to better handle the specific characteristics and anomalies found in the supplier data, thus improving the overall resilience and performance of the system. Exploring semisupervised or unsupervised methods for triplet extraction could reduce the reliance on labeled data and SMEs, making the framework more scalable [33,34]. Additionally, we will consider using LangChain agents to enhance system robustness, as LangChain allows automatic switching to alternative tools if one fails, ensuring the system adapts to failure modes and continues functioning effectively.

Looking ahead, there are several avenues for future work to enhance the capabilities of our research. First, scaling up the ontology by incorporating additional categories such as brands, equipment, and product information, along with expanding the thesaurus, will enrich the knowledge base and support the inclusion of more suppliers. The presented framework in this article uses property graphs. To enhance the semantic features and utilities of the KG, resource description framework (RDF) graphs will be used. Since RDF graphs are directly aligned with an axiomatic ontology, reasoning and inference can be readily adapted to automatically extend the graph and apply consistency checking. Second, the development of multimodal KGs [35] that integrate images, videos, and 3D models will provide a more comprehensive and interactive repository, enabling richer data interaction and visualization. Lastly, generalizing our methodology across different ontologies [36] and thesauri will extend its applicability to various aspects of supply chain management. This broader application could revolutionize how supply chains are managed by integrating diverse data types and sources, thus enhancing decision-making processes and operational efficiency across industries.

6 Conclusion

In this article, we introduce a novel methodology that harnesses the synergies between LLMs and KGs to effectively address the complexities of supply chain management, with a particular emphasis on supplier discovery. Our methodology, which utilizes ontology-driven graph construction and thesaurus-assisted RAG, significantly enhances the integration and utilization of supply chain data. Through a detailed case study, we develop an SCKG, which encompasses five entity types and six relationship types across 1663 entities and 6911 relationships. Leveraging this methodology, we construct an SCKG-based QAS. The results demonstrate that our QAS not only integrates high-quality supplier data to produce reliable responses but also improves the interpretability of answers provided by LLMs through a clear and traceable process. The design of the system facilitates scalability and seamless integration of new data, illustrating its potential for broad application across various fields beyond supplier sourcing. The adaptability of our framework to different ontologies and thesauri underscores its flexibility and broad applicability, promising significant advancements in data accessibility and decision-making processes across multiple industries.

Acknowledgment

Funding for this research is provided by the National Science Foundation (NSF) Award #2333801.

Conflict of Interest

There are no conflicts of interest.

Data Availability Statement

The datasets generated and supporting the findings of this article are obtainable from the corresponding author upon reasonable request.

Appendix A: Supplier Triplet Dataset Example

This appendix provides an example of the prompt completion based on the A&G Machining, LLC, which is a supplier in the ManufacturedNC database.

The first half of the prompt is consistent with other prompts, including the SUDOKN ontology for supplier data extraction:

The content is from a manufacturer’s webpage. Please only return all extracted triplets in (s, p, o) considering one of the ontologies:

(manufacturer name, has_process_capability, manufacturing capability)

(manufacturer name, supplies_to_industry, industry)

(manufacturer name, has_certificate, manufacturing certification)

(manufacturer name, has_material_capability, materials).

The second half is the raw text scraped from the web page of the supplier [37].

The completion is structured as follows:

(A&G Machining, LLC, supplies_to_industry, Fabricated Metal Product Manufacturing)

(A&G Machining, LLC, supplies_to_industry, Aerospace)

(A&G Machining, LLC, supplies_to_industry, Automotive)

(A&G Machining, LLC, supplies_to_industry, Military)

(A&G Machining, LLC, has_process_capability, Fabrication)

(A&G Machining, LLC, has_process_capability, Machining)

(A&G Machining, LLC, has_certificate, ISO Compliant)

(A&G Machining, LLC, has_material_capability, Aluminum)

(A&G Machining, LLC, has_material_capability, Steel)

Appendix B: Question-Answering Comparison

As detailed in Sec. 4.6, our study provides a comparative and in-depth analysis of thirty question-and-answer evaluations for supplier sourcing. This section selects four representative questions and shows the performance of responses generated by an LLM utilizing an SCKG index compared to other indexing strategies, as depicted in Figs. 79. TPs and FPs are identified by comparing the generated responses against the correct answers, which are sourced from a comprehensive dataset of 1000 suppliers’ web pages [26]. The responses visually emphasize TPs in bold and FPs underlined, respectively. The SCKG-QAS demonstrates superior indexing and precise retrieval capabilities for supplier data. While the SummaryIndex is good at retrieving answers from consolidated documents, it is prone to collating adjacent terms irrespective of their correctness, leading to potential inaccuracies or duplicates. The KnowledgegraphIndex proficiently retrieves entities formatted as triplets, although not all extracted information aligns accurately with the structured triplets, due to the lack of a suitable manufacturing ontology.

Fig. 7
Fig. 8
Fig. 9

References

1.
Manufactured in North Carolina
,
2024
. “Manufactured in North Carolina.” https://www.manufacturednc.com/, Accessed March 22, 2024.
2.
Mfg.com. https://www.mfg.com/, Accessed April 3, 2024.
3.
Thomas Publishing Company
,
2024
. “Thomasnet.” https://www.thomasnet.com. Accessed April 3, 2024.
4.
“Openai chat.” https://chat.openai.com/, Accessed April 3, 2024.
5.
Carta
,
S.
,
Giuliani
,
A.
,
Piano
,
L.
,
Podda
,
A. S.
,
Pompianu
,
L.
, and
Tiddia
,
S. G.
,
2023
, “
Iterative Zero-Shot LLM Prompting for Knowledge Graph Construction
,”
arXiv
. https://arxiv.org/abs/2307.01128
6.
Ko
,
H.
,
Witherell
,
P.
,
Lu
,
Y.
,
Kim
,
S.
, and
Rosen
,
D. W.
,
2021
, “
Machine Learning and Knowledge Graph Based Design Rule Construction for Additive Manufacturing
,”
Addit. Manuf.
,
37
, p.
101620
.
7.
Sabbagh
,
R.
,
Ameri
,
F.
, and
Yoder
,
R.
,
2018
, “
Thesaurus-Guided Text Analytics Technique for Capability-Based Classification of Manufacturing Suppliers
,”
ASME J. Comput. Inf. Sci. Eng.
,
18
(
3
), p.
031009
.
8.
Ameri
,
F.
, and
McArthur
,
C.
,
2014
, “
Semantic Rule Modelling for Intelligent Supplier Discovery
,”
Int. J. Comput. Integr. Manuf.
,
27
(
6
), pp.
570
590
.
9.
Lee
,
J.
,
Jung
,
K.
,
Kim
,
B. H.
,
Peng
,
Y.
, and
Cho
,
H.
,
2015
, “
Semantic Web-Based Supplier Discovery System for Building a Long-Term Supply Chain
,”
Int. J. Comput. Integr. Manuf.
,
28
(
2
), pp.
155
169
.
10.
Mesmer
,
L.
, and
Olewnik
,
A.
,
2018
, “
Enabling Supplier Discovery Through a Part-Focused Manufacturing Process Ontology
,”
Int. J. Comput. Integr. Manuf.
,
31
(
1
), pp.
87
100
.
11.
Papa
,
M.
,
Ioannis
,
C.
, and
Aris
,
A.
,
2024
, “
Automated Natural Language Processing-Based Supplier Discovery for Financial Services
,”
Big Data
,
12
, pp.
30
48
.
12.
Sun
,
Q.
,
Huang
,
K.
,
Yang
,
X.
,
Tong
,
R.
,
Zhang
,
K.
, and
Poria
,
S.
,
2024
, “Consistency Guided Knowledge Retrieval and Denoising in LLMS For Zero-Shot Document-Level Relation Triplet Extraction,” preprint arXiv:2401.13598.
13.
Rusu
,
D.
,
Dali
,
L.
,
Fortuna
,
B.
,
Grobelnik
,
M.
, and
Mladenic
,
D.
,
2007
, “
Triplet Extraction From Sentences
,”
Proceedings of the 10th International Multiconference, Information Society-IS
,
Ljubljana, Slovenia
,
Oct. 8–12
, pp.
8
12
.
14.
Nayak
,
T.
,
Majumder
,
N.
,
Goyal
,
P.
, and
Poria
,
S.
,
2021
, “
Deep Neural Approaches to Relation Triplets Extraction: A Comprehensive Survey
,”
Cogn. Comput.
,
13
(
5
), pp.
1215
1232
.
15.
Modarressi
,
A.
,
Imani
,
A.
,
Fayyaz
,
M.
, and
Schütze
,
H.
,
2023
, “Ret-LLM: Towards a General Read-Write Memory for Large Language Models,” preprint arXiv:2305.14322.
16.
Yang
,
L.
,
Chen
,
H.
,
Li
,
Z.
,
Ding
,
X.
, and
Wu
,
X.
,
2023
, “ChatGPT Is Not Enough: Enhancing Large Language Models With Knowledge Graphs for Fact-Aware Language Modeling,” preprint arXiv:2306.11489.
17.
Soman
,
K.
,
Rose
,
P. W.
,
Morris
,
J. H.
,
Akbas
,
R. E.
,
Smith
,
B.
,
Peetoom
,
B.
, and
Villouta-Reyes
,
C.
, et al.
2024
, “
Biomedical Knowledge Graph-Optimized Prompt Generation for Large Language Models
,”
Bioinformatics
,
40
(
9
), p.
btae560
.
18.
Jiang
,
X.
,
Zhang
,
R.
,
Xu
,
Y.
,
Qiu
,
R.
,
Fang
,
Y.
,
Wang
,
Z.
,
Tang
,
J.
, et al.,
2024
, “
Hykge: A Hypothesis Knowledge Graph Enhanced Framework for Accurate and Reliable Medical LLMS Responses
,”
arXiv
. https://arxiv.org/abs/2312.15883
19.
Li
,
Y.
, and
Starly
,
B.
,
2024
, “
Building a Knowledge Graph to Enrich ChatGPT Responses in Manufacturing Service Discovery
,”
J. Ind. Inf. Integr.
,
40
, p.
100612
.
20.
Zhou
,
B.
,
Li
,
X.
,
Liu
,
T.
,
Xu
,
K.
,
Liu
,
W.
, and
Bao
,
J.
,
2024
, “
CausalKGPT: Industrial Structure Causal Knowledge-Enhanced Large Language Model for Cause Analysis of Quality Problems in Aerospace Product Manufacturing
,”
Adv. Eng. Inform.
,
59
, p.
102333
.
21.
Xiao
,
Y.
,
Zheng
,
S.
,
Shi
,
J.
,
Du
,
X.
, and
Hong
,
J.
,
2023
, “
Knowledge Graph-Based Manufacturing Process Planning: A State-of-the-Art Review
,”
J. Manuf. Syst.
,
70
, pp.
417
435
.
22.
Radford
,
A.
,
Narasimhan
,
K.
,
Salimans
,
T.
, and
Sutskever
,
I.
,
2018
, “Improving Language Understanding by Generative Pre-Training,” OpenAI Technical Report.
23.
Hao
,
X.
,
Ji
,
Z.
,
Li
,
X.
,
Yin
,
L.
,
Liu
,
L.
,
Sun
,
M.
,
Liu
,
Q.
, and
Yang
,
R.
,
2021
, “
Construction and Application of a Knowledge Graph
,”
Remote Sens.
,
13
(
13
), p.
2511
.
24.
Ameri
,
F.
, and
Bernstein
,
W.
,
2017
, “
A Thesaurus-Guided Framework for Visualization of Unstructured Manufacturing Capability Data
,” Advances in Production Management Systems. The Path to Intelligent, Collaborative and Sustainable Manufacturing: IFIP WG 5.7 International Conference, APMS 2017, Proceedings, Part I, Hamburg, Germany, September 3–7, 2017,
Springer
, pp.
202
212
.
25.
Drobnjakovic
,
M.
,
Kulvatunyou
,
B.
,
Ameri
,
F.
,
Will
,
C.
,
Smith
,
B.
, and
Jones
,
A.
,
2022
, “The Industrial Ontologies foundry (IOF) Core Ontology.”
26.
“Manufactured in North Carolina.” https://www.manufacturednc.com/. Accessed April 3, 2024.
27.
Li
,
Y.
,
Ko
,
H.
, and
Ameri
,
F.
,
2024
, “Supplier Triplet Dataset.” https://figshare.com/articles/journal_contribution/Supplier_Triplet_Dataset/26090326, 6. doi:10.6084/m9.figshare.26090326.v1.
28.
OpenAI
,
2024
, “Fine-Tuning Guide.” https://platform.openai.com/docs/guides/fine-tuning. Accessed June 20, 2024.
29.
LangChain
,
2023
, “Neo4j Cypher Integration - Langchain Documentation.” Accessed April 15, 2024.
30.
LlamaIndex
,
2023
, “Knowledge Graph - Llamaindex Api Reference.” Accessed April 15, 2024.
31.
LlamaIndex
,
2023
, “Summaryindexretriever - Llamaindex Api Reference.” Accessed April 15, 2024.
32.
Kong
,
J.
,
Wang
,
J.
, and
Zhang
,
X.
,
2022
, “
Hierarchical BERT With an Adaptive Fine-Tuning Strategy for Document Classification
,”
Knowl. Based Syst.
,
238
, p.
107872
.
33.
Carlson
,
A.
,
Betteridge
,
J.
,
Wang
,
R. C.
,
Hruschka Jr
,
E. R.
, and
Mitchell
,
T. M.
,
2010
, “
Coupled Semi-Supervised Learning for Information Extraction
,”
Proceedings of the Third ACM International Conference on Web Search and Data Mining
,
New York City
,
Feb. 3–6
, pp.
101
110
.
34.
Dalvi
,
B. B.
,
Cohen
,
W. W.
, and
Callan
,
J.
,
2012
, “
Websets: Extracting Sets of Entities From the Web Using Unsupervised Information Extraction
,”
Proceedings of the Fifth ACM International Conference on Web Search and Data Mining
,
Seattle, WA
,
Feb. 8–12
, pp.
243
252
.
35.
Zhu
,
X.
,
Li
,
Z.
,
Wang
,
X.
,
Jiang
,
X.
,
Sun
,
P.
,
Wang
,
X.
,
Xiao
,
Y.
, and
Yuan
,
N. J.
,
2024
, “
Multi-Modal Knowledge Graph Construction and Application: A Survey
,”
IEEE. Trans. Knowl. Data. Eng.
,
36
(
2
), pp.
715
735
.
36.
Sulaeman
,
M. M.
, and
Harsono
,
M.
,
2021
, “
Supply Chain Ontology: Model Overview and Synthesis
,”
J. Mantik
,
5
(
2
), pp.
790
799
.
37.
A&G Machining LLC. https://www.manufacturednc.com/A-and-G-Machining-LLC, Accessed April 2024.