OpenSPG/KAG: KAG is a logical form-guided reasoning and retrieval framework based on the OpenSPG engine and LLMs. It is used to create logical reasoning and authentic Q&A solutions for professional domain knowledge bases. It effectively overcomes the shortcomings of the traditional RAG vector similarity calculation model.

English |
Simplified Chinese |
Japanese version document
KAG is a logical reasoning and Q&A framework based on OpenSPG machine and large language models, used to create logical reasoning and Q&A solutions for vertical domain knowledge bases. KAG effectively overcomes the ambiguity of traditional RAG vector similarity calculation and the noise problem of GraphRAG introduced by OpenIE. KAG supports logical reasoning and multi-hop fact Q&A, etc., and is much better than the current SOTA method.
The goal of KAG is to build an LLM service framework that is enhanced by knowledge in professional fields, supporting logical reasoning, real Q&A, etc. KAG perfectly combines the logical and factual characteristics of KGs. Its main features include:
- Knowledge and Chunk Mutual Indexing structure to integrate more complete text information in context
- Knowledge alignment using conceptual semantic reasoning to mitigate the noise problem caused by OpenIE
- Schema-constrained knowledge construction to support domain expert knowledge representation and construction
- Logical form-guided hybrid reasoning and retrieval to support logical reasoning and multi-hop reasoning Q&A
⭐️ Star our repository to stay up-to-date with exciting new features and improvements! Get instant notifications for new releases! 🌟
In the context of private knowledge bases, unstructured data, structured information, and business expert experience often coexist. KAG discussed with the DIKW hierarchy to upgrade the SPG to a LLMs friendly version.
For unstructured data such as news, events, logs, and books, as well as structured data such as transactions, statistics, and approvals, along with business experience and domain knowledge rules, KAG uses techniques such as layout analysis, knowledge extraction, property normalization, and semantic alignment to integrate raw business data and expert rules into a unified business knowledge graph.
It is made compatible with schema-free information retrieval and schema-constrained expertise construction of the same type of knowledge (e.g., entity type, event type), and supports cross-index representation between the graph structure and the original text block .
This representation of the mutual index helps to build the inverted index based on the graph structure, and promotes the joint representation and reasoning of logical forms.
KAG proposes a logical formal guided hybrid solution and inference engine.
The engine includes three types of operators: planning, reasoning, and recovery, which transform natural language problems into problem-solving processes that combine language and notation.
In this process, each step can use different operators, such as exact match retrieval, text retrieval, numerical calculation or semantic reasoning, to realize the integration of four different problem-solving processes: Retrieval , Knowledge Graph reasoning, language reasoning and numerical calculations. .
- 2024.11.21 : Support uploading Word docs, model invoke concurrency setting, User experience optimization, etc.
- 2024.10.25 : KAG initial release
- domain knowledge injection, domain schema customization, QFS tasks support, Visual query analysis, etc.
- Optimization of logical reasoning, support of conversation tasks
- and model release, and solution for the reasoning of event knowledge graph and medical knowledge graph
- and front-end open source, distributed build support, mathematical reasoning optimization
Use the following commands to download the docker-compose.yml file and launch the Docker Compose services.
# set the HOME environment variable (only Windows users need to execute this command)
# set HOME=%USERPROFILE%
curl -sSL https://raw.githubusercontent.com/OpenSPG/openspg/refs/heads/master/dev/release/docker-compose-west.yml -o docker-compose-west.yml
docker compose -f docker-compose-west.yml up -d
Navigate to the default KAG product url using your browser: http://127.0.0.1:8887
See the Quick Start for Product Mode for a detailed introduction.
See section 3.1 to complete the installation of the engine and dependent image.
macOS / Linux developers
# Create conda env: conda create -n kag-demo python=3.10 && conda activate kag-demo
# Clone code: git clone https://github.com/OpenSPG/KAG.git
# Install KAG: cd KAG && pip install -e .
Windows developers
# Install the official Python 3.8.10 or later, install Git.
# Create and activate Python venv: py -m venv kag-demo && kag-demo\Scripts\activate
# Clone code: git clone https://github.com/OpenSPG/KAG.git
# Install KAG: cd KAG && pip install -e .
Please see the Quick Start for Developer Mode guide for a detailed introduction to the toolkit. Then you can use the built-in components to copy the performance results of the built-in datasets, and apply the components to new business scenarios.
The KAG framework includes three parts: kg-builder, kg-solver, and kag-model. This release includes only the first two parts, and models gradual open source releases in the future.
kg-builder implements a knowledge representation that is friendly to large-scale language models (LLM). Based on the hierarchical structure of DIKW (data, information, knowledge and wisdom), IT improves the knowledge representation ability of SPG, and is compatible with obtaining information without schema constraints and building professional knowledge that have schema constraints of the same type of knowledge (such as entity type. and event type), it also supports the representation of the mutual index between the graph structure and the original text block, which supports the efficient taking the stage in question and answer reasoning.
kg-solver uses a logical symbol guided hybrid solving and reasoning engine that includes three types of operators: planning, reasoning, and recovery, to transform natural language problems into a process to solve the problem that combines language and symbols. In this process, each step can use different operators, such as exact match retrieval, text retrieval, numerical calculation or semantic reasoning, to realize the integration of four different problem-solving processes: Retrieval , Knowledge Graph reasoning, language reasoning and numerical calculations. .
GitHub: https://github.com/OpenSPG/KAG
OpenSPG: https://spg.openkg.cn/
KAG introduction and applications: https://github.com/orgs/OpenSPG/discussions/52
If you use this software, please comment below:
@article{liang2024kag,
title={KAG: Boosting LLMs in Professional Domains via Knowledge Augmented Generation},
author={Liang, Lei and Sun, Mengshu and Gui, Zhengke and Zhu, Zhongshu and Jiang, Zhouyu and Zhong, Ling and Qu, Yuan and Zhao, Peilong and Bo, Zhongpu and Yang, Jin and others},
journal={arXiv preprint arXiv:2409.13731},
year={2024}
}
@article{yikgfabric,
title={KGFabric: A Scalable Knowledge Graph Warehouse for Enterprise Data Interconnection},
author={Yi, Peng and Liang, Lei and Da Zhang, Yong Chen and Zhu, Jinye and Liu, Xiangyu and Tang, Kun and Chen, Jialin and Lin, Hao and Qiu, Leijie and Zhou, Jun}
}
https://opengraph.githubassets.com/8592261e6a7d00fdde18ad567f692319e5a3dea244ed1825fe940b0fec8739e2/OpenSPG/KAG
2024-12-30 02:55:00