Skip to content

LlamaIndex — LLM 数据框架

简介

LlamaIndex 专注于将私有数据与 LLM 连接,提供比 LangChain 更强大的数据索引和查询能力。

bash
pip install llama-index llama-index-llms-openai llama-index-embeddings-huggingface

快速 RAG

python
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

# 配置
Settings.llm = OpenAI(
    model="qwen-turbo",
    api_key="sk-xxx",
    api_base="https://dashscope.aliyuncs.com/compatible-mode/v1"
)
Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-zh-v1.5")

# 加载文档
documents = SimpleDirectoryReader("./finance_docs").load_data()

# 构建索引
index = VectorStoreIndex.from_documents(documents)

# 查询
query_engine = index.as_query_engine(similarity_top_k=3)
response = query_engine.query("招商银行2024年的不良贷款率是多少?")
print(response)
print(f"\n来源: {[node.metadata for node in response.source_nodes]}")

持久化索引

python
from llama_index.core import StorageContext, load_index_from_storage

# 保存
index.storage_context.persist(persist_dir="./index_storage")

# 加载
storage_context = StorageContext.from_defaults(persist_dir="./index_storage")
index = load_index_from_storage(storage_context)

子问题查询(复杂问题分解)

python
from llama_index.core.query_engine import SubQuestionQueryEngine
from llama_index.core.tools import QueryEngineTool

# 多文档查询
tools = [
    QueryEngineTool.from_defaults(
        query_engine=index.as_query_engine(),
        name="finance_kb",
        description="金融知识库,包含银行年报、风控手册等"
    )
]

sub_question_engine = SubQuestionQueryEngine.from_defaults(query_engine_tools=tools)
response = sub_question_engine.query(
    "比较招商银行和平安银行2024年的不良贷款率,哪家更好?"
)

与 Chroma 集成

python
from llama_index.vector_stores.chroma import ChromaVectorStore
import chromadb

chroma_client = chromadb.PersistentClient(path="./chroma_db")
chroma_collection = chroma_client.get_or_create_collection("finance")

vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)

本站内容由 褚成志 整理编写,仅供学习参考