Using LLMs to Create Knowledge Graphs From a Large Corpus of Parliamentary Debates

Usman

Monday 16:10 in B05-B06

Type/Track Talk pydata-natural-language-processing-computer-vision

Large Language Models (LLMs) have proven to be incredibly powerful on a range of tasks. They do however, have certain limitations when the input context becomes significantly large. Solutions such as Retrieval Augmented Generation (RAG) do a great job in providing context from custom data without retraining any models but they too have limitations, especially when the context is spread out over many documents. Consider the question “Which projects has person X worked on?”. Information required to answer this question may be spread out over hundreds of documents, making it difficult for an LLM alone to answer. One way to overcome this issue is to use an LLM as an entity extraction tool, which can extract entities and relationships from documents and load that data into a structured format such as a knowledge graph. In this talk, I will demonstrate this process on a dataset of parliamentary debates, showing how downstream analytics becomes more intuitive and feasible.

Level Domain Expertise Intermediate Python Skill Level Novice

Usman

Affiliation: Xebia Data B.V.

Usman is a Machine Learning Engineer working for Xebia Data, with an interest for graph theory, low-level machine learning frameworks and the bridge between research and real-world implementation.