Building a Philosophical Debate Agent with LangGraph

Mechatronics engineering student passionate about building intelligent systems that bridge the gap between hardware, software, and networking, with a stronger pull toward the software side. I specialize in robotics, embedded systems, IoT, and machine learning, with hands-on experience ranging from drone automation to deep learning for computer vision. I’m on the path to becoming a robotics software engineer. I also have a strong interest in philosophy and a love for chess, math, and logic.
Introduction
One of the best use cases of large language models is research, especially when the model has access to the internet and can query information from different sources. This particular process of querying information from other sources to better inform text generation is known as Retrieval Augmented Generation (RAG). Being able to aggregate information from various sources quickly is one of the best things LLMs can do. Although you might still have to cross check as LLMs can also hallucinate answers (hallucinate in this contexts just means giving wrong or made up information).
At its core, LLMs just complete text, but how is it able to act as a helpful assistant (chatGPT, gemini etc.)?
Chat systems wrap this prediction of the next token (text completion) inside structured conversational roles.
It all comes down to how the LLM is prompted. Your prompts don't reach ChatGPT directly. In fact, your prompt is just one part of the text input to the LLM, known as the HumanMessage. There's also the SystemMessage.
The System message basically describes how the LLM should respond, what it is allowed to say, what it isn't allowed to say, system messages guide the behavior of the LLM.
Human Message and System Message:
The system message is a description of how the model should respond while the human message is what the model is responding to. Let us see how this affects an LLMs response:
System Message: "You are a French translator, you help people to convert their language into French only. do not do anything else more than that". Human Message: "What is philosophy?". AI response: Qu’est‑ce que la philosophie ?
System Message: "You are a helpful teacher that answers all questions in 10 words or less.". Human Message:"What is philosophy?". AI response: Love of wisdom; inquiry into existence, knowledge, values, reason.
System Message: "You are a poet and you answer question poetically in one line of less than 10 words.". Human Message: "What is philosophy?". AI response: Dreaming of truth, we map the mind's uncharted seas.
If you want to try it yourself, check it out here: GroqCloud
You can specify both system and human message.
But you have to be as precise as possible when it comes to system messages; the more explicit the message is, the better.
Project Introduction
Now that we have that taken care of, it would be easy to understand how this project is possible. I am guessing you have already read the title. Knowing that we can configure LLMs to give the kind of output we desire, all that is left is just structuring the inputs and outputs in a way that simulates a philosophical debate.
The project starts by inputting a philosophical topic to the agent. Then an LLM generates an opposing topic and also generates two philosophers: one for the topic and one against it. The first philosopher gives their opening statement and asks a question at the end. The next philosopher answers this question and then presents their own question. This continues for 5 turns. Finally, the entire simulated debate is structured for the user to read.
Let us break that down into actionable steps.
Core Idea
The agent works as a multi-step reasoning system:
A user provides a philosophical concept (e.g., Utilitarianism).
The system generates an opposing philosophical position.
Two philosopher personas are created.
Each philosopher retrieves supporting knowledge.
They debate across several turns.
The dialogue is formatted into a readable philosophical exchange.
This can be achieved by creating functions that can perform each step. The framework used for this project is LangGraph.
Why LangGraph?
Traditional chains execute once and terminate.
Philosophical reasoning, however, requires:
memory of previous arguments
iterative exchanges
controlled looping
structured state updates
We can use LangGraph for our specific application by treating the debate and the whole process as a state machine that changes at each step. This is very convenient for us as each output depends on the last step's output. Each step can be translated into nodes of a Graph. Where each node modifies the shared state of the graph and passes it forward.
State Design (The Heart of the Agent)
The entire system revolves around a shared state object, this state object is what is propagated throughout the whole graph, it is updated at each node and carries the output from the last node to the next node.
class PhilosophyAgentState(TypedDict, total=False):
topic: str
philosopher_set: PhilosopherSet
enriched_philosophers: list
history: list
final_dialogue: str
turn_count: int
This state acts like a blackboard where every node writes new information.
Key ideas:
topic: user input
philosopher_set : generated debate participants, these are the profiles of the philosophers engaging in debate. If you look closely you would notice that this particular value is a custom object. It would be explained more in the next step.
enriched_philosophers: philosophers + retrieved knowledge
history: accumulated debate question and responses.
turn_count: controls how many times each philosopher gets to speak.
final_dialogue: human-readable output
The graph is essentially a controlled transformation of this object.
Let us get into the actual nodes of the graph.
Step 1 Philosopher Generation Node
The first node creates debate participants.
The LLM is prompted to:
identify the strongest opposing philosophy
create historically grounded philosopher profiles
define their goals, claims, and argumentative styles
Each philosopher is structured using Pydantic:
class PhilosopherProfile(BaseModel):
name: str
school: str
stance: str
core_claims: List[str]
argumentative_style: str
primary_goal: str
This is important because the agent is not generating text, it is generating roles.
Now we need a prompt that would tell the LLM to perform this generation:
prompt = ChatPromptTemplate.from_template("""
You are a philosophy professor.
Given a philosophical concept, do the following:
1. Identify its strongest opposing philosophical position.
2. Create two philosopher profiles:
- One defending the original concept
- One defending the opposing concept
Use historical realism when possible.
Concept: {topic}
IMPORTANT: Respond ONLY with a valid JSON object. Do NOT use any tools. The JSON must have this exact structure:
{{
"topic": "...",
"opposing_topic": "...",
"philosophers": [
{{"name": "...", "school": "...", "stance": "...", "core_claims": [...], "argumentative_style": "...", "primary_goal": "..."}},
{{"name": "...", "school": "...", "stance": "...", "core_claims": [...], "argumentative_style": "...", "primary_goal": "..."}}
]
}}
""")
Yes it is important the prompt is this long and specific. The more explicit the better.
The above prompt is to ensure that given the input topic, the LLM provides a structured output according to the previously defined philosopher profile class for two opposing views.
Below is the actual code to generate the philosopher profiles:
def create_philosophers(state: PhilosophyAgentState) -> PhilosophyAgentState:
topic = state.get("topic", "Free Will")
# Use plain invoke to avoid tool-calling
resp = llm.invoke(prompt.format(topic=topic))
text = getattr(resp, 'content', None) or str(resp)
# Parse JSON from response
try:
parsed = json.loads(text)
except Exception:
# Try to extract JSON object from text
match = re.search(r'\{.*\}', text, re.DOTALL)
if not match:
raise ValueError(f"Could not extract JSON from response: {text[:200]}")
parsed = json.loads(match.group(0))
# Validate and create PhilosopherSet
philosopher_set = PhilosopherSet.parse_obj(parsed)
return {**state, "philosopher_set": philosopher_set, "turn_count": 0}
This is the node that creates the philosopher profiles. Yes it is basically a function and the function takes in the graph state as input and also returns an updated graph state that contains the philosopher profiles.
The step is important as it gives a sort of personality to the debate, by giving the philosophers names and views, we are able to create a system message for the LLM that adheres to the views and also the personality of the philosopher.
Step 2 Knowledge Retrieval
After philosophers are created, each one gathers evidence. This step involves searching the internet for information on the particular topic. This step ensures that the arguments made by each philosopher is backed by actual information gotten from the internet and reduces the risk of hallucinations from the LLM.
The retrieval node:
tavily_client = TavilyClient(api_key=TAVILY_API_KEY)
def retrieve_philosophy_knowledge(philosopher):
query = f"{philosopher.school} philosophy arguments criticisms"
try:
docs = tavily_client.search(query, max_results=5)
except Exception as e:
docs = []
return {
"philosopher": philosopher,
"sources": docs
}
def retrieval_map(state):
philosophers = state["philosopher_set"].philosophers
enriched = [
retrieve_philosophy_knowledge(p) for p in philosophers
]
return {**state, "enriched_philosophers": enriched, "history": []}
The first function queries Tavily using the philosopher’s school of thought. Tavily is just an API that is really good for web search, (in the actual project, the Wikipedia API is also used, but I want to keep this article simple). The documents retrieved from tavily are then used to 'enrich' the philosophers with knowledge on the subject matter. This serves as context that the LLM would use to generate responses for the debate.
The actual node is the second function which basically runs the first function for all the philosophers and then returns an "enriched" (or enlightened *wink wink*) philosopher that would be able to debate circles around his opponent.
The result is an enriched structure:
{
"philosopher": philosopher,
"sources": docs
}
We have established the philosophers and their pot of knowledge, now how does the debate actually happen?
Step 3 Debate Turn Node (The Engine)
This is the most important component.
Each philosopher:
Reads the previous question.
Defends their position.
Ends with a philosophical question.
The model must return strict JSON that adheres to the fornat:
class DebateTurn(BaseModel):
speaker: str
argument: str
question: str
Every response becomes a DebateTurn.
This class is how each response and question from each speaker is stored. Now, how do we populate this class?
Let us first look at the prompt that guides the output of the LLM at each turn:
prompt_template = ChatPromptTemplate.from_template("""
You are {name}, a philosopher from the {school} tradition.
Your task:
- Respond directly to the previous philosopher's question or argument
- Defend your philosophical position
- Challenge the opponent's reasoning
- End with a probing philosophical question for them
Opponent's last argument:
{last_argument}
Your sources:
{sources}
IMPORTANT: Respond ONLY with a valid JSON object. Do NOT use any tools. The JSON must have this exact structure:
{{
"speaker": "{name}",
"argument": "...",
"question": "..."
}}
""")
Note: We have previously talked about human and system messages. We have been using a combination of the two for all the LLM prompts in this project. But they can also be used individually as demonstrated earlier.
Now for the actual node that populates the DebateTurn Object:
def debate_turn(state):
history = state.get("history", [])
philosophers = state["enriched_philosophers"]
turn_count = state.get("turn_count", 0)
turns = []
# Each philosopher responds in sequence to the previous one's question
for i, p in enumerate(philosophers):
# Determine what the previous philosopher said
if history:
last_turn = history[-1]
last_argument = last_turn.question if hasattr(last_turn, 'question') else last_turn.get('question', '')
else:
last_argument = f"Begin the debate on {state.get('topic', 'the topic')}"
# Extract sources - Tavily returns a dict with 'results' key
sources_list = p["sources"]
if isinstance(sources_list, dict) and "results" in sources_list:
sources_list = sources_list["results"]
# Format sources safely
if isinstance(sources_list, list):
sources_text = "\n".join([f"- {s.get('title', s.get('query', 'Source'))}" for s in sources_list[:3]])
else:
sources_text = str(sources_list)[:500]
## invoking the LLM with the previously written prompt
resp = llm.invoke(
prompt_template.format(
name=p["philosopher"].name,
school=p["philosopher"].school,
last_argument=last_argument,
sources=sources_text
)
)
text = getattr(resp, 'content', None) or str(resp)
# Parse JSON response for safety
try:
parsed = json.loads(text)
except Exception:
match = re.search(r'\{.*\}', text, re.DOTALL)
if not match:
parsed = {"speaker": p["philosopher"].name, "argument": text[:500], "question": "What do you think?"}
else:
parsed = json.loads(match.group(0))
turn = DebateTurn.parse_obj(parsed)
turns.append(turn)
return {**state, "history": history + turns, "turn_count": turn_count + 1}
This Node is in charge of calling the LLM with different philosopher profiles to respond to the previous response or just start the debate. Then formats each response as a DebateTurn object. This approach simulates a dialogue, where each response is dependent on the last one.
These are the steps that are followed:
Reads current debate state
Gets past conversation (
history) if there is any.Gets philosophers + their retrieved knowledge.
Gets current turn number.
Runs one debate round
- Each philosopher speaks once per round
Finds what to respond to
If debate already started then reply to last question
Otherwise, start discussion using the topic
Prepares research sources
Extracts Tavily results
Formats a few sources as context for the LLM
Generates the philosopher’s response
Sends philosopher identity + opponent argument + sources to the LLM
This ensures that the LLM gives an output that is inline with the argument and the personality of the philosopher.
Forces structured output
Tries to parse JSON response
Uses fallback if the model output is messy
Validates the response
- Converts output into a
DebateTurnobject (speaker, argument, question)
- Converts output into a
Updates memory
Adds new turns to debate history
Increments turn counter
The next challenge is stopping the debate. How can we tell the node when to stop producing debate responses?
All we have to do is add a condition.
Step 4 Controlled Reasoning Loop
LangGraph allows conditional edges:
def should_continue(state):
if state["turn_count"] < 5:
return "debate"
return "format"
This edge decides whether or not the debate should be continued based on the turn count, ensuring that only the specified number of turns is allowed.
Note: In LangGraph, edges are the connections between nodes, we can have a normal edge that links one node to another, and there could also be a conditional edge that only runs the connected node if a particular condition is met.
After getting all the responses and question from the philosophers, we need to format the debate so that it is readable by the user. This brings us to the last node of the graph.
Step 5 Dialogue Formatting
Finally, debate turns are converted into a readable conversation:
## node to format the final dialogue
def format_dialogue(state):
dialogue = []
for turn in state["history"]:
# Handle both Pydantic objects and dicts
speaker = turn.speaker if hasattr(turn, 'speaker') else turn.get('speaker', 'Unknown')
argument = turn.argument if hasattr(turn, 'argument') else turn.get('argument', '')
question = turn.question if hasattr(turn, 'question') else turn.get('question', '')
dialogue.append(
f"{speaker}:\n{argument}\n\nQuestion:\n{question}\n"
)
return {
**state,
"final_dialogue": "\n---\n".join(dialogue)
}
This formats the dialogue like this:
Philosopher A:
Argument...
Question:
...
---
Philosopher B:
...
Now that we have all our Graph state, nodes and some edges we can bring all these building blocks together to get our graph.
The Final Graph
Conceptually, the pipeline looks like this:
User Topic
↓
Create Philosophers
↓
Retrieve Knowledge
↓
Debate Loop (x5)
↓
Format Dialogue
↓
Final Philosophical Conversation
This is how the whole thing is put together:
from langgraph.graph import StateGraph, END
## Create the final graph
final_graph = StateGraph(PhilosophyAgentState)
final_graph.add_node("philosophers", create_philosophers)
final_graph.add_node("retrieve", retrieval_map)
final_graph.add_node("debate", debate_turn)
final_graph.add_node("format", format_dialogue)
final_graph.set_entry_point("philosophers")
final_graph.add_edge("philosophers", "retrieve")
final_graph.add_edge("retrieve", "debate")
final_graph.add_conditional_edges("debate", should_continue)
final_graph.add_edge("format", END)
philosophy_agent = final_graph.compile()
Then you can invoke the graph with:
from IPython.display import display, Markdown
result = philosophy_agent.invoke({
"topic": "Utilitarianism"
})
# print(result["final_dialogue"])
output = result["final_dialogue"]
display(Markdown(output))
Output for "Utilitarianism"
John Stuart Mill: The debate on utilitarianism must begin with a clear recognition that the moral law is the principle of the greatest happiness for the greatest number. Act utilitarianism, which evaluates each act solely on its immediate consequences, is vulnerable to the tyranny of the majority and to the neglect of moral duties that are essential for a just society. Rule utilitarianism, on the other hand, adopts rules that, if generally followed, tend to produce the greatest happiness. This approach preserves individual rights and promotes social stability while still aiming for the greatest overall well‑being. Critics often argue that utilitarianism reduces moral life to a mere calculation of pleasure and pain, but this criticism overlooks the qualitative distinctions between higher and lower pleasures that I have emphasized. Moreover, utilitarianism is not a blind pursuit of pleasure; it requires a careful assessment of the consequences of our actions, including the long‑term effects on society. Therefore, utilitarianism remains the most rational and humane ethical system, as it aligns moral deliberation with the ultimate aim of human flourishing.
Question: If utilitarianism prioritizes the greatest happiness, how can it justify protecting the rights of a minority when doing so might reduce overall happiness?
Immanuel Kant: I appreciate the invitation to discuss Utilitarianism, yet I must first point out that its foundational premise—maximizing overall happiness—fails to respect the intrinsic worth of each individual. Utilitarianism treats persons as instruments whose value is measured by the pleasure or utility they can produce, thereby violating the principle that humanity must always be treated as an end in itself. The categorical imperative demands that we act only according to maxims that can be universalized without contradiction; a maxim that allows one to sacrifice an innocent person for the greater good cannot be universalized, for it would erode the very moral law that protects all. Moreover, utilitarian calculations are inherently contingent on subjective valuations of pleasure and pain, which cannot be grounded in a rational, objective moral framework. Thus, while Utilitarianism may offer a seemingly pragmatic solution, it ultimately collapses under the weight of its own moral inadequacy. I challenge you to explain how a system that permits individuals to be treated merely as means can be justified as a legitimate moral theory.
Question: How can you reconcile the imperative to treat each person as an end in themselves with a moral calculus that permits their instrumental use for the sake of aggregate happiness?
Key Design Insight
The most important realization from this project is:
Agents become powerful when identity, memory, and iteration are separated.
Identity: philosopher profiles
Memory: shared state history
Iteration: graph loops
LangGraph provides the orchestration layer that lets these interact coherently.
The actual project is way more complex than this, this article just explains the contents of the notebook i used to experiment the functionality before actually building the full thing here.
What This Demonstrates
This notebook shows that LLM agents can move beyond assistants into structured thinkers:
opposing viewpoints emerge automatically
arguments evolve over time
reasoning becomes dialogical rather than declarative
In many ways, this mirrors how philosophy itself progresses through questioning, just how Socrates liked it.
References
Langgraph Introductory course: https://academy.langchain.com/courses/intro-to-langgraph
My Philosophical debate simulator project: https://github.com/Badaszz/Philosophical-debate-simulator
The notebook used as a playground for testing the project logic: Philosophical-debate-simulator/notebook.ipynb at main · Badaszz/Philosophical-debate-simulator
Note: For the notebook you would need to get a groq api key (or any other langChain LLM provider) from groq.com





