PAPER LIVE DATA MOCK AI

How each agent works

This page explains — in plain language — exactly what every AI agent in the pipeline does, what data it reads, and how to interpret the numbers it produces. Reading this once turns the approval screen from a wall of text into something you can scan in thirty seconds.

What changed (M3 "Agent Depth"). The agents no longer reason over raw ticks and then invent indicator values. Stage 1 now computes a deterministic feature bundle — real RSI, MACD, moving averages, ATR, support/resistance, news features, a price forecast — and hands it to the agents as evidence. The agents reason over given numbers and are told not to restate a value they weren't handed. Where an agent has no real data, it now abstains instead of guessing. The sections below describe this current behaviour.


The feature bundle (computed before any agent runs)

Before the analysts wake up, stage 1 attaches a FeatureBundle to the market snapshot. This is pure deterministic code — no AI — and it is the single source of the numbers the agents are allowed to cite:

  • Technical features — RSI(14), MACD(12/26/9), SMA 20/50/200, EMA20,

ATR(14), Bollinger(20,2), swing high/low, support/resistance and distance to each, gap %, and volume-vs-average.

  • News features — each headline tagged by event class

(earnings / guidance / rating / regulatory / macro), recency-weighted, de-duplicated, and given a per-item sentiment score.

  • Sentiment features — news-driven sentiment, plus a flag for whether a real

positioning feed (options / FII-DII) is wired (it is not yet, so the flag is off).

  • Fundamentals — P/E, P/B, EV/EBITDA, growth, margins **when a source is

wired** (none is yet, so this is marked unavailable).

  • Price forecast — a baseline expected move with a band over the horizon

(see the bottom of this page). Evidence only — it never touches risk or sizing.

A composition-root wrapper guarantees the bundle is always present: if any feed forgets to attach one, a fallback bundle is computed from the recent closes and news, so the evidence layer can never silently switch off.


The agent landscape

Seven AI agents run across four pipeline stages. All of them share the same underlying design: they receive a structured prompt built from the feature bundle and live data, call the configured LLM (or the offline mock), and return a Pydantic-validated object whose fields feed the next stage.

StageAgentsRuns
2News · Sentiment · Technical · FundamentalIn parallel
5Bull Researcher · Bear Researcher → Research ManagerBull/bear, then a rebuttal round, then the manager
6TraderAfter debate
12ReviewerAfter position closes

Stages 7, 9, and 11 contain no agents — they are deterministic code only.

Anti-hallucination ticker guard. Every agent that emits a symbol is checked, centrally, against the symbol it was asked about. If a model ever returns a note or thesis for a different instrument, the stage fails closed — it does not slip through.


Stage 2 — The four analyst agents

All four run in parallel over the same snapshot and its feature bundle. They cannot see each other's notes. Each returns an AnalystNote.

What every analyst note contains

FieldRangeMeaning
stance−1.00 to +1.00Directional lean. Positive = bullish, negative = bearish, near zero = neutral
confidence0.00 to 1.00How sure the agent is of its own stance
summarytextOne-sentence conclusion
key_pointslistThe supporting bullets
subscoresmapPer-factor scores (e.g. momentum, trend) that fed the stance
evidencelistThe exact feature values or headlines the agent cited
expectation_gapnumber / blankHow far reality sits from what the agent expected
time_horizontextThe horizon the note is reasoning over

Quorum rule. At least 3 of the 4 analysts must succeed (return a note) for the pipeline to continue. If only 2 or fewer succeed, the run is marked DEGRADED and halts before the debate stage. An abstaining note still counts as a successful note for quorum — see below.

Self-critique pass. If an analyst returns a note below a confidence floor (0.40), it runs exactly one self-review pass ("what would change your stance, and is the low confidence justified?") before emitting. This is cheap and only fires when the model is genuinely unsure. Offline, the deterministic mock returns its calibrated note directly and skips the extra pass.


News Analyst

What it does. Reads the event-tagged news features for the symbol and rates how bullish or bearish the news is, by event class.

Inputs the agent sees:

  • Last traded price and previous close
  • The computed news features: each headline's event type

(earnings/guidance/rating/regulatory/macro), recency weight, and per-item sentiment score, plus the net recency-weighted sentiment and how many items were unique after de-duplication

  • The recent raw headlines (still shown, inside the fence below)
  • Up to 5 macro indicators

Security note. The news text is third-party data and could contain adversarial content. Before being placed in the prompt, every headline is wrapped in unforgeable fence markers (<UNTRUSTED_FEED_DATA>…</UNTRUSTED_FEED_DATA>) and the system prompt tells the model to treat that block as data only, not as instructions. The same neutralisation is applied to headlines echoed inside the news-features block, so the fence cannot be bypassed through the features path.

How it scores:

  • Rates directional impact per event class and cites the specific headline or

feature behind each point — it is told not to invent figures it wasn't given.

  • A surprise dividend or contract win typically produces a stance near +0.6 to

+0.9; a fraud allegation or profit warning near −0.6 to −0.9.

  • confidence is typically lower when headlines are sparse or ambiguous.

Sentiment Analyst

What it does. Estimates the "temperature" of investor positioning around the stock — leaning on the news-driven sentiment it is actually given.

Inputs the agent sees:

  • Last traded price and previous close
  • The computed sentiment features: news-driven sentiment and a flag for

whether a real positioning feed (options open interest, FII/DII flows) is wired

  • Up to 5 macro indicators

It abstains when there is nothing real to use. In the current build there is no options / FII-DII positioning feed wired. When there is also no news to derive sentiment from, the agent returns a deterministic abstention note — no LLM call at all — with stance 0, confidence 0.15, and model_used = "deterministic-abstain". When news is present, it leans on that news sentiment and is told not to fabricate flows it wasn't given — it lowers its confidence instead.

Stance interpretation (when it does take a side):

  • Positive: bullish news-driven tone.
  • Negative: bearish news-driven tone.
  • Near 0 / low confidence: thin or mixed signals, or abstaining on absent data.

Technical Analyst

What it does. Reads the computed indicators and the price forecast to decide whether the chart is set up for a move up or down.

Inputs the agent sees:

  • Last traded price, previous close, and last-tick volume
  • The full computed technical block: RSI(14), MACD(12/26/9) with signal and

histogram, SMA 20/50/200, EMA20, ATR(14), Bollinger bands, support/resistance and distance to each, gap %, and volume-vs-average

  • The baseline price forecast (expected move + band) as evidence only

This is the big M3 change. The technical analyst used to be handed only "the 5 most recent tick prices" and then asked to describe RSI and MACD — which it had to invent. It now reads the real computed indicators and is explicitly told to use those values and not to restate a different number. The hallucinated-RSI problem is gone.

How it reasons:

  • Reads the trend (SMA/EMA stack), momentum (RSI/MACD), volatility

(ATR/Bollinger), and proximity to support/resistance into a directional stance, citing the indicator behind each point.

  • A bullish MACD crossover + price above SMA200 + RSI in the 50–65 range would

produce a stance around +0.5 to +0.7; a breakdown below support with bearish MACD produces stances near −0.5 to −0.8.

Confidence tends to be high when multiple indicators agree, and low when they contradict each other (e.g. bullish trend but RSI diverging).


Fundamental Analyst

What it does. Evaluates the business quality and valuation of the company — when it actually has the ratios to do so.

Inputs the agent sees:

  • Last traded price and previous close
  • The computed fundamentals block: P/E, P/B, EV/EBITDA, revenue growth, and

net margin — when a source is wired

  • Up to 5 macro indicators

It abstains when no source is wired. No NSE/BSE fundamentals source is wired in the current build, so this agent returns the deterministic abstention note (stance 0, confidence 0.15, model_used = "deterministic-abstain") rather than inventing multiples. When a source is present, it evaluates valuation, growth, and margin quality from the reported ratios and cites the figure behind each point — it is told not to fabricate multiples.

Why this still matters for your review. Even abstaining is information: a visible "abstaining — no fundamentals source wired" note tells you the trade is resting on technical and news evidence only, not on a view of the underlying business. The quorum is still reachable on the other three analysts.


Stage 5 — The debate layer (3 agents)

After the four analyst notes are collected, the debate stage takes over. In M3 it runs as two passes plus a synthesis: bull and bear each build their case, then each gets one bounded rebuttal round answering the other, and only then does the manager judge.

Why a debate (and why a rebuttal)?

One analyst panel can reach a consensus that is wrong. The debate forces the system to articulate the strongest opposing argument before committing. The rebuttal round then makes each side answer the other's best points rather than talk past them, so the manager judges the cases after they have been tested.


Bull Researcher

What it does. Builds the strongest case for buying — then rebuts the bear.

Pass 1 (build) inputs:

  • Last price and previous close
  • Only the bullish analyst notes (stance > 0), with stance and confidence
  • All key points from the full panel (up to 2 per analyst)

Pass 2 (rebuttal): sees its own initial case and the bear's case, and returns a sharpened BullCase whose supporting points directly address the bear.

What it produces — BullCase: argument (the case for LONG), supporting_points, and risks it acknowledges even as a bull.

If no analyst is explicitly bullish, the agent argues from the available data anyway, so there is always a debate.


Bear Researcher

Mirrors the bull exactly, for the short side: a build pass over the bearish notes, then a rebuttal pass answering the bull. Produces a BearCase with argument, supporting_points, and acknowledged upside risks (earnings surprise, short-squeeze, sector re-rating).

The rebuttal round runs once and is bounded; if a rebuttal call fails, the system safely falls back to that side's initial case.


Research Manager

What it does. Acts as a neutral judge. Reads the full panel and both rebutted cases, declares a winner, and assigns a conviction score.

Inputs the agent sees:

  • Last price and previous close
  • All analyst notes with stance numbers
  • The post-rebuttal bull and bear arguments and supporting points

What it produces — DebateResult:

FieldRange / typeMeaning
winnerLONG or SHORTWhich direction the debate favoured
conviction0.00 to 1.00How decisive the verdict was (after calibration)
manager_rationaletextExplicit reasoning for why one side won
key_disagreementslistWhere the panel / the two sides genuinely conflict
falsifierslistWhat evidence would flip the winner
rebuttalslistThe rebutted cases the verdict was judged on

How conviction is scored — and then calibrated. The model proposes a conviction, but code then deterministically calibrates it down when the analyst panel diverges from the chosen winner. The detail that matters:

  • If no analyst opposes the winner, conviction is left unchanged.
  • The denominator counts only analysts that actually took a side — abstaining

or neutral notes do not dilute the disagreement. So one analyst opposing the winner with three abstaining reads as full opposition, not a 25% minority.

  • Full opposition applies up to a 60% haircut on the model's proposed conviction.

This means a split panel can no longer produce a confident-looking number — the conviction you see has already been knocked down to reflect real disagreement.

Reading the approval screen. The winner pill shows direction and conviction. The manager rationale tells you why this side won; the falsifiers tell you what to watch for that would prove the trade wrong.


Stage 6 — Trader agent

What it does. Synthesises everything into a concrete trade proposal — but the prices are now derived deterministically, not invented by the model.

Inputs the agent sees:

  • Last price and previous close
  • Debate result: winner, conviction, and the manager rationale
  • The deterministic price anchors (current price, ATR, support/resistance)
  • All four analyst notes with stance and confidence

What it produces — TradeThesis:

FieldMeaning
directionLONG or SHORT (follows the debate winner)
conviction0–1
entryEntry price in ₹
targetTake-profit price in ₹
stopStop-loss price in ₹
horizon_sessionsExpected holding period in trading sessions
rationaleThe trader's reasoning
invalidation_conditionsWhat would invalidate the thesis
key_risksThe main risks to the trade
expected_horizonA human-readable horizon note

How entry / target / stop are set (the M3 change). The LLM owns direction, rationale, and risks; the code owns the prices. After the model proposes a thesis, the system overwrites the levels from deterministic anchors:

  • Entry = the current price.
  • Stop2 × ATR(14) from entry (on the correct side for the direction).
  • Target ≈ a 2R multiple — twice the entry-to-stop distance.

The old guide said the trader "infers reasonable levels" and was warned not to "invent arbitrary numbers." That framing is now obsolete — the trader is no longer trusted to pick prices at all. If the ATR is so small that 2×ATR rounds the stop back onto entry (a sub-tick ATR), the system keeps the model's own prudent prices rather than reject a sound thesis.

A thesis validator runs after anchoring and fails the stage closed if the geometry is wrong: a stop equal to entry, a target on the wrong side of entry, or a stop further than 4×ATR away are all rejected before the trade can reach you. (This is a thesis-side check; the deterministic risk engine in stage 7 is separate and unchanged.)

Conviction-based model escalation. When the manager's conviction is high (≥ 0.75), the trader escalates to the Opus model tier (the manager's tier) for that one call — the bigger the decision, the more the deeper model is worth. Normal-conviction runs stay on the default (Sonnet) tier.

Important: the trader does not size the position. Quantity is computed by the deterministic risk engine in stage 7 from the stop distance and capital. The trader only sets the three prices and the horizon.


Stage 7 — Risk engine (not an agent)

This stage is included here for completeness because the approval screen shows its output alongside the agent outputs.

It is pure deterministic code — no LLM. It takes the trader's thesis and runs its checks:

CheckWhat it tests
degenerate_thesisTarget and stop must be on the correct sides of entry
size_nonzeroComputed share count must be at least 1
daily_loss_capRisk amount ≤ configured daily loss cap
margin_sufficientPosition notional ≤ available capital
max_notional_pctSingle-trade notional ≤ max % of capital
max_positionsOpen positions count < configured maximum
exposure_capPortfolio gross exposure after this trade ≤ cap

Sizing formula:

quantity = floor( (capital × risk_per_trade_pct / 100) / stop_distance )

Where stop_distance = |entry − stop|. The formula ensures that if the stop is hit, you lose exactly risk_per_trade_pct percent of capital — no more.

The risk decision on the approval screen shows each check with a ✓ or ✗. A single ✗ rejects the trade. You should not approve a trade the risk engine rejected.


Stage 12 — Reviewer agent

What it does. After a position closes (stop hit, target hit, or manual close), the reviewer critiques the outcome against the thesis and writes a structured lesson to the memory store for future runs.

Inputs the agent sees:

  • The original thesis: direction, entry, target, stop, conviction, rationale
  • The actual outcome: win, loss, or scratch
  • The realized P&L (in ₹)

What it produces — TradeReview:

FieldMeaning
critiqueAn honest assessment of what the thesis got right and wrong
lessonsActionable takeaways for future runs
signal_evolutionHow the thesis fared (see below)
thesis_vs_outcomePredicted vs realised deltas (e.g. predicted target/stop % vs the outcome)
memory_recordA structured, tagged record written back to memory

signal_evolution is classified deterministically from the factual outcome, never from the model's wording: a win → Realized, a loss → Falsified, a scratch → Weakened. (The full set of states is Strengthened, Weakened, Falsified, Realized, and Unknown.)

Grounding rule. The realized P&L, the outcome label, and the signal-evolution state are all factual — derived from known numbers, not the model. Only the critique and lessons are model-authored. This prevents the learning loop from recording a hallucinated outcome. The memory record is also tagged with the symbol, direction, outcome, and signal so hybrid retrieval can match a similar setup next time.


How memory enriches every agent

Every agent has optional access to a memory store. The default store is now a hybrid retriever — it combines a BM25 keyword score with a hashed dense vector and fuses the two rankings with Reciprocal Rank Fusion (RRF). It is pure standard-library code: deterministic, offline, and needs no embedding model or network. ChromaDB is still available as an opt-in vector backend (TRADING_MEMORY=chroma).

The old guide described "a vector memory store (ChromaDB when enabled; in-memory mock otherwise)." The default is now this hybrid (BM25 + dense + RRF) store, not ChromaDB.

Before each agent's prompt is assembled, the base class queries the store for the most relevant past records for the current symbol. A relevance floor means only records that share at least one query term are surfaced — so an irrelevant past note can no longer leak into the prompt. If hits are found, they are prepended:

Relevant past notes:
- [review] RELIANCE: Chased a breakout; stock reversed. Lesson: wait for volume confirmation. [LONG RELIANCE | outcome=loss | signal=falsified | ...]
- [review] RELIANCE: Bull thesis on refinery margins held; hit target in 4 sessions. [LONG RELIANCE | outcome=win | signal=realized | ...]

[rest of prompt...]

The memory grows automatically as the reviewer writes its structured record after each closed position.


The price forecast (evidence only)

The feature bundle includes a baseline price forecast: an expected % move with a low/high band over the thesis horizon. The current implementation is a classical drift-plus-volatility-band baseline (deterministic, offline). It is shown to the technical analyst and the trader as evidence only and is physically barred from the risk and execution path — a forecast can inform reasoning but can never size or route a trade. A more sophisticated news-aware model can be dropped in later behind the same interface without changing any agent.


Quick interpretation guide

Use this when scanning the approval screen quickly:

PatternWhat it likely means
All non-abstaining stances positive, high confidenceStrong consensus — rare and worth taking seriously
Mixed stances (some +, some −)Genuine uncertainty; conviction will already be calibrated down — check the manager rationale and falsifiers
Sentiment / Fundamental abstainingNo real data source for that factor; the trade rests on the other evidence — not a red flag by itself
High debate conviction (> 0.75) + risk APPROVEDCleanest signal — and the trader will have used the deeper Opus model
Low debate conviction (< 0.45)Debate was close or the panel diverged; think twice
Risk check REJECTEDHard stop — the trade violates a portfolio rule, do not override
Stop / target look tight or wideThey are derived from ATR (≈2×ATR stop, 2R target) — that is the volatility talking, not a guess
News high-confidence bearishEvent-driven risk; the thesis is fighting active headwinds