Skip to main content

How the Pipeline Works

QuantSpace is a sequential ML trading pipeline with optional post-processing tools. Each stage takes the output URL of the previous stage as input, runs a job, and returns a new output URL.
run_data_extraction → run_feature_worker → run_ml_job (or run_dl_job) → run_po_job → run_trading_job
Optional after run_po_job:
run_plot_job | run_st_job | run_risk_job
Stage by stage:
#ToolInputOutput
1run_data_extractionconfig (tickers, dates, source)data_extractor_*.json
2run_feature_workerdata_extractor_*.json URLfeature_engine_*.json
3arun_ml_jobfeature_engine_*.json + data_extractor_*.json URLsml_engine_*.json
3brun_dl_jobfeature_engine_*.json + data_extractor_*.json URLsnn_engine_*.json
4run_po_jobml_engine_*.json or nn_engine_*.json URLportfolio_optimization_*.json
5run_trading_jobportfolio_optimization_*.json URLtrading_report_*.json
6run_plot_jobURL from a prior stage + plot configplot_*.html
7run_st_jobportfolio_optimization_*.json URL + stress configstress_test_*.html
8run_risk_jobportfolio_optimization_*.json URL + risk configrisk_*.(html/png/csv)
Each tool blocks until the job finishes and returns { status, output_url, output_name, execution_name }. All blobs are stored with a timestamp in the filename — you can reuse any previous stage’s output_url to skip re-running it.

Agent Rule — Pipeline Workflow

Copy this prompt into your agent’s system prompt or Cursor rule to teach it how to run a full QuantSpace pipeline correctly.
You have access to the QuantSpace MCP server which runs an ML trading pipeline via 9 tools.

PIPELINE ORDER (must be followed strictly):
1. run_data_extraction(config) → returns output_url (data_extractor_*.json)
2. run_feature_worker(input_url=<step1 output_url>, config) → returns output_url (feature_engine_*.json)
3. run_ml_job(feature_url=<step2 output_url>, data_extractor_url=<step1 output_url>, config)
   OR run_dl_job(feature_url=<step2 output_url>, data_extractor_url=<step1 output_url>, config)
   → returns output_url (ml_engine_*.json or nn_engine_*.json)
4. run_po_job(input_url=<step3 output_url>, config) → returns output_url (portfolio_optimization_*.json)
5. run_trading_job(input_url=<step4 output_url>, config) → returns output_url (trading_report_*.json)
6. Optional post-processing:
   - run_plot_job(input_url=<step1/2/3/4 output_url>, config) → returns output_url (plot_*.html)
   - run_st_job(input_url=<step4 output_url>, config) → returns output_url (stress_test_*.html)
   - run_risk_job(input_url=<step4 output_url>, config) → returns output_url (risk_*.(html/png/csv))

RULES:
- Always pass the exact output_url from the previous step as input_url to the next step.
- Never skip a stage or assume an output_url without actually calling the tool.
- Step 3 requires BOTH feature_url (from step 2) AND data_extractor_url (from step 1) simultaneously.
- If the user already has a blob URL from a previous run, use it directly — do not re-run that stage.
- Each tool call blocks until the job finishes (up to 10 minutes). Wait for the result before proceeding.
- On failure (status != "Succeeded"), report the execution_name to the user and stop. Do not retry automatically.
- Use run_ml_job for fast/baseline experiments. Use run_dl_job for complex pattern recognition.

Agent Rule — When to Use QuantSpace MCP

Copy this prompt to define for an agent exactly when to call QuantSpace tools versus doing things itself.
You have access to the QuantSpace MCP server. Use it when the user asks to:

CALL QuantSpace MCP tools when:
- User asks to run, backtest, or evaluate a trading strategy on historical price data
- User wants to compare ML models (RandomForest, XGBoost, LightGBM, LSTM, Transformer, etc.) for price prediction
- User asks for portfolio optimization (Sharpe ratio, HRP, risk budgeting, minimum volatility, etc.)
- User wants to fetch OHLCV data for a list of tickers from Yahoo Finance, Polygon, or Limex
- User wants to compute technical indicators (RSI, MACD, Bollinger Bands, ATR, etc.) on market data
- User asks "run the full pipeline" or "run a backtest from scratch"

DO NOT call QuantSpace MCP tools when:
- User asks general questions about trading, finance, or ML theory
- User wants to analyze or plot data you already have in context
- User asks to write code that does not involve running a job

TOOL SELECTION GUIDE:
- Need price data? → run_data_extraction
- Need technical features? → run_feature_worker (after data extraction)
- Need ML predictions, fast? → run_ml_job (after feature worker)
- Need deep learning predictions? → run_dl_job (after feature worker)
- Need portfolio weights? → run_po_job (after ml_job or dl_job)
- Need backtest results? → run_trading_job (after po_job)
- Need chart rendering from pipeline JSON? → run_plot_job
- Need stress test report? → run_st_job (after po_job)
- Need risk analytics artifact? → run_risk_job (after po_job)

Always run stages in order. Never call run_po_job before run_ml_job or run_dl_job.
Always confirm output_url from each stage before calling the next.