From c2fa046a9bc169b218c827127f2e44338ebd0890 Mon Sep 17 00:00:00 2001 From: Yijia-Xiao Date: Sat, 28 Dec 2024 11:32:06 +0800 Subject: [PATCH 01/56] TradingAgents-AI --- README.md | 5 + index.html | 455 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 460 insertions(+) create mode 100644 README.md create mode 100644 index.html diff --git a/README.md b/README.md new file mode 100644 index 00000000..2d944541 --- /dev/null +++ b/README.md @@ -0,0 +1,5 @@ +# TradingAgents: Multi-Agents LLM Financial Trading Framework + +> MARW Workshop, AAAI 2025 +> +> Homepage: https://TradingAgents.github.io/ diff --git a/index.html b/index.html new file mode 100644 index 00000000..e52ee518 --- /dev/null +++ b/index.html @@ -0,0 +1,455 @@ + + + + + + + + TradingAgents: Multi-Agents LLM Financial Trading Framework + + + + + + + + + + + + + + + + + + + + + + +
+
+
+
+
+

TradingAgents: Multi-Agents LLM Financial Trading Framework

+
+ Yijia Xiao1, + Edward Sun1, + Di Luo2, + Wei Wang1 +
+
+ 1University of California, Los Angeles, + 2Massachusetts Institute of Technology +
+
+ +
+
+
+
+
+
+ +
+
+
+
+

Abstract

+
+

Significant progress has been made in automated problem-solving using societies of agents powered by large language models (LLMs). In finance, efforts have largely focused on single-agent systems handling specific tasks or multi-agent frameworks independently gathering data. However, multi-agent systems' potential to replicate real-world trading firms' collaborative dynamics remains underexplored. TradingAgents proposes a novel stock trading framework inspired by trading firms, featuring LLM-powered agents in specialized roles such as fundamental analysts, sentiment analysts, technical analysts, and traders with varied risk profiles. The framework includes Bull and Bear researcher agents assessing market conditions, a risk management team monitoring exposure, and traders synthesizing insights from debates and historical data to make informed decisions. By simulating a dynamic, collaborative trading environment, this framework aims to improve trading performance. Detailed architecture and extensive experiments reveal its superiority over baseline models, with notable improvements in cumulative returns, Sharpe ratio, and maximum drawdown, highlighting the potential of multi-agent LLM frameworks in financial trading.

+
+
+
+
+
+ +
+
+
+
+

Introduction

+
+

Autonomous agents leveraging Large Language Models (LLMs) present a transformative approach to decision-making by replicating human processes and workflows across various applications. These systems enhance the problem-solving capabilities of language agents by equipping them with tools and enabling collaboration with other agents, effectively breaking down complex problems into manageable components Park et al., 2023, Havrilla et al., 2024, Talebirad et al., 2023, Tang et al., 2024. One prominent application of these autonomous frameworks is in the financial market—a highly complex system influenced by numerous factors, including company fundamentals, market sentiment, technical indicators, and macroeconomic events.

+ +

Traditional algorithmic trading systems often rely on quantitative models that struggle to fully capture the complex interplay of diverse factors. In contrast, LLMs excel at processing and understanding natural language data, making them particularly effective for tasks that require textual comprehension, such as analyzing news articles, financial reports, and social media sentiment. Additionally, deep learning-based trading systems often suffer from low explainability, as they rely on hidden features that drive decision-making but are difficult to interpret. Recent advancements in multi-agent LLM frameworks for finance have shown significant promise in addressing these challenges. These frameworks create explainable AI systems, where decisions are supported by evidence and transparent reasoning Li et al., 2023, Wang et al., 2024, Yu et al., 2024, demonstrating their potential in financial applications.

+ +

Despite their potential, most current applications of language agents in the financial and trading sectors face two significant limitations:

+ + Lack of Realistic Organizational Modeling: Many frameworks fail to capture the complex interactions between agents that mimic the structure of real-world trading firms Li et al., 2023, Wang et al., 2024, Yu et al., 2024. Instead, they focus narrowly on specific task performance, often disconnected from the organizational workflows and established human operating procedures proven effective in trading. This limits their ability to fully replicate and benefit from real-world trading practices. + + Inefficient Communication Interfaces: Most existing systems use natural language as the primary communication medium, typically relying on message histories or an unstructured pool of information for decision-making Park et al., 2023, Qian et al., 2024. This approach often results in a "telephone effect", where details are lost, and states become corrupted as conversations lengthen. Agents struggle to maintain context and track extended histories while filtering out irrelevant information from previous decision steps, diminishing their effectiveness in handling complex, dynamic tasks. Additionally, the unstructured pool-of-information approach lacks clear instructions, forcing logical communication and information exchange between agents to depend solely on retrieval, which disrupts the relational integrity of the data. + +

In this work, we address these key limitations of existing models by introducing a system that overcomes these challenges. First, our framework bridges the gap by simulating the multi-agent decision-making processes typical of professional trading teams. It incorporates specialized agents tailored to distinct aspects of trading, inspired by the organizational structure of real-world trading firms. These agents include fundamental analysts, sentiment/news analysts, technical analysts, and traders with diverse risk profiles. Bullish and bearish debaters evaluate market conditions to provide balanced recommendations, while a risk management team ensures that exposures remain within acceptable limits. Second, to enhance communication, our framework combines structured outputs for control, clarity, and reasoning with natural language dialogue to facilitate effective debate and collaboration among agents. This hybrid approach ensures both precision and flexibility in decision-making.

+ +

We validate our framework through experiments on historical financial data, comparing its performance against multiple baselines. Comprehensive evaluation metrics, including cumulative return, Sharpe ratio, and maximum drawdown, are employed to assess its overall effectiveness.

+
+
+
+
+
+ +
+
+
+
+

Related Work

+ +

LLMs as Financial Assistants

+
+

Large Language Models (LLMs) are applied in finance by fine-tuning on financial data or training on financial corpora. This improves the model’s understanding of financial terminology and data, enabling a specialized assistant for analytical support, insights, and information retrieval, rather than trade execution.

+ + Fine-Tuned LLMs for Finance +

Fine-tuning enhances domain-specific performance. Examples include PIXIU (FinMA) Xie et al., 2023, which fine-tuned LLaMA on 136K finance-related instructions; FinGPT Yang et al., 2023, which used LoRA to fine-tune models like LLaMA and ChatGLM with about 50K finance-specific samples; and Instruct-FinGPT Zhang et al., 2023, fine-tuned on 10K instruction samples from financial sentiment analysis datasets. These models outperform their base versions and other open-source LLMs like BLOOM and OPT Zhang et al., 2022 in finance classification tasks, even surpassing BloombergGPT Wu et al., 2023 in several evaluations. However, in generative tasks, they perform similarly or slightly worse than powerful general-purpose models like GPT-4, indicating a need for more high-quality, domain-specific datasets.

+ + Finance LLMs Trained from Scratch +

Training LLMs from scratch on finance-specific corpora aims for better domain adaptation. Models like BloombergGPT Wu et al., 2023, XuanYuan 2.0 Zhang et al., 2023, and Fin-T5 Lu et al., 2023 combine public datasets with finance-specific data during pretraining. BloombergGPT, for instance, was trained on both general and financial text, with proprietary Bloomberg data enhancing its performance on finance benchmarks. These models outperform general-purpose counterparts like BLOOM-176B and T5 in tasks such as market sentiment classification and summarization. While they may not match larger closed-source models like GPT-3 or PaLM Chowdhery et al., 2022, they offer competitive performance among similar-sized open-source models without compromising general language understanding.

+ +

In summary, finance-specific LLMs developed through fine-tuning or training from scratch show significant improvements in domain-specific tasks, underscoring the importance of domain adaptation and the potential for further enhancements with high-quality finance-specific datasets.

+ +
+ TradingAgents Overall Framework Organization +
Figure 1: TradingAgents Overall Framework Organization. I. Analysts Team: Four analysts concurrently gather relevant market information. II. Research Team: The team discusses and evaluates the collected data. III. Trader: Based on the researchers' analysis, the trader makes the trading decision. IV. Risk Management Team: Risk guardians assess the decision against current market conditions to mitigate risks. V. Fund Manager: The fund manager approves and executes the trade.
+
+
+ +

LLMs as Traders

+
+

LLMs act as trader agents making direct trading decisions by analyzing external data like news, financial reports, and stock prices. Proposed architectures include news-driven, reasoning-driven, and reinforcement learning (RL)-driven agents.

+ + News-Driven Agents +

News-driven architectures integrate stock news and macroeconomic updates into LLM prompts to predict stock price movements. Studies evaluating both closed-source models (e.g., GPT-3.5, GPT-4) and open-source LLMs (e.g., Qwen Bai et al., 2023, Baichuan Yang et al., 2023) in financial sentiment analysis have shown the effectiveness of simple long-short strategies based on sentiment scores Lopezlira et al., 2023. Further research on fine-tuned LLMs like FinGPT and OPT demonstrates improved performance through domain-specific alignment Unveiling et al., Sentitrade et al.. Advanced methods involve summarizing news data and reasoning about their relationship with stock prices Beatunveiling et al., Wang et al., 2024.

+ + Reasoning-Driven Agents +

Reasoning-driven agents enhance trading decisions through mechanisms like reflection and debate. Reflection-driven agents, such as FinMem FinMem et al. and FinAgent MultimodalFinMem et al., use layered memorization and multimodal data to summarize inputs into memories, inform decisions, and incorporate technical indicators, achieving superior backtest performance while mitigating hallucinations Ji et al., 2023. Debate-driven agents, like those in heterogeneous frameworks Xing et al., 2024 and TradingGPT Li et al., 2023, enhance reasoning and factual validity by employing LLM debates among agents with different roles, improving sentiment classification and increasing robustness in trading decisions.

+ + Reinforcement Learning-Driven Agents +

Reinforcement learning methods align LLM outputs with expected behaviors, using backtesting as rewards. SEP Koa, 2024 employs RL with memorization and reflection to refine LLM predictions based on market history. Classical RL methods are also used in trading frameworks that integrate LLM-generated embeddings with stock features, trained via algorithms like Proximal Policy Optimization (PPO) Ding et al., 2023, PPO, Year.

+
+ +

LLMs as Alpha Miners

+
+

LLMs are also used to generate alpha factors instead of making direct trading decisions. QuantAgent Wang et al., 2023 demonstrates this by leveraging LLMs to produce alpha factors through an inner-loop and outer-loop architecture. In the inner loop, a writer agent generates a script from a trader's idea, while a judge agent provides feedback. In the outer loop, the code is tested in the real market, and trading results enhance the judge agent. This approach enables progressive approximation of optimal behavior.

+ +

Subsequent research, such as AlphaGPT Wang et al., 2023, proposes a human-in-the-loop framework for alpha mining with a similar architecture. Both studies showcase the effectiveness of LLM-powered alpha mining systems, highlighting their potential in automating and accelerating the development of trading strategies by generating and refining alpha factors.

+
+
+
+
+
+ +
+
+
+
+

TradingAgents: Role Specialization

+
+

Assigning LLM agents clear, well-defined roles with specific goals enables the breakdown of complex objectives into smaller, manageable subtasks. Financial trading is a prime example of such complexity, demanding the integration of diverse signals, inputs, and specialized expertise. In the real world, this approach to managing complexity is demonstrated by trading firms that rely on expert teams to collaborate and make high-stakes decisions, underscoring the multifaceted nature of the task.

+ +

In a typical trading firm, vast amounts of data are collected, including financial metrics, price movements, trading volumes, historical performance, economic indicators, and news sentiment. This data is then analyzed by quantitative experts (quants), including mathematicians, data scientists, and engineers, using advanced tools and algorithms to identify trends and predict market movements.

+ +

Inspired by this organizational structure, TradingAgents defines seven distinct agent roles within a simulated trading firm: Fundamentals Analyst, Sentiment Analyst, News Analyst, Technical Analyst, Researcher, Trader, and Risk Manager. Each agent is assigned a specific name, role, goal, and set of constraints, alongside predefined context, skills, and tools tailored to their function. For example, a Sentiment Analyst is equipped with tools like web search engines, Reddit search APIs, X/Twitter search tools, and sentiment score calculation algorithms, while a Technical Analyst can execute code, calculate technical indicators, and analyze trading patterns. More specifically, TradingAgents assumes the following teams.

+ +

Analyst Team

+
+

The Analyst Team (Figure 2) is composed of specialized agents responsible for gathering and analyzing various types of market data to inform trading decisions. Each agent focuses on a specific aspect of market analysis, bringing together a comprehensive view of the market's conditions.

+ +
+ TradingAgents Analyst Team +
Figure 2: TradingAgents Analyst Team
+
+ +
    +
  • Fundamental Analyst Agents: These agents evaluate company fundamentals by analyzing financial statements, earnings reports, insider transactions, and other pertinent data. They assess a company's intrinsic value to identify undervalued or overvalued stocks, providing insights into long-term investment potential.
  • +
  • Sentiment Analyst Agents: These agents process large volumes of social media posts, sentiment scores, and insider sentiments derived from public information and social media activity. They gauge market sentiment to predict how collective investor behavior might impact stock prices in the short term.
  • +
  • News Analyst Agents: These agents analyze news articles, government announcements, and other macroeconomic indicators to assess the market's macroeconomic state, major world events, and significant company changes. They identify news events that could influence market movements, helping to anticipate sudden shifts in market dynamics.
  • +
  • Technical Analyst Agents: These agents calculate and select relevant technical indicators, such as Moving Average Convergence Divergence (MACD) and Relative Strength Index (RSI), customized for specific assets. They analyze price patterns and trading volumes to forecast future price movements, assisting in timing entry and exit points.
  • +
+ +

Collectively, the Analyst Team synthesizes data from multiple sources to provide a holistic market analysis. Their combined insights form the foundational input for the Researcher Team, ensuring that all facets of the market are considered in subsequent decision-making processes.

+
+ +

Researcher Team

+
+

The Researcher Team (Figure 3) is responsible for critically evaluating the information provided by the Analyst Team. Comprised of agents adopting both bullish and bearish perspectives, they engage in multiple rounds of debate to assess the potential risks and benefits of investment decisions.

+ +
+ TradingAgents Researcher Team +
Figure 3: TradingAgents Researcher Team: Bullish Perspectives and Bearish Perspectives
+
+ +
    +
  • Bullish Researchers: These agents advocate for investment opportunities by highlighting positive indicators, growth potential, and favorable market conditions. They construct arguments supporting the initiation or continuation of positions in certain assets.
  • +
  • Bearish Researchers: Conversely, these agents focus on potential downsides, risks, and unfavorable market signals. They provide cautionary insights, questioning the viability of investment strategies and highlighting possible negative outcomes.
  • +
+ +

Through this dialectical process, the Researcher Team aims to reach a balanced understanding of the market situation. Their thorough analysis helps in identifying the most promising investment strategies while anticipating possible challenges, thus aiding the Trader Agents in making informed decisions.

+
+ +

Trader Agents

+
+

Trader Agents (Figure 4) are responsible for executing trading decisions based on the comprehensive analysis provided by the Analyst Team and the nuanced perspectives from the Researcher Team. They assess the synthesized information, considering both quantitative data and qualitative insights, to determine optimal trading actions.

+ +
+ TradingAgents Trader Decision-Making Process +
Figure 4: TradingAgents Trader Decision-Making Process
+
+ +

The tasks of TradingAgents Trader include:

+ +
    +
  • Evaluating recommendations and insights from analysts and researchers.
  • +
  • Deciding on the timing and size of trades to maximize trading returns.
  • +
  • Placing buy or sell orders in the market.
  • +
  • Adjusting portfolio allocations in response to market changes and new information.
  • +
+ +

Trader Agents must balance potential returns against associated risks, making timely decisions in a dynamic market environment. Their actions directly impact the firm's performance, necessitating a high level of precision and strategic thinking.

+
+ +

Risk Management Team

+
+

The Risk Management Team (Figure 5) monitors and controls the firm's exposure to various market risks. These agents continuously evaluate the portfolio's risk profile, ensuring that trading activities remain within predefined risk parameters and comply with regulatory requirements.

+ +
+ TradingAgents Risk Management Team and Fund Manager Approval Workflow +
Figure 5: TradingAgents Risk Management Team and Fund Manager Approval Workflow
+
+ +

The responsibilities of Risk Management Team include:

+ +
    +
  • Assessing factors such as market volatility, liquidity, and counterparty risks.
  • +
  • Implementing risk mitigation strategies, such as setting stop-loss orders or diversifying holdings.
  • +
  • Providing feedback to Trader Agents on risk exposures and suggesting adjustments to trading strategies.
  • +
  • Ensuring that the overall portfolio aligns with the firm's risk tolerance and investment objectives.
  • +
+ +

By offering oversight and guidance, the Risk Management Team helps maintain the firm's financial stability and protect against adverse market events. They play a crucial role in safeguarding assets and ensuring sustainable long-term performance.

+ +

All agents in TradingAgents follow the ReAct prompting framework Yao et al., 2023, which synergizes reasoning and acting. The environment state is shared and monitored by the agents, enabling them to take context-appropriate actions such as conducting research, executing trades, engaging in debates, or managing risks. This design ensures a collaborative, dynamic decision-making process reflective of real-world trading systems.

+
+
+
+
+
+
+ +
+
+
+
+

TradingAgents: Agent Workflow

+
+

Communication Protocol

+

Most existing LLM-based agent frameworks use natural language as the primary communication interface, typically through structured message histories or collections of agent-generated messages Fatouros et al., 2024, Li et al., 2023, Yang et al., 2024, Yang et al., 2023. However, relying solely on natural language often proves insufficient for solving complex, long-term tasks that require extensive planning horizons. In such cases, pure natural language communication can resemble a game of telephone—over multiple iterations, initial information may be forgotten or distorted due to context length limitations and an overload of text that obscures critical earlier details Hong et al., 2024. To address this limitation, we draw inspiration from frameworks like MetaGPT, which adopt a structured approach to communication. Our model introduces a structured communication protocol to govern agent interactions. By clearly defining each agent's state, we ensure that each role only extracts or queries the necessary information, processes it, and returns a completed report. This streamlined approach reduces unnecessary steps, lowers the risk of message corruption, and keeps interactions focused and efficient, even in complex, long-horizon tasks.

+ +

Types of Agent Interactions

+

In contrast to previous multi-agent trading frameworks, which rely heavily on natural language dialogue, TradingAgents agents communicate primarily through structured documents and diagrams. These documents encapsulate the agents' insights in concise, well-organized reports that preserve essential content while avoiding irrelevant information. By utilizing structured reports, agents can query necessary details directly from the global state, eliminating the need for lengthy conversations that risk diluting information, extending the message state indefinitely, and causing data loss. The types of documents and the information they contain are detailed below:

+ +
    +
  • Analyst Team: Fundamental, sentiment, news, and technical analysts compile their research and findings into concise analysis reports specific to their areas of expertise. These reports include key metrics, insights, and recommendations based on their specialized analyses.
  • +
  • Traders: Traders review and analyze the reports from the analysts, carefully deliberating to produce clear decision signals. They accompany these decisions with detailed reports explaining their rationale and supporting evidence, which are later utilized by the risk management team.
  • +
+ +

Agents engage in natural language dialogue exclusively during agent-to-agent conversations and debates. These concise, focused discussions have been shown to promote deeper reasoning and integrate diverse perspectives, enabling more balanced decisions in complex, long-horizon scenarios—a method particularly relevant to the intricate environment of trading Du et al., 2023. This approach seamlessly integrates with our structured framework, as the conversation state is recorded as a structured entry within the overall agent state. The types of communication in these scenarios are detailed below:

+ +
    +
  • Researcher Team: Each researcher agent queries the global agent state for analyst reports and carefully forms their opinion. Two researchers represent opposing perspectives: one bullish and one bearish. They engage in natural language dialogue for $n$ rounds, as determined by the debate facilitator agent. At the conclusion, the facilitator reviews the debate history, selects the prevailing perspective, and records it as a structured entry in the communication protocol.
  • +
  • Risk Management Team: The risk management team, similar to the researcher team, queries the trader's decision and accompanying report. They then deliberate from three perspectives—risk-seeking, neutral, and risk-conservative—to adjust the trading plan within risk constraints. They engage in $n$ rounds of natural language discussion, guided by a facilitator agent.
  • +
  • Fund Manager: The fund manager reviews the discussion from the risk management team, determines the appropriate risk adjustments, and updates the trader's decision and report states within the communication protocol.
  • +
+ +

Backbone LLMs

+

To meet the diverse complexity and speed demands of tasks in our framework, we strategically select Large Language Models (LLMs) based on their strengths. Quick-thinking models, such as gpt-4o-mini and gpt-4o, efficiently handle fast, low-depth tasks like summarization, data retrieval, and converting tabular data to text OpenAI, 2024. In contrast, deep-thinking models like o1-preview excel in reasoning-intensive tasks such as decision-making, evidence-based report writing, and data analysis. These models leverage their architectures for multi-round reasoning, producing logically sound, in-depth insights Zhong et al., 2024, Wang et al., 2024, OpenAI, 2024. Additionally, we prioritize models with proven reliability and scalability to ensure optimal performance across various market conditions. We also employ auxiliary expert models for specialized tasks like sentiment analysis.

+ +

Specifically, all analyst nodes rely on deep-thinking models to ensure robust analysis, while quick-thinking models handle data retrieval from APIs and tools for efficiency. Researchers and traders use deep-thinking models to generate valuable insights and support well-informed decisions. By aligning the choice of LLMs with the specific requirements of each task, our framework achieves a balance between efficiency and depth of reasoning, which is crucial for effective trading strategies.

+ +

This implementation strategy ensures that TradingAgents can be deployed without requiring a GPU, relying only on API credits. It also introduces seamless exchangeability of backbone models, enabling researchers to effortlessly replace the model with any locally hosted or API-accessible alternatives in the future. This adaptability supports the integration of improved reasoning models or finance-tuned models customized for specific tasks. As a result, TradingAgents is highly scalable and future-proof, offering flexibility to accommodate any backbone model for any of its agents.

+
+
+
+
+
+ +
+
+
+
+

Experiments

+
+

In this section, we describe the experimental setup used to evaluate our proposed framework. We also provide detailed descriptions of the evaluation metrics employed to assess performance comprehensively.

+ +
+ Performance Comparison +
Table 1: Performance comparison across all methods using four evaluation metrics. Results highlighted in green represent the best-performing statistic for each model. The improvement row illustrates TradingAgents' performance gains over the top-performing baselines.
+
+ +

Back Trading

+

To simulate a realistic trading environment, we utilize a multi-asset and multi-modal financial dataset comprising of various stocks such as Apple, Nvidia, Microsoft, Meta, Google, and more. The dataset includes:

+ +
    +
  • Historical Stock Prices: Open, high, low, close, volume, and adjusted close prices from January 1st, 2024, to March 29th, 2024.
  • +
  • News Articles: Daily news updates are gathered from diverse sources such as Bloomberg, Yahoo, EODHD, FinnHub, and Reddit, covering specific company developments, global events, macroeconomic trends, and government updates.
  • +
  • Social Media Posts and Sentiment: Posts from Reddit, X/Twitter, and other platforms along with sentiment scores of posts calculated by auxiliary language models.
  • +
  • Insider Sentiments and Transactions: Sentiment derived from public information, including transactions from SEDI and relevant company filings.
  • +
  • Financial Statements and Earnings Reports: Quarterly and annual reports filed by companies.
  • +
  • Company Profiles and Financial History: Descriptions of company profiles, target industries, and financial history reported by third parties.
  • +
  • Technical Indicators: Sixty standard technical analysis indicators calculated for each asset, including MACD, RSI, Bollinger Bands, etc.
  • +
+ +

Simulation Setup

+

We simulate the trading environment for the period from June 19, 2024, to November 19, 2024. TradingAgents facilitates seamless plug-and-play strategies during the simulation, enabling straightforward comparisons with any baseline. Agents make decisions based solely on data available up to each trading day, ensuring no future data is used (eliminating look-ahead bias). Based on their analysis, TradingAgents generates trading signals to buy, sell, or hold assets, which are then executed. Afterward, analysis metrics are calculated before proceeding to the next day's data.

+ +

Baseline Models

+

We compare our TradingAgents framework against several baselines:

+ +
    +
  • Buy and Hold: Investing equal amounts in all selected stocks and holding them throughout the simulation period.
  • +
  • MACD (Moving Average Convergence Divergence): A trend-following momentum strategy that generates buy and sell signals based on the crossover points between the MACD line and signal line.
  • +
  • KDJ and RSI (Relative Strength Index): A momentum strategy combining KDJ (stochastic oscillator) and RSI (relative strength index) indicators to identify overbought and oversold conditions for trading signals.
  • +
  • ZMR (Zero Mean Reversion): A mean reversion trading strategy that generates signals based on price deviations from and subsequent reversions to a zero reference line.
  • +
  • SMA (Simple Moving Average): A trend-following strategy that generates trading signals based on crossovers between short-term and long-term moving averages.
  • +
+ +

Evaluation Metrics

+ +
+ Cumulative Returns on AAPL +
Figure 6: TradingAgents: Cumulative Returns (CR) and Detailed Transaction History for AAPL.
+
+ +

To thoroughly evaluate the performance of our TradingAgents framework, we use widely recognized metrics to assess the risk management, profitability, and safety of the TradingAgents strategy in comparison to baseline approaches. Here we describe these metrics:

+ +

Cumulative Return (CR)

+

The cumulative return measures the total return generated over the simulation period. It is calculated as:

+

+ CR = ((Vend - Vstart) / Vstart) × 100% +

+

where Vend is the portfolio value at the end of the simulation, and Vstart is the initial portfolio value.

+ +

Annualized Return (AR)

+

The annualized return normalizes the cumulative return over the number of years:

+

+ AR = (((Vend / Vstart)^(1/N)) - 1) × 100% +

+

where N is the number of years in the simulation.

+ +

Sharpe Ratio (SR)

+

The Sharpe ratio measures risk-adjusted return by comparing a portfolio's excess return over the risk-free rate to its volatility:

+

+ SR = (R̄ - Rf) / σ +

+

where R̄ is the average portfolio return, Rf is the risk-free rate (e.g., yield of 3-month Treasury bills), and σ is the standard deviation of the portfolio returns.

+ +

Maximum Drawdown (MDD)

+

Maximum drawdown measures the largest peak-to-trough decline in the portfolio value:

+

+ MDD = maxt ∈ [0, T] ((Peakt - Trought) / Peakt) × 100% +

+ +
+ ROUGE Score Comparison +
Figure 7: ROUGE Score Comparison
+
+ +
+
+
+
+
+ +
+
+
+
+

Results and Analysis

+
+

Performance Comparison

+ +

Cumulative and Annual Returns

+

Table 1 and Figures 6, 7, and 8 highlight that our method significantly outperforms existing rule-based trading baselines, particularly in profitability, as measured by returns. TradingAgents achieves at least a 23.21% cumulative return and 24.90% annual return on the three sampled stocks, outperforming the best-performing baselines by a margin of at least 6.1%. Notably, on the AAPL stock—a particularly challenging case due to market volatility during the testing period—traditional methods struggled, as their patterns failed to generalize to this situation. In contrast, TradingAgents excelled even under these adverse conditions, achieving returns exceeding 26% within less than three months.

+ +

Sharpe Ratio

+

The Sharpe Ratio performance highlights TradingAgents' exceptional ability to deliver superior risk-adjusted returns, consistently outperforming all baseline models across AAPL, GOOGL, and AMZN with Sharpe Ratios of at least 5.60—surpassing the next best models by a significant margin of at least 2.07 points. This result underscores TradingAgents' effectiveness in balancing returns against risk, a critical metric for sustainable and predictable investment growth. By excelling over market benchmarks like Buy-and-Hold and advanced strategies such as KDJRSI, SMA, MACD, and ZMR, TradingAgents demonstrates its adaptability and robustness in diverse market conditions. Its ability to maximize returns while maintaining controlled risk exposure establishes a solid foundation for multi-agent and debate-based automated trading algorithms.

+ +

Maximum Drawdown

+

While rule-based baselines demonstrated superior performance in controlling risk, as reflected by their maximum drawdown scores, they fell short in capturing high returns. This trade-off between risk and reward underscores TradingAgents' strength as a balanced approach. Despite higher returns being typically associated with higher risks, TradingAgents maintained a relatively low maximum drawdown compared to many baselines. Its effective risk-control mechanisms, facilitated by the debates among risk-control agents, ensured that the maximum drawdown remained within a manageable limit, not exceeding 2%. This demonstrates TradingAgents' capability to strike a robust balance between maximizing returns and managing risk effectively.

+ +

Explainability

+

A significant drawback of current deep learning methods for trading is their dense and complex architectures, which often render the decisions made by trading agents indecipherable to humans. This challenge, rooted in the broader issue of AI explainability, is particularly critical for trading agents, as they operate in real-world financial markets, often involving substantial sums of money where incorrect decisions can lead to severe consequences and losses.

+ +

In contrast, an LLM-based agentic framework for trading offers a transformative advantage: its operations and decisions are communicated in natural language, making them highly interpretable to humans. To illustrate this, we provide the full trading log of TradingAgents for a single day in the Appendix, showcasing its use of the ReAct-style prompting framework Yao et al., 2023. Each decision made by the agents is accompanied by detailed reasoning, tool usage, and thought processes, enabling traders to easily understand and debug the system. This transparency empowers traders to fine-tune and adjust the framework to account for factors influencing decisions, offering a significant edge in explainability over traditional deep-learning trading algorithms.

+
+
+
+
+
+ +
+
+
+
+

Conclusion

+
+

In this paper, we introduced TradingAgents, an LLM-agent-powered stock trading framework that simulates a realistic trading firm environment with multiple specialized agents engaging in agentic debates and conversations. Leveraging the capabilities of LLMs to process and analyze diverse data sources, the framework enables informed trading decisions while utilizing multi-agent interactions to enhance performance through comprehensive reasoning and debate before acting. By integrating agents with distinct roles and risk profiles, along with a reflective agent and a dedicated risk management team, TradingAgents significantly improves trading outcomes and risk management compared to baseline models. Additionally, the collaborative nature of these agents ensures adaptability to varying market conditions. Extensive experiments demonstrate that TradingAgents outperforms traditional trading strategies and baselines in cumulative return, Sharpe ratio, and other critical metrics. Future work will focus on deploying the framework in a live trading environment, expanding agent roles, and incorporating real-time data processing to enhance performance further.

+
+
+
+
+
+ + + + + From a70ca6e1a1dfad960f5bc379a6b8d12cda0f4830 Mon Sep 17 00:00:00 2001 From: Yijia-Xiao Date: Sat, 28 Dec 2024 11:47:36 +0800 Subject: [PATCH 02/56] Figures --- README.md | 2 +- index.html | 123 +++++++++++++++++++++++++++++++++++------------------ 2 files changed, 82 insertions(+), 43 deletions(-) diff --git a/README.md b/README.md index 2d944541..d449d623 100644 --- a/README.md +++ b/README.md @@ -2,4 +2,4 @@ > MARW Workshop, AAAI 2025 > -> Homepage: https://TradingAgents.github.io/ +> Homepage: https://TradingAgents-AI.github.io/ diff --git a/index.html b/index.html index e52ee518..08f5daf4 100644 --- a/index.html +++ b/index.html @@ -114,7 +114,7 @@ Lack of Realistic Organizational Modeling: Many frameworks fail to capture the complex interactions between agents that mimic the structure of real-world trading firms Li et al., 2023, Wang et al., 2024, Yu et al., 2024. Instead, they focus narrowly on specific task performance, often disconnected from the organizational workflows and established human operating procedures proven effective in trading. This limits their ability to fully replicate and benefit from real-world trading practices. - Inefficient Communication Interfaces: Most existing systems use natural language as the primary communication medium, typically relying on message histories or an unstructured pool of information for decision-making Park et al., 2023, Qian et al., 2024. This approach often results in a "telephone effect", where details are lost, and states become corrupted as conversations lengthen. Agents struggle to maintain context and track extended histories while filtering out irrelevant information from previous decision steps, diminishing their effectiveness in handling complex, dynamic tasks. Additionally, the unstructured pool-of-information approach lacks clear instructions, forcing logical communication and information exchange between agents to depend solely on retrieval, which disrupts the relational integrity of the data. + Inefficient Communication Interfaces: Most existing systems use natural language as the primary communication medium, typically relying on message histories or an unstructured pool of information for decision-making Park et al., 2023, Qian et al., 2024. This approach often results in a "telephone effect", where details are lost, and states become corrupted as conversations lengthen. Agents struggle to maintain context and track extended histories while filtering out irrelevant information from previous decision steps, diminishing their effectiveness in handling complex, dynamic tasks. Additionally, the unstructured pool-of-information approach lacks clear instructions, forcing logical communication and information exchange between agents to depend solely on retrieval, which disrupts the relational integrity of the data.

In this work, we address these key limitations of existing models by introducing a system that overcomes these challenges. First, our framework bridges the gap by simulating the multi-agent decision-making processes typical of professional trading teams. It incorporates specialized agents tailored to distinct aspects of trading, inspired by the organizational structure of real-world trading firms. These agents include fundamental analysts, sentiment/news analysts, technical analysts, and traders with diverse risk profiles. Bullish and bearish debaters evaluate market conditions to provide balanced recommendations, while a risk management team ensures that exposures remain within acceptable limits. Second, to enhance communication, our framework combines structured outputs for control, clarity, and reasoning with natural language dialogue to facilitate effective debate and collaboration among agents. This hybrid approach ensures both precision and flexibility in decision-making.

@@ -316,11 +316,6 @@

In this section, we describe the experimental setup used to evaluate our proposed framework. We also provide detailed descriptions of the evaluation metrics employed to assess performance comprehensively.

-
- Performance Comparison -
Table 1: Performance comparison across all methods using four evaluation metrics. Results highlighted in green represent the best-performing statistic for each model. The improvement row illustrates TradingAgents' performance gains over the top-performing baselines.
-
-

Back Trading

To simulate a realistic trading environment, we utilize a multi-asset and multi-modal financial dataset comprising of various stocks such as Apple, Nvidia, Microsoft, Meta, Google, and more. The dataset includes:

@@ -351,44 +346,74 @@

Evaluation Metrics

- Cumulative Returns on AAPL -
Figure 6: TradingAgents: Cumulative Returns (CR) and Detailed Transaction History for AAPL.
+ Cumulative Returns on AAPL +
(a) Cumulative Returns on AAPL
-

To thoroughly evaluate the performance of our TradingAgents framework, we use widely recognized metrics to assess the risk management, profitability, and safety of the TradingAgents strategy in comparison to baseline approaches. Here we describe these metrics:

- -

Cumulative Return (CR)

-

The cumulative return measures the total return generated over the simulation period. It is calculated as:

-

- CR = ((Vend - Vstart) / Vstart) × 100% -

-

where Vend is the portfolio value at the end of the simulation, and Vstart is the initial portfolio value.

- -

Annualized Return (AR)

-

The annualized return normalizes the cumulative return over the number of years:

-

- AR = (((Vend / Vstart)^(1/N)) - 1) × 100% -

-

where N is the number of years in the simulation.

- -

Sharpe Ratio (SR)

-

The Sharpe ratio measures risk-adjusted return by comparing a portfolio's excess return over the risk-free rate to its volatility:

-

- SR = (R̄ - Rf) / σ -

-

where R̄ is the average portfolio return, Rf is the risk-free rate (e.g., yield of 3-month Treasury bills), and σ is the standard deviation of the portfolio returns.

- -

Maximum Drawdown (MDD)

-

Maximum drawdown measures the largest peak-to-trough decline in the portfolio value:

-

- MDD = maxt ∈ [0, T] ((Peakt - Trought) / Peakt) × 100% -

-
- ROUGE Score Comparison -
Figure 7: ROUGE Score Comparison
+ TradingAgents Transactions for AAPL +
(b) TradingAgents Transactions for AAPL.
Green / Red Arrows for Long / Short Positions.
+
+ Performance Comparison +
Table 1: Performance comparison across all methods using four evaluation metrics. Results highlighted in green represent the best-performing statistic for each model. The improvement row illustrates TradingAgents' performance gains over the top-performing baselines.
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
MetricRNA SequenceModality FusionRNA-GPT
SBERTSPubSGPTSBERTSPubSGPTSBERTSPubSGPT
Precision0.73720.55280.52190.69290.65070.66550.86020.73840.7848
Recall0.74960.52700.54740.80280.60820.66030.84040.72080.7561
F1 Score0.74240.53870.53390.74030.62830.66270.84940.72930.7700
+

Table 1: TradingAgents (AIS): Comparison of RNA Sequence (left), Modality Fusion (middle), and TradingAgents (right). Embedding base models are BERT, PubMedBERT, and OpenAI's GPT text-embedding-3-large.

+ +

Sharpe Ratio

+

The Sharpe Ratio performance highlights TradingAgents' exceptional ability to deliver superior risk-adjusted returns, consistently outperforming all baseline models across AAPL, GOOGL, and AMZN with Sharpe Ratios of at least 5.60—surpassing the next best models by a significant margin of at least 2.07 points. This result underscores TradingAgents' effectiveness in balancing returns against risk, a critical metric for sustainable and predictable investment growth. By excelling over market benchmarks like Buy-and-Hold and advanced strategies such as KDJRSI, SMA, MACD, and ZMR, TradingAgents demonstrates its adaptability and robustness in diverse market conditions. Its ability to maximize returns while maintaining controlled risk exposure establishes a solid foundation for multi-agent and debate-based automated trading algorithms.

+ +

Maximum Drawdown

+

While rule-based baselines demonstrated superior performance in controlling risk, as reflected by their maximum drawdown scores, they fell short in capturing high returns. This trade-off between risk and reward underscores TradingAgents' strength as a balanced approach. Despite higher returns being typically associated with higher risks, TradingAgents maintained a relatively low maximum drawdown compared to many baselines. Its effective risk-control mechanisms, facilitated by the debates among risk-control agents, ensured that the maximum drawdown remained within a manageable limit, not exceeding 2%. This demonstrates TradingAgents' capability to strike a robust balance between maximizing returns and managing risk effectively.

+ +

Explainability

+

A significant drawback of current deep learning methods for trading is their dense and complex architectures, which often render the decisions made by trading agents indecipherable to humans. This challenge, rooted in the broader issue of AI explainability, is particularly critical for trading agents, as they operate in real-world financial markets, often involving substantial sums of money where incorrect decisions can lead to severe consequences and losses.

+ +

In contrast, an LLM-based agentic framework for trading offers a transformative advantage: its operations and decisions are communicated in natural language, making them highly interpretable to humans. To illustrate this, we provide the full trading log of TradingAgents for a single day in the Appendix, showcasing its use of the ReAct-style prompting framework Yao et al., 2023. Each decision made by the agents is accompanied by detailed reasoning, tool usage, and thought processes, enabling traders to easily understand and debug the system. This transparency empowers traders to fine-tune and adjust the framework to account for factors influencing decisions, offering a significant edge in explainability over traditional deep-learning trading algorithms.

@@ -404,7 +429,7 @@

Performance Comparison

Cumulative and Annual Returns

-

Table 1 and Figures 6, 7, and 8 highlight that our method significantly outperforms existing rule-based trading baselines, particularly in profitability, as measured by returns. TradingAgents achieves at least a 23.21% cumulative return and 24.90% annual return on the three sampled stocks, outperforming the best-performing baselines by a margin of at least 6.1%. Notably, on the AAPL stock—a particularly challenging case due to market volatility during the testing period—traditional methods struggled, as their patterns failed to generalize to this situation. In contrast, TradingAgents excelled even under these adverse conditions, achieving returns exceeding 26% within less than three months.

+

Table 1 and Figures (a) and (b) highlight that our method significantly outperforms existing rule-based trading baselines, particularly in profitability, as measured by returns. TradingAgents achieves at least a 23.21% cumulative return and 24.90% annual return on the three sampled stocks, outperforming the best-performing baselines by a margin of at least 6.1%. Notably, on the AAPL stock—a particularly challenging case due to market volatility during the testing period—traditional methods struggled, as their patterns failed to generalize to this situation. In contrast, TradingAgents excelled even under these adverse conditions, achieving returns exceeding 26% within less than three months.

Sharpe Ratio

The Sharpe Ratio performance highlights TradingAgents' exceptional ability to deliver superior risk-adjusted returns, consistently outperforming all baseline models across AAPL, GOOGL, and AMZN with Sharpe Ratios of at least 5.60—surpassing the next best models by a significant margin of at least 2.07 points. This result underscores TradingAgents' effectiveness in balancing returns against risk, a critical metric for sustainable and predictable investment growth. By excelling over market benchmarks like Buy-and-Hold and advanced strategies such as KDJRSI, SMA, MACD, and ZMR, TradingAgents demonstrates its adaptability and robustness in diverse market conditions. Its ability to maximize returns while maintaining controlled risk exposure establishes a solid foundation for multi-agent and debate-based automated trading algorithms.

@@ -426,9 +451,23 @@
-

Conclusion

+

Results and Analysis

-

In this paper, we introduced TradingAgents, an LLM-agent-powered stock trading framework that simulates a realistic trading firm environment with multiple specialized agents engaging in agentic debates and conversations. Leveraging the capabilities of LLMs to process and analyze diverse data sources, the framework enables informed trading decisions while utilizing multi-agent interactions to enhance performance through comprehensive reasoning and debate before acting. By integrating agents with distinct roles and risk profiles, along with a reflective agent and a dedicated risk management team, TradingAgents significantly improves trading outcomes and risk management compared to baseline models. Additionally, the collaborative nature of these agents ensures adaptability to varying market conditions. Extensive experiments demonstrate that TradingAgents outperforms traditional trading strategies and baselines in cumulative return, Sharpe ratio, and other critical metrics. Future work will focus on deploying the framework in a live trading environment, expanding agent roles, and incorporating real-time data processing to enhance performance further.

+

Performance Comparison

+ +

Cumulative and Annual Returns

+

Table 1 and Figures (a) and (b) highlight that our method significantly outperforms existing rule-based trading baselines, particularly in profitability, as measured by returns. TradingAgents achieves at least a 23.21% cumulative return and 24.90% annual return on the three sampled stocks, outperforming the best-performing baselines by a margin of at least 6.1%. Notably, on the AAPL stock—a particularly challenging case due to market volatility during the testing period—traditional methods struggled, as their patterns failed to generalize to this situation. In contrast, TradingAgents excelled even under these adverse conditions, achieving returns exceeding 26% within less than three months.

+ +

Sharpe Ratio

+

The Sharpe Ratio performance highlights TradingAgents' exceptional ability to deliver superior risk-adjusted returns, consistently outperforming all baseline models across AAPL, GOOGL, and AMZN with Sharpe Ratios of at least 5.60—surpassing the next best models by a significant margin of at least 2.07 points. This result underscores TradingAgents' effectiveness in balancing returns against risk, a critical metric for sustainable and predictable investment growth. By excelling over market benchmarks like Buy-and-Hold and advanced strategies such as KDJRSI, SMA, MACD, and ZMR, TradingAgents demonstrates its adaptability and robustness in diverse market conditions. Its ability to maximize returns while maintaining controlled risk exposure establishes a solid foundation for multi-agent and debate-based automated trading algorithms.

+ +

Maximum Drawdown

+

While rule-based baselines demonstrated superior performance in controlling risk, as reflected by their maximum drawdown scores, they fell short in capturing high returns. This trade-off between risk and reward underscores TradingAgents' strength as a balanced approach. Despite higher returns being typically associated with higher risks, TradingAgents maintained a relatively low maximum drawdown compared to many baselines. Its effective risk-control mechanisms, facilitated by the debates among risk-control agents, ensured that the maximum drawdown remained within a manageable limit, not exceeding 2%. This demonstrates TradingAgents' capability to strike a robust balance between maximizing returns and managing risk effectively.

+ +

Explainability

+

A significant drawback of current deep learning methods for trading is their dense and complex architectures, which often render the decisions made by trading agents indecipherable to humans. This challenge, rooted in the broader issue of AI explainability, is particularly critical for trading agents, as they operate in real-world financial markets, often involving substantial sums of money where incorrect decisions can lead to severe consequences and losses.

+ +

In contrast, an LLM-based agentic framework for trading offers a transformative advantage: its operations and decisions are communicated in natural language, making them highly interpretable to humans. To illustrate this, we provide the full trading log of TradingAgents for a single day in the Appendix, showcasing its use of the ReAct-style prompting framework Yao et al., 2023. Each decision made by the agents is accompanied by detailed reasoning, tool usage, and thought processes, enabling traders to easily understand and debug the system. This transparency empowers traders to fine-tune and adjust the framework to account for factors influencing decisions, offering a significant edge in explainability over traditional deep-learning trading algorithms.

From db9f63fa54059ec8ae262ef10557c853b6a011a7 Mon Sep 17 00:00:00 2001 From: Yijia-Xiao Date: Sat, 28 Dec 2024 11:56:38 +0800 Subject: [PATCH 03/56] Citations --- index.html | 152 +++++++++++++++++++++++++++++++++-------------------- 1 file changed, 94 insertions(+), 58 deletions(-) diff --git a/index.html b/index.html index 08f5daf4..a13337c4 100644 --- a/index.html +++ b/index.html @@ -106,15 +106,15 @@

Introduction

-

Autonomous agents leveraging Large Language Models (LLMs) present a transformative approach to decision-making by replicating human processes and workflows across various applications. These systems enhance the problem-solving capabilities of language agents by equipping them with tools and enabling collaboration with other agents, effectively breaking down complex problems into manageable components Park et al., 2023, Havrilla et al., 2024, Talebirad et al., 2023, Tang et al., 2024. One prominent application of these autonomous frameworks is in the financial market—a highly complex system influenced by numerous factors, including company fundamentals, market sentiment, technical indicators, and macroeconomic events.

+

Autonomous agents leveraging Large Language Models (LLMs) present a transformative approach to decision-making by replicating human processes and workflows across various applications. These systems enhance the problem-solving capabilities of language agents by equipping them with tools and enabling collaboration with other agents, effectively breaking down complex problems into manageable components. One prominent application of these autonomous frameworks is in the financial market—a highly complex system influenced by numerous factors, including company fundamentals, market sentiment, technical indicators, and macroeconomic events.

-

Traditional algorithmic trading systems often rely on quantitative models that struggle to fully capture the complex interplay of diverse factors. In contrast, LLMs excel at processing and understanding natural language data, making them particularly effective for tasks that require textual comprehension, such as analyzing news articles, financial reports, and social media sentiment. Additionally, deep learning-based trading systems often suffer from low explainability, as they rely on hidden features that drive decision-making but are difficult to interpret. Recent advancements in multi-agent LLM frameworks for finance have shown significant promise in addressing these challenges. These frameworks create explainable AI systems, where decisions are supported by evidence and transparent reasoning Li et al., 2023, Wang et al., 2024, Yu et al., 2024, demonstrating their potential in financial applications.

+

Traditional algorithmic trading systems often rely on quantitative models that struggle to fully capture the complex interplay of diverse factors. In contrast, LLMs excel at processing and understanding natural language data, making them particularly effective for tasks that require textual comprehension, such as analyzing news articles, financial reports, and social media sentiment. Additionally, deep learning-based trading systems often suffer from low explainability, as they rely on hidden features that drive decision-making but are difficult to interpret. Recent advancements in multi-agent LLM frameworks for finance have shown significant promise in addressing these challenges. These frameworks create explainable AI systems, where decisions are supported by evidence and transparent reasoning, demonstrating their potential in financial applications.

Despite their potential, most current applications of language agents in the financial and trading sectors face two significant limitations:

- Lack of Realistic Organizational Modeling: Many frameworks fail to capture the complex interactions between agents that mimic the structure of real-world trading firms Li et al., 2023, Wang et al., 2024, Yu et al., 2024. Instead, they focus narrowly on specific task performance, often disconnected from the organizational workflows and established human operating procedures proven effective in trading. This limits their ability to fully replicate and benefit from real-world trading practices. + Lack of Realistic Organizational Modeling: Many frameworks fail to capture the complex interactions between agents that mimic the structure of real-world trading firms. Instead, they focus narrowly on specific task performance, often disconnected from the organizational workflows and established human operating procedures proven effective in trading. This limits their ability to fully replicate and benefit from real-world trading practices. - Inefficient Communication Interfaces: Most existing systems use natural language as the primary communication medium, typically relying on message histories or an unstructured pool of information for decision-making Park et al., 2023, Qian et al., 2024. This approach often results in a "telephone effect", where details are lost, and states become corrupted as conversations lengthen. Agents struggle to maintain context and track extended histories while filtering out irrelevant information from previous decision steps, diminishing their effectiveness in handling complex, dynamic tasks. Additionally, the unstructured pool-of-information approach lacks clear instructions, forcing logical communication and information exchange between agents to depend solely on retrieval, which disrupts the relational integrity of the data.

+ Inefficient Communication Interfaces: Most existing systems use natural language as the primary communication medium, typically relying on message histories or an unstructured pool of information for decision-making. This approach often results in a "telephone effect", where details are lost, and states become corrupted as conversations lengthen. Agents struggle to maintain context and track extended histories while filtering out irrelevant information from previous decision steps, diminishing their effectiveness in handling complex, dynamic tasks. Additionally, the unstructured pool-of-information approach lacks clear instructions, forcing logical communication and information exchange between agents to depend solely on retrieval, which disrupts the relational integrity of the data.

In this work, we address these key limitations of existing models by introducing a system that overcomes these challenges. First, our framework bridges the gap by simulating the multi-agent decision-making processes typical of professional trading teams. It incorporates specialized agents tailored to distinct aspects of trading, inspired by the organizational structure of real-world trading firms. These agents include fundamental analysts, sentiment/news analysts, technical analysts, and traders with diverse risk profiles. Bullish and bearish debaters evaluate market conditions to provide balanced recommendations, while a risk management team ensures that exposures remain within acceptable limits. Second, to enhance communication, our framework combines structured outputs for control, clarity, and reasoning with natural language dialogue to facilitate effective debate and collaboration among agents. This hybrid approach ensures both precision and flexibility in decision-making.

@@ -136,10 +136,10 @@

Large Language Models (LLMs) are applied in finance by fine-tuning on financial data or training on financial corpora. This improves the model’s understanding of financial terminology and data, enabling a specialized assistant for analytical support, insights, and information retrieval, rather than trade execution.

Fine-Tuned LLMs for Finance -

Fine-tuning enhances domain-specific performance. Examples include PIXIU (FinMA) Xie et al., 2023, which fine-tuned LLaMA on 136K finance-related instructions; FinGPT Yang et al., 2023, which used LoRA to fine-tune models like LLaMA and ChatGLM with about 50K finance-specific samples; and Instruct-FinGPT Zhang et al., 2023, fine-tuned on 10K instruction samples from financial sentiment analysis datasets. These models outperform their base versions and other open-source LLMs like BLOOM and OPT Zhang et al., 2022 in finance classification tasks, even surpassing BloombergGPT Wu et al., 2023 in several evaluations. However, in generative tasks, they perform similarly or slightly worse than powerful general-purpose models like GPT-4, indicating a need for more high-quality, domain-specific datasets.

+

Fine-tuning enhances domain-specific performance. Examples include PIXIU (FinMA), which fine-tuned LLaMA on 136K finance-related instructions; FinGPT, which used LoRA to fine-tune models like LLaMA and ChatGLM with about 50K finance-specific samples; and Instruct-FinGPT, fine-tuned on 10K instruction samples from financial sentiment analysis datasets. These models outperform their base versions and other open-source LLMs like BLOOM and OPT in finance classification tasks, even surpassing BloombergGPT in several evaluations. However, in generative tasks, they perform similarly or slightly worse than powerful general-purpose models like GPT-4, indicating a need for more high-quality, domain-specific datasets.

Finance LLMs Trained from Scratch -

Training LLMs from scratch on finance-specific corpora aims for better domain adaptation. Models like BloombergGPT Wu et al., 2023, XuanYuan 2.0 Zhang et al., 2023, and Fin-T5 Lu et al., 2023 combine public datasets with finance-specific data during pretraining. BloombergGPT, for instance, was trained on both general and financial text, with proprietary Bloomberg data enhancing its performance on finance benchmarks. These models outperform general-purpose counterparts like BLOOM-176B and T5 in tasks such as market sentiment classification and summarization. While they may not match larger closed-source models like GPT-3 or PaLM Chowdhery et al., 2022, they offer competitive performance among similar-sized open-source models without compromising general language understanding.

+

Training LLMs from scratch on finance-specific corpora aims for better domain adaptation. Models like BloombergGPT, XuanYuan 2.0, and Fin-T5 combine public datasets with finance-specific data during pretraining. BloombergGPT, for instance, was trained on both general and financial text, with proprietary Bloomberg data enhancing its performance on finance benchmarks. These models outperform general-purpose counterparts like BLOOM-176B and T5 in tasks such as market sentiment classification and summarization. While they may not match larger closed-source models like GPT-3 or PaLM, they offer competitive performance among similar-sized open-source models without compromising general language understanding.

In summary, finance-specific LLMs developed through fine-tuning or training from scratch show significant improvements in domain-specific tasks, underscoring the importance of domain adaptation and the potential for further enhancements with high-quality finance-specific datasets.

@@ -154,20 +154,20 @@

LLMs act as trader agents making direct trading decisions by analyzing external data like news, financial reports, and stock prices. Proposed architectures include news-driven, reasoning-driven, and reinforcement learning (RL)-driven agents.

News-Driven Agents -

News-driven architectures integrate stock news and macroeconomic updates into LLM prompts to predict stock price movements. Studies evaluating both closed-source models (e.g., GPT-3.5, GPT-4) and open-source LLMs (e.g., Qwen Bai et al., 2023, Baichuan Yang et al., 2023) in financial sentiment analysis have shown the effectiveness of simple long-short strategies based on sentiment scores Lopezlira et al., 2023. Further research on fine-tuned LLMs like FinGPT and OPT demonstrates improved performance through domain-specific alignment Unveiling et al., Sentitrade et al.. Advanced methods involve summarizing news data and reasoning about their relationship with stock prices Beatunveiling et al., Wang et al., 2024.

+

News-driven architectures integrate stock news and macroeconomic updates into LLM prompts to predict stock price movements. Studies evaluating both closed-source models (e.g., GPT-3.5, GPT-4) and open-source LLMs (e.g., Qwen, Baichuan) in financial sentiment analysis have shown the effectiveness of simple long-short strategies based on sentiment scores. Further research on fine-tuned LLMs like FinGPT and OPT demonstrates improved performance through domain-specific alignment. Advanced methods involve summarizing news data and reasoning about their relationship with stock prices.

Reasoning-Driven Agents -

Reasoning-driven agents enhance trading decisions through mechanisms like reflection and debate. Reflection-driven agents, such as FinMem FinMem et al. and FinAgent MultimodalFinMem et al., use layered memorization and multimodal data to summarize inputs into memories, inform decisions, and incorporate technical indicators, achieving superior backtest performance while mitigating hallucinations Ji et al., 2023. Debate-driven agents, like those in heterogeneous frameworks Xing et al., 2024 and TradingGPT Li et al., 2023, enhance reasoning and factual validity by employing LLM debates among agents with different roles, improving sentiment classification and increasing robustness in trading decisions.

+

Reasoning-driven agents enhance trading decisions through mechanisms like reflection and debate. Reflection-driven agents, such as FinMem and FinAgent, use layered memorization and multimodal data to summarize inputs into memories, inform decisions, and incorporate technical indicators, achieving superior backtest performance while mitigating hallucinations. Debate-driven agents, like those in heterogeneous frameworks and TradingGPT, enhance reasoning and factual validity by employing LLM debates among agents with different roles, improving sentiment classification and increasing robustness in trading decisions.

Reinforcement Learning-Driven Agents -

Reinforcement learning methods align LLM outputs with expected behaviors, using backtesting as rewards. SEP Koa, 2024 employs RL with memorization and reflection to refine LLM predictions based on market history. Classical RL methods are also used in trading frameworks that integrate LLM-generated embeddings with stock features, trained via algorithms like Proximal Policy Optimization (PPO) Ding et al., 2023, PPO, Year.

+

Reinforcement learning methods align LLM outputs with expected behaviors, using backtesting as rewards. SEP employs RL with memorization and reflection to refine LLM predictions based on market history. Classical RL methods are also used in trading frameworks that integrate LLM-generated embeddings with stock features, trained via algorithms like Proximal Policy Optimization (PPO).

LLMs as Alpha Miners

-

LLMs are also used to generate alpha factors instead of making direct trading decisions. QuantAgent Wang et al., 2023 demonstrates this by leveraging LLMs to produce alpha factors through an inner-loop and outer-loop architecture. In the inner loop, a writer agent generates a script from a trader's idea, while a judge agent provides feedback. In the outer loop, the code is tested in the real market, and trading results enhance the judge agent. This approach enables progressive approximation of optimal behavior.

+

LLMs are also used to generate alpha factors instead of making direct trading decisions. QuantAgent demonstrates this by leveraging LLMs to produce alpha factors through an inner-loop and outer-loop architecture. In the inner loop, a writer agent generates a script from a trader's idea, while a judge agent provides feedback. In the outer loop, the code is tested in the real market, and trading results enhance the judge agent. This approach enables progressive approximation of optimal behavior.

-

Subsequent research, such as AlphaGPT Wang et al., 2023, proposes a human-in-the-loop framework for alpha mining with a similar architecture. Both studies showcase the effectiveness of LLM-powered alpha mining systems, highlighting their potential in automating and accelerating the development of trading strategies by generating and refining alpha factors.

+

Subsequent research, such as AlphaGPT, proposes a human-in-the-loop framework for alpha mining with a similar architecture. Both studies showcase the effectiveness of LLM-powered alpha mining systems, highlighting their potential in automating and accelerating the development of trading strategies by generating and refining alpha factors.

@@ -263,7 +263,7 @@

By offering oversight and guidance, the Risk Management Team helps maintain the firm's financial stability and protect against adverse market events. They play a crucial role in safeguarding assets and ensuring sustainable long-term performance.

-

All agents in TradingAgents follow the ReAct prompting framework Yao et al., 2023, which synergizes reasoning and acting. The environment state is shared and monitored by the agents, enabling them to take context-appropriate actions such as conducting research, executing trades, engaging in debates, or managing risks. This design ensures a collaborative, dynamic decision-making process reflective of real-world trading systems.

+

All agents in TradingAgents follow the ReAct prompting framework, which synergizes reasoning and acting. The environment state is shared and monitored by the agents, enabling them to take context-appropriate actions such as conducting research, executing trades, engaging in debates, or managing risks. This design ensures a collaborative, dynamic decision-making process reflective of real-world trading systems.

@@ -278,7 +278,7 @@

TradingAgents: Agent Workflow

Communication Protocol

-

Most existing LLM-based agent frameworks use natural language as the primary communication interface, typically through structured message histories or collections of agent-generated messages Fatouros et al., 2024, Li et al., 2023, Yang et al., 2024, Yang et al., 2023. However, relying solely on natural language often proves insufficient for solving complex, long-term tasks that require extensive planning horizons. In such cases, pure natural language communication can resemble a game of telephone—over multiple iterations, initial information may be forgotten or distorted due to context length limitations and an overload of text that obscures critical earlier details Hong et al., 2024. To address this limitation, we draw inspiration from frameworks like MetaGPT, which adopt a structured approach to communication. Our model introduces a structured communication protocol to govern agent interactions. By clearly defining each agent's state, we ensure that each role only extracts or queries the necessary information, processes it, and returns a completed report. This streamlined approach reduces unnecessary steps, lowers the risk of message corruption, and keeps interactions focused and efficient, even in complex, long-horizon tasks.

+

Most existing LLM-based agent frameworks use natural language as the primary communication interface, typically through structured message histories or collections of agent-generated messages. However, relying solely on natural language often proves insufficient for solving complex, long-term tasks that require extensive planning horizons. In such cases, pure natural language communication can resemble a game of telephone—over multiple iterations, initial information may be forgotten or distorted due to context length limitations and an overload of text that obscures critical earlier details. To address this limitation, we draw inspiration from frameworks like MetaGPT, which adopt a structured approach to communication. Our model introduces a structured communication protocol to govern agent interactions. By clearly defining each agent's state, we ensure that each role only extracts or queries the necessary information, processes it, and returns a completed report. This streamlined approach reduces unnecessary steps, lowers the risk of message corruption, and keeps interactions focused and efficient, even in complex, long-horizon tasks.

Types of Agent Interactions

In contrast to previous multi-agent trading frameworks, which rely heavily on natural language dialogue, TradingAgents agents communicate primarily through structured documents and diagrams. These documents encapsulate the agents' insights in concise, well-organized reports that preserve essential content while avoiding irrelevant information. By utilizing structured reports, agents can query necessary details directly from the global state, eliminating the need for lengthy conversations that risk diluting information, extending the message state indefinitely, and causing data loss. The types of documents and the information they contain are detailed below:

@@ -288,7 +288,7 @@
  • Traders: Traders review and analyze the reports from the analysts, carefully deliberating to produce clear decision signals. They accompany these decisions with detailed reports explaining their rationale and supporting evidence, which are later utilized by the risk management team.
  • -

    Agents engage in natural language dialogue exclusively during agent-to-agent conversations and debates. These concise, focused discussions have been shown to promote deeper reasoning and integrate diverse perspectives, enabling more balanced decisions in complex, long-horizon scenarios—a method particularly relevant to the intricate environment of trading Du et al., 2023. This approach seamlessly integrates with our structured framework, as the conversation state is recorded as a structured entry within the overall agent state. The types of communication in these scenarios are detailed below:

    +

    Agents engage in natural language dialogue exclusively during agent-to-agent conversations and debates. These concise, focused discussions have been shown to promote deeper reasoning and integrate diverse perspectives, enabling more balanced decisions in complex, long-horizon scenarios—a method particularly relevant to the intricate environment of trading. This approach seamlessly integrates with our structured framework, as the conversation state is recorded as a structured entry within the overall agent state. The types of communication in these scenarios are detailed below:

    • Researcher Team: Each researcher agent queries the global agent state for analyst reports and carefully forms their opinion. Two researchers represent opposing perspectives: one bullish and one bearish. They engage in natural language dialogue for $n$ rounds, as determined by the debate facilitator agent. At the conclusion, the facilitator reviews the debate history, selects the prevailing perspective, and records it as a structured entry in the communication protocol.
    • @@ -297,7 +297,7 @@

    Backbone LLMs

    -

    To meet the diverse complexity and speed demands of tasks in our framework, we strategically select Large Language Models (LLMs) based on their strengths. Quick-thinking models, such as gpt-4o-mini and gpt-4o, efficiently handle fast, low-depth tasks like summarization, data retrieval, and converting tabular data to text OpenAI, 2024. In contrast, deep-thinking models like o1-preview excel in reasoning-intensive tasks such as decision-making, evidence-based report writing, and data analysis. These models leverage their architectures for multi-round reasoning, producing logically sound, in-depth insights Zhong et al., 2024, Wang et al., 2024, OpenAI, 2024. Additionally, we prioritize models with proven reliability and scalability to ensure optimal performance across various market conditions. We also employ auxiliary expert models for specialized tasks like sentiment analysis.

    +

    To meet the diverse complexity and speed demands of tasks in our framework, we strategically select Large Language Models (LLMs) based on their strengths. Quick-thinking models, such as gpt-4o-mini and gpt-4o, efficiently handle fast, low-depth tasks like summarization, data retrieval, and converting tabular data to text. In contrast, deep-thinking models like o1-preview excel in reasoning-intensive tasks such as decision-making, evidence-based report writing, and data analysis. These models leverage their architectures for multi-round reasoning, producing logically sound, in-depth insights. Additionally, we prioritize models with proven reliability and scalability to ensure optimal performance across various market conditions. We also employ auxiliary expert models for specialized tasks like sentiment analysis.

    Specifically, all analyst nodes rely on deep-thinking models to ensure robust analysis, while quick-thinking models handle data retrieval from APIs and tools for efficiency. Researchers and traders use deep-thinking models to generate valuable insights and support well-informed decisions. By aligning the choice of LLMs with the specific requirements of each task, our framework achieves a balance between efficiency and depth of reasoning, which is crucial for effective trading strategies.

    @@ -363,42 +363,92 @@ - - - - + + + + + + + - - - - - - - - - + + + + + + + + + + + + + + + - - - - + + + + + + + - - - - + + + + + + + - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    MetricRNA SequenceModality FusionRNA-GPTCategoriesModelsAAPLGOOGLAMZN
    SBERTSPubSGPTSBERTSPubSGPTSBERTSPubSGPTCR%↑ARR%↑SR↑MDD%↓CR%↑ARR%↑SR↑MDD%↓CR%↑ARR%↑SR↑MDD%↓
    Precision0.73720.55280.52190.69290.65070.66550.86020.73840.7848MarketB&H-5.23-5.09-1.2911.907.788.091.3513.0417.117.63.533.80
    Recall0.74960.52700.54740.80280.60820.66030.84040.72080.7561Rule-basedMACD-1.49-1.48-0.814.536.206.262.311.22----
    F1 Score0.74240.53870.53390.74030.62830.66270.84940.72930.7700KDJ&RSI2.052.071.641.090.40.40.021.58-0.77-0.76-2.251.08
    ZMR0.570.570.170.86-0.580.582.122.34-0.77-0.77-2.450.82
    SMA-3.2-2.97-1.723.676.236.432.122.3411.0111.62.223.97
    OursTradingAgents26.6230.58.210.9124.3627.586.391.6923.2124.905.602.11
    Improvement(%)24.5728.436.57-16.5819.494.26-6.107.302.07-
    @@ -413,7 +463,7 @@

    Explainability

    A significant drawback of current deep learning methods for trading is their dense and complex architectures, which often render the decisions made by trading agents indecipherable to humans. This challenge, rooted in the broader issue of AI explainability, is particularly critical for trading agents, as they operate in real-world financial markets, often involving substantial sums of money where incorrect decisions can lead to severe consequences and losses.

    -

    In contrast, an LLM-based agentic framework for trading offers a transformative advantage: its operations and decisions are communicated in natural language, making them highly interpretable to humans. To illustrate this, we provide the full trading log of TradingAgents for a single day in the Appendix, showcasing its use of the ReAct-style prompting framework Yao et al., 2023. Each decision made by the agents is accompanied by detailed reasoning, tool usage, and thought processes, enabling traders to easily understand and debug the system. This transparency empowers traders to fine-tune and adjust the framework to account for factors influencing decisions, offering a significant edge in explainability over traditional deep-learning trading algorithms.

    +

    In contrast, an LLM-based agentic framework for trading offers a transformative advantage: its operations and decisions are communicated in natural language, making them highly interpretable to humans. To illustrate this, we provide the full trading log of TradingAgents for a single day in the Appendix, showcasing its use of the ReAct-style prompting framework. Each decision made by the agents is accompanied by detailed reasoning, tool usage, and thought processes, enabling traders to easily understand and debug the system. This transparency empowers traders to fine-tune and adjust the framework to account for factors influencing decisions, offering a significant edge in explainability over traditional deep-learning trading algorithms.

    @@ -440,7 +490,7 @@

    Explainability

    A significant drawback of current deep learning methods for trading is their dense and complex architectures, which often render the decisions made by trading agents indecipherable to humans. This challenge, rooted in the broader issue of AI explainability, is particularly critical for trading agents, as they operate in real-world financial markets, often involving substantial sums of money where incorrect decisions can lead to severe consequences and losses.

    -

    In contrast, an LLM-based agentic framework for trading offers a transformative advantage: its operations and decisions are communicated in natural language, making them highly interpretable to humans. To illustrate this, we provide the full trading log of TradingAgents for a single day in the Appendix, showcasing its use of the ReAct-style prompting framework Yao et al., 2023. Each decision made by the agents is accompanied by detailed reasoning, tool usage, and thought processes, enabling traders to easily understand and debug the system. This transparency empowers traders to fine-tune and adjust the framework to account for factors influencing decisions, offering a significant edge in explainability over traditional deep-learning trading algorithms.

    +

    In contrast, an LLM-based agentic framework for trading offers a transformative advantage: its operations and decisions are communicated in natural language, making them highly interpretable to humans. To illustrate this, we provide the full trading log of TradingAgents for a single day in the Appendix, showcasing its use of the ReAct-style prompting framework. Each decision made by the agents is accompanied by detailed reasoning, tool usage, and thought processes, enabling traders to easily understand and debug the system. This transparency empowers traders to fine-tune and adjust the framework to account for factors influencing decisions, offering a significant edge in explainability over traditional deep-learning trading algorithms.

    @@ -451,23 +501,9 @@
    -

    Results and Analysis

    +

    Conclusion

    -

    Performance Comparison

    - -

    Cumulative and Annual Returns

    -

    Table 1 and Figures (a) and (b) highlight that our method significantly outperforms existing rule-based trading baselines, particularly in profitability, as measured by returns. TradingAgents achieves at least a 23.21% cumulative return and 24.90% annual return on the three sampled stocks, outperforming the best-performing baselines by a margin of at least 6.1%. Notably, on the AAPL stock—a particularly challenging case due to market volatility during the testing period—traditional methods struggled, as their patterns failed to generalize to this situation. In contrast, TradingAgents excelled even under these adverse conditions, achieving returns exceeding 26% within less than three months.

    - -

    Sharpe Ratio

    -

    The Sharpe Ratio performance highlights TradingAgents' exceptional ability to deliver superior risk-adjusted returns, consistently outperforming all baseline models across AAPL, GOOGL, and AMZN with Sharpe Ratios of at least 5.60—surpassing the next best models by a significant margin of at least 2.07 points. This result underscores TradingAgents' effectiveness in balancing returns against risk, a critical metric for sustainable and predictable investment growth. By excelling over market benchmarks like Buy-and-Hold and advanced strategies such as KDJRSI, SMA, MACD, and ZMR, TradingAgents demonstrates its adaptability and robustness in diverse market conditions. Its ability to maximize returns while maintaining controlled risk exposure establishes a solid foundation for multi-agent and debate-based automated trading algorithms.

    - -

    Maximum Drawdown

    -

    While rule-based baselines demonstrated superior performance in controlling risk, as reflected by their maximum drawdown scores, they fell short in capturing high returns. This trade-off between risk and reward underscores TradingAgents' strength as a balanced approach. Despite higher returns being typically associated with higher risks, TradingAgents maintained a relatively low maximum drawdown compared to many baselines. Its effective risk-control mechanisms, facilitated by the debates among risk-control agents, ensured that the maximum drawdown remained within a manageable limit, not exceeding 2%. This demonstrates TradingAgents' capability to strike a robust balance between maximizing returns and managing risk effectively.

    - -

    Explainability

    -

    A significant drawback of current deep learning methods for trading is their dense and complex architectures, which often render the decisions made by trading agents indecipherable to humans. This challenge, rooted in the broader issue of AI explainability, is particularly critical for trading agents, as they operate in real-world financial markets, often involving substantial sums of money where incorrect decisions can lead to severe consequences and losses.

    - -

    In contrast, an LLM-based agentic framework for trading offers a transformative advantage: its operations and decisions are communicated in natural language, making them highly interpretable to humans. To illustrate this, we provide the full trading log of TradingAgents for a single day in the Appendix, showcasing its use of the ReAct-style prompting framework Yao et al., 2023. Each decision made by the agents is accompanied by detailed reasoning, tool usage, and thought processes, enabling traders to easily understand and debug the system. This transparency empowers traders to fine-tune and adjust the framework to account for factors influencing decisions, offering a significant edge in explainability over traditional deep-learning trading algorithms.

    +

    In this paper, we introduced TradingAgents, an LLM-agent-powered stock trading framework that simulates a realistic trading firm environment with multiple specialized agents engaging in agentic debates and conversations. Leveraging the capabilities of LLMs to process and analyze diverse data sources, the framework enables informed trading decisions while utilizing multi-agent interactions to enhance performance through comprehensive reasoning and debate before acting. By integrating agents with distinct roles and risk profiles, along with a reflective agent and a dedicated risk management team, TradingAgents significantly improves trading outcomes and risk management compared to baseline models. Additionally, the collaborative nature of these agents ensures adaptability to varying market conditions. Extensive experiments demonstrate that TradingAgents outperforms traditional trading strategies and baselines in cumulative return, Sharpe ratio, and other critical metrics. Future work will focus on deploying the framework in a live trading environment, expanding agent roles, and incorporating real-time data processing to enhance performance further.

    From 8b7b5b9c78f60f846e79740fb8a0491ffe445450 Mon Sep 17 00:00:00 2001 From: Yijia-Xiao Date: Sat, 28 Dec 2024 12:28:02 +0800 Subject: [PATCH 04/56] Layout --- index.html | 200 ++++++++++++++++++++++++----------------------------- 1 file changed, 89 insertions(+), 111 deletions(-) diff --git a/index.html b/index.html index a13337c4..b9fa7ebd 100644 --- a/index.html +++ b/index.html @@ -188,12 +188,34 @@

    Analyst Team

    -

    The Analyst Team (Figure 2) is composed of specialized agents responsible for gathering and analyzing various types of market data to inform trading decisions. Each agent focuses on a specific aspect of market analysis, bringing together a comprehensive view of the market's conditions.

    +

    The Analyst Team is composed of specialized agents responsible for gathering and analyzing various types of market data to inform trading decisions. Each agent focuses on a specific aspect of market analysis, bringing together a comprehensive view of the market's conditions.

    -
    - TradingAgents Analyst Team -
    Figure 2: TradingAgents Analyst Team
    -
    +
    +
    +
    + TradingAgents Analyst Team +
    Figure 2: TradingAgents Analyst Team
    +
    +
    +
    +
    + TradingAgents Researcher Team +
    Figure 3: TradingAgents Researcher Team: Bullish Perspectives and Bearish Perspectives
    +
    +
    +
    +
    + TradingAgents Trader Decision-Making Process +
    Figure 4: TradingAgents Trader Decision-Making Process
    +
    +
    +
    +
    + TradingAgents Risk Management Team and Fund Manager Approval Workflow +
    Figure 5: TradingAgents Risk Management Team and Fund Manager Approval Workflow
    +
    +
    +
    • Fundamental Analyst Agents: These agents evaluate company fundamentals by analyzing financial statements, earnings reports, insider transactions, and other pertinent data. They assess a company's intrinsic value to identify undervalued or overvalued stocks, providing insights into long-term investment potential.
    • @@ -207,12 +229,7 @@

      Researcher Team

      -

      The Researcher Team (Figure 3) is responsible for critically evaluating the information provided by the Analyst Team. Comprised of agents adopting both bullish and bearish perspectives, they engage in multiple rounds of debate to assess the potential risks and benefits of investment decisions.

      - -
      - TradingAgents Researcher Team -
      Figure 3: TradingAgents Researcher Team: Bullish Perspectives and Bearish Perspectives
      -
      +

      The Researcher Team is responsible for critically evaluating the information provided by the Analyst Team. Comprised of agents adopting both bullish and bearish perspectives, they engage in multiple rounds of debate to assess the potential risks and benefits of investment decisions.

      • Bullish Researchers: These agents advocate for investment opportunities by highlighting positive indicators, growth potential, and favorable market conditions. They construct arguments supporting the initiation or continuation of positions in certain assets.
      • @@ -224,14 +241,7 @@

        Trader Agents

        -

        Trader Agents (Figure 4) are responsible for executing trading decisions based on the comprehensive analysis provided by the Analyst Team and the nuanced perspectives from the Researcher Team. They assess the synthesized information, considering both quantitative data and qualitative insights, to determine optimal trading actions.

        - -
        - TradingAgents Trader Decision-Making Process -
        Figure 4: TradingAgents Trader Decision-Making Process
        -
        - -

        The tasks of TradingAgents Trader include:

        +

        Trader Agents are responsible for executing trading decisions based on the comprehensive analysis provided by the Analyst Team and the nuanced perspectives from the Researcher Team. They assess the synthesized information, considering both quantitative data and qualitative insights, to determine optimal trading actions.

        • Evaluating recommendations and insights from analysts and researchers.
        • @@ -245,14 +255,7 @@

          Risk Management Team

          -

          The Risk Management Team (Figure 5) monitors and controls the firm's exposure to various market risks. These agents continuously evaluate the portfolio's risk profile, ensuring that trading activities remain within predefined risk parameters and comply with regulatory requirements.

          - -
          - TradingAgents Risk Management Team and Fund Manager Approval Workflow -
          Figure 5: TradingAgents Risk Management Team and Fund Manager Approval Workflow
          -
          - -

          The responsibilities of Risk Management Team include:

          +

          The Risk Management Team monitors and controls the firm's exposure to various market risks. These agents continuously evaluate the portfolio's risk profile, ensuring that trading activities remain within predefined risk parameters and comply with regulatory requirements.

          • Assessing factors such as market volatility, liquidity, and counterparty risks.
          • @@ -363,107 +366,82 @@ - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + - - - - - - - + + + + + + + - - - - - - - + + + + + + + - - - - - - + + + + + + + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + +
            CategoriesModelsAAPLGOOGLAMZN
            CR%↑ARR%↑SR↑MDD%↓CR%↑ARR%↑SR↑MDD%↓CR%↑ARR%↑SR↑MDD%↓MetricBuy and HoldMACDKDJ & RSIZMRSMATradingAgents
            MarketB&H-5.23-5.09-1.2911.907.788.091.3513.0417.117.63.533.80Cumulative Return (CR%)-5.23-1.492.050.57-3.226.62
            Rule-basedMACD-1.49-1.48-0.814.536.206.262.311.22----Annualized Return (AR%)-5.09-1.482.070.57-2.9730.50
            KDJ&RSI2.052.071.641.090.40.40.021.58-0.77-0.76-2.251.08Sharpe Ratio (SR)-1.29-0.811.640.17-1.728.21
            ZMR0.570.570.170.86-0.580.582.122.34-0.77-0.77-2.450.82
            SMA-3.2-2.97-1.723.676.236.432.122.3411.0111.62.223.97
            OursTradingAgents26.6230.58.210.9124.3627.586.391.6923.2124.905.602.11
            Improvement(%)24.5728.436.57-16.5819.494.26-6.107.302.07-Maximum Drawdown (MDD%)11.904.531.090.863.670.91
            -

            Table 1: TradingAgents (AIS): Comparison of RNA Sequence (left), Modality Fusion (middle), and TradingAgents (right). Embedding base models are BERT, PubMedBERT, and OpenAI's GPT text-embedding-3-large.

            +

            Table 1: Performance comparison of TradingAgents against baseline models using four evaluation metrics.

            -

            Sharpe Ratio

            -

            The Sharpe Ratio performance highlights TradingAgents' exceptional ability to deliver superior risk-adjusted returns, consistently outperforming all baseline models across AAPL, GOOGL, and AMZN with Sharpe Ratios of at least 5.60—surpassing the next best models by a significant margin of at least 2.07 points. This result underscores TradingAgents' effectiveness in balancing returns against risk, a critical metric for sustainable and predictable investment growth. By excelling over market benchmarks like Buy-and-Hold and advanced strategies such as KDJRSI, SMA, MACD, and ZMR, TradingAgents demonstrates its adaptability and robustness in diverse market conditions. Its ability to maximize returns while maintaining controlled risk exposure establishes a solid foundation for multi-agent and debate-based automated trading algorithms.

            +

            Cumulative Return (CR)

            +

            The cumulative return measures the total return generated over the simulation period. It is calculated as:

            +

            + CR = ((Vend - Vstart) / Vstart) × 100% +

            +

            where Vend is the portfolio value at the end of the simulation, and Vstart is the initial portfolio value.

            -

            Maximum Drawdown

            -

            While rule-based baselines demonstrated superior performance in controlling risk, as reflected by their maximum drawdown scores, they fell short in capturing high returns. This trade-off between risk and reward underscores TradingAgents' strength as a balanced approach. Despite higher returns being typically associated with higher risks, TradingAgents maintained a relatively low maximum drawdown compared to many baselines. Its effective risk-control mechanisms, facilitated by the debates among risk-control agents, ensured that the maximum drawdown remained within a manageable limit, not exceeding 2%. This demonstrates TradingAgents' capability to strike a robust balance between maximizing returns and managing risk effectively.

            +

            Annualized Return (AR)

            +

            The annualized return normalizes the cumulative return over the number of years:

            +

            + AR = (((Vend / Vstart)^(1/N)) - 1) × 100% +

            +

            where N is the number of years in the simulation.

            -

            Explainability

            -

            A significant drawback of current deep learning methods for trading is their dense and complex architectures, which often render the decisions made by trading agents indecipherable to humans. This challenge, rooted in the broader issue of AI explainability, is particularly critical for trading agents, as they operate in real-world financial markets, often involving substantial sums of money where incorrect decisions can lead to severe consequences and losses.

            +

            Sharpe Ratio (SR)

            +

            The Sharpe ratio measures risk-adjusted return by comparing a portfolio's excess return over the risk-free rate to its volatility:

            +

            + SR = (R̄ - Rf) / σ +

            +

            where R̄ is the average portfolio return, Rf is the risk-free rate (e.g., yield of 3-month Treasury bills), and σ is the standard deviation of the portfolio returns.

            -

            In contrast, an LLM-based agentic framework for trading offers a transformative advantage: its operations and decisions are communicated in natural language, making them highly interpretable to humans. To illustrate this, we provide the full trading log of TradingAgents for a single day in the Appendix, showcasing its use of the ReAct-style prompting framework. Each decision made by the agents is accompanied by detailed reasoning, tool usage, and thought processes, enabling traders to easily understand and debug the system. This transparency empowers traders to fine-tune and adjust the framework to account for factors influencing decisions, offering a significant edge in explainability over traditional deep-learning trading algorithms.

            +

            Maximum Drawdown (MDD)

            +

            Maximum drawdown measures the largest peak-to-trough decline in the portfolio value:

            +

            + MDD = maxt ∈ [0, T] ((Peakt - Trought) / Peakt) × 100% +

      From 2e8a02e874c8b1421c9af47dd415efaa1d3faf2d Mon Sep 17 00:00:00 2001 From: Yijia-Xiao Date: Sat, 28 Dec 2024 12:35:29 +0800 Subject: [PATCH 05/56] Layout --- index.html | 210 +++++++++++++++++++++++++---------------------------- 1 file changed, 99 insertions(+), 111 deletions(-) diff --git a/index.html b/index.html index b9fa7ebd..86bb145d 100644 --- a/index.html +++ b/index.html @@ -190,29 +190,17 @@

      The Analyst Team is composed of specialized agents responsible for gathering and analyzing various types of market data to inform trading decisions. Each agent focuses on a specific aspect of market analysis, bringing together a comprehensive view of the market's conditions.

      -
      -
      +
      +
      TradingAgents Analyst Team
      Figure 2: TradingAgents Analyst Team
      -
      +
      TradingAgents Researcher Team -
      Figure 3: TradingAgents Researcher Team: Bullish Perspectives and Bearish Perspectives
      -
      -
      -
      -
      - TradingAgents Trader Decision-Making Process -
      Figure 4: TradingAgents Trader Decision-Making Process
      -
      -
      -
      -
      - TradingAgents Risk Management Team and Fund Manager Approval Workflow -
      Figure 5: TradingAgents Risk Management Team and Fund Manager Approval Workflow
      +
      Figure 3: TradingAgents Researcher Team
      @@ -243,6 +231,21 @@

      Trader Agents are responsible for executing trading decisions based on the comprehensive analysis provided by the Analyst Team and the nuanced perspectives from the Researcher Team. They assess the synthesized information, considering both quantitative data and qualitative insights, to determine optimal trading actions.

      +
      +
      +
      + TradingAgents Trader Decision-Making Process +
      Figure 4: TradingAgents Trader Decision-Making Process
      +
      +
      +
      +
      + TradingAgents Risk Management Team and Fund Manager Approval Workflow +
      Figure 5: TradingAgents Risk Management Team and Fund Manager Approval Workflow
      +
      +
      +
      +
      • Evaluating recommendations and insights from analysts and researchers.
      • Deciding on the timing and size of trades to maximize trading returns.
      • @@ -366,106 +369,104 @@ - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + - - - - - - - + + + + + + + - - - - - - - + + + + + + + - - - - - - - + + + + + + - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
        MetricBuy and HoldMACDKDJ & RSIZMRSMATradingAgentsCategoriesModelsAAPLGOOGLAMZN
        CR%↑ARR%↑SR↑MDD%↓CR%↑ARR%↑SR↑MDD%↓CR%↑ARR%↑SR↑MDD%↓
        Cumulative Return (CR%)-5.23-1.492.050.57-3.226.62MarketB&H-5.23-5.09-1.2911.907.788.091.3513.0417.117.63.533.80
        Annualized Return (AR%)-5.09-1.482.070.57-2.9730.50Rule-basedMACD-1.49-1.48-0.814.536.206.262.311.22----
        Sharpe Ratio (SR)-1.29-0.811.640.17-1.728.21KDJ&RSI2.052.071.641.090.40.40.021.58-0.77-0.76-2.251.08
        Maximum Drawdown (MDD%)11.904.531.090.863.670.91ZMR0.570.570.170.86-0.580.582.122.34-0.77-0.77-2.450.82
        SMA-3.2-2.97-1.723.676.236.432.122.3411.0111.62.223.97
        OursTradingAgents26.6230.58.210.9124.3627.586.391.6923.2124.905.602.11
        Improvement(%)24.5728.436.57-16.5819.494.26-6.107.302.07-
        -

        Table 1: Performance comparison of TradingAgents against baseline models using four evaluation metrics.

        +

        Table 1: TradingAgents (AIS): Comparison of RNA Sequence (left), Modality Fusion (middle), and TradingAgents (right). Embedding base models are BERT, PubMedBERT, and OpenAI's GPT text-embedding-3-large.

        -

        Cumulative Return (CR)

        -

        The cumulative return measures the total return generated over the simulation period. It is calculated as:

        -

        - CR = ((Vend - Vstart) / Vstart) × 100% -

        -

        where Vend is the portfolio value at the end of the simulation, and Vstart is the initial portfolio value.

        - -

        Annualized Return (AR)

        -

        The annualized return normalizes the cumulative return over the number of years:

        -

        - AR = (((Vend / Vstart)^(1/N)) - 1) × 100% -

        -

        where N is the number of years in the simulation.

        - -

        Sharpe Ratio (SR)

        -

        The Sharpe ratio measures risk-adjusted return by comparing a portfolio's excess return over the risk-free rate to its volatility:

        -

        - SR = (R̄ - Rf) / σ -

        -

        where R̄ is the average portfolio return, Rf is the risk-free rate (e.g., yield of 3-month Treasury bills), and σ is the standard deviation of the portfolio returns.

        - -

        Maximum Drawdown (MDD)

        -

        Maximum drawdown measures the largest peak-to-trough decline in the portfolio value:

        -

        - MDD = maxt ∈ [0, T] ((Peakt - Trought) / Peakt) × 100% -

        -
      -
      -
      -
      - - -
      -
      -
      -
      -

      Results and Analysis

      -
      -

      Performance Comparison

      - -

      Cumulative and Annual Returns

      -

      Table 1 and Figures (a) and (b) highlight that our method significantly outperforms existing rule-based trading baselines, particularly in profitability, as measured by returns. TradingAgents achieves at least a 23.21% cumulative return and 24.90% annual return on the three sampled stocks, outperforming the best-performing baselines by a margin of at least 6.1%. Notably, on the AAPL stock—a particularly challenging case due to market volatility during the testing period—traditional methods struggled, as their patterns failed to generalize to this situation. In contrast, TradingAgents excelled even under these adverse conditions, achieving returns exceeding 26% within less than three months.

      - -

      Sharpe Ratio

      +

      Sharpe Ratio

      The Sharpe Ratio performance highlights TradingAgents' exceptional ability to deliver superior risk-adjusted returns, consistently outperforming all baseline models across AAPL, GOOGL, and AMZN with Sharpe Ratios of at least 5.60—surpassing the next best models by a significant margin of at least 2.07 points. This result underscores TradingAgents' effectiveness in balancing returns against risk, a critical metric for sustainable and predictable investment growth. By excelling over market benchmarks like Buy-and-Hold and advanced strategies such as KDJRSI, SMA, MACD, and ZMR, TradingAgents demonstrates its adaptability and robustness in diverse market conditions. Its ability to maximize returns while maintaining controlled risk exposure establishes a solid foundation for multi-agent and debate-based automated trading algorithms.

      -

      Maximum Drawdown

      +

      Maximum Drawdown

      While rule-based baselines demonstrated superior performance in controlling risk, as reflected by their maximum drawdown scores, they fell short in capturing high returns. This trade-off between risk and reward underscores TradingAgents' strength as a balanced approach. Despite higher returns being typically associated with higher risks, TradingAgents maintained a relatively low maximum drawdown compared to many baselines. Its effective risk-control mechanisms, facilitated by the debates among risk-control agents, ensured that the maximum drawdown remained within a manageable limit, not exceeding 2%. This demonstrates TradingAgents' capability to strike a robust balance between maximizing returns and managing risk effectively.

      -

      Explainability

      +

      Explainability

      A significant drawback of current deep learning methods for trading is their dense and complex architectures, which often render the decisions made by trading agents indecipherable to humans. This challenge, rooted in the broader issue of AI explainability, is particularly critical for trading agents, as they operate in real-world financial markets, often involving substantial sums of money where incorrect decisions can lead to severe consequences and losses.

      In contrast, an LLM-based agentic framework for trading offers a transformative advantage: its operations and decisions are communicated in natural language, making them highly interpretable to humans. To illustrate this, we provide the full trading log of TradingAgents for a single day in the Appendix, showcasing its use of the ReAct-style prompting framework. Each decision made by the agents is accompanied by detailed reasoning, tool usage, and thought processes, enabling traders to easily understand and debug the system. This transparency empowers traders to fine-tune and adjust the framework to account for factors influencing decisions, offering a significant edge in explainability over traditional deep-learning trading algorithms.

      @@ -475,19 +476,6 @@
      -
      -
      -
      -
      -

      Conclusion

      -
      -

      In this paper, we introduced TradingAgents, an LLM-agent-powered stock trading framework that simulates a realistic trading firm environment with multiple specialized agents engaging in agentic debates and conversations. Leveraging the capabilities of LLMs to process and analyze diverse data sources, the framework enables informed trading decisions while utilizing multi-agent interactions to enhance performance through comprehensive reasoning and debate before acting. By integrating agents with distinct roles and risk profiles, along with a reflective agent and a dedicated risk management team, TradingAgents significantly improves trading outcomes and risk management compared to baseline models. Additionally, the collaborative nature of these agents ensures adaptability to varying market conditions. Extensive experiments demonstrate that TradingAgents outperforms traditional trading strategies and baselines in cumulative return, Sharpe ratio, and other critical metrics. Future work will focus on deploying the framework in a live trading environment, expanding agent roles, and incorporating real-time data processing to enhance performance further.

      -
      -
      -
      -
      -
      -
      From fac7eca73f201d8d42afe04f77e8465edaa6fe3c Mon Sep 17 00:00:00 2001 From: Yijia-Xiao Date: Sat, 28 Dec 2024 12:51:51 +0800 Subject: [PATCH 06/56] Layout --- index.html | 69 ++++++++++++++++++++++++++++++------------------------ 1 file changed, 39 insertions(+), 30 deletions(-) diff --git a/index.html b/index.html index 86bb145d..a38bb5ca 100644 --- a/index.html +++ b/index.html @@ -188,22 +188,12 @@

      Analyst Team

      -

      The Analyst Team is composed of specialized agents responsible for gathering and analyzing various types of market data to inform trading decisions. Each agent focuses on a specific aspect of market analysis, bringing together a comprehensive view of the market's conditions.

      +

      The Analyst Team (Figure 2) is composed of specialized agents responsible for gathering and analyzing various types of market data to inform trading decisions. Each agent focuses on a specific aspect of market analysis, bringing together a comprehensive view of the market's conditions.

      -
      -
      -
      - TradingAgents Analyst Team -
      Figure 2: TradingAgents Analyst Team
      -
      -
      -
      -
      - TradingAgents Researcher Team -
      Figure 3: TradingAgents Researcher Team
      -
      -
      -
      +
      + TradingAgents Analyst Team +
      Figure 2: TradingAgents Analyst Team
      +
      • Fundamental Analyst Agents: These agents evaluate company fundamentals by analyzing financial statements, earnings reports, insider transactions, and other pertinent data. They assess a company's intrinsic value to identify undervalued or overvalued stocks, providing insights into long-term investment potential.
      • @@ -217,21 +207,15 @@

        Researcher Team

        -

        The Researcher Team is responsible for critically evaluating the information provided by the Analyst Team. Comprised of agents adopting both bullish and bearish perspectives, they engage in multiple rounds of debate to assess the potential risks and benefits of investment decisions.

        - -
          -
        • Bullish Researchers: These agents advocate for investment opportunities by highlighting positive indicators, growth potential, and favorable market conditions. They construct arguments supporting the initiation or continuation of positions in certain assets.
        • -
        • Bearish Researchers: Conversely, these agents focus on potential downsides, risks, and unfavorable market signals. They provide cautionary insights, questioning the viability of investment strategies and highlighting possible negative outcomes.
        • -
        - -

        Through this dialectical process, the Researcher Team aims to reach a balanced understanding of the market situation. Their thorough analysis helps in identifying the most promising investment strategies while anticipating possible challenges, thus aiding the Trader Agents in making informed decisions.

        -
        - -

        Trader Agents

        -
        -

        Trader Agents are responsible for executing trading decisions based on the comprehensive analysis provided by the Analyst Team and the nuanced perspectives from the Researcher Team. They assess the synthesized information, considering both quantitative data and qualitative insights, to determine optimal trading actions.

        +

        The Researcher Team (Figure 3) is responsible for critically evaluating the information provided by the Analyst Team. Comprised of agents adopting both bullish and bearish perspectives, they engage in multiple rounds of debate to assess the potential risks and benefits of investment decisions.

        +
        +
        + TradingAgents Researcher Team +
        Figure 3: TradingAgents Researcher Team: Bullish Perspectives and Bearish Perspectives
        +
        +
        TradingAgents Trader Decision-Making Process @@ -246,6 +230,18 @@
        +
          +
        • Bullish Researchers: These agents advocate for investment opportunities by highlighting positive indicators, growth potential, and favorable market conditions. They construct arguments supporting the initiation or continuation of positions in certain assets.
        • +
        • Bearish Researchers: Conversely, these agents focus on potential downsides, risks, and unfavorable market signals. They provide cautionary insights, questioning the viability of investment strategies and highlighting possible negative outcomes.
        • +
        + +

        Through this dialectical process, the Researcher Team aims to reach a balanced understanding of the market situation. Their thorough analysis helps in identifying the most promising investment strategies while anticipating possible challenges, thus aiding the Trader Agents in making informed decisions.

        +
        + +

        Trader Agents

        +
        +

        Trader Agents (Figure 4) are responsible for executing trading decisions based on the comprehensive analysis provided by the Analyst Team and the nuanced perspectives from the Researcher Team. They assess the synthesized information, considering both quantitative data and qualitative insights, to determine optimal trading actions.

        +
        • Evaluating recommendations and insights from analysts and researchers.
        • Deciding on the timing and size of trades to maximize trading returns.
        • @@ -258,7 +254,7 @@

          Risk Management Team

          -

          The Risk Management Team monitors and controls the firm's exposure to various market risks. These agents continuously evaluate the portfolio's risk profile, ensuring that trading activities remain within predefined risk parameters and comply with regulatory requirements.

          +

          The Risk Management Team (Figure 5) monitors and controls the firm's exposure to various market risks. These agents continuously evaluate the portfolio's risk profile, ensuring that trading activities remain within predefined risk parameters and comply with regulatory requirements.

          • Assessing factors such as market volatility, liquidity, and counterparty risks.
          • @@ -458,7 +454,7 @@ -

            Table 1: TradingAgents (AIS): Comparison of RNA Sequence (left), Modality Fusion (middle), and TradingAgents (right). Embedding base models are BERT, PubMedBERT, and OpenAI's GPT text-embedding-3-large.

            +

            Table 1: TradingAgents: Comparison of Performance Metrics across AAPL, GOOGL, and AMZN.

            Sharpe Ratio

            The Sharpe Ratio performance highlights TradingAgents' exceptional ability to deliver superior risk-adjusted returns, consistently outperforming all baseline models across AAPL, GOOGL, and AMZN with Sharpe Ratios of at least 5.60—surpassing the next best models by a significant margin of at least 2.07 points. This result underscores TradingAgents' effectiveness in balancing returns against risk, a critical metric for sustainable and predictable investment growth. By excelling over market benchmarks like Buy-and-Hold and advanced strategies such as KDJRSI, SMA, MACD, and ZMR, TradingAgents demonstrates its adaptability and robustness in diverse market conditions. Its ability to maximize returns while maintaining controlled risk exposure establishes a solid foundation for multi-agent and debate-based automated trading algorithms.

            @@ -476,6 +472,19 @@
          +
          +
          +
          +
          +

          Conclusion

          +
          +

          In this paper, we introduced TradingAgents, an LLM-agent-powered stock trading framework that simulates a realistic trading firm environment with multiple specialized agents engaging in agentic debates and conversations. Leveraging the capabilities of LLMs to process and analyze diverse data sources, the framework enables informed trading decisions while utilizing multi-agent interactions to enhance performance through comprehensive reasoning and debate before acting. By integrating agents with distinct roles and risk profiles, along with a reflective agent and a dedicated risk management team, TradingAgents significantly improves trading outcomes and risk management compared to baseline models. Additionally, the collaborative nature of these agents ensures adaptability to varying market conditions. Extensive experiments demonstrate that TradingAgents outperforms traditional trading strategies and baselines in cumulative return, Sharpe ratio, and other critical metrics. Future work will focus on deploying the framework in a live trading environment, expanding agent roles, and incorporating real-time data processing to enhance performance further.

          +
          +
          +
          +
          +
          +
          From 1920757ed9bde412f1f969e0830a16575f4b9573 Mon Sep 17 00:00:00 2001 From: Yijia-Xiao Date: Sat, 28 Dec 2024 13:06:03 +0800 Subject: [PATCH 07/56] Layout --- index.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/index.html b/index.html index a38bb5ca..46c0c3f5 100644 --- a/index.html +++ b/index.html @@ -191,7 +191,7 @@

          The Analyst Team (Figure 2) is composed of specialized agents responsible for gathering and analyzing various types of market data to inform trading decisions. Each agent focuses on a specific aspect of market analysis, bringing together a comprehensive view of the market's conditions.

          - TradingAgents Analyst Team + TradingAgents Analyst Team
          Figure 2: TradingAgents Analyst Team
          From 21890ac942f2c563e2922dc0b71190c777dd8c58 Mon Sep 17 00:00:00 2001 From: Yijia-Xiao Date: Sat, 28 Dec 2024 13:35:10 +0800 Subject: [PATCH 08/56] Results --- index.html | 24 ++++++++++++++++++++++-- 1 file changed, 22 insertions(+), 2 deletions(-) diff --git a/index.html b/index.html index 46c0c3f5..3c6a494a 100644 --- a/index.html +++ b/index.html @@ -346,7 +346,7 @@

        Evaluation Metrics

        - + +
        + +
        + Cumulative Returns on AAPL +
        + (a) Cumulative Returns on AAPL +
        +
        + +
        + TradingAgents Transactions for AAPL +
        + (b) TradingAgents Transactions for AAPL.
        + Green / Red Arrows for Long / Short Positions. +
        +
        + +
        + From 0ef7b4657943d6a393fb3b135236080f4a116ab3 Mon Sep 17 00:00:00 2001 From: Yijia-Xiao Date: Sat, 28 Dec 2024 13:41:45 +0800 Subject: [PATCH 09/56] Results --- index.html | 34 +++++++++++++++++----------------- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/index.html b/index.html index 3c6a494a..64240731 100644 --- a/index.html +++ b/index.html @@ -357,24 +357,24 @@
        (b) TradingAgents Transactions for AAPL.
        Green / Red Arrows for Long / Short Positions.
        --> -
        - -
        - Cumulative Returns on AAPL -
        - (a) Cumulative Returns on AAPL -
        -
        +
        +
        +
        + Cumulative Returns on AAPL +
        (a) Cumulative Returns on AAPL
        +
        +
        -
        - TradingAgents Transactions for AAPL -
        - (b) TradingAgents Transactions for AAPL.
        - Green / Red Arrows for Long / Short Positions. -
        -
        - -
        +
        +
        + TradingAgents Transactions for AAPL +
        + (b) TradingAgents Transactions for AAPL.
        + Green / Red Arrows for Long / Short Positions. +
        +
        +
        +
        @@ -93,7 +92,7 @@

        Abstract

        -

        Significant progress has been made in automated problem-solving using societies of agents powered by large language models (LLMs). In finance, efforts have largely focused on single-agent systems handling specific tasks or multi-agent frameworks independently gathering data. However, multi-agent systems' potential to replicate real-world trading firms' collaborative dynamics remains underexplored. TradingAgents proposes a novel stock trading framework inspired by trading firms, featuring LLM-powered agents in specialized roles such as fundamental analysts, sentiment analysts, technical analysts, and traders with varied risk profiles. The framework includes Bull and Bear researcher agents assessing market conditions, a risk management team monitoring exposure, and traders synthesizing insights from debates and historical data to make informed decisions. By simulating a dynamic, collaborative trading environment, this framework aims to improve trading performance. Detailed architecture and extensive experiments reveal its superiority over baseline models, with notable improvements in cumulative returns, Sharpe ratio, and maximum drawdown, highlighting the potential of multi-agent LLM frameworks in financial trading.

        +

        Societies of LLM-powered agents have advanced automated problem-solving, particularly in finance. Yet, most frameworks don’t replicate the collaborative workflows of real trading firms. TradingAgents addresses this gap by assigning specialized LLM-powered agents—analysts, researchers, traders, and risk managers—to simulate a dynamic, team-based environment. These agents collaborate through debates, structured outputs, and risk checks. Experiments show that TradingAgents significantly improves key performance metrics over baseline models, highlighting the promise of multi-agent LLM frameworks in financial trading.

        @@ -106,19 +105,8 @@

        Introduction

        -

        Autonomous agents leveraging Large Language Models (LLMs) present a transformative approach to decision-making by replicating human processes and workflows across various applications. These systems enhance the problem-solving capabilities of language agents by equipping them with tools and enabling collaboration with other agents, effectively breaking down complex problems into manageable components. One prominent application of these autonomous frameworks is in the financial market—a highly complex system influenced by numerous factors, including company fundamentals, market sentiment, technical indicators, and macroeconomic events.

        - -

        Traditional algorithmic trading systems often rely on quantitative models that struggle to fully capture the complex interplay of diverse factors. In contrast, LLMs excel at processing and understanding natural language data, making them particularly effective for tasks that require textual comprehension, such as analyzing news articles, financial reports, and social media sentiment. Additionally, deep learning-based trading systems often suffer from low explainability, as they rely on hidden features that drive decision-making but are difficult to interpret. Recent advancements in multi-agent LLM frameworks for finance have shown significant promise in addressing these challenges. These frameworks create explainable AI systems, where decisions are supported by evidence and transparent reasoning, demonstrating their potential in financial applications.

        - -

        Despite their potential, most current applications of language agents in the financial and trading sectors face two significant limitations:

        - - Lack of Realistic Organizational Modeling: Many frameworks fail to capture the complex interactions between agents that mimic the structure of real-world trading firms. Instead, they focus narrowly on specific task performance, often disconnected from the organizational workflows and established human operating procedures proven effective in trading. This limits their ability to fully replicate and benefit from real-world trading practices. - - Inefficient Communication Interfaces: Most existing systems use natural language as the primary communication medium, typically relying on message histories or an unstructured pool of information for decision-making. This approach often results in a "telephone effect", where details are lost, and states become corrupted as conversations lengthen. Agents struggle to maintain context and track extended histories while filtering out irrelevant information from previous decision steps, diminishing their effectiveness in handling complex, dynamic tasks. Additionally, the unstructured pool-of-information approach lacks clear instructions, forcing logical communication and information exchange between agents to depend solely on retrieval, which disrupts the relational integrity of the data.

        - -

        In this work, we address these key limitations of existing models by introducing a system that overcomes these challenges. First, our framework bridges the gap by simulating the multi-agent decision-making processes typical of professional trading teams. It incorporates specialized agents tailored to distinct aspects of trading, inspired by the organizational structure of real-world trading firms. These agents include fundamental analysts, sentiment/news analysts, technical analysts, and traders with diverse risk profiles. Bullish and bearish debaters evaluate market conditions to provide balanced recommendations, while a risk management team ensures that exposures remain within acceptable limits. Second, to enhance communication, our framework combines structured outputs for control, clarity, and reasoning with natural language dialogue to facilitate effective debate and collaboration among agents. This hybrid approach ensures both precision and flexibility in decision-making.

        - -

        We validate our framework through experiments on historical financial data, comparing its performance against multiple baselines. Comprehensive evaluation metrics, including cumulative return, Sharpe ratio, and maximum drawdown, are employed to assess its overall effectiveness.

        +

        Autonomous agents equipped with Large Language Models (LLMs) can mimic human problem-solving in finance—an intricate domain shaped by fundamentals, market sentiment, and macro factors. While deep learning models have long struggled with explainability, LLM-based systems show promise by pairing structured reasoning with interpretability. However, current solutions often lack organizational realism and rely on purely conversational interfaces susceptible to context loss.

        +

        TradingAgents fills these gaps by emulating the multi-agent decision-making processes of trading firms. The framework includes fundamental, sentiment, news, and technical analysts, along with bullish and bearish researchers, traders, and a risk management team. They coordinate using structured documents and concise dialogues. Our architecture leverages specialized LLM roles, combining clarity with deeper debates. Through extensive evaluations, TradingAgents delivers robust performance across multiple assets, validating the importance of multi-agent collaboration for real-world trading systems.

        @@ -133,41 +121,37 @@

        LLMs as Financial Assistants

        -

        Large Language Models (LLMs) are applied in finance by fine-tuning on financial data or training on financial corpora. This improves the model’s understanding of financial terminology and data, enabling a specialized assistant for analytical support, insights, and information retrieval, rather than trade execution.

        +

        Specialized LLMs in finance have improved domain understanding via fine-tuning or from-scratch training on financial corpora (e.g., FinGPT, BloombergGPT). These models often excel at classification tasks but face challenges in generative quality compared to powerful general-purpose models like GPT-4.

        Fine-Tuned LLMs for Finance -

        Fine-tuning enhances domain-specific performance. Examples include PIXIU (FinMA), which fine-tuned LLaMA on 136K finance-related instructions; FinGPT, which used LoRA to fine-tune models like LLaMA and ChatGLM with about 50K finance-specific samples; and Instruct-FinGPT, fine-tuned on 10K instruction samples from financial sentiment analysis datasets. These models outperform their base versions and other open-source LLMs like BLOOM and OPT in finance classification tasks, even surpassing BloombergGPT in several evaluations. However, in generative tasks, they perform similarly or slightly worse than powerful general-purpose models like GPT-4, indicating a need for more high-quality, domain-specific datasets.

        +

        Fine-tuning boosts performance on tasks such as financial sentiment analysis. Examples include PIXIU (FinMA) and Instruct-FinGPT. They outperform generic open-source LLMs but still lag behind top-tier proprietary models in some generative tasks.

        Finance LLMs Trained from Scratch -

        Training LLMs from scratch on finance-specific corpora aims for better domain adaptation. Models like BloombergGPT, XuanYuan 2.0, and Fin-T5 combine public datasets with finance-specific data during pretraining. BloombergGPT, for instance, was trained on both general and financial text, with proprietary Bloomberg data enhancing its performance on finance benchmarks. These models outperform general-purpose counterparts like BLOOM-176B and T5 in tasks such as market sentiment classification and summarization. While they may not match larger closed-source models like GPT-3 or PaLM, they offer competitive performance among similar-sized open-source models without compromising general language understanding.

        - -

        In summary, finance-specific LLMs developed through fine-tuning or training from scratch show significant improvements in domain-specific tasks, underscoring the importance of domain adaptation and the potential for further enhancements with high-quality finance-specific datasets.

        +

        Models like BloombergGPT and XuanYuan 2.0 blend general corpora with specialized financial data, delivering strong domain-specific results. While they may not match larger closed-source models, they remain competitive among open-source counterparts.

        TradingAgents Overall Framework Organization -
        Figure 1: TradingAgents Overall Framework Organization. I. Analysts Team: Four analysts concurrently gather relevant market information. II. Research Team: The team discusses and evaluates the collected data. III. Trader: Based on the researchers' analysis, the trader makes the trading decision. IV. Risk Management Team: Risk guardians assess the decision against current market conditions to mitigate risks. V. Fund Manager: The fund manager approves and executes the trade.
        +
        Figure 1: TradingAgents Overall Framework Organization. I. Analysts Team: Four analysts concurrently gather relevant market information. II. Research Team: The team discusses data. III. Trader: Makes final decisions using debates and history. IV. Risk Management Team: Monitors risk. V. Fund Manager: Approves and executes trades.

        LLMs as Traders

        -

        LLMs act as trader agents making direct trading decisions by analyzing external data like news, financial reports, and stock prices. Proposed architectures include news-driven, reasoning-driven, and reinforcement learning (RL)-driven agents.

        +

        LLMs directly executing trades often rely on news-driven or reasoning-driven prompts, sometimes enhanced by reinforcement learning. Debate and reflection modules help overcome hallucinations and bolster factual accuracy.

        News-Driven Agents -

        News-driven architectures integrate stock news and macroeconomic updates into LLM prompts to predict stock price movements. Studies evaluating both closed-source models (e.g., GPT-3.5, GPT-4) and open-source LLMs (e.g., Qwen, Baichuan) in financial sentiment analysis have shown the effectiveness of simple long-short strategies based on sentiment scores. Further research on fine-tuned LLMs like FinGPT and OPT demonstrates improved performance through domain-specific alignment. Advanced methods involve summarizing news data and reasoning about their relationship with stock prices.

        +

        These agents use market news to gauge sentiment. Both closed-source (GPT-4) and open-source (Qwen) models show promising gains via simple sentiment-driven strategies.

        Reasoning-Driven Agents -

        Reasoning-driven agents enhance trading decisions through mechanisms like reflection and debate. Reflection-driven agents, such as FinMem and FinAgent, use layered memorization and multimodal data to summarize inputs into memories, inform decisions, and incorporate technical indicators, achieving superior backtest performance while mitigating hallucinations. Debate-driven agents, like those in heterogeneous frameworks and TradingGPT, enhance reasoning and factual validity by employing LLM debates among agents with different roles, improving sentiment classification and increasing robustness in trading decisions.

        +

        Frameworks like FinMem and TradingGPT integrate multi-round reasoning, reflection, and debates between agents with different stances, enabling more robust trading signals.

        Reinforcement Learning-Driven Agents -

        Reinforcement learning methods align LLM outputs with expected behaviors, using backtesting as rewards. SEP employs RL with memorization and reflection to refine LLM predictions based on market history. Classical RL methods are also used in trading frameworks that integrate LLM-generated embeddings with stock features, trained via algorithms like Proximal Policy Optimization (PPO).

        +

        RL aligns LLM outputs with backtest rewards, often leveraging memorized states and technical signals to refine decision-making.

        LLMs as Alpha Miners

        -

        LLMs are also used to generate alpha factors instead of making direct trading decisions. QuantAgent demonstrates this by leveraging LLMs to produce alpha factors through an inner-loop and outer-loop architecture. In the inner loop, a writer agent generates a script from a trader's idea, while a judge agent provides feedback. In the outer loop, the code is tested in the real market, and trading results enhance the judge agent. This approach enables progressive approximation of optimal behavior.

        - -

        Subsequent research, such as AlphaGPT, proposes a human-in-the-loop framework for alpha mining with a similar architecture. Both studies showcase the effectiveness of LLM-powered alpha mining systems, highlighting their potential in automating and accelerating the development of trading strategies by generating and refining alpha factors.

        +

        Some frameworks focus on generating alpha factors rather than final trades. Systems like QuantAgent and AlphaGPT iteratively refine alpha scripts through feedback from an LLM-based judge and real-market performance, accelerating systematic strategy development.

        @@ -180,15 +164,11 @@

        TradingAgents: Role Specialization

        -

        Assigning LLM agents clear, well-defined roles with specific goals enables the breakdown of complex objectives into smaller, manageable subtasks. Financial trading is a prime example of such complexity, demanding the integration of diverse signals, inputs, and specialized expertise. In the real world, this approach to managing complexity is demonstrated by trading firms that rely on expert teams to collaborate and make high-stakes decisions, underscoring the multifaceted nature of the task.

        - -

        In a typical trading firm, vast amounts of data are collected, including financial metrics, price movements, trading volumes, historical performance, economic indicators, and news sentiment. This data is then analyzed by quantitative experts (quants), including mathematicians, data scientists, and engineers, using advanced tools and algorithms to identify trends and predict market movements.

        - -

        Inspired by this organizational structure, TradingAgents defines seven distinct agent roles within a simulated trading firm: Fundamentals Analyst, Sentiment Analyst, News Analyst, Technical Analyst, Researcher, Trader, and Risk Manager. Each agent is assigned a specific name, role, goal, and set of constraints, alongside predefined context, skills, and tools tailored to their function. For example, a Sentiment Analyst is equipped with tools like web search engines, Reddit search APIs, X/Twitter search tools, and sentiment score calculation algorithms, while a Technical Analyst can execute code, calculate technical indicators, and analyze trading patterns. More specifically, TradingAgents assumes the following teams.

        +

        TradingAgents assigns each LLM agent a clear role. This mirrors how real trading firms split responsibilities—e.g., fundamental, sentiment, news, and technical analysts gather data, while researchers balance bullish and bearish arguments. A trader synthesizes these inputs, and risk managers ensure exposures stay within safe limits. This structured approach fosters comprehensive coverage of market signals.

        Analyst Team

        -

        The Analyst Team (Figure 2) is composed of specialized agents responsible for gathering and analyzing various types of market data to inform trading decisions. Each agent focuses on a specific aspect of market analysis, bringing together a comprehensive view of the market's conditions.

        +

        The analyst team (Figure 2) covers fundamental, sentiment, news, and technical aspects. Each member focuses on different market signals, providing the basis for research and trading decisions.

        TradingAgents Analyst Team @@ -196,24 +176,22 @@
          -
        • Fundamental Analyst Agents: These agents evaluate company fundamentals by analyzing financial statements, earnings reports, insider transactions, and other pertinent data. They assess a company's intrinsic value to identify undervalued or overvalued stocks, providing insights into long-term investment potential.
        • -
        • Sentiment Analyst Agents: These agents process large volumes of social media posts, sentiment scores, and insider sentiments derived from public information and social media activity. They gauge market sentiment to predict how collective investor behavior might impact stock prices in the short term.
        • -
        • News Analyst Agents: These agents analyze news articles, government announcements, and other macroeconomic indicators to assess the market's macroeconomic state, major world events, and significant company changes. They identify news events that could influence market movements, helping to anticipate sudden shifts in market dynamics.
        • -
        • Technical Analyst Agents: These agents calculate and select relevant technical indicators, such as Moving Average Convergence Divergence (MACD) and Relative Strength Index (RSI), customized for specific assets. They analyze price patterns and trading volumes to forecast future price movements, assisting in timing entry and exit points.
        • +
        • Fundamental Analysts: Evaluate intrinsic value via earnings, balance sheets, etc.
        • +
        • Sentiment Analysts: Analyze social media and public sentiment data.
        • +
        • News Analysts: Track macro events, economic indicators, and other critical news.
        • +
        • Technical Analysts: Calculate metrics like MACD/RSI to identify trends and patterns.
        - -

        Collectively, the Analyst Team synthesizes data from multiple sources to provide a holistic market analysis. Their combined insights form the foundational input for the Researcher Team, ensuring that all facets of the market are considered in subsequent decision-making processes.

        Researcher Team

        -

        The Researcher Team (Figure 3) is responsible for critically evaluating the information provided by the Analyst Team. Comprised of agents adopting both bullish and bearish perspectives, they engage in multiple rounds of debate to assess the potential risks and benefits of investment decisions.

        +

        (Figure 3) Bullish and bearish researchers debate the analysts’ findings, challenging each other’s viewpoints to produce a balanced outcome.

        TradingAgents Researcher Team -
        Figure 3: TradingAgents Researcher Team: Bullish Perspectives and Bearish Perspectives
        +
        Figure 3: TradingAgents Researcher Team
        @@ -225,47 +203,36 @@
        TradingAgents Risk Management Team and Fund Manager Approval Workflow -
        Figure 5: TradingAgents Risk Management Team and Fund Manager Approval Workflow
        +
        Figure 5: TradingAgents Risk Management and Fund Manager Workflow
          -
        • Bullish Researchers: These agents advocate for investment opportunities by highlighting positive indicators, growth potential, and favorable market conditions. They construct arguments supporting the initiation or continuation of positions in certain assets.
        • -
        • Bearish Researchers: Conversely, these agents focus on potential downsides, risks, and unfavorable market signals. They provide cautionary insights, questioning the viability of investment strategies and highlighting possible negative outcomes.
        • +
        • Bullish Researchers: Highlight favorable signals and positive growth opportunities.
        • +
        • Bearish Researchers: Emphasize caution, identifying risks or negative signals.
        - -

        Through this dialectical process, the Researcher Team aims to reach a balanced understanding of the market situation. Their thorough analysis helps in identifying the most promising investment strategies while anticipating possible challenges, thus aiding the Trader Agents in making informed decisions.

        Trader Agents

        -

        Trader Agents (Figure 4) are responsible for executing trading decisions based on the comprehensive analysis provided by the Analyst Team and the nuanced perspectives from the Researcher Team. They assess the synthesized information, considering both quantitative data and qualitative insights, to determine optimal trading actions.

        - +

        (Figure 4) Trader agents synthesize all insights to form buy/sell decisions, weighing returns against potential downside.

          -
        • Evaluating recommendations and insights from analysts and researchers.
        • -
        • Deciding on the timing and size of trades to maximize trading returns.
        • -
        • Placing buy or sell orders in the market.
        • -
        • Adjusting portfolio allocations in response to market changes and new information.
        • +
        • Review data from analysts and researchers.
        • +
        • Determine optimal trade timing and size.
        • +
        • Execute orders and manage portfolios.
        - -

        Trader Agents must balance potential returns against associated risks, making timely decisions in a dynamic market environment. Their actions directly impact the firm's performance, necessitating a high level of precision and strategic thinking.

        Risk Management Team

        -

        The Risk Management Team (Figure 5) monitors and controls the firm's exposure to various market risks. These agents continuously evaluate the portfolio's risk profile, ensuring that trading activities remain within predefined risk parameters and comply with regulatory requirements.

        - +

        (Figure 5) Risk managers ensure safety by evaluating volatility, liquidity, and other exposures. They enforce stop-loss measures and signal portfolio rebalancing when necessary.

          -
        • Assessing factors such as market volatility, liquidity, and counterparty risks.
        • -
        • Implementing risk mitigation strategies, such as setting stop-loss orders or diversifying holdings.
        • -
        • Providing feedback to Trader Agents on risk exposures and suggesting adjustments to trading strategies.
        • -
        • Ensuring that the overall portfolio aligns with the firm's risk tolerance and investment objectives.
        • +
        • Monitor market risk factors.
        • +
        • Adjust trading strategies to stay within risk limits.
        • +
        • Collaborate with traders to manage drawdowns.
        - -

        By offering oversight and guidance, the Risk Management Team helps maintain the firm's financial stability and protect against adverse market events. They play a crucial role in safeguarding assets and ensuring sustainable long-term performance.

        - -

        All agents in TradingAgents follow the ReAct prompting framework, which synergizes reasoning and acting. The environment state is shared and monitored by the agents, enabling them to take context-appropriate actions such as conducting research, executing trades, engaging in debates, or managing risks. This design ensures a collaborative, dynamic decision-making process reflective of real-world trading systems.

        +

        All agents follow a ReAct-style prompting framework. Their actions—like research, debate, or trade execution—are tracked in a shared environment, creating a cohesive multi-agent ecosystem reminiscent of real trading firms.

        @@ -280,30 +247,18 @@

        TradingAgents: Agent Workflow

        Communication Protocol

        -

        Most existing LLM-based agent frameworks use natural language as the primary communication interface, typically through structured message histories or collections of agent-generated messages. However, relying solely on natural language often proves insufficient for solving complex, long-term tasks that require extensive planning horizons. In such cases, pure natural language communication can resemble a game of telephone—over multiple iterations, initial information may be forgotten or distorted due to context length limitations and an overload of text that obscures critical earlier details. To address this limitation, we draw inspiration from frameworks like MetaGPT, which adopt a structured approach to communication. Our model introduces a structured communication protocol to govern agent interactions. By clearly defining each agent's state, we ensure that each role only extracts or queries the necessary information, processes it, and returns a completed report. This streamlined approach reduces unnecessary steps, lowers the risk of message corruption, and keeps interactions focused and efficient, even in complex, long-horizon tasks.

        +

        Relying solely on natural language can lead to “telephone effect” issues for complex, long-horizon tasks. TradingAgents introduces structured reports to preserve key details and reduce message distortion, drawing inspiration from frameworks like MetaGPT. Each agent produces or queries structured entries—concise and focused—to streamline interactions.

        Types of Agent Interactions

        -

        In contrast to previous multi-agent trading frameworks, which rely heavily on natural language dialogue, TradingAgents agents communicate primarily through structured documents and diagrams. These documents encapsulate the agents' insights in concise, well-organized reports that preserve essential content while avoiding irrelevant information. By utilizing structured reports, agents can query necessary details directly from the global state, eliminating the need for lengthy conversations that risk diluting information, extending the message state indefinitely, and causing data loss. The types of documents and the information they contain are detailed below:

        - +

        Instead of lengthy dialogues, TradingAgents agents exchange structured documents containing critical data. Short natural language debates occur when merging contrasting opinions (e.g., bullish vs. bearish). Key communication types include:

          -
        • Analyst Team: Fundamental, sentiment, news, and technical analysts compile their research and findings into concise analysis reports specific to their areas of expertise. These reports include key metrics, insights, and recommendations based on their specialized analyses.
        • -
        • Traders: Traders review and analyze the reports from the analysts, carefully deliberating to produce clear decision signals. They accompany these decisions with detailed reports explaining their rationale and supporting evidence, which are later utilized by the risk management team.
        • -
        - -

        Agents engage in natural language dialogue exclusively during agent-to-agent conversations and debates. These concise, focused discussions have been shown to promote deeper reasoning and integrate diverse perspectives, enabling more balanced decisions in complex, long-horizon scenarios—a method particularly relevant to the intricate environment of trading. This approach seamlessly integrates with our structured framework, as the conversation state is recorded as a structured entry within the overall agent state. The types of communication in these scenarios are detailed below:

        - -
          -
        • Researcher Team: Each researcher agent queries the global agent state for analyst reports and carefully forms their opinion. Two researchers represent opposing perspectives: one bullish and one bearish. They engage in natural language dialogue for $n$ rounds, as determined by the debate facilitator agent. At the conclusion, the facilitator reviews the debate history, selects the prevailing perspective, and records it as a structured entry in the communication protocol.
        • -
        • Risk Management Team: The risk management team, similar to the researcher team, queries the trader's decision and accompanying report. They then deliberate from three perspectives—risk-seeking, neutral, and risk-conservative—to adjust the trading plan within risk constraints. They engage in $n$ rounds of natural language discussion, guided by a facilitator agent.
        • -
        • Fund Manager: The fund manager reviews the discussion from the risk management team, determines the appropriate risk adjustments, and updates the trader's decision and report states within the communication protocol.
        • +
        • Analyst Team: Each analyst produces specialized reports (fundamentals, sentiment, etc.).
        • +
        • Traders: Combine analyst reports into a decision signal with accompanying rationale.
        +

        Debates among researchers or risk managers occur in natural language but are recorded as structured entries. This approach maintains clarity while enabling multi-round reasoning.

        Backbone LLMs

        -

        To meet the diverse complexity and speed demands of tasks in our framework, we strategically select Large Language Models (LLMs) based on their strengths. Quick-thinking models, such as gpt-4o-mini and gpt-4o, efficiently handle fast, low-depth tasks like summarization, data retrieval, and converting tabular data to text. In contrast, deep-thinking models like o1-preview excel in reasoning-intensive tasks such as decision-making, evidence-based report writing, and data analysis. These models leverage their architectures for multi-round reasoning, producing logically sound, in-depth insights. Additionally, we prioritize models with proven reliability and scalability to ensure optimal performance across various market conditions. We also employ auxiliary expert models for specialized tasks like sentiment analysis.

        - -

        Specifically, all analyst nodes rely on deep-thinking models to ensure robust analysis, while quick-thinking models handle data retrieval from APIs and tools for efficiency. Researchers and traders use deep-thinking models to generate valuable insights and support well-informed decisions. By aligning the choice of LLMs with the specific requirements of each task, our framework achieves a balance between efficiency and depth of reasoning, which is crucial for effective trading strategies.

        - -

        This implementation strategy ensures that TradingAgents can be deployed without requiring a GPU, relying only on API credits. It also introduces seamless exchangeability of backbone models, enabling researchers to effortlessly replace the model with any locally hosted or API-accessible alternatives in the future. This adaptability supports the integration of improved reasoning models or finance-tuned models customized for specific tasks. As a result, TradingAgents is highly scalable and future-proof, offering flexibility to accommodate any backbone model for any of its agents.

        +

        We employ both “quick-thinking” and “deep-thinking” LLMs, choosing models based on complexity and speed requirements. Analysts and traders use robust reasoning models for decision-making, while simpler tasks (e.g., data retrieval) rely on faster LLMs. This modular design, requiring no GPUs, allows easy swapping of different local or API-based models and ensures future scalability.

        @@ -316,48 +271,26 @@

        Experiments

        -

        In this section, we describe the experimental setup used to evaluate our proposed framework. We also provide detailed descriptions of the evaluation metrics employed to assess performance comprehensively.

        +

        We evaluate our framework on multi-asset data spanning a realistic time period, combining historical prices, news, social sentiment, insider transactions, and more. Baselines include traditional strategies like Buy-and-Hold, MACD, and SMA, ensuring a fair comparison.

        Back Trading

        -

        To simulate a realistic trading environment, we utilize a multi-asset and multi-modal financial dataset comprising of various stocks such as Apple, Nvidia, Microsoft, Meta, Google, and more. The dataset includes:

        - -
          -
        • Historical Stock Prices: Open, high, low, close, volume, and adjusted close prices from January 1st, 2024, to March 29th, 2024.
        • -
        • News Articles: Daily news updates are gathered from diverse sources such as Bloomberg, Yahoo, EODHD, FinnHub, and Reddit, covering specific company developments, global events, macroeconomic trends, and government updates.
        • -
        • Social Media Posts and Sentiment: Posts from Reddit, X/Twitter, and other platforms along with sentiment scores of posts calculated by auxiliary language models.
        • -
        • Insider Sentiments and Transactions: Sentiment derived from public information, including transactions from SEDI and relevant company filings.
        • -
        • Financial Statements and Earnings Reports: Quarterly and annual reports filed by companies.
        • -
        • Company Profiles and Financial History: Descriptions of company profiles, target industries, and financial history reported by third parties.
        • -
        • Technical Indicators: Sixty standard technical analysis indicators calculated for each asset, including MACD, RSI, Bollinger Bands, etc.
        • -
        +

        Our dataset includes stocks like Apple and Google, daily news, social media sentiment, and technical indicators. Agents process only the data available up to each trading day, avoiding look-ahead bias.

        Simulation Setup

        -

        We simulate the trading environment for the period from June 19, 2024, to November 19, 2024. TradingAgents facilitates seamless plug-and-play strategies during the simulation, enabling straightforward comparisons with any baseline. Agents make decisions based solely on data available up to each trading day, ensuring no future data is used (eliminating look-ahead bias). Based on their analysis, TradingAgents generates trading signals to buy, sell, or hold assets, which are then executed. Afterward, analysis metrics are calculated before proceeding to the next day's data.

        +

        The simulation runs from June 19, 2024, to November 19, 2024. TradingAgents autonomously generates buy, sell, or hold signals, then records performance metrics. This daily cycle repeats for each asset under study.

        Baseline Models

        -

        We compare our TradingAgents framework against several baselines:

        - +

        We benchmark against several baselines:

          -
        • Buy and Hold: Investing equal amounts in all selected stocks and holding them throughout the simulation period.
        • -
        • MACD (Moving Average Convergence Divergence): A trend-following momentum strategy that generates buy and sell signals based on the crossover points between the MACD line and signal line.
        • -
        • KDJ and RSI (Relative Strength Index): A momentum strategy combining KDJ (stochastic oscillator) and RSI (relative strength index) indicators to identify overbought and oversold conditions for trading signals.
        • -
        • ZMR (Zero Mean Reversion): A mean reversion trading strategy that generates signals based on price deviations from and subsequent reversions to a zero reference line.
        • -
        • SMA (Simple Moving Average): A trend-following strategy that generates trading signals based on crossovers between short-term and long-term moving averages.
        • +
        • Buy and Hold
        • +
        • MACD
        • +
        • KDJ and RSI
        • +
        • ZMR
        • +
        • SMA

        Evaluation Metrics

        - -
        +
        Cumulative Returns on AAPL @@ -374,14 +307,8 @@
        -
        - - +
        +
        @@ -477,15 +404,13 @@

        Table 1: TradingAgents: Comparison of Performance Metrics across AAPL, GOOGL, and AMZN.

        Sharpe Ratio

        -

        The Sharpe Ratio performance highlights TradingAgents' exceptional ability to deliver superior risk-adjusted returns, consistently outperforming all baseline models across AAPL, GOOGL, and AMZN with Sharpe Ratios of at least 5.60—surpassing the next best models by a significant margin of at least 2.07 points. This result underscores TradingAgents' effectiveness in balancing returns against risk, a critical metric for sustainable and predictable investment growth. By excelling over market benchmarks like Buy-and-Hold and advanced strategies such as KDJRSI, SMA, MACD, and ZMR, TradingAgents demonstrates its adaptability and robustness in diverse market conditions. Its ability to maximize returns while maintaining controlled risk exposure establishes a solid foundation for multi-agent and debate-based automated trading algorithms.

        +

        TradingAgents consistently beats all baselines in risk-adjusted returns, showing Sharpe Ratios above 5.60 and surpassing the nearest competitors by at least 2.07 points. Its adaptability and robust debate mechanism enable high returns with controlled risk.

        Maximum Drawdown

        -

        While rule-based baselines demonstrated superior performance in controlling risk, as reflected by their maximum drawdown scores, they fell short in capturing high returns. This trade-off between risk and reward underscores TradingAgents' strength as a balanced approach. Despite higher returns being typically associated with higher risks, TradingAgents maintained a relatively low maximum drawdown compared to many baselines. Its effective risk-control mechanisms, facilitated by the debates among risk-control agents, ensured that the maximum drawdown remained within a manageable limit, not exceeding 2%. This demonstrates TradingAgents' capability to strike a robust balance between maximizing returns and managing risk effectively.

        +

        Rule-based baselines limit downside but sacrifice overall returns. TradingAgents balances both, keeping maximum drawdown below 2% while generating superior returns, aided by dedicated risk-control agents.

        Explainability

        -

        A significant drawback of current deep learning methods for trading is their dense and complex architectures, which often render the decisions made by trading agents indecipherable to humans. This challenge, rooted in the broader issue of AI explainability, is particularly critical for trading agents, as they operate in real-world financial markets, often involving substantial sums of money where incorrect decisions can lead to severe consequences and losses.

        - -

        In contrast, an LLM-based agentic framework for trading offers a transformative advantage: its operations and decisions are communicated in natural language, making them highly interpretable to humans. To illustrate this, we provide the full trading log of TradingAgents for a single day in the Appendix, showcasing its use of the ReAct-style prompting framework. Each decision made by the agents is accompanied by detailed reasoning, tool usage, and thought processes, enabling traders to easily understand and debug the system. This transparency empowers traders to fine-tune and adjust the framework to account for factors influencing decisions, offering a significant edge in explainability over traditional deep-learning trading algorithms.

        +

        Unlike dense deep-learning models, TradingAgents provides transparent logs of its ReAct-style reasoning for every trade decision. This approach greatly enhances human interpretability, facilitating debugging and fine-tuning in real markets.

        @@ -498,7 +423,7 @@

        Conclusion

        -

        In this paper, we introduced TradingAgents, an LLM-agent-powered stock trading framework that simulates a realistic trading firm environment with multiple specialized agents engaging in agentic debates and conversations. Leveraging the capabilities of LLMs to process and analyze diverse data sources, the framework enables informed trading decisions while utilizing multi-agent interactions to enhance performance through comprehensive reasoning and debate before acting. By integrating agents with distinct roles and risk profiles, along with a reflective agent and a dedicated risk management team, TradingAgents significantly improves trading outcomes and risk management compared to baseline models. Additionally, the collaborative nature of these agents ensures adaptability to varying market conditions. Extensive experiments demonstrate that TradingAgents outperforms traditional trading strategies and baselines in cumulative return, Sharpe ratio, and other critical metrics. Future work will focus on deploying the framework in a live trading environment, expanding agent roles, and incorporating real-time data processing to enhance performance further.

        +

        We introduced TradingAgents, a multi-agent LLM trading framework inspired by professional trading firms. Its specialized analysts, researcher debates, and risk management teams create a rich decision-making ecosystem. By effectively combining structured reports and targeted dialogues, TradingAgents exceeds baseline performance across returns, Sharpe ratio, and drawdown metrics. Future work will explore live trading, expanded agent roles, and real-time data integration for even more refined trading outcomes.

        diff --git a/index_complete.html b/index_complete.html new file mode 100644 index 00000000..64240731 --- /dev/null +++ b/index_complete.html @@ -0,0 +1,525 @@ + + + + + + + + TradingAgents: Multi-Agents LLM Financial Trading Framework + + + + + + + + + + + + + + + + + + + + + + +
        +
        +
        +
        +
        +

        TradingAgents: Multi-Agents LLM Financial Trading Framework

        +
        + Yijia Xiao1, + Edward Sun1, + Di Luo2, + Wei Wang1 +
        +
        + 1University of California, Los Angeles, + 2Massachusetts Institute of Technology +
        +
        + +
        +
        +
        +
        +
        +
        + +
        +
        +
        +
        +

        Abstract

        +
        +

        Significant progress has been made in automated problem-solving using societies of agents powered by large language models (LLMs). In finance, efforts have largely focused on single-agent systems handling specific tasks or multi-agent frameworks independently gathering data. However, multi-agent systems' potential to replicate real-world trading firms' collaborative dynamics remains underexplored. TradingAgents proposes a novel stock trading framework inspired by trading firms, featuring LLM-powered agents in specialized roles such as fundamental analysts, sentiment analysts, technical analysts, and traders with varied risk profiles. The framework includes Bull and Bear researcher agents assessing market conditions, a risk management team monitoring exposure, and traders synthesizing insights from debates and historical data to make informed decisions. By simulating a dynamic, collaborative trading environment, this framework aims to improve trading performance. Detailed architecture and extensive experiments reveal its superiority over baseline models, with notable improvements in cumulative returns, Sharpe ratio, and maximum drawdown, highlighting the potential of multi-agent LLM frameworks in financial trading.

        +
        +
        +
        +
        +
        + +
        +
        +
        +
        +

        Introduction

        +
        +

        Autonomous agents leveraging Large Language Models (LLMs) present a transformative approach to decision-making by replicating human processes and workflows across various applications. These systems enhance the problem-solving capabilities of language agents by equipping them with tools and enabling collaboration with other agents, effectively breaking down complex problems into manageable components. One prominent application of these autonomous frameworks is in the financial market—a highly complex system influenced by numerous factors, including company fundamentals, market sentiment, technical indicators, and macroeconomic events.

        + +

        Traditional algorithmic trading systems often rely on quantitative models that struggle to fully capture the complex interplay of diverse factors. In contrast, LLMs excel at processing and understanding natural language data, making them particularly effective for tasks that require textual comprehension, such as analyzing news articles, financial reports, and social media sentiment. Additionally, deep learning-based trading systems often suffer from low explainability, as they rely on hidden features that drive decision-making but are difficult to interpret. Recent advancements in multi-agent LLM frameworks for finance have shown significant promise in addressing these challenges. These frameworks create explainable AI systems, where decisions are supported by evidence and transparent reasoning, demonstrating their potential in financial applications.

        + +

        Despite their potential, most current applications of language agents in the financial and trading sectors face two significant limitations:

        + + Lack of Realistic Organizational Modeling: Many frameworks fail to capture the complex interactions between agents that mimic the structure of real-world trading firms. Instead, they focus narrowly on specific task performance, often disconnected from the organizational workflows and established human operating procedures proven effective in trading. This limits their ability to fully replicate and benefit from real-world trading practices. + + Inefficient Communication Interfaces: Most existing systems use natural language as the primary communication medium, typically relying on message histories or an unstructured pool of information for decision-making. This approach often results in a "telephone effect", where details are lost, and states become corrupted as conversations lengthen. Agents struggle to maintain context and track extended histories while filtering out irrelevant information from previous decision steps, diminishing their effectiveness in handling complex, dynamic tasks. Additionally, the unstructured pool-of-information approach lacks clear instructions, forcing logical communication and information exchange between agents to depend solely on retrieval, which disrupts the relational integrity of the data.

        + +

        In this work, we address these key limitations of existing models by introducing a system that overcomes these challenges. First, our framework bridges the gap by simulating the multi-agent decision-making processes typical of professional trading teams. It incorporates specialized agents tailored to distinct aspects of trading, inspired by the organizational structure of real-world trading firms. These agents include fundamental analysts, sentiment/news analysts, technical analysts, and traders with diverse risk profiles. Bullish and bearish debaters evaluate market conditions to provide balanced recommendations, while a risk management team ensures that exposures remain within acceptable limits. Second, to enhance communication, our framework combines structured outputs for control, clarity, and reasoning with natural language dialogue to facilitate effective debate and collaboration among agents. This hybrid approach ensures both precision and flexibility in decision-making.

        + +

        We validate our framework through experiments on historical financial data, comparing its performance against multiple baselines. Comprehensive evaluation metrics, including cumulative return, Sharpe ratio, and maximum drawdown, are employed to assess its overall effectiveness.

        +
        +
        +
        +
        +
        + +
        +
        +
        +
        +

        Related Work

        + +

        LLMs as Financial Assistants

        +
        +

        Large Language Models (LLMs) are applied in finance by fine-tuning on financial data or training on financial corpora. This improves the model’s understanding of financial terminology and data, enabling a specialized assistant for analytical support, insights, and information retrieval, rather than trade execution.

        + + Fine-Tuned LLMs for Finance +

        Fine-tuning enhances domain-specific performance. Examples include PIXIU (FinMA), which fine-tuned LLaMA on 136K finance-related instructions; FinGPT, which used LoRA to fine-tune models like LLaMA and ChatGLM with about 50K finance-specific samples; and Instruct-FinGPT, fine-tuned on 10K instruction samples from financial sentiment analysis datasets. These models outperform their base versions and other open-source LLMs like BLOOM and OPT in finance classification tasks, even surpassing BloombergGPT in several evaluations. However, in generative tasks, they perform similarly or slightly worse than powerful general-purpose models like GPT-4, indicating a need for more high-quality, domain-specific datasets.

        + + Finance LLMs Trained from Scratch +

        Training LLMs from scratch on finance-specific corpora aims for better domain adaptation. Models like BloombergGPT, XuanYuan 2.0, and Fin-T5 combine public datasets with finance-specific data during pretraining. BloombergGPT, for instance, was trained on both general and financial text, with proprietary Bloomberg data enhancing its performance on finance benchmarks. These models outperform general-purpose counterparts like BLOOM-176B and T5 in tasks such as market sentiment classification and summarization. While they may not match larger closed-source models like GPT-3 or PaLM, they offer competitive performance among similar-sized open-source models without compromising general language understanding.

        + +

        In summary, finance-specific LLMs developed through fine-tuning or training from scratch show significant improvements in domain-specific tasks, underscoring the importance of domain adaptation and the potential for further enhancements with high-quality finance-specific datasets.

        + +
        + TradingAgents Overall Framework Organization +
        Figure 1: TradingAgents Overall Framework Organization. I. Analysts Team: Four analysts concurrently gather relevant market information. II. Research Team: The team discusses and evaluates the collected data. III. Trader: Based on the researchers' analysis, the trader makes the trading decision. IV. Risk Management Team: Risk guardians assess the decision against current market conditions to mitigate risks. V. Fund Manager: The fund manager approves and executes the trade.
        +
        +
        + +

        LLMs as Traders

        +
        +

        LLMs act as trader agents making direct trading decisions by analyzing external data like news, financial reports, and stock prices. Proposed architectures include news-driven, reasoning-driven, and reinforcement learning (RL)-driven agents.

        + + News-Driven Agents +

        News-driven architectures integrate stock news and macroeconomic updates into LLM prompts to predict stock price movements. Studies evaluating both closed-source models (e.g., GPT-3.5, GPT-4) and open-source LLMs (e.g., Qwen, Baichuan) in financial sentiment analysis have shown the effectiveness of simple long-short strategies based on sentiment scores. Further research on fine-tuned LLMs like FinGPT and OPT demonstrates improved performance through domain-specific alignment. Advanced methods involve summarizing news data and reasoning about their relationship with stock prices.

        + + Reasoning-Driven Agents +

        Reasoning-driven agents enhance trading decisions through mechanisms like reflection and debate. Reflection-driven agents, such as FinMem and FinAgent, use layered memorization and multimodal data to summarize inputs into memories, inform decisions, and incorporate technical indicators, achieving superior backtest performance while mitigating hallucinations. Debate-driven agents, like those in heterogeneous frameworks and TradingGPT, enhance reasoning and factual validity by employing LLM debates among agents with different roles, improving sentiment classification and increasing robustness in trading decisions.

        + + Reinforcement Learning-Driven Agents +

        Reinforcement learning methods align LLM outputs with expected behaviors, using backtesting as rewards. SEP employs RL with memorization and reflection to refine LLM predictions based on market history. Classical RL methods are also used in trading frameworks that integrate LLM-generated embeddings with stock features, trained via algorithms like Proximal Policy Optimization (PPO).

        +
        + +

        LLMs as Alpha Miners

        +
        +

        LLMs are also used to generate alpha factors instead of making direct trading decisions. QuantAgent demonstrates this by leveraging LLMs to produce alpha factors through an inner-loop and outer-loop architecture. In the inner loop, a writer agent generates a script from a trader's idea, while a judge agent provides feedback. In the outer loop, the code is tested in the real market, and trading results enhance the judge agent. This approach enables progressive approximation of optimal behavior.

        + +

        Subsequent research, such as AlphaGPT, proposes a human-in-the-loop framework for alpha mining with a similar architecture. Both studies showcase the effectiveness of LLM-powered alpha mining systems, highlighting their potential in automating and accelerating the development of trading strategies by generating and refining alpha factors.

        +
        +
        +
        +
        +
        + +
        +
        +
        +
        +

        TradingAgents: Role Specialization

        +
        +

        Assigning LLM agents clear, well-defined roles with specific goals enables the breakdown of complex objectives into smaller, manageable subtasks. Financial trading is a prime example of such complexity, demanding the integration of diverse signals, inputs, and specialized expertise. In the real world, this approach to managing complexity is demonstrated by trading firms that rely on expert teams to collaborate and make high-stakes decisions, underscoring the multifaceted nature of the task.

        + +

        In a typical trading firm, vast amounts of data are collected, including financial metrics, price movements, trading volumes, historical performance, economic indicators, and news sentiment. This data is then analyzed by quantitative experts (quants), including mathematicians, data scientists, and engineers, using advanced tools and algorithms to identify trends and predict market movements.

        + +

        Inspired by this organizational structure, TradingAgents defines seven distinct agent roles within a simulated trading firm: Fundamentals Analyst, Sentiment Analyst, News Analyst, Technical Analyst, Researcher, Trader, and Risk Manager. Each agent is assigned a specific name, role, goal, and set of constraints, alongside predefined context, skills, and tools tailored to their function. For example, a Sentiment Analyst is equipped with tools like web search engines, Reddit search APIs, X/Twitter search tools, and sentiment score calculation algorithms, while a Technical Analyst can execute code, calculate technical indicators, and analyze trading patterns. More specifically, TradingAgents assumes the following teams.

        + +

        Analyst Team

        +
        +

        The Analyst Team (Figure 2) is composed of specialized agents responsible for gathering and analyzing various types of market data to inform trading decisions. Each agent focuses on a specific aspect of market analysis, bringing together a comprehensive view of the market's conditions.

        + +
        + TradingAgents Analyst Team +
        Figure 2: TradingAgents Analyst Team
        +
        + +
          +
        • Fundamental Analyst Agents: These agents evaluate company fundamentals by analyzing financial statements, earnings reports, insider transactions, and other pertinent data. They assess a company's intrinsic value to identify undervalued or overvalued stocks, providing insights into long-term investment potential.
        • +
        • Sentiment Analyst Agents: These agents process large volumes of social media posts, sentiment scores, and insider sentiments derived from public information and social media activity. They gauge market sentiment to predict how collective investor behavior might impact stock prices in the short term.
        • +
        • News Analyst Agents: These agents analyze news articles, government announcements, and other macroeconomic indicators to assess the market's macroeconomic state, major world events, and significant company changes. They identify news events that could influence market movements, helping to anticipate sudden shifts in market dynamics.
        • +
        • Technical Analyst Agents: These agents calculate and select relevant technical indicators, such as Moving Average Convergence Divergence (MACD) and Relative Strength Index (RSI), customized for specific assets. They analyze price patterns and trading volumes to forecast future price movements, assisting in timing entry and exit points.
        • +
        + +

        Collectively, the Analyst Team synthesizes data from multiple sources to provide a holistic market analysis. Their combined insights form the foundational input for the Researcher Team, ensuring that all facets of the market are considered in subsequent decision-making processes.

        +
        + +

        Researcher Team

        +
        +

        The Researcher Team (Figure 3) is responsible for critically evaluating the information provided by the Analyst Team. Comprised of agents adopting both bullish and bearish perspectives, they engage in multiple rounds of debate to assess the potential risks and benefits of investment decisions.

        + +
        +
        +
        + TradingAgents Researcher Team +
        Figure 3: TradingAgents Researcher Team: Bullish Perspectives and Bearish Perspectives
        +
        +
        +
        +
        + TradingAgents Trader Decision-Making Process +
        Figure 4: TradingAgents Trader Decision-Making Process
        +
        +
        +
        +
        + TradingAgents Risk Management Team and Fund Manager Approval Workflow +
        Figure 5: TradingAgents Risk Management Team and Fund Manager Approval Workflow
        +
        +
        +
        + +
          +
        • Bullish Researchers: These agents advocate for investment opportunities by highlighting positive indicators, growth potential, and favorable market conditions. They construct arguments supporting the initiation or continuation of positions in certain assets.
        • +
        • Bearish Researchers: Conversely, these agents focus on potential downsides, risks, and unfavorable market signals. They provide cautionary insights, questioning the viability of investment strategies and highlighting possible negative outcomes.
        • +
        + +

        Through this dialectical process, the Researcher Team aims to reach a balanced understanding of the market situation. Their thorough analysis helps in identifying the most promising investment strategies while anticipating possible challenges, thus aiding the Trader Agents in making informed decisions.

        +
        + +

        Trader Agents

        +
        +

        Trader Agents (Figure 4) are responsible for executing trading decisions based on the comprehensive analysis provided by the Analyst Team and the nuanced perspectives from the Researcher Team. They assess the synthesized information, considering both quantitative data and qualitative insights, to determine optimal trading actions.

        + +
          +
        • Evaluating recommendations and insights from analysts and researchers.
        • +
        • Deciding on the timing and size of trades to maximize trading returns.
        • +
        • Placing buy or sell orders in the market.
        • +
        • Adjusting portfolio allocations in response to market changes and new information.
        • +
        + +

        Trader Agents must balance potential returns against associated risks, making timely decisions in a dynamic market environment. Their actions directly impact the firm's performance, necessitating a high level of precision and strategic thinking.

        +
        + +

        Risk Management Team

        +
        +

        The Risk Management Team (Figure 5) monitors and controls the firm's exposure to various market risks. These agents continuously evaluate the portfolio's risk profile, ensuring that trading activities remain within predefined risk parameters and comply with regulatory requirements.

        + +
          +
        • Assessing factors such as market volatility, liquidity, and counterparty risks.
        • +
        • Implementing risk mitigation strategies, such as setting stop-loss orders or diversifying holdings.
        • +
        • Providing feedback to Trader Agents on risk exposures and suggesting adjustments to trading strategies.
        • +
        • Ensuring that the overall portfolio aligns with the firm's risk tolerance and investment objectives.
        • +
        + +

        By offering oversight and guidance, the Risk Management Team helps maintain the firm's financial stability and protect against adverse market events. They play a crucial role in safeguarding assets and ensuring sustainable long-term performance.

        + +

        All agents in TradingAgents follow the ReAct prompting framework, which synergizes reasoning and acting. The environment state is shared and monitored by the agents, enabling them to take context-appropriate actions such as conducting research, executing trades, engaging in debates, or managing risks. This design ensures a collaborative, dynamic decision-making process reflective of real-world trading systems.

        +
        +
        +
        +
        +
        +
        + +
        +
        +
        +
        +

        TradingAgents: Agent Workflow

        +
        +

        Communication Protocol

        +

        Most existing LLM-based agent frameworks use natural language as the primary communication interface, typically through structured message histories or collections of agent-generated messages. However, relying solely on natural language often proves insufficient for solving complex, long-term tasks that require extensive planning horizons. In such cases, pure natural language communication can resemble a game of telephone—over multiple iterations, initial information may be forgotten or distorted due to context length limitations and an overload of text that obscures critical earlier details. To address this limitation, we draw inspiration from frameworks like MetaGPT, which adopt a structured approach to communication. Our model introduces a structured communication protocol to govern agent interactions. By clearly defining each agent's state, we ensure that each role only extracts or queries the necessary information, processes it, and returns a completed report. This streamlined approach reduces unnecessary steps, lowers the risk of message corruption, and keeps interactions focused and efficient, even in complex, long-horizon tasks.

        + +

        Types of Agent Interactions

        +

        In contrast to previous multi-agent trading frameworks, which rely heavily on natural language dialogue, TradingAgents agents communicate primarily through structured documents and diagrams. These documents encapsulate the agents' insights in concise, well-organized reports that preserve essential content while avoiding irrelevant information. By utilizing structured reports, agents can query necessary details directly from the global state, eliminating the need for lengthy conversations that risk diluting information, extending the message state indefinitely, and causing data loss. The types of documents and the information they contain are detailed below:

        + +
          +
        • Analyst Team: Fundamental, sentiment, news, and technical analysts compile their research and findings into concise analysis reports specific to their areas of expertise. These reports include key metrics, insights, and recommendations based on their specialized analyses.
        • +
        • Traders: Traders review and analyze the reports from the analysts, carefully deliberating to produce clear decision signals. They accompany these decisions with detailed reports explaining their rationale and supporting evidence, which are later utilized by the risk management team.
        • +
        + +

        Agents engage in natural language dialogue exclusively during agent-to-agent conversations and debates. These concise, focused discussions have been shown to promote deeper reasoning and integrate diverse perspectives, enabling more balanced decisions in complex, long-horizon scenarios—a method particularly relevant to the intricate environment of trading. This approach seamlessly integrates with our structured framework, as the conversation state is recorded as a structured entry within the overall agent state. The types of communication in these scenarios are detailed below:

        + +
          +
        • Researcher Team: Each researcher agent queries the global agent state for analyst reports and carefully forms their opinion. Two researchers represent opposing perspectives: one bullish and one bearish. They engage in natural language dialogue for $n$ rounds, as determined by the debate facilitator agent. At the conclusion, the facilitator reviews the debate history, selects the prevailing perspective, and records it as a structured entry in the communication protocol.
        • +
        • Risk Management Team: The risk management team, similar to the researcher team, queries the trader's decision and accompanying report. They then deliberate from three perspectives—risk-seeking, neutral, and risk-conservative—to adjust the trading plan within risk constraints. They engage in $n$ rounds of natural language discussion, guided by a facilitator agent.
        • +
        • Fund Manager: The fund manager reviews the discussion from the risk management team, determines the appropriate risk adjustments, and updates the trader's decision and report states within the communication protocol.
        • +
        + +

        Backbone LLMs

        +

        To meet the diverse complexity and speed demands of tasks in our framework, we strategically select Large Language Models (LLMs) based on their strengths. Quick-thinking models, such as gpt-4o-mini and gpt-4o, efficiently handle fast, low-depth tasks like summarization, data retrieval, and converting tabular data to text. In contrast, deep-thinking models like o1-preview excel in reasoning-intensive tasks such as decision-making, evidence-based report writing, and data analysis. These models leverage their architectures for multi-round reasoning, producing logically sound, in-depth insights. Additionally, we prioritize models with proven reliability and scalability to ensure optimal performance across various market conditions. We also employ auxiliary expert models for specialized tasks like sentiment analysis.

        + +

        Specifically, all analyst nodes rely on deep-thinking models to ensure robust analysis, while quick-thinking models handle data retrieval from APIs and tools for efficiency. Researchers and traders use deep-thinking models to generate valuable insights and support well-informed decisions. By aligning the choice of LLMs with the specific requirements of each task, our framework achieves a balance between efficiency and depth of reasoning, which is crucial for effective trading strategies.

        + +

        This implementation strategy ensures that TradingAgents can be deployed without requiring a GPU, relying only on API credits. It also introduces seamless exchangeability of backbone models, enabling researchers to effortlessly replace the model with any locally hosted or API-accessible alternatives in the future. This adaptability supports the integration of improved reasoning models or finance-tuned models customized for specific tasks. As a result, TradingAgents is highly scalable and future-proof, offering flexibility to accommodate any backbone model for any of its agents.

        +
        +
        +
        +
        +
        + +
        +
        +
        +
        +

        Experiments

        +
        +

        In this section, we describe the experimental setup used to evaluate our proposed framework. We also provide detailed descriptions of the evaluation metrics employed to assess performance comprehensively.

        + +

        Back Trading

        +

        To simulate a realistic trading environment, we utilize a multi-asset and multi-modal financial dataset comprising of various stocks such as Apple, Nvidia, Microsoft, Meta, Google, and more. The dataset includes:

        + +
          +
        • Historical Stock Prices: Open, high, low, close, volume, and adjusted close prices from January 1st, 2024, to March 29th, 2024.
        • +
        • News Articles: Daily news updates are gathered from diverse sources such as Bloomberg, Yahoo, EODHD, FinnHub, and Reddit, covering specific company developments, global events, macroeconomic trends, and government updates.
        • +
        • Social Media Posts and Sentiment: Posts from Reddit, X/Twitter, and other platforms along with sentiment scores of posts calculated by auxiliary language models.
        • +
        • Insider Sentiments and Transactions: Sentiment derived from public information, including transactions from SEDI and relevant company filings.
        • +
        • Financial Statements and Earnings Reports: Quarterly and annual reports filed by companies.
        • +
        • Company Profiles and Financial History: Descriptions of company profiles, target industries, and financial history reported by third parties.
        • +
        • Technical Indicators: Sixty standard technical analysis indicators calculated for each asset, including MACD, RSI, Bollinger Bands, etc.
        • +
        + +

        Simulation Setup

        +

        We simulate the trading environment for the period from June 19, 2024, to November 19, 2024. TradingAgents facilitates seamless plug-and-play strategies during the simulation, enabling straightforward comparisons with any baseline. Agents make decisions based solely on data available up to each trading day, ensuring no future data is used (eliminating look-ahead bias). Based on their analysis, TradingAgents generates trading signals to buy, sell, or hold assets, which are then executed. Afterward, analysis metrics are calculated before proceeding to the next day's data.

        + +

        Baseline Models

        +

        We compare our TradingAgents framework against several baselines:

        + +
          +
        • Buy and Hold: Investing equal amounts in all selected stocks and holding them throughout the simulation period.
        • +
        • MACD (Moving Average Convergence Divergence): A trend-following momentum strategy that generates buy and sell signals based on the crossover points between the MACD line and signal line.
        • +
        • KDJ and RSI (Relative Strength Index): A momentum strategy combining KDJ (stochastic oscillator) and RSI (relative strength index) indicators to identify overbought and oversold conditions for trading signals.
        • +
        • ZMR (Zero Mean Reversion): A mean reversion trading strategy that generates signals based on price deviations from and subsequent reversions to a zero reference line.
        • +
        • SMA (Simple Moving Average): A trend-following strategy that generates trading signals based on crossovers between short-term and long-term moving averages.
        • +
        + +

        Evaluation Metrics

        + +
        +
        +
        + Cumulative Returns on AAPL +
        (a) Cumulative Returns on AAPL
        +
        +
        + +
        +
        + TradingAgents Transactions for AAPL +
        + (b) TradingAgents Transactions for AAPL.
        + Green / Red Arrows for Long / Short Positions. +
        +
        +
        +
        + + +
        + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
        CategoriesModelsAAPLGOOGLAMZN
        CR%↑ARR%↑SR↑MDD%↓CR%↑ARR%↑SR↑MDD%↓CR%↑ARR%↑SR↑MDD%↓
        MarketB&H-5.23-5.09-1.2911.907.788.091.3513.0417.117.63.533.80
        Rule-basedMACD-1.49-1.48-0.814.536.206.262.311.22----
        KDJ&RSI2.052.071.641.090.40.40.021.58-0.77-0.76-2.251.08
        ZMR0.570.570.170.86-0.580.582.122.34-0.77-0.77-2.450.82
        SMA-3.2-2.97-1.723.676.236.432.122.3411.0111.62.223.97
        OursTradingAgents26.6230.58.210.9124.3627.586.391.6923.2124.905.602.11
        Improvement(%)24.5728.436.57-16.5819.494.26-6.107.302.07-
        +

        Table 1: TradingAgents: Comparison of Performance Metrics across AAPL, GOOGL, and AMZN.

        + +

        Sharpe Ratio

        +

        The Sharpe Ratio performance highlights TradingAgents' exceptional ability to deliver superior risk-adjusted returns, consistently outperforming all baseline models across AAPL, GOOGL, and AMZN with Sharpe Ratios of at least 5.60—surpassing the next best models by a significant margin of at least 2.07 points. This result underscores TradingAgents' effectiveness in balancing returns against risk, a critical metric for sustainable and predictable investment growth. By excelling over market benchmarks like Buy-and-Hold and advanced strategies such as KDJRSI, SMA, MACD, and ZMR, TradingAgents demonstrates its adaptability and robustness in diverse market conditions. Its ability to maximize returns while maintaining controlled risk exposure establishes a solid foundation for multi-agent and debate-based automated trading algorithms.

        + +

        Maximum Drawdown

        +

        While rule-based baselines demonstrated superior performance in controlling risk, as reflected by their maximum drawdown scores, they fell short in capturing high returns. This trade-off between risk and reward underscores TradingAgents' strength as a balanced approach. Despite higher returns being typically associated with higher risks, TradingAgents maintained a relatively low maximum drawdown compared to many baselines. Its effective risk-control mechanisms, facilitated by the debates among risk-control agents, ensured that the maximum drawdown remained within a manageable limit, not exceeding 2%. This demonstrates TradingAgents' capability to strike a robust balance between maximizing returns and managing risk effectively.

        + +

        Explainability

        +

        A significant drawback of current deep learning methods for trading is their dense and complex architectures, which often render the decisions made by trading agents indecipherable to humans. This challenge, rooted in the broader issue of AI explainability, is particularly critical for trading agents, as they operate in real-world financial markets, often involving substantial sums of money where incorrect decisions can lead to severe consequences and losses.

        + +

        In contrast, an LLM-based agentic framework for trading offers a transformative advantage: its operations and decisions are communicated in natural language, making them highly interpretable to humans. To illustrate this, we provide the full trading log of TradingAgents for a single day in the Appendix, showcasing its use of the ReAct-style prompting framework. Each decision made by the agents is accompanied by detailed reasoning, tool usage, and thought processes, enabling traders to easily understand and debug the system. This transparency empowers traders to fine-tune and adjust the framework to account for factors influencing decisions, offering a significant edge in explainability over traditional deep-learning trading algorithms.

        +
        +
      +
      +
      + + +
      +
      +
      +
      +

      Conclusion

      +
      +

      In this paper, we introduced TradingAgents, an LLM-agent-powered stock trading framework that simulates a realistic trading firm environment with multiple specialized agents engaging in agentic debates and conversations. Leveraging the capabilities of LLMs to process and analyze diverse data sources, the framework enables informed trading decisions while utilizing multi-agent interactions to enhance performance through comprehensive reasoning and debate before acting. By integrating agents with distinct roles and risk profiles, along with a reflective agent and a dedicated risk management team, TradingAgents significantly improves trading outcomes and risk management compared to baseline models. Additionally, the collaborative nature of these agents ensures adaptability to varying market conditions. Extensive experiments demonstrate that TradingAgents outperforms traditional trading strategies and baselines in cumulative return, Sharpe ratio, and other critical metrics. Future work will focus on deploying the framework in a live trading environment, expanding agent roles, and incorporating real-time data processing to enhance performance further.

      +
      +
      +
      +
      +
      + + + + + From c252bbb155a09619f5936dba3729f34bea759ae0 Mon Sep 17 00:00:00 2001 From: Yijia-Xiao Date: Sun, 29 Dec 2024 15:04:43 +0800 Subject: [PATCH 11/56] Brief --- index.html | 160 ++++++++++++++++++++++++----------------------------- 1 file changed, 73 insertions(+), 87 deletions(-) diff --git a/index.html b/index.html index 1815043e..fc686cb7 100644 --- a/index.html +++ b/index.html @@ -78,6 +78,7 @@
    @@ -92,7 +93,7 @@

    Abstract

    -

    Societies of LLM-powered agents have advanced automated problem-solving, particularly in finance. Yet, most frameworks don’t replicate the collaborative workflows of real trading firms. TradingAgents addresses this gap by assigning specialized LLM-powered agents—analysts, researchers, traders, and risk managers—to simulate a dynamic, team-based environment. These agents collaborate through debates, structured outputs, and risk checks. Experiments show that TradingAgents significantly improves key performance metrics over baseline models, highlighting the promise of multi-agent LLM frameworks in financial trading.

    +

    We introduce TradingAgents, a novel stock trading framework inspired by trading firms, utilizing multiple LLM-powered agents with specialized roles such as fundamental, sentiment, and technical analysts, as well as traders with diverse risk profiles. The system features Bull and Bear researchers evaluating market conditions, a risk management team overseeing exposure, and traders integrating insights from debates and historical data to make informed decisions. This collaborative, dynamic environment enhances trading performance, as demonstrated by our comprehensive experiments showing significant improvements in cumulative returns, Sharpe ratio, and maximum drawdown compared to baseline models. Our results highlight the effectiveness of multi-agent LLM frameworks in financial trading.

    @@ -105,53 +106,17 @@

    Introduction

    -

    Autonomous agents equipped with Large Language Models (LLMs) can mimic human problem-solving in finance—an intricate domain shaped by fundamentals, market sentiment, and macro factors. While deep learning models have long struggled with explainability, LLM-based systems show promise by pairing structured reasoning with interpretability. However, current solutions often lack organizational realism and rely on purely conversational interfaces susceptible to context loss.

    -

    TradingAgents fills these gaps by emulating the multi-agent decision-making processes of trading firms. The framework includes fundamental, sentiment, news, and technical analysts, along with bullish and bearish researchers, traders, and a risk management team. They coordinate using structured documents and concise dialogues. Our architecture leverages specialized LLM roles, combining clarity with deeper debates. Through extensive evaluations, TradingAgents delivers robust performance across multiple assets, validating the importance of multi-agent collaboration for real-world trading systems.

    -
    -
    - - - - -
    -
    -
    -
    -

    Related Work

    - -

    LLMs as Financial Assistants

    -
    -

    Specialized LLMs in finance have improved domain understanding via fine-tuning or from-scratch training on financial corpora (e.g., FinGPT, BloombergGPT). These models often excel at classification tasks but face challenges in generative quality compared to powerful general-purpose models like GPT-4.

    +

    Autonomous agents powered by Large Language Models (LLMs) revolutionize decision-making by mimicking human workflows across various domains, including finance. Unlike traditional algorithmic trading systems that rely on quantitative models, LLMs excel in processing and understanding natural language data, making them adept at analyzing news, reports, and social media sentiment. Recent multi-agent LLM frameworks in finance have shown promise in creating explainable AI systems, enhancing decision transparency and reasoning.

    - Fine-Tuned LLMs for Finance -

    Fine-tuning boosts performance on tasks such as financial sentiment analysis. Examples include PIXIU (FinMA) and Instruct-FinGPT. They outperform generic open-source LLMs but still lag behind top-tier proprietary models in some generative tasks.

    +

    However, existing applications face two main challenges:

    - Finance LLMs Trained from Scratch -

    Models like BloombergGPT and XuanYuan 2.0 blend general corpora with specialized financial data, delivering strong domain-specific results. While they may not match larger closed-source models, they remain competitive among open-source counterparts.

    + Organizational Modeling: Current frameworks often overlook the complex interactions that mirror real trading firms, focusing instead on isolated tasks. This limits their ability to replicate effective trading practices. -
    - TradingAgents Overall Framework Organization -
    Figure 1: TradingAgents Overall Framework Organization. I. Analysts Team: Four analysts concurrently gather relevant market information. II. Research Team: The team discusses data. III. Trader: Makes final decisions using debates and history. IV. Risk Management Team: Monitors risk. V. Fund Manager: Approves and executes trades.
    -
    -
    - -

    LLMs as Traders

    -
    -

    LLMs directly executing trades often rely on news-driven or reasoning-driven prompts, sometimes enhanced by reinforcement learning. Debate and reflection modules help overcome hallucinations and bolster factual accuracy.

    + Communication Efficiency: Reliance on unstructured natural language communication leads to information loss and context degradation over lengthy interactions, reducing effectiveness in complex tasks.

    - News-Driven Agents -

    These agents use market news to gauge sentiment. Both closed-source (GPT-4) and open-source (Qwen) models show promising gains via simple sentiment-driven strategies.

    +

    Our framework addresses these issues by simulating the multi-agent decision-making processes of professional trading teams. We employ specialized agents with distinct roles inspired by real trading firms, coupled with a structured communication protocol that combines clear, structured outputs with natural language dialogue. This hybrid approach ensures precise and efficient interactions, enabling robust decision-making.

    - Reasoning-Driven Agents -

    Frameworks like FinMem and TradingGPT integrate multi-round reasoning, reflection, and debates between agents with different stances, enabling more robust trading signals.

    - - Reinforcement Learning-Driven Agents -

    RL aligns LLM outputs with backtest rewards, often leveraging memorized states and technical signals to refine decision-making.

    -
    - -

    LLMs as Alpha Miners

    -
    -

    Some frameworks focus on generating alpha factors rather than final trades. Systems like QuantAgent and AlphaGPT iteratively refine alpha scripts through feedback from an LLM-based judge and real-market performance, accelerating systematic strategy development.

    +

    We validate TradingAgents using historical financial data, comparing its performance against several baselines through metrics like cumulative return, Sharpe ratio, and maximum drawdown, demonstrating its superior effectiveness.

    @@ -164,28 +129,28 @@

    TradingAgents: Role Specialization

    -

    TradingAgents assigns each LLM agent a clear role. This mirrors how real trading firms split responsibilities—e.g., fundamental, sentiment, news, and technical analysts gather data, while researchers balance bullish and bearish arguments. A trader synthesizes these inputs, and risk managers ensure exposures stay within safe limits. This structured approach fosters comprehensive coverage of market signals.

    +

    Assigning specific roles to LLM agents allows complex trading objectives to be broken down into manageable tasks. Inspired by trading firms, TradingAgents features seven distinct roles: Fundamentals Analyst, Sentiment Analyst, News Analyst, Technical Analyst, Researcher, Trader, and Risk Manager. Each agent is equipped with specialized tools and constraints tailored to their function, ensuring comprehensive market analysis and informed decision-making.

    Analyst Team

    -

    The analyst team (Figure 2) covers fundamental, sentiment, news, and technical aspects. Each member focuses on different market signals, providing the basis for research and trading decisions.

    +

    The Analyst Team gathers and analyzes market data across various domains:

    +
      +
    • Fundamental Analysts: Assess company fundamentals to identify undervalued or overvalued stocks.
    • +
    • Sentiment Analysts: Analyze social media and public sentiment to gauge market mood.
    • +
    • News Analysts: Evaluate news and macroeconomic indicators to predict market movements.
    • +
    • Technical Analysts: Use technical indicators to forecast price trends and trading opportunities.
    • +
    +

    Combined, their insights provide a holistic market view, feeding into the Researcher Team for further evaluation.

    TradingAgents Analyst Team
    Figure 2: TradingAgents Analyst Team
    - -
      -
    • Fundamental Analysts: Evaluate intrinsic value via earnings, balance sheets, etc.
    • -
    • Sentiment Analysts: Analyze social media and public sentiment data.
    • -
    • News Analysts: Track macro events, economic indicators, and other critical news.
    • -
    • Technical Analysts: Calculate metrics like MACD/RSI to identify trends and patterns.
    • -

    Researcher Team

    -

    (Figure 3) Bullish and bearish researchers debate the analysts’ findings, challenging each other’s viewpoints to produce a balanced outcome.

    +

    The Researcher Team critically evaluates analyst data through a dialectical process involving bullish and bearish perspectives. This debate ensures balanced analysis, identifying both opportunities and risks to inform trading strategies.

    @@ -202,37 +167,48 @@
    - TradingAgents Risk Management Team and Fund Manager Approval Workflow -
    Figure 5: TradingAgents Risk Management and Fund Manager Workflow
    + TradingAgents Risk Management Team Workflow +
    Figure 5: TradingAgents Risk Management Workflow
      -
    • Bullish Researchers: Highlight favorable signals and positive growth opportunities.
    • -
    • Bearish Researchers: Emphasize caution, identifying risks or negative signals.
    • +
    • Bullish Researchers: Highlight positive market indicators and growth potential.
    • +
    • Bearish Researchers: Focus on risks and negative market signals.
    + +

    This process ensures a balanced understanding of market conditions, aiding Trader Agents in making informed decisions.

    Trader Agents

    -

    (Figure 4) Trader agents synthesize all insights to form buy/sell decisions, weighing returns against potential downside.

    +

    Trader Agents execute decisions based on comprehensive analyses. They evaluate insights from analysts and researchers to determine optimal trading actions, balancing returns and risks in a dynamic market environment.

    +
      -
    • Review data from analysts and researchers.
    • -
    • Determine optimal trade timing and size.
    • -
    • Execute orders and manage portfolios.
    • +
    • Assessing analyst and researcher recommendations.
    • +
    • Determining trade timing and size.
    • +
    • Executing buy/sell orders.
    • +
    • Adjusting portfolios in response to market changes.
    + +

    Precision and strategic thinking are essential for their role in maximizing performance.

    Risk Management Team

    -

    (Figure 5) Risk managers ensure safety by evaluating volatility, liquidity, and other exposures. They enforce stop-loss measures and signal portfolio rebalancing when necessary.

    +

    The Risk Management Team oversees the firm's exposure to market risks, ensuring trading activities stay within predefined limits.

    +
      -
    • Monitor market risk factors.
    • -
    • Adjust trading strategies to stay within risk limits.
    • -
    • Collaborate with traders to manage drawdowns.
    • +
    • Assessing market volatility and liquidity.
    • +
    • Implementing risk mitigation strategies.
    • +
    • Advising Trader Agents on risk exposures.
    • +
    • Aligning portfolio with risk tolerance.
    -

    All agents follow a ReAct-style prompting framework. Their actions—like research, debate, or trade execution—are tracked in a shared environment, creating a cohesive multi-agent ecosystem reminiscent of real trading firms.

    + +

    They ensure financial stability and safeguard assets through effective risk control.

    + +

    All agents utilize the ReAct prompting framework, facilitating a collaborative and dynamic decision-making process reflective of real-world trading systems.

    @@ -247,18 +223,26 @@

    TradingAgents: Agent Workflow

    Communication Protocol

    -

    Relying solely on natural language can lead to “telephone effect” issues for complex, long-horizon tasks. TradingAgents introduces structured reports to preserve key details and reduce message distortion, drawing inspiration from frameworks like MetaGPT. Each agent produces or queries structured entries—concise and focused—to streamline interactions.

    +

    To enhance communication efficiency, TradingAgents employs a structured protocol that combines clear, structured outputs with natural language dialogue. This approach minimizes information loss and maintains context over long interactions, ensuring focused and effective communication among agents.

    Types of Agent Interactions

    -

    Instead of lengthy dialogues, TradingAgents agents exchange structured documents containing critical data. Short natural language debates occur when merging contrasting opinions (e.g., bullish vs. bearish). Key communication types include:

    +

    Unlike previous frameworks that rely heavily on unstructured dialogue, our agents communicate through structured reports and diagrams, preserving essential information and enabling direct queries from the global state.

    +
      -
    • Analyst Team: Each analyst produces specialized reports (fundamentals, sentiment, etc.).
    • -
    • Traders: Combine analyst reports into a decision signal with accompanying rationale.
    • +
    • Analyst Team: Compiles research into concise analysis reports.
    • +
    • Traders: Review analyst reports and produce decision signals with detailed rationales.
    • +
    + +

    Natural language dialogue is reserved for specific interactions, such as debates within the Researcher and Risk Management teams, fostering deeper reasoning and balanced decision-making.

    + +
      +
    • Researcher Team: Engages in debates to form balanced perspectives.
    • +
    • Risk Management Team: Deliberates on trading plans from multiple risk perspectives.
    • +
    • Fund Manager: Reviews and approves risk-adjusted trading decisions.
    -

    Debates among researchers or risk managers occur in natural language but are recorded as structured entries. This approach maintains clarity while enabling multi-round reasoning.

    Backbone LLMs

    -

    We employ both “quick-thinking” and “deep-thinking” LLMs, choosing models based on complexity and speed requirements. Analysts and traders use robust reasoning models for decision-making, while simpler tasks (e.g., data retrieval) rely on faster LLMs. This modular design, requiring no GPUs, allows easy swapping of different local or API-based models and ensures future scalability.

    +

    We select LLMs based on task requirements, using quick-thinking models for data retrieval and deep-thinking models for in-depth analysis and decision-making. This strategic alignment ensures efficiency and robust reasoning, allowing TradingAgents to operate without the need for GPUs and enabling easy integration of alternative models in the future.

    @@ -271,25 +255,27 @@

    Experiments

    -

    We evaluate our framework on multi-asset data spanning a realistic time period, combining historical prices, news, social sentiment, insider transactions, and more. Baselines include traditional strategies like Buy-and-Hold, MACD, and SMA, ensuring a fair comparison.

    +

    We evaluated TradingAgents using a comprehensive experimental setup to assess its performance against various baselines.

    Back Trading

    -

    Our dataset includes stocks like Apple and Google, daily news, social media sentiment, and technical indicators. Agents process only the data available up to each trading day, avoiding look-ahead bias.

    +

    Our simulation utilized a multi-asset, multi-modal financial dataset including historical stock prices, news articles, social media sentiments, insider transactions, financial reports, and technical indicators from January to March 2024.

    Simulation Setup

    -

    The simulation runs from June 19, 2024, to November 19, 2024. TradingAgents autonomously generates buy, sell, or hold signals, then records performance metrics. This daily cycle repeats for each asset under study.

    +

    The trading environment spanned from June to November 2024. Agents operated on a daily basis, making decisions based on available data without future information, ensuring unbiased results.

    Baseline Models

    -

    We benchmark against several baselines:

    +

    We compared TradingAgents against the following strategies:

    +
      -
    • Buy and Hold
    • -
    • MACD
    • -
    • KDJ and RSI
    • -
    • ZMR
    • -
    • SMA
    • +
    • Buy and Hold: Investing equally across selected stocks throughout the period.
    • +
    • MACD: Momentum strategy based on MACD crossovers.
    • +
    • KDJ & RSI: Combined momentum indicators for trading signals.
    • +
    • ZMR: Mean reversion strategy based on price deviations.
    • +
    • SMA: Trend-following strategy using moving average crossovers.

    Evaluation Metrics

    +
    @@ -307,7 +293,7 @@
    -
    +
    @@ -401,16 +387,16 @@
    -

    Table 1: TradingAgents: Comparison of Performance Metrics across AAPL, GOOGL, and AMZN.

    +

    Table 1: TradingAgents: Performance Metrics Comparison across AAPL, GOOGL, and AMZN.

    Sharpe Ratio

    -

    TradingAgents consistently beats all baselines in risk-adjusted returns, showing Sharpe Ratios above 5.60 and surpassing the nearest competitors by at least 2.07 points. Its adaptability and robust debate mechanism enable high returns with controlled risk.

    +

    TradingAgents achieves superior risk-adjusted returns, consistently outperforming all baselines across AAPL, GOOGL, and AMZN. The enhanced Sharpe Ratios demonstrate the framework's effectiveness in balancing returns with risk, highlighting its robustness in diverse market conditions.

    Maximum Drawdown

    -

    Rule-based baselines limit downside but sacrifice overall returns. TradingAgents balances both, keeping maximum drawdown below 2% while generating superior returns, aided by dedicated risk-control agents.

    +

    While rule-based strategies excel in controlling risk, TradingAgents maintains a low maximum drawdown without sacrificing high returns. This balance underscores the framework's ability to maximize profits while effectively managing risk.

    Explainability

    -

    Unlike dense deep-learning models, TradingAgents provides transparent logs of its ReAct-style reasoning for every trade decision. This approach greatly enhances human interpretability, facilitating debugging and fine-tuning in real markets.

    +

    Unlike traditional deep learning models, TradingAgents offers transparent decision-making through natural language explanations. Each agent's actions are accompanied by detailed reasoning and tool usage, making the system's operations easily interpretable and debuggable, which is crucial for real-world financial applications.

    @@ -423,7 +409,7 @@

    Conclusion

    -

    We introduced TradingAgents, a multi-agent LLM trading framework inspired by professional trading firms. Its specialized analysts, researcher debates, and risk management teams create a rich decision-making ecosystem. By effectively combining structured reports and targeted dialogues, TradingAgents exceeds baseline performance across returns, Sharpe ratio, and drawdown metrics. Future work will explore live trading, expanded agent roles, and real-time data integration for even more refined trading outcomes.

    +

    We presented TradingAgents, a multi-agent LLM-driven stock trading framework that emulates a realistic trading firm with specialized agents collaborating through debates and structured communication. Our framework leverages diverse data sources and multi-agent interactions to enhance trading decisions, achieving superior performance in cumulative returns, Sharpe ratio, and risk management compared to traditional strategies. Future work includes live deployment, expanding agent roles, and integrating real-time data processing to further improve performance.

    From 413d9ecbcfa960420ebda55f97372a2b638097f4 Mon Sep 17 00:00:00 2001 From: Yijia-Xiao Date: Tue, 31 Dec 2024 11:58:35 +0800 Subject: [PATCH 12/56] Paper and Code links --- index.html | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/index.html b/index.html index fc686cb7..c62b53a8 100644 --- a/index.html +++ b/index.html @@ -51,7 +51,7 @@ @@ -76,8 +76,8 @@
    @@ -419,8 +419,8 @@
    - - + +
    From 1a2074b60ec89908044d543badadb89983cd1390 Mon Sep 17 00:00:00 2001 From: Yijia Xiao <48253104+Yijia-Xiao@users.noreply.github.com> Date: Mon, 20 Jan 2025 21:21:43 -0800 Subject: [PATCH 13/56] Update README.md --- README.md | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/README.md b/README.md index d449d623..015891a9 100644 --- a/README.md +++ b/README.md @@ -3,3 +3,21 @@ > MARW Workshop, AAAI 2025 > > Homepage: https://TradingAgents-AI.github.io/ + + +**Update** +We will release the framework by providing source code or in the form of API recently. Please keep tuned and thanks for your patience! + + +## Citation + +Please reference our work if you find *TradingAgents* provides you with some help :) + +``` +@article{xiao2024tradingagents, + title={TradingAgents: Multi-Agents LLM Financial Trading Framework}, + author={Xiao, Yijia and Sun, Edward and Luo, Di and Wang, Wei}, + journal={arXiv preprint arXiv:2412.20138}, + year={2024} +} +``` From 5986f61283f05d37a68f15e94f37e2a2cad822dc Mon Sep 17 00:00:00 2001 From: Yijia Xiao <48253104+Yijia-Xiao@users.noreply.github.com> Date: Mon, 20 Jan 2025 21:25:41 -0800 Subject: [PATCH 14/56] Update index.html --- index.html | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/index.html b/index.html index c62b53a8..7bfafcb3 100644 --- a/index.html +++ b/index.html @@ -416,6 +416,26 @@
    +
    +
    +
    +
    +

    Citation

    +
    +

    Please reference our work if you find TradingAgents provides you with some help :)

    + +
    @article{xiao2024tradingagents,
    +  title={TradingAgents: Multi-Agents LLM Financial Trading Framework},
    +  author={Xiao, Yijia and Sun, Edward and Luo, Di and Wang, Wei},
    +  journal={arXiv preprint arXiv:2412.20138},
    +  year={2024}
    +}
    +
    +
    +
    +
    +
    +