AI Trading Tests Show Big Losses
Recent public contests designed to test AI trading skills have produced poor results, showing artificial intelligence is not yet ready to replace human fund managers in active trading. In tests like the Alpha Arena, eight major AI systems, including Google's Gemini and OpenAI's ChatGPT, were given $10,000 each to trade U.S. tech stocks for two weeks. Across these tests, the AI portfolios lost about a third of their initial capital. Out of 32 trading scenarios, only six were profitable, painting a stark picture of AI's immediate trading performance. A major issue was over-trading; one AI made an extreme 1,418 trades with the same instructions, while another made a much lower 158 trades. This showed a lack of discipline and an inability to manage position size or timing.
AI's Struggles Mirror Human Fund Managers
This AI performance is similar to how most human fund managers also struggle to beat market indexes. The AI models showed distinct 'personalities,' with some favoring long positions, others short-selling, and some using high leverage, requiring active management of their outputs just like human analysts. Jay Azhang, founder of Nof1, which ran the Alpha Arena, noted that AI models need 'a very sophisticated setup and data platform to even have a chance.' This highlights the necessity of extensive infrastructure, not just the AI model itself. While AI is good at spotting patterns and processing huge amounts of data, these public tests show its current limits in understanding market subtleties, timing trades poorly, and managing risk effectively.
Real AI Use: Inside Firms, Not Public Tests
In contrast to these experimental failures, established financial institutions are carefully using AI, often to help humans rather than fully automate trading desks. Firms like JPMorgan Chase & Co. and Balyasny Asset Management use AI for tasks like parsing news, drafting memos, and detecting fraud, but humans still manage trading. Hedge funds and proprietary trading firms also use AI for research, finding trading signals, and optimizing execution, but typically within strict oversight. For example, Man Group's AlphaGPT generates and tests trading ideas, but requires human review. AI has shown more reliable success in focused, data-heavy tasks, such as predicting earnings estimate directions, where OpenAI's ChatGPT was accurate 68% of the time in Q4 2025. New benchmarks are also appearing to evaluate AI's skills for specific financial tasks.
Why Autonomous AI Trading Faces Hurdles
Several critical challenges prevent the widespread use of AI for direct trading. A key problem in testing is 'lookahead bias,' where AI models know future events during simulations. This makes past results unreliable and requires live market testing. Furthermore, AI trading systems must be production-ready, focusing on speed, broker connections, and monitoring, not just raw model intelligence. Experts suggest that any AI trading bot showing a lasting edge is likely operating secretly, protected by proprietary techniques used in exclusive trading firms. This implies that publicly tested models are far from replicating institutional success. Studies indicate that even AI models with strong comprehension skills can struggle with retrieving information from complex documents, a vital step for financial analysis. Also, many AI models are 'black boxes,' meaning their decision-making process is unclear, raising serious risk management concerns. The trade-off between cost and performance remains another significant challenge.
AI in Finance: Helping Humans, Not Replacing Them
The current public AI trading experiments, while informative, are too limited in scope and duration to conclude AI's ultimate trading potential. These tests often lack access to proprietary research and have poorer execution capabilities than institutional players. While AI is clearly changing the financial industry by improving research, automating tasks, and providing advanced analysis, its role in direct trading appears to be an evolutionary process. The general view is that AI will act as a powerful tool to boost human judgment and creativity. This helps analysts and portfolio managers process more data, ask better questions, and make smarter decisions, rather than fully replacing human oversight and strategy. True AI trading success, when it arrives, will likely appear as an invisible, proprietary advantage within sophisticated quantitative funds, far from public view.
