
Bitcoin Magazine
Alpha Arena reveals the shortcomings of AI trading: Western models lose 80% of their capital in one week
Can artificial intelligence trade cryptocurrencies? Jay Azhanga computer engineer and financial bro from New York, puts this question to the test with… Alpha Arena. The project pits the greatest LLMs against each other, each with $10,000 in capital, to see which can make more money from cryptocurrency trading. Models include Grok 4, Claude Sonnet 4.5, Gemini 2.5 pro, ChatGPT 5, Deepseek v3.1, and Qwen3 Max.
Now, you’re probably thinking, “Wow, that’s a great idea!” And you’d be surprised that, at the time of writing, three out of five are underwater AI systems, with Qwen3 and Deepseek – the two open source Chinese models – leading the charge.

That’s right, the Western world’s most powerful closed-source AI systems run by giants like Google and OpenAI have lost more than $8,000, or 80%, of their cryptocurrency trading capital in just over a week, while their Eastern open-source counterparts are in the green.
Most successful trade so far? Qwen3 – hydrated and on track – with a minor long position of 20x BTC. Grok 4 – to no one’s surprise – has long been Doge with 10x leverage of most competitors… having at one point been at the top of the charts alongside Deepseek, which is now close to 20% underwater. Maybe Elon Musk should tweet a funny tweet or something, you know, to get your puppy out of the doghouse.

Meanwhile, Google’s Gemini is relentlessly bearish, experiencing a shortage of all crypto assets available for trading, a position that reflects its general crypto policy over the past 15 years.
And last but not least, ChatGibitty, which made every bad trade possible for a week in a row, is an amazing achievement! It takes skill to be this bad, especially when Qwen3 was craving Bitcoin and went fishing. If this is the best closed source AI can offer, maybe OpenAI should keep it closed source and save us.
A new standard for artificial intelligence
All kidding aside, the idea of pitting AI models against each other in the cryptocurrency trading arena has some very profound ideas. For starters, AI cannot be pre-trained on answers to cryptocurrency trading knowledge tests because it is unpredictable, a problem that other standards suffer from. In other words, many AI models are given answers to some of these tests in their training, so of course they perform well when tested. But some research has proven this Small changes in some of these tests lead to radically different results for AI benchmarks.
This controversy raises the following question: What is the ultimate test of intelligence? Well, according to Elon Musk, Iron Man fan and creator of Grok 4, predicting the future is the ultimate measure of intelligence.
And let’s face it, there is no future more uncertain than the price of cryptocurrencies in the short term. As Azhang puts it, “Our goal with Alpha Arena is to make benchmarks more like the real world, and markets are perfect for this. They are dynamic, adversarial, open-ended and endlessly unpredictable. They challenge AI in ways that static benchmarks cannot. Markets are the ultimate test of intelligence.”
This vision of markets is deeply rooted in the libertarian principles from which Bitcoin was born. Over a hundred years ago, economists such as Murray Rothbard and Milton Friedman made the argument that markets are fundamentally unpredictable by central planners, and that only individuals who make real economic decisions and have something to lose are capable of making rational economic calculations.
In other words, the market is the hardest thing to predict because it depends on the individual viewpoints and decisions of smart individuals all over the world, and therefore it is the best test of intelligence.
Azhang states in its project description that the AI systems are instructed to trade not just for gains, but for risk-adjusted returns. This dimension of risk is crucial, as one bad trade can wipe out all previous returns, as we saw, for example, in the Grok 4 portfolio collapse.
Another question remains whether these models learn from their experience trading cryptocurrencies, something that is not technically easy to achieve, since pre-training AI models is very expensive in the first place. They can be fine-tuned to their own or other people’s trading history, and they may keep recent trades in their short-term memory or context window, but this can only take them so far. Ultimately, a suitable AI trading model may have to truly learn from its own experiences, a technology that has recently been heralded among academic circles but has a long way to go before becoming a product. MIT invites them Self-adaptive AI models.
How do we know it’s not just luck?
Another analysis of the project and its results so far is that it may be indistinguishable from a “random walk.” Random walk is like rolling the dice for each decision. What would that look like on a chart? Well, there actually is A simulator you can use to answer this question; It wouldn’t actually look very different.

This issue of luck in the markets has also been described very carefully by intellectuals such as Nassim Taleb in his book Antifragility. In it, he argues that from a statistics point of view, it is completely normal and possible for a single trader, like Qwen3 in this case, to be lucky for an entire week in a row! Which leads to the emergence of superior logic. Taleb goes much further, arguing that there are enough traders on Wall Street that one could easily get lucky for 20 years in a row, and gain such a reputation that everyone around him assumes that trader is just a genius, until, of course, the luck runs out.
Thus, for AlphaArena to be able to produce valuable data, it would actually have to operate for a long time, and its patterns and results would need to be replicated independently as well, with real capital at stake, before they could be identified as different from a random walk.
Ultimately, it’s great to see cost-effective, open source models like DeepSeek outperforming their closed source counterparts so far. So far, Alpha Arena has been a great source of entertainment, going viral on X.com over the past week. Where it goes is anyone’s guess. We’ll have to see if the gamble its creator took, giving $50,000 to five chatbots to gamble on cryptocurrencies, eventually pays off.
The post Alpha Arena Reveals Drawbacks of AI Trading: Western Models Lose 80% of Their Capital in One Week first appeared on Bitcoin Magazine and was written by Juan Galt.
The post Alpha Arena Reveals AI Trading Flaws: Western Models Lose 80% Capital in One Week first appeared on Investorempires.com.
