3 hellos limit, where did your Claude Code limit go? A 28-day cache Bug, and an official response that encourages you to "use it sparingly."

By: blockbeats|2026/04/03 13:00:04

4-17%. This is the prompt cache hit rate for Claude Code in the past month. The normal level is 97-99%.

This means that when you resume a previous session, Claude Code does not reuse the previously processed context, but instead processes everything from scratch each time, consuming credits at a rate 10 to 20 times higher than normal. You may think you are continuing a conversation, but in reality, you are starting a completely new, full-priced conversation each time.

3 hellos limit, where did your Claude Code limit go? A 28-day cache Bug, and an official response that encourages you to

This data comes from independent developer ArkNill's proxy monitoring. By setting up a transparent proxy, he recorded every request between Claude Code and the Anthropic API, uncovering at least two client-side caching bugs that caused the API server to be unable to match cached conversation prefixes, forcing a full token rebuild each round.

The graph above shows a comparison of cache hit rates across three stages. During versions v2.1.69 to v2.1.89 (the period of the bug), the standalone version's cache hit rate was only 4-17%. After fixing a critical bug in version v2.1.90, the cold start cache hit rate returned to 47-99.7%. By v2.1.91, the stable cache hit rate recovered to 97-99%.

One notable detail from the chart: the range in v2.1.90 is quite wide (47% to 99.7%) because the cache still needs to "warm up" when a session is just resumed, resulting in low hit rates in the first few rounds, but quickly returning to normal. In the bugged version, this warm-up never occurs — the cache hit rate always stays at 14,500 tokens of system prompt, with the full conversation history being fully billed each time.

28 Days, 20 Versions

This bug is not the type introduced in one update and fixed in the next. According to the npm registry release records, the version v2.1.69 that introduced the bug was released on March 4, and the version v2.1.90 that fixed the bug was released on April 1. There were 28 days in between, spanning 20 versions.

The timeline revealed a tantalizing detail. After the bug was introduced on March 4, users did not immediately complain on a large scale. It wasn't until March 23 that the complaints erupted en masse, almost three weeks later. The reason is that, according to the GitHub issue #41930 analysis, from March 13 to 28, Anthropic had a 2x quota promotion live (doubling during off-peak hours), which objectively masked the impact of the bug. After the promotion ended, the cache bug consumption returned to normal billing baselines, and users' quotas instantly "evaporated."

Anthropic's response was not swift. On March 26, three days after the user complaints erupted, engineer Thariq Shihipar announced on his personal X account that the peak hour limit (weekdays 5am-11am PT) had been tightened. On March 30, Anthropic admitted on Reddit that the "rate at which users hit their quota far exceeded expectations," listing it as the team's top priority. It wasn't until April 1 that team member Lydia Hallie released the official investigation findings.

Throughout the process, Anthropic did not release any blog posts, send email notifications, or update the status page. All official communication was done solely through engineers' personal social media posts and a few Reddit comments.

How Much Did You Pay, and How Long Can You Use It?

GitHub issue #41930 collected hundreds of user reports. The most extreme case was a Max 20x subscription user ($200/month), whose 5-hour rolling window was entirely consumed in 19 minutes. Max 5x users ($100/month) reported their 5-hour window was used up in 90 minutes. According to The Letter Two, some users claimed that a simple "hello" consumed 13% of their session quota. A Pro user ($20/month) on Discord mentioned that his quota "ran out every Monday and only reset on Saturday," with only 12 days of normal usage in 30 days.

Based on ArkNill's benchmark testing, in bug version v2.1.89, the 100% quota of the Max 20x plan would be depleted in about 70 minutes. He also calculated the cost of a single --resume operation for a 500K token context session, which is approximately $0.15, as the system fully replays the entire context.

「You're Holding It Wrong」

Lydia Hallie's investigation confirmed two points: first, there has indeed been a tightening of peak-hour limits, and second, there has been an increase in token consumption within the 1 million token context. She mentioned that the team fixed some bugs but emphasized that "none of the bugs led to overcharging."

She then provided four frugality recommendations:
1. Use Sonnet 4.6 instead of Opus (Opus consumes at about twice the rate);

2. Lower the reasoning depth or turn off extended thinking when deep reasoning is not needed;

3. Do not resume long idle sessions for over an hour; start a new one instead;

4. Set the environment variable CLAUDE_CODE_AUTO_COMPACT_WINDOW=200000 to limit the context window size.

No mention was made of any form of quota reset or compensation.

AI podcast host Alex Volkov summarized this response as "You're holding it wrong," pointing out that Anthropic itself set the 1 million token context as the default, promoted Opus as the flagship model, and highlighted extended thinking as a selling point, but is now advising paying users not to use these features.

The assertion of "no overcharging" also creates tension with Claude Code's own update history. Just the day before Lydia's response, v2.1.90 fixed a cache regression bug that had been present since v2.1.69: when using --resume to resume a session, requests that should have hit the cache would trigger a complete prompt cache miss, resulting in full billing. Lydia's response did not mention this confirmed billing anomaly.

For comparison, OpenAI's Codex had previously experienced similar abnormal quota consumption issues. OpenAI's approach was to reset user quotas, issue credit refunds, and announce the removal of the usage cap on Codex in March. Anthropic's approach is to advise users to downgrade models, disable features, limit context, and attribute responsibility to user usage.

Anthropic sells a subscription for the "strongest model + maximum context + highest reasoning capabilities," charging a fee of $20 to $200 per month. A 28-day caching bug caused the paid users' quotas to deplete at a rate 10-20 times faster, with the official response being to use it sparingly.

-- Price

The privacy-focused crypto wallet Mixin announced today the launch of its U-based perpetual contract (a derivative priced in USDT). Unlike traditional exchanges, Mixin has taken a new approach by "liberating" derivative trading from isolated matching engines and embedding it into the instant messaging environment.

Users can directly open positions within the app with leverage of up to 200x, while sharing positions, discussing strategies, and copy trading within private communities. Trading, social interaction, and asset management are integrated into the same interface.

Simplified Trading Experience: No KYC Required, Opening a Position in Five Steps

Based on its non-custodial architecture, Mixin has eliminated friction from the traditional onboarding process, allowing users to participate in perpetual contract trading without identity verification.

The trading process has been streamlined into five steps:

· Choose the trading asset

· Select long or short

· Input position size and leverage

· Confirm order details

· Confirm and open the position

The interface provides real-time visualization of price, position, and profit and loss (PnL), allowing users to complete trades without switching between multiple modules.

Social-Native Trading: Strategy and Execution Completed in the Same Context

Mixin has directly integrated social features into the derivative trading environment. Users can create private trading communities and interact around real-time positions:

· End-to-end encrypted private groups supporting up to 1024 members

· End-to-end encrypted voice communication

· One-click position sharing

· One-click trade copying

On the execution side, Mixin aggregates liquidity from multiple sources and accesses decentralized protocol and external market liquidity through a unified trading interface.

By combining social interaction with trade execution, Mixin enables users to collaborate, share, and execute trading strategies instantly within the same environment.

Referral Mechanism: Non-institutional users can receive up to 60% fee split

Mixin has also introduced a referral incentive system based on trading behavior:

· Users can join with an invite code

· Up to 60% of trading fees as referral rewards

· Incentive mechanism designed for long-term, sustainable earnings

This model aims to drive user-driven network expansion and organic growth.

Self-Custody Architecture and Built-in Privacy Mechanism

Mixin's derivative transactions are built on top of its existing self-custody wallet infrastructure, with core features including:

· Separation of transaction account and asset storage

· User full control over assets

· Platform does not custody user funds

· Built-in privacy mechanisms to reduce data exposure

The system aims to strike a balance between transaction efficiency, asset security, and privacy protection.

A New Path for On-Chain Derivatives

Against the background of perpetual contracts becoming a mainstream trading tool, Mixin is exploring a different development direction by lowering barriers, enhancing social and privacy attributes.

The platform does not only view transactions as execution actions but positions them as a networked activity: transactions have social attributes, strategies can be shared, and relationships between individuals also become part of the financial system.

Regulatory Background

Mixin's design is based on a user-initiated, user-controlled model. The platform neither custodies assets nor executes transactions on behalf of users.

This model aligns with a statement issued by the U.S. Securities and Exchange Commission (SEC) on April 13, 2026, titled "Staff Statement on Whether Partial User Interface Used in Preparing Cryptocurrency Securities Transactions May Require Broker-Dealer Registration."

The statement indicates that, under the premise where transactions are entirely initiated and controlled by users, non-custodial service providers that offer neutral interfaces may not need to register as broker-dealers or exchanges.

About Mixin

Mixin is a decentralized, self-custodial privacy wallet designed to provide secure and efficient digital asset management services.

Its core capabilities include:

· Aggregation: integrating multi-chain assets and routing between different transaction paths to simplify user operations

· High liquidity access: connecting to various liquidity sources, including decentralized protocols and external markets

· Decentralization: achieving full user control over assets without relying on custodial intermediaries

· Privacy protection: safeguarding assets and data through MPC, CryptoNote, and end-to-end encrypted communication

Mixin has been in operation for over 8 years, supporting over 40 blockchains and more than 10,000 assets, with a global user base exceeding 10 million and an on-chain self-custodied asset scale of over $1 billion.

$600 million stolen in 20 days, ushering in the era of AI hackers in the crypto world

Ethereum's biggest enemy is actually AI hackers

Vitalik's 2026 Hong Kong Web3 Summit Speech: Ethereum's Ultimate Vision as the "World Computer" and Future Roadmap

Ethereum's Two Core Tenets: A "World Computer" + Global Broadcast Channel.

On the same day Aave introduced rsETH, why did Spark decide to exit?

The results of two decision-making philosophies have now been quantified

Full Post-Mortem of the KelpDAO Incident: Why Did Aave, Which Was Not Compromised, End Up in Crisis Situation?

Bad Debt, Liquidity Crisis, and DeFi Risk Reassessment

After a $290 million DeFi liquidation, is the security promise still there?

Replacing intermediary credit with code does not automatically equate to greater security

ZachXBT's post ignites RAVE nearing zero, what is the truth behind the insider control?

Perhaps what the cryptocurrency market truly needs is to transform what ZachXBT has done into institutional arrangements: stricter listing review standards, more transparent token distribution information disclosure, and more binding continuous monitoring mechanisms for exchanges.

Vitalik 2026 Hong Kong Web3 Carnival Speech Transcript: We do not compete on speed; security and decentralization are the core

Vitalik Buterin reveals the ultimate vision of Ethereum as a "world computer," announcing a hardcore roadmap for the next five years focusing on scalability, zkEVM, and quantum resistance.

Consumer-grade Crypto Global Survey: Users, Revenue, and Track Distribution

The number of active users of consumer-grade encryption has long reached tens of millions, but it is not in the sight of Silicon Valley and New York.

Prediction Markets Under Bias

Why do authoritative narratives always exclude prediction markets?

Stolen: $290 million, Three Parties Refusing to Acknowledge, Who Should Foot the Bill for the KelpDAO Incident Resolution?

The most dangerous scenario right now is that if ETH suddenly drops, Aave's bad debt could snowball even further.

ASTEROID Pumped 10,000x in Three Days, Is Meme Season Back on Ethereum?

「Space Dog」 Triggers ETH Mainnet Price Surge, Sparking SOL Community Anxiety

ChainCatcher Hong Kong Themed Forum Highlights: Decoding the Growth Engine Under the Integration of Crypto Assets and Smart Economy

Why can this institution still grow by 150% when the scale of leading crypto VCs has shrunk significantly?

The merger of the two major payment companies, Bridge and BVNK, establishes their industry position and revenue scale.