General News
-
As they say, don't look at the score board-look at what's going on, on the field! Business is booming


-
The Beginning of Scarcity in AI
For the first time since the 2000s, technology companies are confronting the limits of their supply chain.
GPU rental prices for Nvidia's Blackwell chips hit $4.08 per hour this week, up 48% from $2.75 just two months ago. CoreWeave raised prices 20% & extended minimum contracts from one year to three.
"We're making some very tough trades at the moment on things we're not pursuing because we don't have enough compute." - Sarah Friar, OpenAI CFO
This scarcity is already reshaping access. Anthropic has limited its newest model to roughly forty organizations. Access to the bleeding edge is becoming a gated privilege, for both capacity & security.
If the largest AI companies are having problems, startups face a tougher proposition. Five hallmarks define this era :
-
Relationship Based Selling : State-of-the-art models may no longer be open to everyone as providers limit access to their most profitable or strategic customers.
-
AI to the Highest Bidder : Even when they do become available, SOTA models may become prohibitively expensive. Companies that can raise large amounts of capital or generate strong profits will have an advantage.
-
Available but Slow : Even if you can pay, there may not be guarantees the models will be fast.
-
Inflationary Commodity : This imbalance will inevitably drive prices higher as demand compounds against a fixed supply. Procurement & margin management will become key disciplines in software companies.
-
Forced Diversification : Developers will be forced to look elsewhere, from smaller models to on-premise deployments, until energy infrastructure & data center buildouts catch up, which could take years.
The age of abundant AI is over, & it will remain so for years.
-
-
What's interesting. This is compute which is sold to customers for their business needs(inferencing). What about the LLM owners training their own models-just as big? And usage in the grand scheme of things, speculating is actually tiny cf where it will be in the years ahead.
-
I think the squeeze at the top among frontier models is going to produce a wave of 'Cursor-like' orchestration tools , whispering IDEs and workflow layers that quietly manage which model runs underneath and when. Also Google, Amazon and others are all building their own cheaper TPUs that need less memory, purpose-built for inference rather than training.
I don't think this is a bad thing, but suspect we're heading for a model in which the more you pay the more concentrated intelligence you get. -
The nuance in 'TPU adoption' is compute used internally whereas their customers(cloud) are using GPUs mainly because of CUDA. Today the TPU/GPU mix in total(TAM) is 15-20% and that won't change for years. Any news of GOOG/AMZN building huge TPU clusters is already planned and not something to suggest derailing the Nvidia growth story.
TPUs came about due to: scarcity of alternatives and don't forget TPUs are built for a very narrow process as opposed to GPUs which are programmable. So it's only natural that a device built to perform one thing is less complex.
As to 'TPUs use less memory' another pit fall if you then leap to assumptions . Yes a TPU uses less HBM than a GPU, maybe half but the sheer scale of TPU cluster deployment (in the multi millions) means the demand for HBM is enormous. And all we have to know is that whatever that demand is, it significantly exceeds the ability to meet it. And that imbalance will be worse next quarter and the one after that and so on (years).
Regardless, compute will devour all memory supply indefinitely. Every quarter into the 2030s will see the gap-demand vs supply, widen, imo.
In a constrained market, who theoretically is gaining or losing some market share is irrelevant, and don't forget TPUs are produced by TSM and they too are constrained. Nvidia created the market, are TSMs biggest customer and culturally there are very strong bonds between the two and their founders. Business is business so even if you ignore that, Nvidia has the money to book capacity many years in advance so as I have suggested before, taking AMD as an example, they will be constrained by how much they can commit to and or be allocated simply because their balance sheet can't fund it even if the market wanted it. It's not a case of asking for product at scale years out without handing over billions in up front payments.
And if you ignore all of that for a moment. Fact..
TPUs can be cheaper than GPUs in optimised, large-scale, Google-style workloads such as search optimisation.
GPUs often win in real-world enterprise environments due to flexibility and ecosystem eg, multi modal, inference across models. -
Samsungs record profit having been spotted by the employee union. In Korea avg employee earnings are USD 110k per annum-the union demands that the company pays employees on avg $425K as a bonus. I suppose you Can ask

-
Big old bump today, after only being about 3 weeks ago when it was down in the doldrums, feels like we're almost back to the highs of the start of the year again.
-
The nuance in 'TPU adoption' is compute used internally whereas their customers(cloud) are using GPUs mainly because of CUDA. Today the TPU/GPU mix in total(TAM) is 15-20% and that won't change for years. Any news of GOOG/AMZN building huge TPU clusters is already planned and not something to suggest derailing the Nvidia growth story.
TPUs came about due to: scarcity of alternatives and don't forget TPUs are built for a very narrow process as opposed to GPUs which are programmable. So it's only natural that a device built to perform one thing is less complex.
As to 'TPUs use less memory' another pit fall if you then leap to assumptions . Yes a TPU uses less HBM than a GPU, maybe half but the sheer scale of TPU cluster deployment (in the multi millions) means the demand for HBM is enormous. And all we have to know is that whatever that demand is, it significantly exceeds the ability to meet it. And that imbalance will be worse next quarter and the one after that and so on (years).
Regardless, compute will devour all memory supply indefinitely. Every quarter into the 2030s will see the gap-demand vs supply, widen, imo.
In a constrained market, who theoretically is gaining or losing some market share is irrelevant, and don't forget TPUs are produced by TSM and they too are constrained. Nvidia created the market, are TSMs biggest customer and culturally there are very strong bonds between the two and their founders. Business is business so even if you ignore that, Nvidia has the money to book capacity many years in advance so as I have suggested before, taking AMD as an example, they will be constrained by how much they can commit to and or be allocated simply because their balance sheet can't fund it even if the market wanted it. It's not a case of asking for product at scale years out without handing over billions in up front payments.
And if you ignore all of that for a moment. Fact..
TPUs can be cheaper than GPUs in optimised, large-scale, Google-style workloads such as search optimisation.
GPUs often win in real-world enterprise environments due to flexibility and ecosystem eg, multi modal, inference across models. -
Pleased you’re spending time investing in the knowledge. Hear the comment, ‘I hear a lot of opinions but I can’t always adjudicate. ‘You’re talking to the expert’ yes he is.

Cuda/cuda. Cuda is so pervasive because it’s been built over 20 years. It has more developers than any alternative. AMD have no answer for it. It makes gpu’s sing. The end