The Gist — Compute & Infrastructure

Top Line

Iran's Islamic Revolutionary Guard Corps issued direct physical strike threats against data centre and manufacturing facilities of 18 US tech companies including NVIDIA, Microsoft, Apple, and Google, marking an unprecedented escalation of geopolitical risk to critical AI compute infrastructure.

Microsoft announced plans to develop large, cutting-edge AI models in-house by 2027, signalling a strategic shift toward reducing dependency on OpenAI and Anthropic while requiring significant expansion of its own training compute capacity.

NVIDIA showcased Neural Texture Compression technology at GTC 2026 claiming 85% VRAM reduction with zero quality loss, potentially extending the usable lifespan of existing GPU fleets and reducing pressure on memory supply chains.

Security researchers disclosed GeForge and GDDRHammer attacks exploiting Rowhammer vulnerabilities in NVIDIA GPU memory, demonstrating that shared compute infrastructure in cloud environments faces novel attack vectors through VRAM manipulation.

Key Developments

Geopolitical Threats to Physical AI Infrastructure

Iran's Islamic Revolutionary Guard Corps issued explicit threats to destroy facilities belonging to 18 US technology companies including NVIDIA, Microsoft, Apple, Google, Meta, IBM, Cisco, and Tesla, stating that these companies should expect destruction of their facilities in response to each act of terror in Iran, according to Tom's Hardware. This represents the first time a state actor has publicly threatened kinetic attacks specifically targeting the physical infrastructure of AI compute providers and semiconductor manufacturers.

The threat encompasses both data centre operations and semiconductor manufacturing facilities, creating new risk calculus for infrastructure planning. NVIDIA's fabrication partners, hyperscaler data centres, and cloud regions in proximity to potential conflict zones face elevated operational risk. While the credibility and capability of these threats remain unclear, their mere existence forces infrastructure operators to price in geopolitical risk previously considered remote for technology facilities outside direct conflict zones.

Why it matters

Concentration of AI compute in specific geographic regions and dependency on single fabrication nodes creates systemic vulnerabilities that adversaries now explicitly threaten to exploit.

What to watch

Whether US tech companies announce changes to facility locations, increased physical security measures, or shifts in geographic distribution of critical infrastructure capacity.

Microsoft's Push for Training Compute Independence

Microsoft announced it aims to develop large, cutting-edge AI models in-house by 2027, representing a strategic pivot toward building alternatives to the most powerful AI tools from OpenAI and Anthropic, according to Bloomberg. This move signals that Microsoft intends to compete directly with its current suppliers in frontier model development, requiring massive expansion of dedicated training compute capacity separate from its Azure cloud infrastructure.

The timeline implies Microsoft must secure allocation of next-generation NVIDIA Vera Rubin systems or alternative accelerator architectures within the next 12-18 months to begin training runs capable of matching GPT-5 class models by 2027. This competes directly with OpenAI's own capacity needs on the same hardware generation, creating internal allocation tension within Microsoft's infrastructure planning. The announcement also suggests Microsoft has concluded that relying on external model providers creates unacceptable strategic dependency, even when those providers run primarily on Microsoft's own cloud infrastructure.

Why it matters

Hyperscalers building competitive foundation models must allocate scarce next-generation compute between internal model development and external customer workloads, intensifying demand pressure on limited GPU supply.

What to watch

Whether Microsoft announces new data centre capacity specifically designated for internal model training, and how this affects Azure GPU availability for enterprise customers.

Neural Texture Compression and Memory Efficiency Advances

NVIDIA demonstrated Neural Texture Compression technology at GTC 2026 showing VRAM usage reduction from 6.5GB to 970MB in a test scene, an 85% reduction claimed to achieve zero quality loss compared to traditional block compression methods, according to Tom's Hardware. The technique uses neural networks to decompress textures rather than standard block-based compression, simultaneously reducing memory footprint while improving final image quality.

If successfully deployed in production AI inference workloads, this technology could dramatically extend the effective capacity of existing GPU fleets by allowing more models or larger context windows to fit in available VRAM. For inference providers operating at scale, an 85% memory reduction translates directly to increased throughput per GPU, reducing the number of accelerators needed to serve equivalent request volume. This matters most for memory-constrained inference scenarios where batch sizes are limited by VRAM rather than compute throughput, potentially delaying the point at which operators must purchase next-generation hardware.

Why it matters

Technologies that extend effective memory capacity of existing GPU fleets reduce near-term demand for new hardware purchases and ease pressure on constrained HBM supply chains.

What to watch

Whether NVIDIA integrates Neural Texture Compression into TensorRT and other inference frameworks, and what adoption timeline major cloud providers announce for production deployments.

GPU Memory Security Vulnerabilities in Shared Infrastructure

Security researchers disclosed GeForge and GDDRHammer attacks that exploit Rowhammer vulnerabilities in NVIDIA GPU memory to gain arbitrary read/write access to protected VRAM regions by forcing bit flips in page files and page directory structures, according to Tom's Hardware. The attacks massage protected data structures into vulnerable memory regions where electrical disturbance can induce bit flips, allowing attackers to access even CPU memory through compromised page tables.

This vulnerability is particularly significant for multi-tenant cloud infrastructure where multiple customers share GPU resources. An attacker could potentially access model weights, training data, or inference results belonging to other tenants on the same physical GPU. Cloud providers offering GPU instances must now consider whether current isolation mechanisms provide adequate protection, potentially requiring changes to GPU virtualization approaches, memory error correction implementations, or customer isolation policies. The attack also affects on-premises deployments where multiple users or workloads share GPU resources.

Why it matters

Shared GPU infrastructure in cloud environments faces novel attack vectors through memory manipulation, potentially requiring costly architectural changes to isolation mechanisms or reduced GPU utilization through more conservative resource allocation.

What to watch

Whether cloud providers announce changes to GPU instance isolation policies, NVIDIA releases microcode or driver updates addressing the vulnerability, and whether enterprise customers shift toward dedicated rather than shared GPU instances.

Signals & Trends

Geopolitical Risk Premium Now Required for Infrastructure Planning

Iran's explicit threats against technology infrastructure represent a threshold crossing where state actors now view data centres and semiconductor facilities as legitimate military targets rather than civilian economic assets. This follows earlier concerns about submarine cable vulnerabilities, power grid attacks, and supply chain interdiction, but marks the first public declaration of intent to conduct kinetic strikes against AI compute infrastructure. Infrastructure planners who previously focused on natural disaster resilience, power availability, and network connectivity must now factor in proximity to potential conflict zones, air defence coverage, and hardening against military strikes. This will likely accelerate geographic diversification of critical infrastructure, increase capital costs for physical security, and potentially shift new capacity toward regions with greater geopolitical stability even if those locations have higher energy costs or less favourable tax treatment.

Memory Efficiency Becomes Strategic Differentiator as HBM Constraints Persist

NVIDIA's investment in Neural Texture Compression and the significance of 85% memory reduction demonstrations signal that HBM supply constraints are expected to persist long enough that efficiency innovations provide strategic advantage over raw capacity expansion. This aligns with broader industry recognition that HBM packaging capacity grows more slowly than logic fabrication capacity, creating lasting memory bottlenecks. Expect increased focus on compression algorithms, sparse attention mechanisms, quantization techniques, and architectural innovations that reduce memory bandwidth and capacity requirements. Companies that successfully deploy memory efficiency techniques gain effective capacity expansion without waiting for hardware supply, while also reducing operational costs per inference request. This trend also suggests that next-generation accelerators may emphasise memory efficiency features as heavily as raw compute throughput improvements.

Explore Other Categories

Read detailed analysis in other strategic domains

Capital & Industrial Strategy

Three top executives are stepping back simultaneously — including the COO overseeing monetization and the CEO of AGI deployment — just as OpenAI prepares for a Wall Street debut this year. The leadership vacuum arrives amid fierce competition from Microsoft and mounting pressure to prove profitability, raising questions about execution risk at the sector's most visible company.