Compute & Infrastructure
Top Line
Iran's Islamic Revolutionary Guard Corps issued direct physical strike threats against data centre and manufacturing facilities of 18 US tech companies including NVIDIA, Microsoft, Apple, and Google, marking an unprecedented escalation of geopolitical risk to critical AI compute infrastructure.
Microsoft announced plans to develop large, cutting-edge AI models in-house by 2027, signalling a strategic shift toward reducing dependency on OpenAI and Anthropic while requiring significant expansion of its own training compute capacity.
NVIDIA showcased Neural Texture Compression technology at GTC 2026 claiming 85% VRAM reduction with zero quality loss, potentially extending the usable lifespan of existing GPU fleets and reducing pressure on memory supply chains.
Security researchers disclosed GeForge and GDDRHammer attacks exploiting Rowhammer vulnerabilities in NVIDIA GPU memory, demonstrating that shared compute infrastructure in cloud environments faces novel attack vectors through VRAM manipulation.
Key Developments
Geopolitical Threats to Physical AI Infrastructure
Iran's Islamic Revolutionary Guard Corps issued explicit threats to destroy facilities belonging to 18 US technology companies including NVIDIA, Microsoft, Apple, Google, Meta, IBM, Cisco, and Tesla, stating that these companies should expect destruction of their facilities in response to each act of terror in Iran, according to Tom's Hardware. This represents the first time a state actor has publicly threatened kinetic attacks specifically targeting the physical infrastructure of AI compute providers and semiconductor manufacturers.
The threat encompasses both data centre operations and semiconductor manufacturing facilities, creating new risk calculus for infrastructure planning. NVIDIA's fabrication partners, hyperscaler data centres, and cloud regions in proximity to potential conflict zones face elevated operational risk. While the credibility and capability of these threats remain unclear, their mere existence forces infrastructure operators to price in geopolitical risk previously considered remote for technology facilities outside direct conflict zones.
Microsoft's Push for Training Compute Independence
Microsoft announced it aims to develop large, cutting-edge AI models in-house by 2027, representing a strategic pivot toward building alternatives to the most powerful AI tools from OpenAI and Anthropic, according to Bloomberg. This move signals that Microsoft intends to compete directly with its current suppliers in frontier model development, requiring massive expansion of dedicated training compute capacity separate from its Azure cloud infrastructure.
The timeline implies Microsoft must secure allocation of next-generation NVIDIA Vera Rubin systems or alternative accelerator architectures within the next 12-18 months to begin training runs capable of matching GPT-5 class models by 2027. This competes directly with OpenAI's own capacity needs on the same hardware generation, creating internal allocation tension within Microsoft's infrastructure planning. The announcement also suggests Microsoft has concluded that relying on external model providers creates unacceptable strategic dependency, even when those providers run primarily on Microsoft's own cloud infrastructure.
Neural Texture Compression and Memory Efficiency Advances
NVIDIA demonstrated Neural Texture Compression technology at GTC 2026 showing VRAM usage reduction from 6.5GB to 970MB in a test scene, an 85% reduction claimed to achieve zero quality loss compared to traditional block compression methods, according to Tom's Hardware. The technique uses neural networks to decompress textures rather than standard block-based compression, simultaneously reducing memory footprint while improving final image quality.
If successfully deployed in production AI inference workloads, this technology could dramatically extend the effective capacity of existing GPU fleets by allowing more models or larger context windows to fit in available VRAM. For inference providers operating at scale, an 85% memory reduction translates directly to increased throughput per GPU, reducing the number of accelerators needed to serve equivalent request volume. This matters most for memory-constrained inference scenarios where batch sizes are limited by VRAM rather than compute throughput, potentially delaying the point at which operators must purchase next-generation hardware.
GPU Memory Security Vulnerabilities in Shared Infrastructure
Security researchers disclosed GeForge and GDDRHammer attacks that exploit Rowhammer vulnerabilities in NVIDIA GPU memory to gain arbitrary read/write access to protected VRAM regions by forcing bit flips in page files and page directory structures, according to Tom's Hardware. The attacks massage protected data structures into vulnerable memory regions where electrical disturbance can induce bit flips, allowing attackers to access even CPU memory through compromised page tables.
This vulnerability is particularly significant for multi-tenant cloud infrastructure where multiple customers share GPU resources. An attacker could potentially access model weights, training data, or inference results belonging to other tenants on the same physical GPU. Cloud providers offering GPU instances must now consider whether current isolation mechanisms provide adequate protection, potentially requiring changes to GPU virtualization approaches, memory error correction implementations, or customer isolation policies. The attack also affects on-premises deployments where multiple users or workloads share GPU resources.
Signals & Trends
Geopolitical Risk Premium Now Required for Infrastructure Planning
Iran's explicit threats against technology infrastructure represent a threshold crossing where state actors now view data centres and semiconductor facilities as legitimate military targets rather than civilian economic assets. This follows earlier concerns about submarine cable vulnerabilities, power grid attacks, and supply chain interdiction, but marks the first public declaration of intent to conduct kinetic strikes against AI compute infrastructure. Infrastructure planners who previously focused on natural disaster resilience, power availability, and network connectivity must now factor in proximity to potential conflict zones, air defence coverage, and hardening against military strikes. This will likely accelerate geographic diversification of critical infrastructure, increase capital costs for physical security, and potentially shift new capacity toward regions with greater geopolitical stability even if those locations have higher energy costs or less favourable tax treatment.
Memory Efficiency Becomes Strategic Differentiator as HBM Constraints Persist
NVIDIA's investment in Neural Texture Compression and the significance of 85% memory reduction demonstrations signal that HBM supply constraints are expected to persist long enough that efficiency innovations provide strategic advantage over raw capacity expansion. This aligns with broader industry recognition that HBM packaging capacity grows more slowly than logic fabrication capacity, creating lasting memory bottlenecks. Expect increased focus on compression algorithms, sparse attention mechanisms, quantization techniques, and architectural innovations that reduce memory bandwidth and capacity requirements. Companies that successfully deploy memory efficiency techniques gain effective capacity expansion without waiting for hardware supply, while also reducing operational costs per inference request. This trend also suggests that next-generation accelerators may emphasise memory efficiency features as heavily as raw compute throughput improvements.
Explore Other Categories
Read detailed analysis in other strategic domains