|
Posted on January 16, 2026
Google has rolled out a new beta feature for its AI assistant Gemini called Personal Intelligence, which lets users opt into securely connect their personal Google apps—like Gmail, Google Photos, YouTube, and Search—with the AI to get more personalized, context-aware responses. Rather than treating each app as a separate data silo, Gemini uses cross-source reasoning to combine text, images, and other content from these services to answer questions more intelligently — for example, by using email and photo details together to tailor travel suggestions or product recommendations. Privacy is a core focus: the feature is off by default, users control which apps are connected and can disable it anytime, and Google says that personal data isn’t used to train the underlying models but only referenced to generate responses. The Personal Intelligence beta is rolling out in the U.S. first to Google AI Pro and AI Ultra subscribers on web, Android, and iOS, with plans for broader availability over time.
That’s my take on it:
Google’s move clearly signals that AI competition is shifting from “model quality” to “ecosystem depth.” By tightly integrating Gemini with Gmail, Google Photos, Search, YouTube, and the broader Google account graph, Google isn’t just offering an assistant—it’s offering a personal intelligence layer that already lives where users’ data, habits, and workflows are. That’s a structural advantage that pure-play AI labs don’t automatically have.
For rivals like OpenAI (ChatGPT), Anthropic (Claude), DeepSeek, Meta (Meta AI), xAI (Grok), and Alibaba (Qwen), this creates pressure to anchor themselves inside existing digital ecosystems rather than competing as standalone tools. Users don’t just want smart answers—they want AI that understands their emails, calendars, photos, documents, shopping history, and work context without friction. Google already owns that integration surface.
However, the path forward isn’t identical for everyone. ChatGPT already has a partial ecosystem strategy via deep ties with Microsoft (Windows, Copilot, Office, Azure), while Meta AI leverages WhatsApp, Instagram, and Facebook—arguably one of the richest social-context datasets in the world. Claude is carving out a different niche by embedding into enterprise tools (Slack, Notion, developer workflows) and emphasizing trust and safety rather than mass consumer lock-in. Chinese players like Qwen and DeepSeek naturally align with domestic super-apps and cloud platforms, which already function as ecosystems.
The deeper implication is that AI is starting to resemble operating systems more than apps. Once an AI is woven into your emails, photos, documents, cloud storage, and daily routines, switching costs rise sharply—even if a rival model is technically better. In that sense, Google isn’t just competing on intelligence; it’s competing on institutional memory. And that’s a game where ecosystems matter as much as algorithms.
P.S. I have signed up for Google's personal intelligence
Link: https://opendatascience.com/gemini-introduces-personal-intelligence-to-connect-gmail-photos-and-more/
|
|
Posted on January 7, 2026
Yesterday Nvidia unveiled its next-generation Vera Rubin AI platform at CES 2026 in Las Vegas, introducing a new superchip and broader infrastructure designed to power advanced artificial intelligence workloads. The Vera Rubin system — named after astronomer Vera Rubin — integrates a Vera CPU and multiple Rubin GPUs into a unified architecture and is part of Nvidia’s broader “Rubin” platform aimed at reducing costs and accelerating both training and inference for large, agentic AI models compared with its prior Blackwell systems. CEO Jensen Huang emphasized that the platform is now in production and underscores Nvidia’s push to lead the AI hardware space, with the new technology expected to support complex reasoning models and help scale AI deployment across data centers and cloud providers.
At its core, the Vera Rubin platform isn’t just one chip, but a tightly integrated suite of six co-designed chips that work together as an AI “supercomputer,” not isolated components. These chips include the Vera CPU, Rubin GPU, NVLink 6 switch, ConnectX-9 SuperNIC, BlueField-4 DPU, and Spectrum-X Ethernet switch — all engineered from the ground up to share data fast and efficiently.
The Vera CPU is custom ARM silicon tuned for AI workloads like data movement and reasoning, with high-bandwidth NVLink-C2C links to GPUs that are far faster than traditional PCIe links. That means CPUs and GPUs can share memory and tasks without bottlenecks, improving overall system responsiveness.
Put simply, instead of just scaling up power, Nvidia’s Vera Rubin platform rethinks how chips share data, reducing idle time and wasted cycles — which translates to lower operating costs and faster AI responsiveness in large-scale deployments.
That’s my take on it:
While Nvidia’s unveiling of Vera Rubin is undeniably jaw-dropping, it is still premature to proclaim Nvidia’s uncontested dominance or to count out its chief rivals—especially AMD. As of November 2025, the Top500 list of supercomputers shows that the very top tier remains dominated by systems built on AMD and Intel technologies, with Nvidia-based machines occupying the #4 and #5 positions rather than the top three. The current #1 system, El Capitan, relies on AMD CPUs paired with AMD Instinct GPUs, a combination that performs exceptionally well on the LINPACK benchmark, which the Top500 uses to rank raw floating-point computing power.
Although Nvidia promotes the Vera Rubin architecture—announced at CES 2026—as a next-generation “supercomputer,” the Top500 ranking methodology places heavy emphasis on double-precision floating-point (FP64) performance. Historically, Nvidia GPUs, particularly recent Blackwell-era designs, have prioritized lower-precision arithmetic optimized for AI workloads rather than maximizing FP64 throughput. This architectural focus helps explain why AMD-powered systems have continued to lead the Top500, even as Nvidia dominates the AI training and inference landscape.
Looking ahead, it is entirely plausible that Vera Rubin-based systems will climb higher in the Top500 rankings or establish new records on alternative benchmarks better aligned with AI-centric performance. However, in the near term, the LINPACK crown remains highly competitive, with AMD and Intel still well positioned to defend their lead.
Link: https://finance.yahoo.com/news/nvidia-launches-vera-rubin-its-next-major-ai-platform-at-ces-2026-230045205.html
|
|
Posted on January 5, 2026
Recently Groq announced that it had entered into a non-exclusive licensing agreement with NVIDIA covering Groq’s inference technology. Under this agreement, NVIDIA will leverage Groq’s technology as it explores the development of LPU-based (Language Processing Unit) architectures, marking a notable convergence between two very different design philosophies in the AI-hardware ecosystem. While non-exclusive in nature, the deal signals strategic recognition of Groq’s architectural ideas by the world’s dominant GPU vendor and highlights growing diversification in inference-focused compute strategies.
Groq has built its reputation on deterministic, compiler-driven inference hardware optimized for ultra-low latency and predictable performance. Unlike traditional GPUs, which rely on massive parallelism and complex scheduling, Groq’s approach emphasizes a tightly coupled hardware–software stack that eliminates many runtime uncertainties. By licensing this inference technology, NVIDIA gains access to alternative architectural concepts that may complement its existing GPU, DPU, and emerging accelerator roadmap—particularly as inference workloads begin to dominate AI deployment at scale.
An LPU (Language Processing Unit) is a specialized processor designed primarily for AI inference, especially large language models and sequence-based workloads. LPUs prioritize deterministic execution, low latency, and high throughput for token generation, rather than the flexible, high-variance compute patterns typical of GPUs. In practical terms, an LPU executes pre-compiled computation graphs in a highly predictable manner, making it well-suited for real-time applications such as conversational AI, search, recommendation systems, and edge-to-cloud inference pipelines. Compared with GPUs, LPUs often trade generality for efficiency, focusing on inference rather than training.
That’s my take on it:
Nvidia’s agreement with Groq seems to be a strategy in an attempt to break the “curse” of The Innovator’s Dilemma, a theory introduced by Clayton Christensen to explain why successful, well-managed companies so often fail in the face of disruptive innovation. Christensen argued that incumbents are rarely blindsided by new technologies; rather, they are constrained by their own success. They rationally focus on sustaining innovations that serve existing customers and protect profitable, high-end products, while dismissing simpler or less mature alternatives that initially appear inferior. Over time, those alternatives improve, move up-market, and ultimately displace the incumbent. The collapse of Kodak—despite its early invention of digital photography—remains the canonical example of this dynamic.
Nvidia’s position in the AI ecosystem today strongly resembles the kind of incumbent Christensen described. The GPU is not merely a product; it is the foundation of Nvidia’s identity, revenue model, software ecosystem, and developer loyalty. CUDA, massive parallelism, and GPU-centric optimization define how AI practitioners think about training and inference alike. Any internally developed architecture that fundamentally departs from the GPU paradigm—such as a deterministic, inference-first processor like an LPU—would inevitably compete for resources, mindshare, and legitimacy within the company. In such a context, internal resistance would not stem from short-sightedness or incompetence, but from rational organizational behavior aimed at protecting a highly successful core business.
Seen through this lens, licensing Groq’s inference technology represents a structurally intelligent workaround. Instead of forcing a disruptive architecture to emerge from within its own GPU-centric organization, Nvidia accesses an external source of innovation that is unburdened by legacy assumptions. Thus, disruptive technologies are best explored outside the incumbent’s core operating units, where different performance metrics and expectations can apply. By doing so, Nvidia can experiment with alternative compute models without signaling abandonment of its GPU franchise or triggering internal conflict reminiscent of Kodak’s struggle to reconcile film and digital imaging.
The non-exclusive nature of the agreement further reinforces this interpretation. It suggests exploration rather than immediate commitment, allowing Nvidia to learn from Groq’s deterministic, compiler-driven approach to inference while preserving strategic flexibility. If inference-dominant workloads continue to grow and architectures like LPUs prove essential for low-latency, high-throughput deployment of large language models, Nvidia will be positioned to integrate or adapt these ideas into a broader heterogeneous computing strategy. If not, the company retains its dominant GPU trajectory with minimal disruption.
In this sense, the Groq agreement can be understood not simply as a technology-licensing deal, but as an organizational hedge against the innovator’s curse. Rather than attempting to disrupt itself head-on, Nvidia is selectively absorbing external disruption—observing it, testing it, and keeping it at arm’s length until its strategic value becomes undeniable.
Link: https://www.cnbc.com/2025/12/24/nvidia-buying-ai-chip-startup-groq-for-about-20-billion-biggest-deal.html
|