Salesforce CEO Marc Benioff’s endorsement of Google’s Gemini 3 over OpenAI’s ChatGPT has reignited the conversation about leadership in generative artificial intelligence, especially as the arms race between AI giants intensifies into 2025. During a recent internal employee event, Benioff swept aside any ambiguity about his stance, calling Gemini 3 “far better” than ChatGPT across tasks and even favoring it on benchmarks like LLM (large language model) test results. His remarks, first reported by Business Insider on May 11, 2025, add a firm corporate voice to ongoing debates about utility, reliability, and innovation among AI systems in enterprise and consumer markets alike.
The Context of Benioff’s Preference for Gemini 3
Marc Benioff’s influence in enterprise software is immense, and his promotion of Google’s Gemini 3 over OpenAI’s ChatGPT might not simply be flattery—it hints at shifting tides in business preferences. Salesforce, which has increasingly tethered its offerings to AI functionality through the Einstein Trust Layer, heavily depends on natural language models for CRM and analytics features. While Salesforce still partners with OpenAI, Benioff’s company is also engaged with Google’s AI infrastructure, including Gemini cloud integrations.
According to the internal meeting leak highlighted by Business Insider, Benioff was impressed with Gemini 3’s performance in reasoning, multi-modal accuracy, and general “IQ” scores. These are crucial metrics as enterprise applications demand highly reliable, context-aware AI, particularly as companies scale up AI automation without sacrificing compliance and data security (World Economic Forum, 2025).
These sentiments are echoed by recent benchmarks. For instance, a April 2025 whitepaper from MIT Technology Review showed that Gemini 3 Ultra outscored GPT-4 Turbo in long-context reasoning and multilingual comprehension tasks, offering faster inference times across enterprise API queries. In tightly regulated sectors like finance and healthcare, these milliseconds and higher accuracy rates equate to competitive advantage.
Benchmarking Gemini 3 vs ChatGPT-4.5: An Evidence-Based Synopsis
There are nuances to AI model comparisons that go beyond subjective preference. Consider the following enterprise-focused LLM comparisons from third-party and vendor-published data, which highlight crucial performance elements.
| Model | Release Date | Context Length | Benchmark (MMLU) | Multi-modality Support |
|---|---|---|---|---|
| Gemini 3 Ultra | Mar 2025 | 1M tokens | 90.8% | Text, Code, Image, Audio |
| GPT-4.5 Turbo | Dec 2024 | 128k tokens | 87.5% | Text, Code |
Source: MIT Technology Review Benchmarks, April 2025; OpenAI Benchmark Reports, January 2025
This table illustrates Gemini 3’s advantage in several enterprise-relevant areas. The broader context window, which extends up to 1 million tokens, allows Gemini 3 to maintain coherence across sprawling datasets—an essential capability for industries such as legal, where enterprise clients frequently analyze extensive compliance documentation or derivative contracts (McKinsey Global Institute, 2025).
Implications for Enterprise Adoption
Benioff’s assertion isn’t occurring in a vacuum. In 2024, OpenAI dominated headlines for its innovation streak and developer ecosystem, including the GPT Store and custom GPT builders. However, 2025 has seen heavy investment by Google DeepMind and Alphabet in refining Gemini’s large-scale deployment and accuracy in professional domains. Google’s rollout includes native cloud capabilities and real-time API debugging tools that developers have lauded (NVIDIA Blog, 2025).
Salesforce’s interest naturally aligns with performance improvements that directly affect business integrations. For instance, adoption of Gemini 3 inside Slack—also owned by Salesforce—has helped deliver contextual thread summarization and real-time transcription enhancements. Feedback loops from millions of Slack users could be contributing to an even more tuned enterprise AI layer, assisted by Gemini’s real-time retrieval system paired with Google Cloud’s ground-level APIs (Slack Future of Work, 2025).
In public filings and executive commentary, AI now accounts for more than 12% of Salesforce’s R&D budget, according to Q1 2025 results. This figure has climbed from under 7% in 2023, highlighting how strategic AI has become in Salesforce’s growth thesis. Gartner predicts that by the end of fiscal 2025, over 70% of Fortune 100 companies will run at least one generative AI workflow through either Gemini or ChatGPT integrations (AI Trends, 2025).
Economic and Strategic Frictions in AI Leadership
The battleground for AI model supremacy isn’t limited to architecture or API capability—cost, licensing, localization, and regulatory alignment are also at play. OpenAI’s decision to remove browsing for free users beginning in April 2025 sparked concern, while Google’s Gemini 3 now features integrated search updates for all users, regardless of subscription tier. Freemium-to-premium usability is emerging as a major strategic delimiter for customer retention, particularly in educational and governmental procurement (FTC Press, 2025).
Furthermore, in Q1 2025 earnings calls, Google’s parent Alphabet emphasized AI monetization via Gemini Teams, charging business customers $29/user/month. In sharp contrast, OpenAI’s ChatGPT Enterprise tier is priced at $30-$60/user/month—a differential Salesforce leadership has reportedly taken into budgetary consideration (CNBC Markets, 2025).
From a compute perspective, Google’s partnership with NVIDIA for Gemini’s training on the H100 GPU clusters in its custom-built TPU Pods has ensured faster scalability. OpenAI, reportedly still reliant on Microsoft’s Azure infrastructure for most deployments, has raised performance concerns in high-use geographies such as Asia-Pacific, where latency has marginally increased since February 2025 (NVIDIA Blog, 2025).
What Comes Next in the LLM Race?
The Benioff endorsement signals more than momentary enthusiasm; it may signify a strategic reordering of LLM partnerships across enterprise tech. Industry analysts believe this could influence Tier 1 software vendors’ decisions on default AI models. SAP, Oracle, Adobe, and Microsoft Dynamics have all begun re-evaluating GenAI pipelines based on compatibility with multi-modal LLMs such as Gemini 3 Ultra and other frontier models (e.g., Anthropic’s Claude 3.5 and Mistral Mixtral 10×8) (VentureBeat AI, 2025).
Looking forward, OpenAI’s anticipated GPT-5 has yet to arrive as of May 2025. Insiders suggest a summer release, but it must demonstrate more than marginal gains to reclaim parity with Gemini. Meanwhile, Google has already begun beta testing Gemini 4 internally, with hints from DeepMind CEO Demis Hassabis that it will dramatically expand real-world planning applications and operate more like an AI “chief of staff” (DeepMind Blog, May 2025).
Benioff notes these shifts correspond with trust and “open platform” strategy—qualities increasingly requested by enterprise clients worried about AI hallucinations, anti-bias training, and recurring outages. In this light, Gemini’s strategic transparency, bolstered by clear model cards and open evaluation APIs, presents a serious challenge to OpenAI’s previously unshakable dominance.
Conclusion
With 2025 shaping up as the most transformative year yet for generative AI, Marc Benioff’s vocal praise for Gemini 3 over ChatGPT punctuates a broader reality: winning in AI is now about ecosystem depth, fidelity in enterprise scenarios, and smart pricing models. Salesforce’s diversified engagement with multiple LLMs makes it a bellwether; and as the company—and others like it—tilt toward more performant, flexible architectures like Gemini 3, the wider industry conversation is sure to follow.