|
In this newsletter:
- OpenAI releases GPT-5.5, which checks its work and uses fewer tokens
- Microsoft launches Agent Mode in Office apps for autonomous editing
- Google updates Workspace AI to build spreadsheets and write documents
Plus, you’ll find new AI tools and this week’s top AI news headlines!
👑 OpenAI Releases GPT-5.5, Which Checks Its Work and Uses Fewer Tokens
OpenAI released GPT-5.5, its most capable model yet, outperforming GPT-5.4 and competitors on agentic coding, computer use, knowledge work, and scientific research while matching GPT-5.4 speed and using fewer tokens.
What's new:
- Rolling out to Plus, Pro, Business, and Enterprise users in ChatGPT and Codex
- GPT-5.5 Pro rolling out to Pro, Business, and Enterprise users in ChatGPT
- API coming very soon at $5/$30 per million tokens (input/output), GPT-5.5 Pro at $30/$180 per million tokens
- Available on NVIDIA GB200 and GB300 NVL72 systems
- 400K context window in Codex, 1M context window in API
- Fast mode generates tokens 1.5x faster for 2.5x the cost
How it performs:
- Terminal-Bench 2.0 agentic coding: 82.7% vs GPT-5.4 (75.1%), Claude Opus 4.7 (69.4%), Gemini 3.1 Pro (68.5%)
- SWE-Bench Pro: 58.6% vs GPT-5.4 (57.7%), Claude Opus 4.7 (64.3%)
- Expert-SWE internal eval: 73.1% vs GPT-5.4 (68.5%)
- GDPval knowledge work: 84.9% vs GPT-5.4 (83.0%), Claude Opus 4.7 (80.3%), Gemini 3.1 Pro (67.3%)
- OSWorld-Verified computer use: 78.7% vs GPT-5.4 (75.0%), Claude Opus 4.7 (78.0%)
- FrontierMath Tier 4: 35.4% vs GPT-5.4 (27.1%), Claude Opus 4.7 (22.9%), Gemini 3.1 Pro (16.7%)
- CyberGym: 81.8% vs GPT-5.4 (79.0%), Claude Opus 4.7 (73.1%)
- GeneBench scientific research: 25.0% vs GPT-5.4 (19.0%)
- More token-efficient on the same tasks than GPT-5.4
What it can do:
- Better at understanding intent and carrying work across tools until the task is finished
- Holds context across large systems, reasons through ambiguous failures, checks assumptions with tools
- Generates documents, spreadsheets, and slide presentations in Codex
- Computer use skills: sees what's on screen, clicks, types, navigates interfaces, moves across tools
- Internal version with a custom harness discovered new proof about Ramsey numbers in combinatorics
Real-world usage:
- Over 85% of OpenAI uses Codex weekly across engineering, finance, communications, marketing, data science, and product management
- OpenAI Comms team analyzed six months of speaking requests, built a scoring framework, and validated an automated Slack agent
- The OpenAI Finance team reviewed 24,771 K-1 tax forms totaling 71,637 pages, and accelerated the task by two weeks
- Go-to-Market employee automated weekly business reports, saving 5-10 hours per week
- Load balancing optimization using Codex increased token generation speeds by over 20%
Enterprise feedback:
- Cursor Co-founder: "Noticeably smarter and more persistent than GPT-5.4, stays on task significantly longer."
- Every Founder: "First coding model I've used that has serious conceptual clarity."
- MagicPath CEO: Merged branch with hundreds of frontend changes in 20 minutes in one shot
- NVIDIA engineer: "Losing access to GPT-5.5 feels like I've had a limb amputated."
- Jackson Laboratory immunology professor: Analyzed a 62-sample gene-expression dataset with 28,000 genes, produced a detailed research report that would have taken the team months
Cybersecurity controls:
- Stricter classifiers for potential cyber risk
- Trusted Access for Cyber program in Codex with expanded access for verified users
- Organizations defending critical infrastructure can apply for GPT-5.4-Cyber access
- Treated as High under the Preparedness Framework for biological/chemical and cybersecurity capabilities
Why it matters:
GPT-5.5 marks the shift from AI that generates answers to AI that verifies its own work before responding, addressing the reliability problem that kept companies from deploying autonomous agents at scale.
The token efficiency gains mean GPT-5.5 costs less to run than GPT-5.4 for the same tasks, despite higher intelligence, flipping the traditional tradeoff where smarter models cost more to operate and changing the economics of AI deployment for enterprises.
👀 Read more about GPT-5.5 launch!
💼 Microsoft Launches Agent Mode In Office Apps For Autonomous Editing
Microsoft rolled out Agent Mode across Word, Excel, and PowerPoint this week, shifting Copilot from answering questions to directly editing documents, manipulating spreadsheets, and building presentations.
What's new:
- Rolled out to Word, Excel, and PowerPoint after months of testing
- AI directly edits documents, reformats spreadsheets, and redesigns presentations instead of generating text in a sidebar for copy-paste
- Called "vibe working" internally, describes a hands-off approach to document creation
- Leverages partnership with Anthropic, integrating Claude models alongside Microsoft's AI infrastructure
- Available to enterprise customers paying $30 per user monthly for Copilot
Why the change:
- Microsoft Corporate VP Sumit Chauhan: "When we first shipped Copilot, foundation models were not powerful enough to use Copilot to command the applications. This meant Copilot was a passive partner in documents: it could answer questions but missed the mark when it was asked to take action on the canvas directly."
- Early large language models could generate text, but struggled with the structured, multi-step reasoning needed to edit complex documents
- Models hallucinated formatting, broke Excel formulas, and created presentation layouts that made no visual sense
- Required fundamental advances in model architecture and training to understand Canvas as a manipulable workspace
Competition:
- Google recently launched Workspace Intelligence with similar autonomous capabilities across Docs, Sheets, and Slides
- Google is aggressively pricing Workspace Intelligence to steal enterprise accounts
- Startups like Notion and Coda are building AI-native productivity tools from scratch
- The enterprise productivity market is worth billions in annual subscriptions
Why it matters:
Microsoft is admitting that Copilot's first version wasn't good enough and held back autonomous features until the technology caught up, taking a different approach than competitors who shipped early and iterated publicly.
Agent Mode intensifies the battle with Google for enterprise productivity dominance, where the winner won't be determined by whose AI generates better text but whose AI employees actually trust to edit documents unsupervised without breaking critical business files.
👀 Read more about the Microsoft Agent Mode release!
📊 Google Updates Workspace With AI That Builds Spreadsheets and Writes Documents
Google announced updates to Workspace at Google Cloud Next, integrating AI automation tools that draft emails, organize Google Sheets, and write documents using data from Gmail, Calendar, Chat, and Drive.
What's new:
- Workspace Intelligence AI system draws on users' Gmail, Calendar, Chat, and Drive (Docs, Slides, Sheets) data
- Users have administrative control over what AI can see and access
- Can disable Workspace Intelligence's access to particular data sources at any time
- The more data the system has access to, the more it can assist in those areas
Google Sheets features:
- Build sheets by prompting Gemini with formatting and data retrieval instructions
- Gemini does work that a human would've previously needed to do manually
- Automatic data entry with "prompt-based" filling
- Populates spreadsheets 9x faster than manual entry by inferring what you're going to enter
- Converts unstructured data into organized tables
Google Docs features:
- Gemini can "generate, write, and refine" documents
- Powered by the Workspace Intelligence system using data from Drive, Chat, Gmail archives, and the internet
- Users prompt Gemini to "help me write" or ask it to "match" their writing style to mimic their voice
Why it matters:
Google is turning Workspace into an AI-powered office assistant that automates grunt work by reading everything in your Gmail, Calendar, and Drive, competing head-to-head with Microsoft's Agent Mode for control of enterprise productivity.
The 9x speed claim for spreadsheet filling and document writing features positions Google as the efficiency leader, though giving AI access to your entire work history raises questions about whether the convenience of automation is worth letting Google's systems read every email and document you've ever created.
👀 Read more about Google Workspace updates!
My Latest LinkedIn & X/Twitter Posts:
- How to use Copilot in Excel, Word & PowerPoint (view post)
- 15-step to learn AI from beginner to expert roadmap (view post)
- Top 11 free AI tools from Google and how to use them (view post)
- How to pick the right AI tool for your task (view post)
- 8 ChatGPT prompts for decision making and strategic thinking (view post)
In partnership with Kit:
As a creator, your time should be spent doing what you love, not juggling a dozen tools just to run your business.
Kit gives you everything you need in one place:
✅ Build and grow your email list (I use Kit for my newsletter) ✅ Easily monetize with paid newsletters and digital products ✅ Automate your emails with triggers and custom workflows ✅ Track what’s working and optimize with powerful insights
It’s not just email. It’s your entire creator business, simplified.
Join a thriving community of successful creators.
Use Kit for Free — and Start Building Smarter
New AI Tools to Boost Your Productivity:
- Vidnix: Converts images into AI-generated videos.
- Helena: Autonomous AI marketer for SEO, ads, email, and social.
- OnAir Studio: AI producer for scripting and refining shows.
- Meetz: AI sales assistant for outreach and meeting booking.
- Glif: Chat workspace with multiple AI models for creation tasks.
- ChatGPT: AI assistant for writing, coding, research, and chat.
- NamingCube: Generates brand names with domain and trademark checks.
- Daybreaker: Tracks prompts users type into AI search tools.
- Instant Klarity: AI advisor for decisions and next steps.
- NoimosAI: AI marketing team that runs campaigns proactively.
- AI Photo Editor: Edits and enhances photos with text prompts.
- MojoMake: Converts images into AI-generated videos.
- WordoBot: Generates persuasive marketing content from customer insights.
- HaloMate: AI workspace built for professionals and teams.
- Canva AI: AI tools for creating designs, content, and visuals.
- GMaps Scraper: Extracts Google Maps business data for leads and outreach.
This Week's Top AI News Headlines:
- Claude Can Now Connect to Lifestyle Apps Like Spotify, Instacart, and AllTrails to Turn the AI Assistant into a More Personal Daily Companion (View Article)
- DeepSeek Previews New AI Model That Narrows the Gap With Frontier Systems From OpenAI, Google, and Anthropic (View Article)
- Google Plans to Invest Up to $40 Billion in Anthropic Through Cash and Compute, Deepening One of AI’s Biggest Strategic Alliances (View Article)
- OpenAI Unveils Workspace Agents, an Enterprise Successor to Custom GPTs That Connect Directly to Slack, Salesforce, and Business Tools (View Article)
- X Launches AI-Powered Custom Feeds That Let Users Create Personalized Timelines Around Topics, Trends, and Interests (View Article)
- Beehiiv Rolls Out New Creator Tools Including Webinars and Customizable Paywalls to Help Newsletter Publishers Grow Revenue (View Article)
- Anthropic Reveals Claude Performance Drop Was Likely Caused by Changes to Internal Harnesses and Operating Instructions (View Article)
- NoScroll, an AI News Bot, Does Your Doomscrolling for You by Summarizing Online Chaos into Quick Daily Updates (View Article)
- Meta Signs Deal for Millions of Amazon AI CPUs in Surprise Chip Partnership as Big Tech Races to Secure Compute Capacity (View Article)
- Nothing, the Consumer Tech Brand Behind Phone Devices, Introduces AI-Powered Dictation Tool That Transcribes Speech and Rewrites Notes Across its Smartphones (View Article)
- Google and AWS Are Emerging as Separate Layers of the AI Agent Stack, With Google Controlling Intelligence and AWS Powering Execution Infrastructure (View Article)
Work With Me:
If you enjoy this newsletter, please forward it to your friends and colleagues.
Follow me on LinkedIn and X/Twitter to see my latest posts.
Have a wonderful week!
Andrew Bolis
|