← Back to OpenClaw
ReleaseJune 14, 2026

Google Research's Gemini-SQL2 Dominates Benchmarks, Microsoft's SkillOpt Boosts GPT-5.5

Today's AI news highlights significant advancements in natural language processing for databases and innovative methods for optimizing AI agent performance.

Google Research's Gemini-SQL2 Dominates Benchmarks, Microsoft's SkillOpt Boosts GPT-5.5

The AI landscape continues its rapid evolution, with today's news showcasing breakthroughs in how AI interacts with data and how developers are refining model performance. From Google's latest benchmark-topping model to Microsoft's novel approach to instruction optimization, the focus remains on enhancing efficiency and capability across various applications.

📊 Google Research's Gemini-SQL2 Sets New Text-to-SQL Benchmark

Google Research has unveiled Gemini-SQL2, a new AI model built on Gemini 3.1 Pro, which has significantly outperformed competitors in text-to-SQL benchmarks. Gemini-SQL2 achieved an impressive 80.04 percent accuracy on the BIRD benchmark, surpassing models from OpenAI and Anthropic, as reported by The Decoder [17]. This model is designed to translate natural language into executable SQL queries, a capability that Google states could enhance natural language features across its data services. The advancement demonstrates a substantial leap in AI's ability to accurately interpret and act upon complex data requests.

💡 Microsoft's SkillOpt Optimizes GPT-5.5 with Markdown Files

Microsoft, in collaboration with three Chinese universities, has introduced SkillOpt, a novel method for optimizing AI agent instruction documents. SkillOpt leverages principles from traditional model training to enhance the performance of AI agents using a simple Markdown file, as detailed by The Decoder [18]. This approach has been shown to boost GPT-5.5's performance by approximately 23 points on procedural tasks. A key advantage of SkillOpt is its transferability, with the same Markdown file proving effective across different models and agent environments, including Codex and Claude Code, indicating a flexible and efficient way to improve AI agent efficacy.

➕ "Count Anything" Model Achieves Breakthrough in Object Counting

A new AI model named "Count Anything" has been developed, designed to accurately count objects in any image type using only a text prompt, a task previously considered challenging. As reported by The Decoder, this model cuts the error rate in half compared to existing systems in comparative tests [9]. "Count Anything" can identify and count objects ranging from individuals in crowds to cells under a microscope. While it represents a significant advancement in computer vision, the model still faces challenges with extremely dense object clusters and ambiguous descriptive terms.

📄 Docling Enables Local PDF Parsing for RAG Systems

For those working with Retrieval Augmented Generation (RAG) systems, a new tool called Docling offers the ability to parse PDFs locally, ensuring data privacy and control. According to Towards Data Science, Docling provides "cloud-grade structure" for extracting rich information, including table cells, OCR data, captions, and headings, all while running on a user's own machine [12]. This means no cloud uploads, no API keys, and no per-page billing, addressing concerns about data security and cost for enterprise document intelligence applications.

What this means

These developments collectively highlight a dual focus in AI innovation: pushing the boundaries of what models can achieve in specific, complex tasks like data querying and object counting, and simultaneously refining the practical application and efficiency of existing powerful models. The emphasis on local processing and simplified optimization methods suggests a growing trend towards more accessible and controllable AI solutions. These advancements are not just about raw power but also about making AI more precise, efficient, and integrated into everyday workflows.

The trajectory of AI development continues to emphasize specialized capabilities and practical, user-centric improvements.

Google Research's Gemini-SQL2 Dominates Benchmarks, Microsoft's SkillOpt Boosts GPT-5.5 — Amplify