UpdateJune 7, 2026

Alibaba's Qwen3.7-Plus Advances Autonomous Agents, Sakana AI Pursues Self-Improving AI

Today's AI news highlights Alibaba's new multimodal agent model, Sakana AI's focus on recursive self-improvement, and research into language model skill acquisition.

The AI landscape continues its rapid evolution, with significant advancements in autonomous agents and foundational research. Today's news showcases how companies are pushing the boundaries of what AI can achieve, from developing models that can independently build applications to exploring new paradigms for model improvement and understanding the mechanisms behind large language model capabilities.

🚀 Alibaba's Qwen3.7-Plus Unveils Multimodal Autonomous Agent Capabilities

Alibaba's Qwen team has released Qwen3.7-Plus, a new multimodal agent model designed to combine visual perception, GUI operation, and coding within a single agent loop. This proprietary offering aims to transform multimodal AI into a full-blown autonomous agent, as reported by The Decoder [20]. A demonstration showcased an agent built on Qwen3.7-Plus autonomously developing a vocabulary learning application, generating over 10,000 lines of code across 1,000 agent calls over an eleven-hour period. While the model leads in on-screen understanding in Qwen's internal benchmarks, its overall performance is noted as mixed. Qwen3.7-Plus is priced below Western frontier models, indicating Alibaba's strategic positioning in the AI agent market.

💡 Sakana AI Establishes Research Lab for Recursive Self-Improvement

Sakana AI, a Japanese startup co-founded by Transformer co-author Llion Jones, has launched a dedicated research lab focused on recursive self-improvement (RSI) [11]. As reported by The Decoder, Sakana AI views RSI – where AI iteratively improves itself – as a potential alternative to the intensive compute arms race currently dominating frontier AI labs. This initiative suggests a strategic shift towards more efficient and self-sustaining AI development, aiming to break away from the traditional reliance on ever-increasing computational resources for model advancement.

🧠 Researchers Uncover Mechanism Behind Large Language Model Skill Acquisition

New research sheds light on why larger language models acquire skills that smaller ones often miss [1]. According to The Decoder, a study involving models ranging from 4 million to 4 billion parameters revealed that small language models struggle with rare tasks because frequent tasks continuously overwrite their learned knowledge. The research not only details this mechanism but also proposes a practical solution: instead of solely scaling up models, increasing the frequency of target tasks in training data could be sufficient to improve performance. This finding offers valuable insights into optimizing training strategies for language models.

🗣️ Open-Source Voice Model Offers Nonstop, Real-time Audio Interaction

A new open-source voice model, Audio Interaction, has been released, distinguishing itself by listening nonstop and making real-time decisions every 0.4 seconds on whether to speak or remain silent [17]. The Decoder highlights that, unlike models such as GPT-4o or Qwen3.5-Omni, Audio Interaction does not wait for a recording to conclude. Instead, it continuously translates, transcribes, chats, and even picks up ambient noises like coughing within a single stream. The code, model weights, and download instructions are available on GitHub under the Apache 2.0 open-source license, with the training data slated for future release.

What this means

The developments today underscore a dual focus in the AI industry: the creation of increasingly sophisticated and autonomous AI agents, and a deeper understanding of the fundamental mechanisms driving AI capabilities. Alibaba's Qwen3.7-Plus demonstrates the growing potential for AI to independently perform complex tasks, while Sakana AI's commitment to self-improving AI points towards a future of more efficient and self-evolving systems. Concurrently, research into language model training offers crucial insights that could optimize development processes.

The trajectory of AI development is clearly moving towards more capable, autonomous, and intelligently designed systems.