OraPlan–SQL: Advancing Natural Language to SQL Conversion with Intelligent Planning

TLDR: OraPlan–SQL is a leading framework for converting complex, bilingual natural language queries into SQL. It uses a two-agent system (Planner and SQL agent) with key innovations: feedback-guided meta-prompting to refine the Planner, entity-linking for multilingual support, and plan diversification with majority voting for robustness. This approach achieved top performance in the Archer NL2SQL Challenge 2025, significantly improving execution accuracy and maintaining high SQL validity across English and Chinese.

In the rapidly evolving landscape of artificial intelligence, the ability to translate human language into executable database queries, known as Natural Language to SQL (NL2SQL), is a critical challenge. This task bridges natural language understanding with structured data reasoning, enabling users to interact with databases using everyday language. However, current systems often struggle with complex reasoning, such as arithmetic, commonsense inference, and hypothetical scenarios, especially in bilingual contexts.

A new framework, OraPlan–SQL, developed by researchers at Oracle AI, has emerged as a significant advancement in this field. It recently secured the top position in the Archer NL2SQL Evaluation Challenge 2025, a demanding bilingual benchmark that tests complex reasoning capabilities. OraPlan–SQL surpassed its closest competitor by over 6% in execution accuracy, achieving 55.0% in English and 56.7% in Chinese, while maintaining an impressive SQL validity rate of over 99%.

How OraPlan–SQL Works: A Two-Agent Approach

OraPlan–SQL operates on an intelligent, agent-based framework comprising two main components: a Planner agent and a SQL agent. The Planner agent is responsible for breaking down complex natural language queries into a series of logical, stepwise natural language plans. These plans then serve as instructions for the SQL agent, which translates them into precise, executable SQL queries. The reliability of the SQL agent in adhering to these plans means that most of the system’s refinements are focused on optimizing the Planner.

Innovations in Planning and Robustness

Unlike previous methods that often rely on multiple specialized sub-agents, leading to orchestration complexities, OraPlan–SQL introduces a streamlined approach. It employs a feedback-guided meta-prompting strategy to refine a single Planner agent. This involves analyzing common failure cases from a test dataset, clustering them with human input, and then using a large language model (LLM) to distill these errors into corrective guidelines. These guidelines are then integrated directly into the Planner’s system prompt, enhancing its ability to generalize and correct errors without adding system complexity.

For multilingual scenarios, particularly addressing challenges like transliteration and entity mismatches (where names or terms might be spelled differently across languages or databases), OraPlan–SQL incorporates entity-linking guidelines. These guidelines instruct the Planner to generate alternative forms for entities found in the query, explicitly including them in the plan to improve matching accuracy during SQL generation.

To further enhance reliability and robustness, the system utilizes plan diversification. Instead of generating a single plan, the Planner creates multiple candidate plans for each query. The SQL agent then produces a SQL query for each of these plans. The final output is determined through a majority voting mechanism based on the execution results of these diverse SQL queries, effectively mitigating errors that might arise from any single plan.

Also Read:

Performance and Impact

The effectiveness of OraPlan–SQL’s components was validated through extensive ablation studies. These studies confirmed that explicit natural language planning, the feedback-guided meta-prompting guidelines, and the inclusion of in-context examples for the SQL agent are all crucial for achieving high accuracy. The framework also demonstrated strong performance across different LLM models, showcasing its inherent robustness.

OraPlan–SQL represents a significant leap forward in Text-to-SQL generation, particularly for complex and bilingual applications. By focusing on an optimized planning-centric approach, it offers a scalable and reliable solution for bridging the gap between human language and database interactions. For more details, you can read the full research paper here.

Retailers Intensify Fraud Prevention Efforts with AI Adoption, Report Reveals

Australian Pension Fund Warns: China’s AI Advancements Pose Threat to US Market Rally

Artificial Intelligence Drives Three Major Shifts in Global Macroeconomic Forecasting

Wharton Study Reveals Widespread Generative AI Adoption and Positive ROI Among Enterprise Leaders

Maddocks Provides Expert Guidance on Australia’s Revised AI Adoption Framework

SFU Expert Calls for Urgent Ethical and Regulatory Framework for Therapeutic Voice AI

Retailers Intensify Fraud Prevention Efforts with AI Adoption, Report Reveals

Australian Pension Fund Warns: China’s AI Advancements Pose Threat to US Market Rally

Artificial Intelligence Drives Three Major Shifts in Global Macroeconomic Forecasting

Wharton Study Reveals Widespread Generative AI Adoption and Positive ROI Among Enterprise Leaders

Maddocks Provides Expert Guidance on Australia’s Revised AI Adoption Framework

SFU Expert Calls for Urgent Ethical and Regulatory Framework for Therapeutic Voice AI

OraPlan–SQL: Advancing Natural Language to SQL Conversion with Intelligent Planning

How OraPlan–SQL Works: A Two-Agent Approach

Innovations in Planning and Robustness

Performance and Impact

Gen AI News and Updates

A New Era for Health Prediction: Combining Genetic Insights with Electronic Health Records

MuMo: A New Approach to Multimodal Molecular Representation Learning

Improving Particle Jet Identification with Spatially Aware Linear Transformers

VisCoder2: Advancing Multi-Language Visualization Code Generation

Adaptive AI Framework Boosts Hardware Trojan Detection

RoGBot: A New Era in Bot Detection Without Social Network Links

Optimizing LLM Memory for Extended Text Processing

Optimizing Large Language Models with Contiguous Layer Pruning

Spatiotemporal Error Adjustment Enhances Deep Learning Traffic Models

ELBO-KTO: Aligning Diffusion Language Models with Unpaired Human Feedback

Quantum-Enhanced AI Model Boosts Pneumonia Detection Accuracy

Agentsway: A New Software Development Approach for AI Agent Teams

Direct Semantic Learning from Compressed Files with TEMPEST

Optimize Any Topology: A Foundation Model for Flexible Structural Design

Advanced Traffic Prediction: A Hybrid Model for Urban Flow Forecasting

Understanding Generative AI Adoption: A Deep Dive into What Work AI is Actually Doing

Sparsity and Specialization: Making Sense of Mixture of Experts Models

Unmasking Threats in Model Context Protocol Servers: A Deep Dive into AI Agent Security

Making AI Code Safer: Introducing RefleXGen

Subscribe to get the latest news and updates