Text 2SQL
Deccan AI offers a premium Text-to-SQL dataset for advanced Conversational BI tasks, featuring 1,000 high-quality NLQ-SQL pairs in MySQL and PostgreSQL dialects. Targeting medium to hard complexity, it enhances model accuracy for structured data queries.

How It Works - Annotation Schema?
NLQ Generation
Business analysts craft questions reflecting common business insights and real-world needs.
SQL Query Creation
SQL experts craft efficient queries, considering schema nuances and dialect variations.
Quality Assurance
Multi-layer validation ensures each NLQ-SQL pair meets high standards of accuracy and clarity.
Deccan AI Platform Integration
Real-time error-checking and feedback, enhancing data quality and streamlining workflow.
Unique Dataset Features
Reduces
Hallucinations
Hallucinations
We use hierarchical document-structured chunking to minimize AI errors, delivering precise and well-annotated context.
Expert
Curated
Curated
Meticulously curated by top 1% industry experts, including Ph D holders, our data ensures unmatched accuracy and reliability.
Comprehensive & Domain Specific Dataset
A rich dataset tailored to cover all relevant information within a specific domain.
"Deccan AI ’s premium Text 2SQL dataset has signifi cantly enhanced our model’s ability to handle complex business queries with precision. The high-quality NLQ and SQL pairs, coupled with varied SQL dialects, have greatly improved our model’s accuracy and effi ciency in processing structured data. A game-changer for advanced Conversational Business Intelligence."
– Senior Data Scientist , Leading Business Intelligence Firm
The Deccan AI Advantage
Multi-touch
Collaboration
Collaboration
Up to 3 annotators per data sample ensure high accuracy and thoroughness in annotations.
Spider
Benchmarking
Benchmarking
We enhance LLM performance with Spyder benchmarking for top-tier text-to-SQL accuracy.
Real-time
Feedback
Feedback
Immediate feedback for smooth annotation processes and enhanced quality control.
Anti-Cheating
Indicators
Indicators
The platform ensures data integrity by preventing the use of LLMs during annotation
Enterprise Certifications
ISO 27001

SOC2

GDPR
Coming soon
HIPAA
Coming soon

300
K+
Community of experts

400
K+
Total hours worked by our experts

10
%
Of talent are PhDs