Dynamic Text-to-SQL for Enterprise Workloads with Amazon Bedrock Agent
June 5, 2025 ยท View on GitHub
Elevate your data analysis with an end-to-end agentic Text-to-SQL solution, built on AWS for enterprise-scale adaptability and resilience. Ideal for complex scenarios like fraud detection in financial services.
๐ Key Differentiators:
- Fully Agentic Process: Leveraging Amazon Bedrock, the AI agent manages the entire workflow from question to insight.
- Dynamic Schema Discovery: Adapts to evolving database structures in Amazon Athena in real-time.
- Intelligent Error Handling: Autonomous troubleshooting and query refinement using AWS Lambda.
This solution goes beyond basic Text-to-SQL, offering a robust, self-adapting system for complex enterprise environments.
Visit AWS Machine Learning Blog - Dynamic text-to-SQL for enterprise workloads with Amazon Bedrock Agents to learn more about this solution.
Sample Database Overview
This example demonstrates a fraud detection database schema commonly used in financial institutions. The schema represents core tables needed for transaction monitoring and fraud investigation.
Use Case: Fraud Investigation
Business analysts can query complex transaction patterns by asking questions in natural language, such as:
- "Show fraudulent transactions in the U.S. with device details"
- "List all transactions by John Doe in 2023 with fraud flags and merchant info"
Without Text2SQL capabilities, these queries would require:
- Knowledge of multiple table relationships
- Complex JOIN operations
- Understanding of specific column names and data types
Schema Structure
The following ERD shows the key tables used in fraud detection:

Note: This is a simplified version of a typical fraud detection database. Production environments often contain dozens or hundreds of interconnected tables.
Using the Agentic Text2SQL System
The system provides a natural language interface to your database through Amazon Bedrock Agent. Here's how it works:
-
Schema Discovery
- Lists available tables in the database
- Provides detailed schema information including column names, data types, and descriptions
- Helps users understand data structure without SQL knowledge
-
Natural Language Query Processing
- Converts natural language questions into SQL queries
- Executes queries against Amazon Athena
- Returns results in user-friendly format
-
Intelligent Error Handling
- Analyzes Athena error messages
- Identifies root causes (syntax errors, missing tables, datatype mismatches)
- Suggests and implements query modifications
- Automatically retries corrected queries
Note: Detailed agent instructions can be found in the sample agent prompt.
Important: This system is designed for Amazon Athena which uses AWS Glue Data Catalog for metadata storage. When adapting to other databases:
- Schema retrieval syntax will differ (e.g., PostgreSQL, MySQL have different commands for accessing column comments)
- Column comments are crucial for the Text2SQL agent's accuracy
- Ensure proper access to metadata in your target database
Agentic Text2SQL Flow

- User asks questions to the Amazon Bedrock Agent.
- To serve the user's questions, Agent determines the appropriate action to invoke:
- to execute the generated query with confidence, Agent will invoke athena-query tool
- to confirm the database schema first, it will invoke athena-schema-reader tool
- to understand what tables it has access to: /list_tables,
- to find out specific schema of a certain table: /describe_table
- The Lambda function sends the query to Athena to execute
- Athena queries the data from the S3 data bucket, and stores the query results in the S3 output bucket.
- The Lambda function retrieves and processes the results. If an error occurs:
- The Lambda function captures and formats the error message for Agent to understand
- The error message is returned to the Amazon Bedrock Agent. The agent analyzes the error message and tries to resolve. To retry with the modified query, Agent may repeat 2-5.
- The agent formats and presents the final responses to the user.
Getting Started
Prerequisites
- Node.js (>=v20)
- AWS CLI configured with appropriate credentials
- AWS CDK CLI installed (
npm install -g aws-cdk) (>= 2.1000.2) - Docker (Alternative: Rancher Desktop)
- Enable the model access to use Claude Sonnet 3.5 from AWS console
AWS Generative AI CDK Constructs
This project utilizes AWS Generative AI CDK Constructs, a collection of AWS CDK constructs for building generative AI applications. These constructs provide high-level components that make it easier to integrate AWS AI/ML services into your infrastructure.
Regional Configuration
The project is configured to use Amazon Bedrock in the US region by default. If you need to deploy in a different region, modify the geoRegion parameter in the cross-region inference profile configuration:
const anthropicClaudeRegionInferenceProfile = bedrock.CrossRegionInferenceProfile.fromConfig({
// Change this to your desired region:
// US (default) - bedrock.CrossRegionInferenceProfileRegion.US
// EU - bedrock.CrossRegionInferenceProfileRegion.EU,
// APAC - bedrock.CrossRegionInferenceProfileRegion.APAC
geoRegion: bedrock.CrossRegionInferenceProfileRegion.US,
model: bedrock.BedrockFoundationModel.ANTHROPIC_CLAUDE_3_5_SONNET_V2_0
})
Note: Ensure that Amazon Bedrock and the selected model are available in your chosen region.
Repository Structure
.
โโโ bin/
โ โโโ Entry point for the CDK application
โโโ lib/
โ โโโ agent-stack.ts
โ โ โโโ Defines the Bedrock Agent and related Lambda functions
โ โโโ db-stack.ts
โ โ โโโ Sets up the Athena database and related S3 buckets
โ โโโ assets/
โ โโโ Sample data CSV files and JSON schema definitions
โโโ src/
โ โโโ athena_query/
โ โ โโโ Lambda function for executing Athena queries
โ โโโ athena_schema_reader/
โ โ โโโ Lambda function for reading Athena schema information
โ โโโ common/
โ โโโ Shared utilities and helper functions
โโโ prompt/
โโโ Agent instruction for Text2SQL
CDK Deployment
The project is deployed using AWS CDK. The bin/agentic-text2sql.ts file is the entry point for the CDK application. It creates two main stacks:
- AthenaStack: Sets up the Athena database, S3 buckets, and Glue tables.
- AgentStack: Creates the Bedrock Agent, Lambda functions, and associated resources.
Deployment
-
Clone the repository:
git clone https://github.com/aws-samples/sample-Dynamic-Text-to-SQL-with-Amazon-Bedrock-Agent.git cd sample-Dynamic-Text-to-SQL-with-Amazon-Bedrock-Agent -
Install dependencies:
npm install -
Bootstrap your AWS environment (if not already done):
cdk bootstrap aws://ACCOUNT-NUMBER/REGION -
Deploy the stacks:
cdk deploy --all
How to Test
-
To examine the sample data, go to Athena and run queries, for example:
SELECT * FROM fraud_data.accounts; -
If you run the DESCRIBE clause, you can see detailed descriptions for each column. Without these descriptions, the Agent will attempt to interpret your database schema independently, and in most cases, it will understand generic terms like transaction_id. However, keep in mind that column comments are used to help the LLM better understand the data. Try to provide descriptions that are as detailed as possible.
DESCRIBE fraud_data.transaction_history -
Go to Amazon Bedrock and select Agents.
-
Select AwsText2Sql-AgentStack-DynamicAgent and test by asking questions in Test window on your right.
Example interactions:
- What specific methods or techniques are commonly used by perpetrators in the reported fraud cases?
- What patterns or trends can we identify in the timing and location of fraud incidents?
- Show the details of customers who have made transactions with merchants located in Denver
- Provide a list of all merchants along with the total number of transactions they have processed and the number of those transactions that were flagged as fraudulent
- List the top 5 customers based on the highest transaction amounts they have made

- Click Show trace and examine each step to understand what tools are used and the agent's rationale for approaching your question.

Code Interpreter
In the context of TEXT2SQL, Code Interpreter can be particularly useful for:
- Creating data visualizations from SQL query results
- Performing additional data analysis beyond simple SQL queries
Example interactions:
-
"Create a bar chart showing the top 3 cities that have the most fraud cases"
-

Troubleshooting
- If you encounter unexpected responses in the test window, clear the session history:
- Click the broom icon (๐งน) in the top right corner of the test window
- Start a new conversation
- Check agent's rationale to understand agent's approach
- If queries consistently fail, check the CloudWatch logs for the relevant Lambda functions (Athena Query and Athena Schema Reader).
- Enable Bedrock model invocation logging to troubleshoot
Customization Considerations
When adapting this solution to your environment:
-
Agent Instructions: Update the agent instructions in
prompt/instruction.txtto reflect your specific database environment and schema discovery methods. -
Database Metadata Sources: This sample uses AWS Glue Data Catalog via Athena's
SHOW tablesandDESCRIBE tablecommands. If your metadata is stored elsewhere (PostgreSQL's information_schema, Oracle's data dictionary, etc.), you'll need to modify the schema discovery approach accordingly. -
Metadata Enrichment: If your database doesn't provide sufficient context through column comments, consider implementing a supplementary metadata repository to provide the agent with richer information about your data model, business context, and table relationships.
-
Tool Interaction Optimization: Our sample implementation processes one table at a time, which is simple but may require multiple turns for complex queries. In production environments, consider implementing bulk metadata retrieval or caching strategies to reduce latency and provide more complete context to the agent.
License
Clean Up
If you've decided to remove the resources, follow these steps carefully:
-
Destroy the stacks:
cdk destroy --all --forceThis command will destroy the CDK application without asking for confirmation.
-
Verify deletion in CloudFormation
-
Manually force delete any remaining resources if necessary.
-
Delete CloudWatch Log Groups or set retention period in CloudWatch console.