MSchemaJS

January 9, 2026 Β· View on GitHub

A powerful database schema extraction and formatting library

License TypeScript Node

πŸ“– Background

MSchemaJS is the JavaScript/TypeScript implementation of the M-Schema project by XGenerationLab.

M-Schema is a semi-structured database schema representation format that transforms complex database information into a concise, LLM-friendly format, significantly enhancing SQL generation accuracy in Text-to-SQL applications.

M-Schema Format Example

[DB_ID] my_database
[Schema]
# Table: users 
 [(id: INT, Primary Key),
 (name: VARCHAR(100), Examples: [John Doe, Jane Smith, Bob Wilson]),
 (email: VARCHAR(255), Examples: [john@example.com, jane@example.com, bob@example.com]),
 (created_at: DATETIME, Examples: [2024-01-15 10:30:00, 2024-01-16 14:20:00])]
# Table: orders 
 [(order_id: INT, Primary Key),
 (user_id: INT),
 (amount: DECIMAL(10,2), Examples: [99.99, 149.50, 299.00]),
 (status: VARCHAR(50), Examples: [pending, completed, cancelled])]

✨ Key Features

  • πŸ”Œ Multi-Database Support: Works with MySQL, PostgreSQL, SQLite, and OceanBase Oracle mode
  • πŸ“Š Complete Schema Information: Automatically extracts table structures, column types, primary keys, comments, and other metadata
  • πŸ’‘ Intelligent Sample Data: Automatically retrieves sample values for columns to help LLMs better understand data content

πŸ“¦ Installation

npm install mschemajs
# or
pnpm install mschemajs
# or
yarn add mschemajs

Module Support

This package supports both ESM (ECMAScript Modules) and CommonJS:

// ESM
import { MSchema } from 'mschemajs';

// CommonJS
const { MSchema } = require('mschemajs');

πŸš€ Quick Start

MySQL Example

import { MSchema } from 'mschemajs';

const mschema = new MSchema({
  host: 'localhost',
  port: 3306,
  type: 'mysql',
  user: 'root',
  password: 'your_password',
});

// Connect to database
await mschema.connect();

// Get all databases
const databases = await mschema.getDatabaseNames();
console.log('Available databases:', databases);

// Get complete information for a specific database
const db = await mschema.getDatabase('my_database');
console.log(db.toString()); // Output formatted M-Schema

// Get information for a specific table
const table = db.getTable('users');
console.log(table.toString());

// Disconnect
await mschema.disconnect();

PostgreSQL Example

const mschema = new MSchema({
  host: 'localhost',
  port: 5432,
  type: 'postgresql',
  user: 'postgres',
  password: 'your_password',
  database: 'postgres',
});

await mschema.connect();
const db = await mschema.getDatabase('my_database');
console.log(db.toString());
await mschema.disconnect();

SQLite Example

const mschema = new MSchema({
  host: '',
  port: 0,
  type: 'sqlite',
  database: './example.db', // SQLite file path
});

await mschema.connect();
const databases = await mschema.getDatabaseNames();
const db = await mschema.getDatabase(databases[0]);
console.log(db.toString());
await mschema.disconnect();

OceanBase Oracle Mode Example

const mschema = new MSchema({
  host: 'localhost',
  port: 2883,
  type: 'oceanbase-oracle',
  user: 'your_username',
  password: 'your_password',
  sampleData: {
    limit: 3,
    excludeBinaryData: true,
  },
});

await mschema.connect();
const schemas = await mschema.getDatabaseNames();
const db = await mschema.getDatabase('YOUR_SCHEMA');
console.log(db.toString());
await mschema.disconnect();

πŸ“š Usage

Basic Configuration

interface DatabaseConfig {
  host: string;           // Database host address
  port: number;           // Database port
  type: DatabaseType;     // Database type
  user?: string;          // Username
  password?: string;      // Password
  database?: string;      // Database name (optional)
  sampleData?: {
    limit?: number;             // Number of sample data rows, default 3
    excludeBinaryData?: boolean; // Exclude binary data, default true
    excludeTextData?: boolean;   // Exclude text data, default false
  };
}

type DatabaseType = 'mysql' | 'postgresql' | 'sqlite' | 'oceanbase-oracle';

Core API

MSchema Class

class MSchema {
  constructor(config: DatabaseConfig);
  
  // Connect to database
  async connect(): Promise<void>;
  
  // Disconnect from database
  async disconnect(): Promise<void>;
  
  // Get all database names
  async getDatabaseNames(): Promise<string[]>;
  
  // Get complete information for a specific database
  async getDatabase(databaseName: string): Promise<Database>;
  
  // Get information for a specific table
  async getTable(databaseName: string, tableName: string): Promise<Table>;
  
  // Clear cache
  clearCache(): void;
  
  // Get configuration
  getConfig(): DatabaseConfig;
}

Database Class

class Database {
  // Get database ID
  getId(): string;
  
  // Get database name
  getName(): string;
  
  // Get schema (OceanBase Oracle mode)
  getSchema(): string | undefined;
  
  // Get all table names
  getTableNames(): string[];
  
  // Get specific table
  getTable(tableName: string): Table | undefined;
  
  // Get all tables
  getAllTables(): Table[];
  
  // Format as M-Schema string
  toString(): string;
  
  // Convert to JSON
  toJSON(): DatabaseInfo;
}

Table Class

class Table {
  // Get table name
  getName(): string;
  
  // Get all columns
  getColumns(): ColumnInfo[];
  
  // Get specific column
  getColumn(columnName: string): ColumnInfo | undefined;
  
  // Format as string
  toString(): string;
  
  // Convert to JSON
  toJSON(): TableInfo;
}

Using with Text-to-SQL

import { MSchema } from 'mschemajs';

async function textToSQL(question: string, evidence: string) {
  const mschema = new MSchema({
    host: 'localhost',
    port: 3306,
    type: 'mysql',
    user: 'root',
    password: 'password',
  });

  await mschema.connect();
  const db = await mschema.getDatabase('my_database');
  const schemaStr = db.toString();

  // Build LLM Prompt
  const prompt = `You are a MySQL data analyst. The database schema is as follows:

【Schema】
${schemaStr}

【Question】
${question}

【Evidence】
${evidence}

Based on the user's question and evidence, generate an executable SQL statement.`;

  // Call LLM API
  const response = await callLLM(prompt);
  
  await mschema.disconnect();
  return response;
}

// Example usage
const sql = await textToSQL(
  "How many users registered in January 2024?",
  "User information is stored in the users table, with the registration date field being created_at"
);

Advanced Usage: Custom Connectors

import { BaseConnector, ConnectorFactory } from 'mschemajs';

// If you need a custom connector
class MyCustomConnector extends BaseConnector {
  // Implement custom logic
}

// Register custom connector
// Then use it in configuration

πŸ”§ Development

Building the Project

# Install dependencies
pnpm install

# Compile TypeScript
pnpm build

# Watch mode
pnpm watch

# Clean build artifacts
pnpm clean

Running Examples

# Set environment variables
export MYSQL_HOST=localhost
export MYSQL_PORT=3306
export MYSQL_USER=root
export MYSQL_PASSWORD=your_password

# Run MySQL example
npx ts-node examples/mysql-example.ts

# Run PostgreSQL example
npx ts-node examples/postgresql-example.ts

# Run SQLite example
npx ts-node examples/sqlite-example.ts

# Run OceanBase Oracle example
export OCEANBASE_HOST=localhost
export OCEANBASE_PORT=2883
export OCEANBASE_USER=your_username
export OCEANBASE_PASSWORD=your_password
export OCEANBASE_SCHEMA=YOUR_SCHEMA
npx ts-node examples/oceanbase-oracle-example.ts

🀝 Contributing

Issues and Pull Requests are welcome!

If you're interested in the project or have any questions, feel free to contact us.

πŸ“„ License

This project is licensed under the MIT License. See the LICENSE file for details.