NLWeb Coding Rules and Conventions
March 4, 2026 · View on GitHub
Code Structure
Python Backend Structure
Directory Organization
AskAgent/python/
├── core/ # Core system functionality
│ ├── query_analysis/ # Query understanding modules
│ ├── baseHandler.py # Base handler class
│ ├── config.py # Configuration management
│ ├── retriever.py # Vector DB interface
│ └── ...
├── methods/ # Specialized query handlers
│ ├── generate_answer.py # RAG generation
│ ├── compare_items.py # Comparison logic
│ └── ...
├── data_loading/ # Data ingestion utilities
├── llm_providers/ # LLM provider wrappers
├── webserver/ # HTTP server
└── llm_batch_handler.py # Batch LLM processing
Class Organization
- Single Responsibility: Each class handles one primary concern
- Base Classes: Abstract base classes define interfaces (e.g.,
BaseHandler) - Inheritance: Specialized handlers inherit from base classes
- Composition: Complex functionality built through composition
JavaScript Frontend Structure
Module Organization
static/
├── fp-chat-interface.js # Modern chat interface (main)
├── managed-event-source.js # SSE handling
├── conversation-manager.js # Conversation state
├── json-renderer.js # Content rendering
├── type-renderers.js # Type-specific renderers
├── oauth-login.js # Authentication
└── utils.js # Shared utilities
Class Structure
- ES6 Classes: All major components use ES6 class syntax
- Module Pattern: Each file exports specific classes/functions
- Event-Driven: Components communicate via events
- Separation of Concerns: UI, state, and API logic separated
Naming Conventions
Python Naming
Files and Modules
- snake_case:
base_handler.py,query_analysis.py - Descriptive names: File name matches primary class/function
Classes
- PascalCase:
NLWebHandler,VectorDBClient,AppConfig - Descriptive: Class name indicates purpose
- Suffix patterns:
Handler- Request handlersClient- External service clientsManager- State/resource managers
Functions and Methods
- snake_case:
process_query(),get_embeddings() - Verb prefixes:
get_,set_,process_,handle_ - Async prefix:
async_for async functions - Private prefix:
_for internal methods
Variables
- snake_case:
query_text,result_count - Constants:
UPPERCASE_WITH_UNDERSCORES - Configuration: Loaded into class attributes
JavaScript Naming
Files
- kebab-case:
chat-interface.js,managed-event-source.js - Descriptive: File name indicates component/module
Classes
- PascalCase:
ModernChatInterface,ConversationManager - Descriptive: Clear indication of purpose
Methods and Functions
- camelCase:
sendMessage(),handleStreamingData() - Event handlers:
onprefix (e.g.,onMessage) - Private methods:
_prefix (e.g.,_processData)
Variables
- camelCase:
currentQuery,isStreaming - Constants:
UPPERCASE_WITH_UNDERSCORES - DOM elements: Descriptive names (e.g.,
sendButton,messagesContainer)
Edge-Case Rules
Query Processing
- Empty Queries: Return helpful message, don't process
- Malformed JSON: Log error, return error message to user
- Missing Parameters: Use sensible defaults
- Large Queries: Truncate at reasonable length (e.g., 1000 chars)
- Invalid Sites: Default to 'all' sites
Error Handling
Python Backend
# Always catch specific exceptions
try:
result = await vector_db.search(query)
except ConnectionError as e:
logger.error(f"Vector DB connection failed: {e}")
# Return partial results or fallback
except TimeoutError as e:
logger.warning(f"Vector DB timeout: {e}")
# Use cached results if available
JavaScript Frontend
// Always provide user feedback
try {
const response = await fetch(url);
if (!response.ok) {
throw new Error(`HTTP error! status: ${response.status}`);
}
} catch (error) {
console.error('API call failed:', error);
this.showErrorMessage('Unable to process request. Please try again.');
}
Streaming Responses
- Connection Loss: Implement retry with exponential backoff
- Partial Data: Buffer and validate JSON before parsing
- Timeout: Close stream after reasonable time (e.g., 5 minutes)
- Memory Management: Clear old messages/results periodically
Authentication
- Token Expiry: Refresh tokens automatically
- Invalid Tokens: Clear and redirect to login
- Network Errors: Cache auth state locally
- Multiple Tabs: Sync auth state across tabs
Data Validation
Input Sanitization
// Always escape HTML in user content
function escapeHtml(text) {
const div = document.createElement('div');
div.textContent = text;
return div.innerHTML;
}
Result Validation
# Validate result structure
def validate_result(result):
required_fields = ['url', 'name', 'site']
if not all(field in result for field in required_fields):
logger.warning(f"Invalid result structure: {result}")
return None
return result
Code Quality Rules
General Principles
- DRY (Don't Repeat Yourself): Extract common functionality
- SOLID Principles: Especially Single Responsibility
- Fail Fast: Validate inputs early
- Explicit is Better: Clear variable names over brevity
Python-Specific
- Type Hints: Use for function signatures
- Docstrings: Required for public methods
- Async/Await: Prefer over callbacks
- Context Managers: Use for resource management
JavaScript-Specific
- Strict Mode: Always use 'use strict'
- Const by Default: Use const unless reassignment needed
- Arrow Functions: For callbacks and short functions
- Template Literals: For string interpolation
Testing Conventions
- Unit Tests: Test individual functions/methods
- Integration Tests: Test API endpoints
- Mock External Services: Don't rely on external APIs in tests
- Test Edge Cases: Empty inputs, large inputs, invalid data
Configuration Management
- Environment Variables: For secrets and deployment-specific values
- YAML Files: For structured configuration
- Defaults: Always provide sensible defaults
- Validation: Validate configuration on startup
Logging
- Structured Logging: Use consistent format
- Log Levels: DEBUG, INFO, WARNING, ERROR, CRITICAL
- Context: Include relevant context (user_id, query_id)
- No Sensitive Data: Never log passwords, tokens, or PII
Security
- Input Validation: Always validate user input
- SQL Injection: Use parameterized queries
- XSS Prevention: Escape HTML content
- CORS: Configure appropriately for production
- Authentication: Verify tokens on every request
Performance
- Caching: Cache expensive operations
- Pagination: Limit result sets
- Lazy Loading: Load data as needed
- Connection Pooling: Reuse database connections
- Parallel Processing: Use asyncio for I/O operations
Documentation
- README: Keep updated with setup instructions
- API Docs: Document all endpoints
- Code Comments: Explain "why", not "what"
- Examples: Provide usage examples