RStudio Add-in for DataLinter
May 11, 2026 · View on GitHub
An RStudio add-in that brings the power of DataLinter directly into your RStudio session.
DataLinter performs automated sanity checks and linting on data frames and modelling code (23 built-in linters covering missing values, outliers, type consistency, modelling assumptions, and more). This add-in lets you lint selected code (and optional data) with a single click from the Addins menu.

Table of Contents
- Features
- Prerequisites
- Installation
- Quick Start
- Troubleshooting
- Contributing
- License
- Acknowledgements
- References
Features
- Lint any selected R code directly from the editor.
- Automatically detects
data = my_variablearguments and sends the data frame to the linter. - Falls back to a global
LINTER_DATAvariable when nodata=argument is present. - Real-time feedback inside RStudio (via the add-in UI powered by
rstudioapi). - Zero-configuration integration with the official DataLinter Docker server.
See the full DataLinter documentation for details on the 23 linters and configuration options.
Prerequisites
- R ≥ 4.0 (tested with imports:
rstudioapi,httr,rjson) - RStudio (Desktop or Server) with the Addins menu enabled
- Docker (recommended and documented deployment for DataLinter)
- Up-to-date
pakordevtools+rstudioapi
Installation
1. DataLinter server (Docker)
docker pull ghcr.io/zgornel/datalinter-compiled:latest
2. RStudio-addin package
Note: Make sure you have up-to-date stable versions of devtools or pak and rstudioapi installed before installing the package.
# Recommended (fast & dependency-aware)
pak::pak("zgornel/Rstudio-Addin-DataLinter")
# Alternative
devtools::install_github("zgornel/Rstudio-Addin-DataLinter")
After installation the add-in appears automatically under Addins → DATALINTER.
Quick Start
- Start the DataLinter server (see below
- In RStudio, define sample data under the
LINTER_DATAvariable:
LINTER_DATA <- mtcars # or any data.frame
- Select any modelling code (or nothing) and choose Addins → DATALINTER / Lint data and code.
- Review the lint report that appears.
Starting the DataLinter server
The add-in requires the DataLinter server running on port 10000 (default). The server can be started with
docker run -it --rm -p 10000:10000 \
ghcr.io/zgornel/datalinter-compiled:latest \
/datalinterserver/bin/datalinterserver \
-i 0.0.0.0 \
--config-path /datalinter/config/r_modelling_config.toml \
--log-level info
If the server starts correctly, it should display something like:
[ Info: • Data linting server online @0.0.0.0:10000...
[ Info: Listening on: 0.0.0.0:10000, thread id: 1
How the add-in works
- Selected code is analyzed by the plugin
- If the code contains
data = my_variable, that object is automatically serialized and included. - Code and data are sent to the server
- If no code is selected, the plugin looks for a global
LINTER_DATAvariable. - The plugin waits for an answer from the server and prints when received
Troubleshooting
- Server not reachable → Check Docker is running, port 10000 is free (
docker ps), and firewall allowslocalhosttraffic - Port conflict → Change the published port (
-p 10001:10000) and update the add-in configuration - Large datasets → DataLinter is designed for typical modelling data; very large objects may need custom work
- Windows / macOS → Ensure Docker Desktop is running and “File Sharing” includes your project folders if using volumes
-No output → Verify the server log shows successful connection; try increasing logging level with
--log-level debug
Report issues via GitHub Issues.
Contributing
Please file an issue to report a bug or request a feature. See the parent DataLinter contributing guidelines as well.
License
This project is licensed under the GNU General Public License v3.0 (GPL-3) – see LICENSE for details. The underlying DataLinter engine uses an MIT license.
Acknowledgements
The initial version of DataLinter was fully inspired by this work written by Google brain research.
References
[1] https://en.wikipedia.org/wiki/Lint_(software)
[2] N. Hynes, D. Sculley, M. Terry "The data linter: Lightweight, automated sanity checking for ml data sets", NIPS MLSys Workshop, 2017; paper
[3] The data-linter code repository