PocketPal AI
June 17, 2026 · View on GitHub
PocketPal AI
A private AI assistant that runs entirely on your phone.
Chat with language models, give them a voice, and let them use tools — all on-device. No account, no cloud, no internet required.
pocketpal.dev · Get the app · Leaderboard · PalsHub · Discussions
Why PocketPal AI?
Most AI apps are a thin window onto someone else's server — every message you type gets shipped off, logged, and analyzed somewhere you can't see. PocketPal flips that around: the AI lives on your phone, and your conversations never leave it.
- 🔒 Private by default — every prompt, response, and document stays on your device. Nothing is uploaded or stored on external servers.
- ✈️ Works offline — download a model once and it just works, with no connection and no account. On a plane, on a trail, anywhere.
- 📱 Runs on hardware you already own — real language models, voices, and tools, tuned to make the most of your phone's CPU, GPU, and NPU.
- 🆓 Free and open source — no subscription, no "pro" tier to unlock the AI. MIT-licensed and built in the open.
Privacy note: The only data that ever leaves your device is what you explicitly choose to share — benchmark results (if you opt into the leaderboard) and feedback you submit through the app.
Contents
- Features
- Get the app
- How it works
- Using the app
- For developers
- Contributing
- Roadmap
- Community & support
- License
Features
- 🧠 On-device chat — run GGUF language models (Gemma, Qwen, Phi, Llama, and more) fully offline.
- 🗣️ Text-to-speech — give your assistant a voice with on-device neural TTS (Kokoro and other engines), no cloud calls.
- 🎭 Pals — create personalized assistants with their own model, system prompt, and personality (Assistant and Roleplay types).
- 🛍️ PalsHub — discover and install community Pals, including premium ones via in-app checkout.
- 🛠️ Talents & tools — let capable Pals call built-in tools (calculator, date/time, rich HTML rendering) inside a tool-use loop.
- 📥 Hugging Face integration — search and download GGUF models, including gated ones, directly from the HF Hub with your access token.
- 📊 Benchmarking — measure tokens/sec and memory, and optionally compare on the AI Phone Leaderboard.
- ⚡ Hardware acceleration — CPU, GPU (Metal on iOS, OpenCL/Adreno on Android), and NPU (Qualcomm Hexagon) inference paths, with graceful fallback.
- 🌍 Localized — available in 11 languages, on phones and tablets, including full iPad support.
Get the app
| Platform | |
|---|---|
| iOS / iPadOS | |
| Android |
Three steps to your first chat:
- Install PocketPal from the App Store or Google Play.
- Download a model — tap the menu (☰) → Models, pick one that fits your phone, and download (or add one from Hugging Face).
- Load it and start chatting — that's it, you're running AI fully offline.
How it works
You don't need to know any of this to use PocketPal — but if you're curious how a phone runs real AI offline, here's the short version.
PocketPal is a four-layer stack, from the silicon up to the chat UI. Each layer has one job, and the dependency direction is strictly top-down — the JS app talks to native bridges, bridges talk to inference engines, engines target hardware backends.
| Layer | What runs here |
|---|---|
| UI & Tool Use | The React Native app (UI via React Native Paper, state via MobX, chat history in WatermelonDB). The AgentRunner drives each chat turn — streaming tokens, dispatching Talents (tools) when the model calls them, and feeding results back for follow-up reasoning. Pals are configurable personas; PalsHub is the in-app marketplace for sharing and buying them. |
| Bridging | Native modules that connect JavaScript to the engines. llama.rn bridges LLM inference over JSI; react-native-speech and onnxruntime-react-native bridge text-to-speech. |
| Engine | The inference engines. llama.cpp runs language models in the quantized GGUF format. ONNX Runtime runs TTS voice models in the ONNX format. |
| Hardware | Where the math actually happens. PocketPal targets CPU (universal fallback), GPU (Metal on iOS, OpenCL on Qualcomm Adreno for Android), and NPU (Qualcomm Hexagon) — falling back gracefully and offloading partial layers when a full backend isn't available. |
Using the app
📥 Download & load a model
- Open the app and tap the Menu (☰), then go to Models.
- Pick a model from the list and tap Download, or tap + to add one from Hugging Face or local storage.
- From Hugging Face, search GGUF models and choose a quantization that fits your device's memory and storage — download now or bookmark for later.
- After downloading, tap Load (or use the chevron icon left of the chat input to load right from the chat screen).
💬 Chat
- Make sure a model is loaded.
- Open the Chat page and start talking.
- The screen stays awake during inference and deactivates when idle.
- Copy a full response with the copy icon, or long-press a paragraph to copy just that.
- Edit any of your messages with a long-press — the AI regenerates from your change. Hit retry for a fresh answer, optionally with a different model.
🎭 Pals & PalsHub
Create personalized assistants:
- Assistant Pal — pick a default model, set a system prompt (write it yourself or have the app generate one), and customize the chat input color.
- Roleplay Pal — everything above, plus location, the AI's role, and other contextual parameters.
Switch personas with the Pal picker on the chat page. Browse PalsHub in-app to discover community Pals, including premium ones via in-app checkout (US iOS & Android).
Creating a cocktail-recipe assistant
📊 Benchmark your device
- Open the Benchmark page.
- Run performance tests to compare speed and efficiency across models.
- Review tokens/sec and memory usage.
- Optionally share your results to the AI Phone Leaderboard.
🔑 Set up a Hugging Face token (for gated models)
- Create an access token in your Hugging Face account (docs).
- In PocketPal, go to Settings → Set Token, paste it, and save.
💌 Send feedback
Go to App Info → "Sharing your thoughts", type your feedback — feature requests, suggestions, anything — and submit.
For developers
PocketPal is a standard React Native app. If you can build a React Native project, you can build PocketPal.
Prerequisites
- Node.js — version is pinned in
.nvmrc(currently22.21.0); runnvm useto match it. Older Node will fail theenginescheck. - Yarn 1 (Classic) —
packageManageris pinned toyarn@1.22.22. - Xcode + CocoaPods, and Ruby + Bundler (for iOS / Fastlane tooling).
- Android Studio + Android SDK/NDK.
See the React Native environment setup for platform details.
Clone, install & run
git clone https://github.com/a-ghorbani/pocketpal-ai
cd pocketpal-ai
nvm use # match the pinned Node version
yarn install # install JS dependencies
(cd ios && pod install) # iOS only
yarn start # Metro bundler
yarn ios # build + run on iOS simulator
yarn android # build + run on Android emulator
Core on-device chat works without any backend keys; only PalsHub/auth features need additional configuration.
Native-change rule: if you change
package.json, a native module,ios/,android/, the Podfile, orbuild.gradle, re-runpod installand rebuild both platforms — a JS reload won't pick up native changes.
Quality gates
yarn lint # ESLint
yarn typecheck # tsc --noEmit
yarn test # Jest
yarn l10n:validate # validate locale JSON (placeholders, integrity)
Run yarn lint && yarn typecheck && yarn test before opening a PR. Commits are validated by Commitlint (Conventional Commits) via a Husky hook.
Repository layout
src/
├── screens/ # Chat, Models, Pals, Benchmark, Settings, About, …
├── components/ # Reusable UI
├── store/ # MobX stores (Model, ChatSession, Pal, TTS, HF, Benchmark, …)
├── services/
│ ├── agent/ # AgentRunner — the chat / tool loop
│ ├── talents/ # Tool engines + registries
│ ├── tts/ # TTS engines (kokoro, kitten, supertonic, system)
│ ├── palshub/ # PalsHub marketplace integration
│ └── downloads/ # Model download manager
├── database/ # WatermelonDB schema, models, migrations
├── repositories/ # Data-access layer over the DB
├── locales/ # i18n JSON + lazy loader (index.ts is the registry)
└── hooks/ api/ theme/ utils/ config/ specs/
Tech stack
Versions are pinned in package.json; the highlights:
| Area | Choice |
|---|---|
| Framework | React Native 0.82.1, React 19.1.1 (New Architecture) |
| Language | TypeScript 5.0.4 |
| UI | React Native Paper 5.14.5, React Navigation |
| State | MobX 6 (mobx, mobx-react, mobx-persist-store) |
| Persistence | WatermelonDB (chat history), AsyncStorage (settings), Keychain (secrets) |
| LLM | llama.rn 0.12.4 → llama.cpp · GGUF |
| TTS | react-native-speech 2.3.1 + onnxruntime-react-native 1.23.2 · ONNX |
| Tooling | Yarn 1 (Classic), ESLint, Prettier, Jest, Husky + Commitlint |
Extending PocketPal
A Talent is a tool the model can call mid-conversation. Engines are registered in a TalentRegistry, exposed to the model as tool schemas; the AgentRunner detects a call, runs the engine, and returns the result for the next turn.
| Talent | Engine | Does |
|---|---|---|
calculate | CalculateEngine | Arithmetic / expression evaluation |
datetime | DatetimeEngine | Current date / time |
render_html | RenderHtmlEngine | Renders model-produced HTML in chat |
Good first contributions:
- A new Talent — implement a
TalentEngineand register it insrc/services/talents/. - A new TTS engine — add it under
src/services/tts/engines/. - A new locale — add a JSON file in
src/locales/(or translate on Weblate).
Contributing
Contributions are welcome — bug reports, fixes, features, translations, and docs all help.
- Fork and branch:
git checkout -b feature/your-feature-name - Make your changes; run on a device/emulator (
yarn ios/yarn android). Re-runpod install+ rebuild if you touched native code. - Gate locally:
yarn lint && yarn typecheck && yarn test - Commit with Conventional Commits:
git commit -m "feat: add new talent" - Push and open a pull request.
Please read the Contributing Guidelines and Code of Conduct first. Want to translate PocketPal into your language? Join us on Weblate.
Roadmap
- Tool use expansion — grow the Talents catalog and deepen the agentic loop so Pals can do more, fully on-device.
Have an idea or found a bug? Open an issue or start a discussion.
Community & support
- 💬 Questions & ideas — GitHub Discussions
- 🐛 Bugs & requests — GitHub Issues
- 🌐 Website — pocketpal.dev
- ❤️ Support development — PocketPal is free and ad-free; sponsoring helps keep it that way.
License
Licensed under the MIT License.
Acknowledgements
PocketPal AI stands on the shoulders of the open-source community, including:
- llama.cpp — efficient on-device LLM inference.
- llama.rn — llama.cpp bindings for React Native.
- react-native-speech — React Native TTS bridge powering on-device voices.
- ONNX Runtime — cross-platform inference engine powering on-device TTS.
- React Native, MobX, React Native Paper, React Navigation, WatermelonDB, and many other open-source libraries that make this project possible.
Made with ❤️ for people who want AI that stays on their phone.
If PocketPal is useful to you, consider giving it a ⭐ — it helps others find the project.