Ask anything. The conversation stays on your device.
Downloading the on-device model (one time)...
Model settings
Changing any of these starts a fresh model session on your next message (your visible chat is kept). Not all browsers honour temperature and top-k; Edge ignores them.
How adventurous the wording is. Low stays focused and repeatable; high is more varied but can wander.
How many of the most likely next words it may pick from at each step. 1 is the safest, predictable choice; higher allows more surprising words.
Standing instructions that set the assistant's role and tone for the whole chat, before your first message.
Tip: temperature and top-k work together. For factual answers keep both low; for brainstorming raise the temperature first.
A chatbot that runs on your machine
This chat is powered by your browser's built-in Prompt API, which runs a small language model directly on your device: Gemini Nano in Chrome, Phi-4-mini in Edge. Your messages never leave the browser, there is no account, and once the model is downloaded it works offline. It is a small model, so expect it to be less capable than a cloud assistant, but quick and completely private.
Your chats stay on this device
Conversations are saved in your browser's local storage and listed in the sidebar. They are never uploaded. Reopen one to carry on where you left off, delete them one at a time with the small cross, or clear them all at once. Because they live in this browser only, they will not appear on your other devices and are removed if you clear site data or use a private window. The most recent 50 are kept.
You need to turn on a flag first
The Prompt API is still experimental on the open web, so you have to enable it manually in Chrome 138 or newer on desktop:
- Open a new tab and go to
chrome://flags/#prompt-api-for-gemini-nano - Set it to Enabled.
- Also set
chrome://flags/#optimization-guide-on-device-modelto Enabled BypassPerfRequirement if it is offered. - Click Relaunch to restart Chrome, then reload this page.
Microsoft Edge has the same API: go to edge://flags,
enable Prompt API for on-device language model, relaunch, and reload. The first reply
can be slow while the model loads, especially in Edge, which uses a larger model.
The first message downloads the model once and needs enough free disk space and memory. See the Chrome or Edge docs for the full requirements.