playto / Docs / Models

Recommended Models

Which AI models to use for translation, and how to bring your own.

These are recommendations as of April 2026. New models are released frequently — you can use any llama.cpp compatible GGUF model by placing it in the models directory.

Recommended Models

Model Size VRAM Best For
Lightweight (Gemma 4 E2B) ~1.5 GB ~2 GB English, lightweight. Text Reading on low-spec PCs.
Standard (Gemma 4 E4B) ~5 GB ~6 GB English & European languages. Text Reading + Image Recognition.
High Quality (Qwen3-VL 8B) ~5 GB ~6 GB CJK languages (Japanese, Chinese, Korean). Best Image Recognition for CJK.
Gemini Flash (API) N/A 0 GB All languages, highest quality. No GPU needed. Free tier available.

Language-Specific Picks

Source Language Text Reading Image Recognition
English Lightweight / Standard Standard
Japanese / Chinese / Korean Standard High Quality or Cloud AI
German / French / Spanish / etc. Lightweight / Standard Standard

Using Custom Models

1

Download a GGUF model

Search HuggingFace for "GGUF" models. Q4_K_M quantization offers the best balance of VRAM usage and quality.

2

Place the model file

Copy the .gguf file to Playto's models directory:

%APPDATA%\playto\models\

3

For VLM models: add the mmproj file

If the model supports vision, you also need the mmproj (multimodal projector) file. Place it next to the model:

modelname.gguf
modelname-mmproj.gguf

4

Select in Settings

Go to Settings > Engine and select your model from the dropdown. Any llama.cpp compatible GGUF model with a supported chat_template will work.

VRAM Performance Guide

Available VRAM Recommendation
4 GB Lightweight model. Use Text Reading. Image Recognition may be tight alongside a game.
6-8 GB Standard or High Quality. Text Reading + Image Recognition both work well.
12 GB+ Any model. Game and AI run comfortably side by side.
No GPU Cloud AI (Gemini Flash free tier). Zero local VRAM, fast cloud inference.

Remember: your game also uses VRAM. The numbers above are VRAM available in addition to what your game needs. Reduce GPU Layers in Settings if you run out of VRAM.