Làm sao để LLM trả về structured output (JSON) ổn định trong production?

Question

Luyện Phỏng Vấn IT · Accepted Answer

Các cách theo độ tin cậy tăng dần: 1. Prompt-only — yêu cầu JSON + cho schema mẫu. Mỏng manh, model có thể wrap trong markdown, thêm comment, thiếu field. 2. JSON mode — OpenAI responseformat: {"type": "jsonobject"}, Anthropic prompt patterns. Đảm bảo output là JSON hợp lệ nhưng không ép schema. 3. Constrained / Structured Outputs — OpenAI responseformat: {"type": "jsonschema", "strict": true}, Google Gemini responseschema, Anthropic tool use. Model bị giới hạn chỉ sinh token hợp lệ với schema → guarantee schema đúng. Khuyến nghị cho production. 4. Function / Tool calling — khai báo tool với JSON schema; khi trigger, model gọi tool với arguments tuân schema. 5. Libraries client-side — Instructor, Outlines, LMQL, jsonformer — wrap LLM call với Pydantic model, tự retry khi parse fail. Với local model có thể dùng logit biasing / grammar-constrained decoding (llama.cpp GBNF, vLLM guidedjson). Dù dùng cách nào, luôn validate output bằng Pydantic/Zod ở biên trước khi dùng trong business logic, và có retry với diff mô tả lỗi làm fallback. Tránh phụ thuộc vào regex parse JSON — dễ gãy khi output có nested string.