Multi-agent system = nhiều AI agent có role chuyên biệt phối hợp giải quyết task phức tạp. Mỗi agent có system prompt, tool, knowledge riêng — giống một team.
Lý do chia multi-agent (không phải lúc nào cũng cần):
- Specialization: prompt/tool tập trung → chất lượng cao hơn general-purpose agent.
- Separation of concerns: dễ maintain, swap từng agent.
- Parallel execution: các agent chạy song song trên sub-task độc lập.
- Role-play debate: multi-perspective thảo luận → giảm blind spot.
Pattern orchestration:
1. Hierarchical / Manager-Worker — 1 supervisor/orchestrator agent phân task cho worker agents, tổng hợp output.
- Ví dụ: Researcher manager → [Web Searcher, Doc Reader, Data Analyst].
- Framework: LangGraph supervisor, CrewAI hierarchical process.
2. Sequential / Pipeline — agent A → B → C, mỗi agent đảm nhận giai đoạn. Không khác prompt chain nhiều, nhưng mỗi "agent" có thể dùng tool.
- Ví dụ: Researcher → Writer → Editor → Fact-checker.
3. Network / Peer collaboration — agent giao tiếp ngang hàng, ai cần gì hỏi ai. Linh hoạt nhưng dễ loop và tốn token.
- AutoGen mặc định dùng pattern này (GroupChat).
4. Competitive / Debate — 2 agent tranh luận 2 phía → judge chọn hoặc tổng hợp. Cải thiện reasoning (Constitutional AI, CAMEL).
5. Blackboard / Shared memory — nhiều agent ghi/đọc shared state; supervisor coordinate.
6. Swarm / Agent handoff — OpenAI Swarm pattern: các agent chuyển giao quyền control dựa trên context (customer support agent → billing agent → escalation agent).
Các thách thức:
- Token explosion — mỗi agent có context riêng; share info nhau thường copy message → cost tăng theo N². Cần summarize và shared scratchpad.
- Coordination failure — agents giả định sai vai trò nhau, lặp lại công việc, deadlock. Cần clear role definition + termination condition.
- Debugging cực khó — lỗi có thể ở bất kỳ agent nào hoặc ở tương tác. Cần trace toàn pipeline (LangSmith, Langfuse).
- Evaluation — không chỉ final output mà cả hành vi từng agent, contribution.
- Latency — sequential → tổng các step; parallel → max(steps).
- Cost — dễ vượt single-agent 3-10x.
When NOT to multi-agent: nếu single agent + structured prompt làm được → đừng multi-agent. Anthropic research cho thấy nhiều task multi-agent thua single-agent về cả cost và chất lượng vì phức tạp không đáng.
Framework 2025: LangGraph (state machine, mạnh nhất cho production), CrewAI (role-based, dễ dùng), AutoGen (Microsoft, linh hoạt), OpenAI Swarm (minimal, educational), Anthropic orchestrator-worker pattern.
Multi-agent system = multiple AI agents with specialized roles collaborating on complex tasks. Each has its own system prompt, tools, knowledge — like a team.
Why go multi-agent (not always needed):
- Specialization: focused prompts/tools → higher quality than a general agent.
- Separation of concerns: easier to maintain and swap parts.
- Parallel execution: agents run concurrently on independent sub-tasks.
- Role-play debate: multi-perspective discussion → fewer blind spots.
Orchestration patterns:
1. Hierarchical / Manager-Worker — one supervisor/orchestrator dispatches to workers and aggregates output.
- e.g. Researcher manager → [Web Searcher, Doc Reader, Data Analyst].
- Frameworks: LangGraph supervisor, CrewAI hierarchical process.
2. Sequential / Pipeline — agent A → B → C, each handling a phase. Close to prompt chaining, but each "agent" can use tools.
- e.g. Researcher → Writer → Editor → Fact-checker.
3. Network / Peer collaboration — agents talk as peers, asking whoever has info. Flexible but prone to loops and token bloat.
- AutoGen defaults to this (GroupChat).
4. Competitive / Debate — two agents argue opposing sides → a judge picks or synthesizes. Improves reasoning (Constitutional AI, CAMEL).
5. Blackboard / Shared memory — agents read/write shared state; supervisor coordinates.
6. Swarm / Agent handoff — OpenAI Swarm pattern: agents pass control based on context (support agent → billing agent → escalation agent).
Challenges:
- Token explosion — each agent has its own context; sharing often copies messages → cost scales as N². Needs summarization and shared scratchpads.
- Coordination failures — agents misassume roles, duplicate work, deadlock. Requires clear role definitions + termination conditions.
- Hard debugging — errors can be inside any agent or in interactions. Needs full pipeline tracing (LangSmith, Langfuse).
- Evaluation — not just final output but per-agent behavior and contribution.
- Latency — sequential → sum of steps; parallel → max of steps.
- Cost — easily 3–10x single-agent.
When NOT to go multi-agent: if a single agent with a structured prompt suffices → don't. Anthropic research shows many multi-agent setups lose to single-agent on both cost and quality because of unnecessary complexity.
2025 frameworks: LangGraph (state machine, strongest for production), CrewAI (role-based, approachable), AutoGen (Microsoft, flexible), OpenAI Swarm (minimal, educational), Anthropic orchestrator-worker pattern.