Thiết kế Search Autocomplete (Typeahead Suggestions). Tối ưu latency và ranking? (Design a Search Autocomplete / Typeahead system. How to optimize latency and ranking?)

Requirements: suggest top-K queries khi user gõ, latency < 100ms, millions of users.

Data collection: log tất cả search queries; aggregation job (Hadoop/Spark) chạy weekly/daily để tính frequency của mỗi query (hay trending score).
Trie (Prefix Tree): data structure lý tưởng cho prefix matching – mỗi node đại diện một ký tự, lưu top-K queries tại mỗi node. Query time: traverse trie theo prefix → O(prefix_length).
Trie storage: serialized trie lưu trong distributed cache (Redis).
Caching: top-20% prefixes chiếm ~80% traffic → cache aggressively; prefix cache với TTL 24h; browser cache ở client để giảm requests.
Ranking factors: frequency, freshness (trending queries), personalization (user's search history).
Trie update: không update real-time (expensive); rebuild offline và swap atomically.
Scale: shard trie theo first character (26 shards); consistent hashing để route request đến đúng shard.

Filter: cần filter offensive/spam queries trước khi hiển thị. API: GET /autocomplete?q={prefix}&limit=10 → cần < 50ms để UX tốt.

Requirements: suggest top-K queries as the user types, latency < 100ms, millions of users.

Data collection: log all search queries; run an aggregation job (Hadoop/Spark) weekly or daily to compute the frequency (or trending score) of each query.
Trie (Prefix Tree): the ideal data structure for prefix matching — each node represents a character and stores the top-K queries at that node. Query time: traverse the trie following the prefix → O(prefix_length).
Trie storage: serialize the trie and store it in a distributed cache (Redis).
Caching: the top 20% of prefixes account for ~80% of traffic → cache aggressively; prefix cache with a 24h TTL; browser-cache client-side to reduce requests.
Ranking factors: query frequency, freshness (trending queries), and personalization (user's own search history).
Trie updates: do not update in real-time (too expensive); rebuild offline and swap atomically.
Scaling: shard the trie by the first character (26 shards); use consistent hashing to route requests to the correct shard.

Filtering: screen for offensive or spam queries before displaying. API: GET /autocomplete?q={prefix}&limit=10 — must respond in < 50ms for good UX.

Xem toàn bộ System Design cùng filter theo level & chủ đề con.

Requirements: suggest top-K queries khi user gõ, latency < 100ms, millions of users.

Data collection: log tất cả search queries; aggregation job (Hadoop/Spark) chạy weekly/daily để tính frequency của mỗi query (hay trending score).
Trie (Prefix Tree): data structure lý tưởng cho prefix matching – mỗi node đại diện một ký tự, lưu top-K queries tại mỗi node. Query time: traverse trie theo prefix → O(prefix_length).
Trie storage: serialized trie lưu trong distributed cache (Redis).
Caching: top-20% prefixes chiếm ~80% traffic → cache aggressively; prefix cache với TTL 24h; browser cache ở client để giảm requests.
Ranking factors: frequency, freshness (trending queries), personalization (user's search history).
Trie update: không update real-time (expensive); rebuild offline và swap atomically.
Scale: shard trie theo first character (26 shards); consistent hashing để route request đến đúng shard.

Filter: cần filter offensive/spam queries trước khi hiển thị. API: GET /autocomplete?q={prefix}&limit=10 → cần < 50ms để UX tốt.

Requirements: suggest top-K queries as the user types, latency < 100ms, millions of users.

Data collection: log all search queries; run an aggregation job (Hadoop/Spark) weekly or daily to compute the frequency (or trending score) of each query.
Trie (Prefix Tree): the ideal data structure for prefix matching — each node represents a character and stores the top-K queries at that node. Query time: traverse the trie following the prefix → O(prefix_length).
Trie storage: serialize the trie and store it in a distributed cache (Redis).
Caching: the top 20% of prefixes account for ~80% of traffic → cache aggressively; prefix cache with a 24h TTL; browser-cache client-side to reduce requests.
Ranking factors: query frequency, freshness (trending queries), and personalization (user's own search history).
Trie updates: do not update in real-time (too expensive); rebuild offline and swap atomically.
Scaling: shard the trie by the first character (26 shards); use consistent hashing to route requests to the correct shard.

Filtering: screen for offensive or spam queries before displaying. API: GET /autocomplete?q={prefix}&limit=10 — must respond in < 50ms for good UX.