Amazon Kinesis là family dịch vụ xử lý real-time streaming data.
Kinesis Data Streams (KDS): ingest và store streaming data tạm thời (1-365 ngày), nhiều consumer đọc cùng một stream độc lập (consumer group độc lập), throughput scale bằng shard (1 shard = 1MB/s in, 2MB/s out, 1000 records/s), dùng cho real-time analytics, cần nhiều consumer đọc cùng data, replay events.
Kinesis Firehose (Data Firehose): fully managed, không cần code consumer — tự động load streaming data vào S3, Redshift, OpenSearch, Splunk; transform data bằng Lambda inline; buffer size + interval để batch; không có replay/retention dài; đơn giản nhất khi chỉ cần pipe data vào data store.
Kinesis Data Analytics: chạy SQL hoặc Apache Flink trên streaming data real-time, tính aggregation trong window (tumbling/sliding), anomaly detection; pay per KPU (Kinesis Processing Unit).
Nên dùng Kinesis thay SQS khi: cần nhiều independent consumers đọc cùng stream, cần replay data, ordering quan trọng, real-time analytics, high-volume log ingestion (hàng GB/s). Nên dùng SQS khi: chỉ cần 1 consumer type, task queue pattern, cần exactly-once processing (FIFO), không cần replay. Kinesis đắt hơn SQS, SQS đơn giản hơn để dùng.
Amazon Kinesis is a family of services for real-time streaming data processing.
Kinesis Data Streams (KDS): ingests and temporarily stores streaming data (1-365 days), allows multiple independent consumers to read the same stream, scales throughput via shards (1 shard = 1MB/s in, 2MB/s out, 1000 records/s) — used for real-time analytics, multiple consumers reading the same data, and event replay.
Kinesis Firehose (Data Firehose): fully managed, no consumer code needed — automatically loads streaming data into S3, Redshift, OpenSearch, or Splunk; transforms data with inline Lambda; buffers by size and interval for batching; no replay or long retention; simplest option when you just need to pipe data into a data store.
Kinesis Data Analytics: runs SQL or Apache Flink on real-time streaming data, computes windowed aggregations (tumbling/sliding), and performs anomaly detection; billed per KPU (Kinesis Processing Unit).
Use Kinesis instead of SQS when: you need multiple independent consumers reading the same stream, need to replay data, ordering matters, need real-time analytics, or have high-volume log ingestion (multiple GB/s). Use SQS when: only one consumer type is needed, task queue pattern, exactly-once processing is required (FIFO), or replay is not needed. Kinesis is more expensive than SQS; SQS is simpler to use.