-
kafka.apache.org/documentation/#gettingStarted
Apache Kafka
Apache Kafka: A Distributed Streaming Platform.
kafka.apache.org
Kafka의 Three key capabilities.
- To publish (write) and subscribe to (read) streams of events, including continuous import/export of your data from other systems.
- To store streams of events durably and reliably for as long as you want.
- To process streams of events as they occur or retrospectively.
kafka의 작동
Kafka is a distributed system consisting of servers and clients that communicate via a high-performance TCP network protocol.
Apache Kafka
Apache Kafka: A Distributed Streaming Platform.
kafka.apache.org
Servers: Kafka is run as a cluster of one or more servers that can span multiple datacenters or cloud regions. Some of these servers form the storage layer, called the brokers. Other servers run Kafka Connect to continuously import and export data as event streams to integrate Kafka with your existing systems such as relational databases as well as other Kafka clusters.
Main Concepts과 Terminology
- event : something happened, events are not deleted after consumption.Instead, define for how long Kafka should retain. (Event key: "Alice Event value: "Made a payment of $200 to Bob", Event timestamp: "Jun. 25, 2020 at 2:06 p.m.")
- Producers: Client applications that publish(write) events to Kafka
- Consumers: Subscribe to(read and process) these events
- Topic: similar to a folder in a filesystem. Topic always multi-producer and multi-subscriber(0~)
- Partitioned: Topics are partitioned. A topic is spread over a number of "buckets" located on different Kafka brokers.
- Replicated: to make data fault-tolerant and highly-available.
https://kafka.apache.org/documentation/#intro_concepts_and_terms 위지원데이터 엔지니어로 근무 중에 있으며 데이터와 관련된 일을 모두 좋아합니다!. 특히 ETL 부분에 관심이 가장 크며 데이터를 빛이나게 가공하는 일을 좋아한답니다 ✨
'2020년 > Development' 카테고리의 다른 글
[PyTorch] 파이토치 써보기 (0) 2020.10.21 [Kafka] #2 카프카를 써보자 (0) 2020.10.06 신경망 학습 #3 (0) 2020.07.19 신경망 학습 #2 (1) 2020.07.17 신경망 학습 # 1 (0) 2020.07.15