[VELODB.IO]
DATANOMIX.PRO // BLOG // DATA ENGINEERING

Doris Kafka Connector qoʼllanmasi

Kafka’dan Doris’ga event yuklash boʼyicha amaliy qoʼllanma: standalone, distributed, SSL, DLQ va schema evolution.

Tayyorlagan:
Datanomix.pro
Oʼqish vaqti:
~14 min
MUNDARIJA:
01 / Kafka Connector qachon kerak
02 / Versiya mosligi
03 / Tez start (standalone)
04 / Production (distributed)
05 / SSL va xavfsizlik
06 / DLQ va xatolarni qayta ishlash
07 / Schema evolution va Debezium
08 / Best practices
FAQ

1. Doris Kafka Connector qachon kerak boʼladi

Agar sizda Kafka Connect allaqachon boʼlsa va maʼlumotlarni Apache Doris’ga barqaror yetkazish kerak boʼlsa, rasmiy Doris Sink Connector eng tezkor yoʼl hisoblanadi.

Bu yondashuv exploitation barqarorligi va distributed scale kerak boʼlganda foydali.

2. Versiyalar mosligi

Ishga tushirishdan oldin Kafka, Doris va Java versiyalari mosligini tekshiring.

// MINIMAL CONNECTOR CONFIG
name=test-doris-sink connector.class=org.apache.doris.kafka.connector.DorisSinkConnector topics=topic_test doris.topic2table.map=topic_test:test_kafka_tbl doris.urls=10.10.10.1 doris.http.port=8030 doris.query.port=9030 doris.user=root doris.password= doris.database=test_db value.converter=org.apache.kafka.connect.json.JsonConverter value.converter.schemas.enable=false

3. Tez start: standalone rejimi

  1. Connector JAR faylini Kafka Connect plugins papkasiga joylang.
  2. connect-standalone.properties faylini sozlang.
  3. doris-connector-sink.properties faylini yarating.
  4. connect-standalone.sh bilan ishga tushiring.

Standalone rejimi PoC va debugging uchun. Production’da distributed tavsiya etiladi.

4. Production: distributed rejimi

Distributed rejim scale, fault tolerance va REST orqali lifecycle boshqaruvini beradi.

Birinchi ishga tushishda Kafka Connect xizmat topiclarini yaratadi.

5. SSL va xavfsiz ulanish

SSL Kafka bilan ishlashda truststore worker uchun ham, embedded consumer uchun ham sozlanishi kerak.

Ingestion sekinlashishining keng tarqalgan sababi: max.poll.interval.ms qiymati juda kichik boʼlishi.

6. DLQ: qayta ishlanmagan xabarlar navbati

Error’lar paytida butun connector’ni toʼxtatmaslik uchun DLQ ishlating.

7. Schema evolution va Debezium

Debezium CDC ssenariylarida schema tez-tez oʼzgaradi va barcha oʼzgarishlar Doris’ga avtomatik tushmasligi mumkin.

Odatda avval Doris jadvaliga yangi ustunni qoʼshib, keyin connector-task’ni qayta ishga tushirish tavsiya etiladi.

8. Best practices for production

  • Distributed mode va task status monitoring’dan foydalaning.
  • Business topics va DLQ topics’ni alohida saqlang.
  • Buffer parametrlarini throughput boʼyicha tuning qiling.
  • Schema evolution uchun aniq runbook tuzing.
  • Lag, error rate va delivery latency metrikalarini doimiy kuzating.

FAQ

Production uchun qaysi rejim toʼgʼri?

Distributed rejim.

SSL Kafka bilan ishlay oladimi?

Ha, worker va embedded consumer darajasida truststore kerak.

Xato xabarlar bilan nima qilamiz?

DLQ yoqib, alohida tahlil qilamiz.

Production darajasida Kafka -> Doris pipeline kerakmi?

./ARXITEKTURA_SESSIYASINI_SOʼRASH.sh
© 2026 DATANOMIX.PRO — MARKAZIY OSIYODA VELODB EKSKLUZIV HAMKORI
VeloDB — Data Engineering BOSH SAHIFA