友情提示:本文共有 3627 个字,阅读大概需要 8 分钟。
在Java中,结合两个Kafka流并将它们与Avro值结合在一起是一个常见的任务。本文将介绍如何使用Kafka Streams库来实现这一目标。我们将讨论如何将两个流合并为一个新的流,并将结果发布到具有Avro值的主题中。我们还将探讨如何在Java中使用Avro库来处理流的序列化和反序列化。通过本文的指导,读者将了解如何利用Java和Kafka Streams来处理流数据,并将其结合在一起并产生具有Avro值的结果。
我有两个Kafka Streams,它们具有String键和我使用KSQL创建的Avro格式的值.
这是第一个:
DESCRIBE EXTENDED STREAM_1; Type : STREAMKey field : IDUSERTimestamp field : Not set - using <ROWTIME>Key format : STRINGValue format : AVROKafka output topic : STREAM_1 (partitions: 4, replication: 1) Field | Type-------------------------------------------------------- ROWTIME | BIGINT (system) ROWKEY | VARCHAR(STRING) (system) FIRSTNAME | VARCHAR(STRING) LASTNAME | VARCHAR(STRING) IDUSER | VARCHAR(STRING)
第二个:
DESCRIBE EXTENDED STREAM_2;Type : STREAMKey field : IDUSERTimestamp field : Not set - using <ROWTIME>Key format : STRINGValue format : AVROKafka output topic : STREAM_2 (partitions: 4, replication: 1) Field | Type-------------------------------------------------------- ROWTIME | BIGINT (system) ROWKEY | VARCHAR(STRING) (system) USERNAME | VARCHAR(STRING) IDUSER | VARCHAR(STRING) DEVICE | VARCHAR(STRING)
所需的输出应包括IDUSER,LASTNAME,DEVICE和USERNAME.
我想使用Streams API离开(在IDUSER上)加入这些流,并将输出写入kafka主题.
为此,我尝试了以下方法:
public static void main(String[] args) { final Properties streamsConfiguration = new Properties(); streamsConfiguration.put(StreamsConfig.APPLICATION_ID_CONFIG, "kafka-strteams"); streamsConfiguration.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092"); streamsConfiguration.put(StreamsConfig.ZOOKEEPER_CONNECT_CONFIG, "localhost:2181"); streamsConfiguration.put(AbstractKafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG, "http://localhost:8081"); streamsConfiguration.put(StreamsConfig.KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName()); streamsConfiguration.put(StreamsConfig.VALUE_SERDE_CLASS_CONFIG, GenericAvroSerde.class); streamsConfiguration.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest"); final Serde<String> stringSerde = Serdes.String(); final Serde<GenericRecord> genericAvroSerde = new GenericAvroSerde(); boolean isKeySerde = false; genericAvroSerde.configure(Collections.singletonMap(AbstractKafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG, "http://localhost:8081"), isKeySerde); KStreamBuilder builder = new KStreamBuilder(); KStream<String, GenericRecord> left = builder.stream("STREAM_1"); KStream<String, GenericRecord> right = builder.stram("STREAM_2"); // Java 8 example, using lambda expressions KStream<String, GenericRecord> joined = left.leftJoin(right, (leftValue, rightValue) -> "left=" leftValue ", right=" rightValue, /* ValueJoiner */ JoinWindows.of(TimeUnit.MINUTES.toMillis(5)), Joined.with( stringSerde, /* key */ genericAvroSerde, /* left value */ genericAvroSerde) /* right value */ ); joined.to(stringSerde, genericAvroSerde, "streams-output-testing"); KafkaStreams streams = new KafkaStreams(builder, streamsConfiguration); streams.cleanUp(); streams.start(); Runtime.getRuntime().addShutdownHook(new Thread(streams::close));}
然而,
KStream<String, GenericRecord> joined = ...
在我的IDE上抛出错误:
incompatible types: inference variable VR has incompatible bounds
当我尝试将String Serde用作键和值时,它可以工作,但数据不能从kafka-console-consumer读取.我要做的是以AVRO格式生成数据,以便能够使用kafka-avro-console-consumer读取它们.
解决方法:
我的第一个猜测是您要从join操作返回一个String,而您的代码期望将GenericRecord作为结果:
KStream<String, GenericRecord> joined = left.leftJoin(right, (leftValue, rightValue) -> "left=" leftValue ", right=" rightValue, ...)
请注意,联接的类型为KStream< String,GenericRecord> ;,即值的类型为GenericRecord,但是联接输出是通过类型为String的“ left =“ leftValue”,right =“ rightValue计算的.
来源:/content-1-519501.html
收集不易,本文《Java中如何使用Avro值将两个Kafka流结合并生成结果》知识如果对你有帮助,请点赞收藏并留下你的评论。