knodalyte

Kafka Streaming Evaluation

October 14, 2022

A summary of a technical product evaluation exercise that I did in 2018

Part 1: Near Real-Time Environment Monitoring

We started looking at the Kafka Streaming Platform as we explored options for monitoring our upcoming enterprise management applications. The consulting company that was implementing the enterprise solution was not providing any monitoring capabilities (and in fact did not implement logging in any consistent way). It fell to us, me in particular, to explore possible ways to be able to monitor the environment. After some initial exploration of how other enterprises were solving these kinds of problems, we decided to evaluate streaming as provided by Kafka for a proof-of-concept monitoring application. Our server teams gave me 3 VMs and sudo, and I installed and configured the various components of the Kafka platform.

The specific problem that we wanted to address was that QA and business people were testing various components of the enterprise environment but encountering performance issues. Testers could not easily determine whether they were seeing a problem with the implementation of their use case, a problem with the server they were connecting to, or a problem with a connected component elsewhere in the environment. Almost all of the systems were hosted on components of the Oracle enterprise stack, and the systems were being monitored by Oracle Enterprise Manager (OEM), but OEM is not a tool for users. However, OEM can be configured to send SNMP traps to an address, and we decided to leverage SNMP to provide simple but timely component status to the tester community.

The first challenge was to get data from the SNMP traps into a Kafka stream. I had to learn a bit about SNMP, its versions, and how to interpret the data. I needed to develop a microservice that would listen to incoming SNMP traps, extract the relevant data, and publish events to a stream. A Java SNMP connector on Github looked promising, but it required SNMP V2 and OEM only supported V1 and V3. I found a Python library, PySNMP, that allowed me to listen for SNMP traps from OEM and also grab some data from them to publish to my snmp-traps stream as JSON.

Next, I wrote a simple Python service to consume what was essentially raw SNMP trap data and extract the handful of values that I cared about, and publish to an alert stream, again as JSON.

The final piece was the most challenging: presenting an up to the minute summary of environment status in a web page. I was dealing with 11 tiers, 4 main server types, and over 50 individual servers for which to display the latest status and trend. The display needed to be updated in real-time with no user interaction, primarily for display on a kiosk monitor. I knew I wanted to use the WebSocket protocol, but I didn't know how. I couldn't use Server-sent Events (SSE) because I also needed to support IE on user desktops.

I found a very interesting Python networking platform designed for distributed messaging applications, Crossbar.io. Crossbar supports the WebSocket protocol, includes asynchronous servers, and has numerous sample applications including a demo network event monitoring daemon written in Python for the server side and React on the client side. While the example app was the authors' first Crossbar.io app and their first React.js app (and mine, as well), it nevertheless provided enough scaffolding that I could adapt for my needs. I implemented a Kafka consumer component in the server app that would listen for alert events from the alert stream. My primary need could most quickly be met by maintaining state and generating the core UI on the server side, so I used React in only the most rudimentary way. The application built and persisted a matrix of server status and history, and each time an alert event arrived, the matrix was updated. In addition, an HTML grid component was updated and pushed through the WebSocket connection to each connected browser client, where React handled the page updates.

The overall application performed well and was very robust. It had at least two main issues: history persisted and reported on servers that had been retired (deleting the persisted data structure was the manual workaround); and connecting a new browser client did not trigger a full refresh of the status grid, only an new SNMP event accomplished this.

Part 2: ETL

Subsequent to our monitoring POC, my company shifted direction and walked away from the never-completed enterprise application stack that a rather well-known consulting company had been developing for us. At that point, we realized that implementing “Plan B” required us to implement an application integration platform. Kafka seemed like a good candidate solution (along with others) and we started a second phase of evaluation. For this work, our server team built and configured a small cluster of Kafka platform servers using Chef. Our first target was testing some data conversion and integration activities that required some data transformations.

It seemed to me that what we were trying to do matched some of the use cases for the emerging (at the time, it was in beta) KSQL component of the Kafka platform. I was able to configure a Kafka Connect instance to read data from a database and publish it to a stream in Avro format. This provided a basis for experimenting with some of the functions supported by KSQL. However, I was unable to get more than one transformation function working at a time when creating a new, transformed stream from an existing stream. I then tried using the Kafka Streams Java library and attempted to do some stream joins. This seemed to work but sometimes I ended up with duplicate data in the output streams. Later I discovered that the method I was using to truncate my input streams between tests did not actually purge the data and it was duplicate input that resulted in duplicates in the output.

After a couple of disappointing experiments, I reverted to the tried-and-true approach of creating Python programs to perform transformations from one stream into another. This worked just fine, was easy to implement and fast to execute, but was not an approach that could easily be adopted by our business analysts (in the case of KSQL) or our Java developers (in the case of Kafka Streams).

Part 3: Speed Test

As the final part of our integration platform evaluation, we wanted to compare performance when running data integration or conversion processing. We're not talking huge volumes here, the worst case conversion scenario is less than 60 million records. For our speed test comparison, we essentially copied 1.7 million rows from a DB2 table on our AS/400 into a table in Oracle. No transformations were performed, but column names were slightly different in DB2 (due to limits on the length of a column name). We compared Kafka, Oracle Data Integrator (ODI), MuleSoft, and Microsoft's SSIS tool. I configured Kafka Connect source and sink JDBC connectors and deployed one of each to servers (not part of our Kafka cluster). Each product test was run separately so that there would not be contention.

As I expected, Kafka was the fastest, transferring the data in 6 minutes. ODI and Mule each clocked in at about 7 minutes, and SSIS came in at 10 minutes. In a subsequent run, I deployed 3 Connect sink workers, each on a separate server, and after I started the Connect source worker, the transfer completed in about 4 minutes.

Conclusion

Despite the performance, versatility, scalability, and price (we were using the OSS Confluent package) advantages of Kafka, management decided that MuleSoft was the best solution for our organization.

However, I had a lot of fun learning about and experimenting with Kafka, and I'm sorry we will not be using it in the future. I think it is an excellent and exciting way of implementing systems.

@knodalyte@mastodon.social

Transimony

October 14, 2022

A short short story I wrote in 2018

I was widely assumed that somewhere, people were going through the training while using Smart Drugs. But I didn't know how to connect with that alternate reality, and likely would have been too frugal to pay the premium.

I was both eager for and worried about the training. People said such disparate things, some saying subtle and some saying profound. Some said it was both. The word weird came up a lot, too. Maybe that's what made me a bit concerned.

I wasn't sure whether to take the training standing or sitting. I aspired to standing or even walking – but they suggested sitting to make the training “smoother”. Would that be smoother for me, or for my training partner? I guess we are one, ultimately.

The chair was super comfortable. It took a few minutes for the chair to map itself to my sitting preferences. I suppose I could have uploaded my experience stream, but I wasn't sure I had as many datapoints as the AI could use.

After I was comfortable, I picked up the data mittens. They were light, like empty pinatas, but the powercomm cable added a slight heft. I pulled them on and fiddled with them until they felt comfortable and cozy where they wrapped my forearms.

It seemed like the mittens, or maybe the chair, sensed when I was ready to start. The only way I could describe how they felt was odd. Personally, I wasn't ready to apply the weird yet.

Apparently what I had assumed was powercomm also had some tubing that inflated the mittens until I had soft, quilted balloons on my arms. Although it felt like there was only air inside the mittens, somehow the air had a texture or squeekiness to it. Depending on how I moved my hands, I felt surfaces from sandpaper to pudding to warm oily ball bearing.

Odd little vibratory patterns played a counterpoint to the tactile smorgasbord.

The sound level picked up, but it felt like soft aural confetti, a ticker tape of sound bits. Maybe they fell out of the pinatas.

There was a lot going on, but much of it was happening – or felt like it was happening – inside the mittens. I gave up trying to figure out how all these textures were flowing and colliding seemingly at the same time. I was still pretty sure it was illusion, the mittens had no more mass than they had before.

A voice that seemed to come from all around me calmly asked me to confirm that I would like to use the default speech, then display, for my task input. I concurred, and then gave myself over to a stream of letters and then words, to which I moved my hands – mostly my fingers – in what seemed to be random patterns.

There was a lot of repetition of letters at first. As time went on, new characters were introduced. Eventually it seemed that we had had had enough. The display in front of me started showing words. At first, and for a long time, words repeated, although some patterns were evident.

I entered a fugue state in which I was responding to the words and phrases before me, but without conscious thought or volition. Eventually, a cheesey “Tah-Dah” snippet sounded, like from an ancient version of Windows. I rose from my chair and went off to sleep, fixating those subconscious muscle memories in my body.

The next day my keyfabric was delivered. It came rolled up in a slender tube, with instructions that led me to unroll it and slap it to make it rigid. It had a slight texture, like seersucker, and was the same generic pale dirt color that I remember seeing on the keycaps of my father's aged ergonomic keyboard.

I sat down, laid my fingers on the surface, and started typing this story.

@knodalyte@mastodon.social