Stepping into Event Streaming with Microsoft Fabric

(2025-July-06) Very often, real-time, high-speed streams of events come from IoT devices, social media logs, website user interactions, and financial transactions. So, working with streaming datasets can place you in a different category of data engineering professionals, where you tend to understand the difference between Lambda and Kappa architectures, Kafka is more than a book writer to you, and topics are not only found in human conversations. 

In my post last week (https://datanrg.blogspot.com/2025/06/salesforce-cdc-data-integration.html), I talked about Salesforce Change Data Capture (CDC) event data streaming, where the initial event destination was file storage in Azure. But what if we anticipate a higher volume of incoming Salesforce source data or the addition of a new data feed? This could create the need for an alternative method of managing incoming events.

Image by Florian Kurz from Pixabay

Microsoft Fabric provides an easy way to incorporate streaming data flows into your organizational data ecosystem with the help of the Eventstream component. It allows you to visually create a streaming data flow by defining source data connections and destinations, along with data transformations if required by your business logic.

I don’t want to get into the territory of Microsoft’s clear and well-structured documentation on Eventstream in Fabric, so I’ll let you explore and learn more about it directly - https://learn.microsoft.com/en-us/fabric/real-time-intelligence/event-streams/overview?tabs=enhancedcapabilities

However, there are a few things I want to point out to help you decide whether to try Eventstream for your new or existing data streaming scenarios.

1) Existing Change Data Capture (CDC) source data connectors

If you’re familiar with the concept of CDC you’ll be pleasantly surprised by the number of built-in CDC connectors that Fabric Eventstream offers for easy integration; and, if there had been a Salesforce CDC connector in Fabric Eventstream, I wouldn’t have had to build the Python-based middleware within an Azure Function App that I described in my previous blog post.


2) Custom endpoint backed by Event Hub

As soon as a new Eventstream is created, an Event Hub is provisioned to support a custom endpoint connector within your streaming flow. While you can also access and connect to other external Event Hubs in Azure, among many other options (25 in total as I’m writing this post), the custom endpoint acts like your own invisible event hub repository, with all the bells and whistles of a regular Azure Event Hub. It has its own adjustable retention period, as well as the option to use the Apache Kafka endpoint on the Eventstream item. This enables you to connect and consume streaming events through the Kafka protocol without having to set up a Kafka cluster yourself.

For me, this is the biggest addition that makes Eventstream a very appealing offering for my new data engineering workloads!

Eventstream Retention Settings

Each Eventstream can have its own retention policy, allowing you to retain incoming data for a period between 1 and 90 days (1 day is the current default setting). Just click Settings at the Eventstream level and then choose Retention to specify your desired retention period.


Apache Kafka Protocol Support

Support for the Apache Kafka protocol in Eventstream is provided automatically. In addition to the usual Event Hub connectivity via SAS Key, for example:

Endpoint=sb://{cdc_fabric_eh_namespace}.servicebus.windows.net/;SharedAccessKeyName={cdc_fabric_eh_keyname};SharedAccessKey={cdc_fabric_eh_key};EntityPath={cdc_fabric_eh_hubname}

you can simply switch to the Kafka tab to get your Eventstream Kafka endpoint information.

The ability to connect to the same event repository via Event Hub, AMQP, or Kafka protocols is similar to using different charging adapters, providing multiple ways to establish a reliable flow, depending on your needs.

3) Streaming Data Insights

There is a saying about three things people like to watch: how a river flows, how a fire burns, and how other people count their money. Putting aside the humour in this observation, the idea of an ongoing stream, like an event flow, naturally sparks curiosity to watch it visually. Sometimes this is simply to validate the incoming data or to confirm that any subsequent transformations are processing and forwarding the data correctly.

Data Insights In


Data Insights Out



Salesforce CDC Events Streaming to Fabric

While testing my original Python Function App to consume, process, and stage incoming Salesforce CDC events in Azure file storage, it wasn’t difficult to add an alternative destination for those events and feed them into my Eventstream in Fabric. I created a new Eventstream in my Fabric workspace, collected the Event Hub connection details, stored them in Azure Key Vault, and referenced them in my Python code. Microsoft provides, and continually updates, a Python library for Azure Event Hubs, which makes integration straightforward - https://pypi.org/project/azure-eventhub/.

This modular approach allowed me to introduce a new function that I included in my CDC event message processing pipeline, passing each message as a payload. This Python code safely publishes a list of events to Azure Event Hub by creating a producer client, batching serialized JSON events, and handling size limits. After processing all events, it ensures the producer connection is closed.

from azure.eventhub import EventHubProducerClient, EventData

def publish_events(events: List[Dict[str, Any]]) -> None:
    #Safely publish events to Azure Event Hub without duplication.
    if not events:
        logger.info("No events to publish")
        return

    producer = EventHubProducerClient.from_connection_string(EH_CONNECTION_STR)

    try:
        event_batch = producer.create_batch()

        for event_body in events:
            try:
                serialized = json.dumps(event_body)
                payload_size = len(serialized.encode("utf-8"))
                logger.info("Event payload size: %d bytes", payload_size)

                event_data = EventData(body=serialized)

                try:
                    event_batch.add(event_data)
                except ValueError:
                    # Batch is full, send it
                    logger.info("Sending full batch with %d events", len(event_batch))
                    producer.send_batch(event_batch)

                    # Try adding to new batch
                    event_batch = producer.create_batch()
                    try:
                        event_batch.add(event_data)
                    except ValueError:
                        logger.critical(
                            "Event too large (%d bytes) for empty batch. Skipping.",
                            payload_size,
                        )
                        continue  # Skip this event

            except Exception as e:
                logger.error("Error processing event: %s", str(e))
                continue

        # Send any remaining events
        if len(event_batch) > 0:
            logger.info("Sending final batch with %d events", len(event_batch))
            producer.send_batch(event_batch)

    except Exception as e:
        logger.error("Publishing failed: %s", str(e))
        raise

    finally:
        producer.close()
        logger.info("Producer connection closed.")

Sometimes, when you complete an interaction with a service provider, whether it’s your local bank or an insurance company, you’re given the option to fill out a survey or provide feedback on the service you received. Very often, part of this feedback format includes a question about your willingness to recommend their service to others.

In the same way, I can imagine giving feedback to myself: Will I be using Eventstreams in the future? Maybe. Would I recommend Eventstream to others? Probably. The only major constraint here is the requirement to have an existing Fabric workspace; otherwise, a similar streaming architecture could be implemented using a combination of connected components. The thing I liked most about Eventstream in Fabric is the ability to create custom endpoints, where you can have built-in Event Hubs in your stream flow that also support Kafka protocols if needed.

Please also check out the good section of tutorials for designing solutions with Eventstream in Fabric from Microsoft. Good technology is nothing without a good set of use cases, and when you can learn from others, it's gold!

https://learn.microsoft.com/en-us/fabric/real-time-intelligence/event-streams/stream-events-from-power-automate

Comments