Today, we’re excited to announce the general availability of Data Portal on Confluent Cloud. Data Portal is built on top of Stream Governance, the industry’s only fully managed data governance suite for Apache Kafka® and data streaming. The developer-friendly, self-service UI provides an easy and curated way to find, understand, and enrich all of your data streams, enabling users across your organization to build and launch streaming applications faster.
Building streaming applications with open source Kafka can be slow and inefficient when there's a lack of visibility into what data exists, where it comes from, and who can grant access. Data Portal leverages the capabilities of Stream Catalog and Stream Lineage to empower data users to interact with their organization’s Kafka data streams safely, efficiently, and collaboratively.
With Data Portal, you can:
Search and discover existing topics across the organization with the help of topic metadata and get a drill-down view of the data they hold.
Seamlessly and securely request access to topics through an approval workflow that connects the data user with the data owner, who can approve the request.
Set up clients and query data with Apache Flink® to enrich your topics and build new streaming applications and pipelines.
Ready to get started? If you already use Confluent Cloud, you can access Data Portal simply by logging in to your account. You must have a Stream Governance package enabled for the cloud environments you want displayed in Data Portal. Check out the quick start guide to see how you can get Stream Governance up and running in just a few clicks.
If you’re not yet using Confluent Cloud, you can try Data Portal for free by creating an account and setting up your first cluster.
To learn more, join us for the Stream Governance webinar with a technical demo that showcases the full capabilities of Data Portal and Stream Governance on Confluent Cloud.
Let’s dive into how to get up and running with Data Portal.
When you log in to Confluent Cloud, the Data Portal tab will reflect a unified view of all available Kafka topics by environment. You can search for topics by name or tag, or browse topics by tag, creation date, and modified date.
Each topic card summarizes the topic with the name, data location (environment, cluster, cloud provider, and region), description, tags, and when it was created or modified.
However, this comprehensive view may be overwhelming if you’re sifting through hundreds of topics. Data Portal’s search feature allows you to hone in on the most relevant data for your use case or project. Search for topics by name or tag, or filter by tags, business metadata, cloud provider, region, and other metadata.
Once you’ve identified a topic that piques your interest, click on the card to learn more about it. Clicking on the card reveals a side panel with additional metadata. In the top section of this panel, you’ll see the location of the topic, its tags, and a description of the data it stores. Below, you can see the schema with its fields (without viewing the actual data), where you can view the structure of the data stored in the topic. You’ll also find a link to the lineage of the topic, information about the owner of the topic, business metadata appended to the topic, and finally, the technical metadata of the topic (created date, retention period, etc).
When you click on the Stream Lineage section, you get a complete, end-to-end data flow visualization of the upstream and downstream components from the topic.
So you’ve identified the topic(s) you want to access. Now what? Instead of spending valuable cycles pinging various colleagues to find out who can grant access to the data you need, you can use the Data Portal to request access to the topic directly from the Confluent Cloud UI.
Clicking Request access on the topic side panel triggers an approval workflow that connects the user with the data owner via email (if a topic owner email was set on the topic metadata). Select the permissions you require and (optionally) leave a message for the approver describing your request.
Once the request is submitted, the topic owner will receive an email to review the request.
When the topic owner clicks Review request in the email, they are redirected to the Confluent Cloud Access requests UI.
This simple workflow provides important self-service capabilities for both data owners and data seekers. Streamlining access request management eliminates the need for administrators to manually manage manual permission assignments, while at the same time ensuring that security and access controls remain in place – a welcome sight as your users and topics scale to the tens and hundreds (and beyond)!
Let’s assume the data owner granted you access to the topic – now when you click on the topic card, you can view the last message produced and set up a client or query the topic with Flink SQL directly from the UI.
Clicking Set up client will redirect you to the Clients page in the UI, which will walk you through setting up an application in your programming language of choice.
Clicking Query takes you to the new Flink SQL workspace with a topic query ready to run.
Check out the latest blog post on our serverless Flink service to learn more about how you can effortlessly filter, join, and enrich your Kafka data streams in-flight.
We often hear how critical Stream Governance is to our customers’ data in motion journey, and with Data Portal, we’re excited to bring an enhanced user experience to the product. We look forward to expanding our Stream Governance suite further and adding exciting new features in quarters to come.
Check out Data Portal in Confluent Cloud, today. If you haven’t already, sign up for a free trial of Confluent Cloud and create your first cluster to explore new topics and create streaming pipelines and applications.
Interested in learning more? Be sure to register for the upcoming Stream Governance webinar, where we’ll share a technical demo that showcases the full capabilities of Data Portal and Stream Governance on Confluent Cloud.
Data governance initiatives aim to manage the availability, integrity, and security of data used across an organization. With the explosion in volume, variety, and velocity of data powering the modern […]
In a modern data stack, being able to discover, understand, organize and reuse data is key to obtaining the most value from your data. These requirements have led to the […]