Pulsar IO overview

Connecting the world using Pulsar Functions

Messaging systems are most powerful when you can easily use them in conjunction with external systems like databases and other messaging systems. Pulsar IO is a feature of Pulsar that enables you to easily create, deploy, and manage Pulsar connectors that interact with external systems, such as Apache Cassandra, Aerospike, and many others.

Pulsar IO and Pulsar Functions

Under the hood, Pulsar IO connectors are specialized Pulsar Functions purpose-built to interface with external systems. The administrative interface for Pulsar IO is, in fact, quite similar to that of Pulsar Functions.

Sources and sinks

Pulsar IO connectors come in two types:

  • Sources feed data into Pulsar from other systems. Common sources include other messaging systems and “firehose”-style data pipeline APIs.
  • Sinks are fed data from Pulsar. Common sinks include other messaging systems and SQL and NoSQL databases.

This diagram illustrates the relationship between sources, sinks, and Pulsar:

Pulsar IO diagram
Pulsar IO connectors (sources and sinks)

Working with connectors

Pulsar IO connectors can be managed via the pulsar-admin CLI tool, in particular the source and sink commands.

For a guide to managing connectors in your Pulsar installation, see the Getting started with Pulsar IO.

The following connectors are currently available for Pulsar:

Name Java class
Aerospike sink org.apache.pulsar.io.aerospike.AerospikeStringSink
Cassandra sink org.apache.pulsar.io.cassandra.CassandraStringSink
Kafka source org.apache.pulsar.io.kafka.KafkaStringSource
Kafka sink org.apache.pulsar.io.kafka.KafkaStringSink
RabbitMQ source org.apache.pulsar.io.rabbitmq.RabbitMQSource
Twitter Firehose source org.apache.pulsar.io.twitter.TwitterFireHose