The StreamSets Data Collector engine is a key, integrated component within the StreamSets Platform. The StreamSets Data Integration Platform is where you develop, manage and monitor your data pipelines, and is available as a SaaS. StreamSets Data Collector is one of the pipeline execution engines that is managed by the platform and which actually executes the pipeline.
To get started, sign up for and log into the StreamSets Platform (it’s free!) From there you can set up the Data Collector engine in a variety of ways depending on your use case and your environment. StreamSets makes it easy for you to deploy the Data Collector engine wherever you need, on-premises, in a VPC or in the public cloud—to support the breadth of pipeline patterns across hybrid, multi-cloud environments.
After your engine deployment is set up, you can start building your smart data pipeline in the StreamSets Platform. You can then execute your pipeline using your new StreamSets Data Collector engine.
Set up a deployment
Build a data pipeline
Run a job
Monitor a job
The StreamSets Data Collector engine is a powerful execution engine used to ingest and process data in batch, streaming, or CDC pipelines, and is a core part of the StreamSets Platform.
You can install and deploy Data Collector engines anywhere you need to ingest data, on-premises or in the cloud. You can download and install the Data Collector engine, or let the StreamSets Platform auto-deploy it in AWS, Azure or GCP. Either way, the Data Collector engine will be managed from the StreamSets Platform. Use the pre-built connectors and native integrations to configure your smart data pipeline without coding. With smart data pipelines, you can spend more time building new data pipelines and less time rewriting and fixing old pipelines.