StreamSets Transformer download is only available to existing users. If you are new to StreamSets, we encourage you to try our cloud-native platform for free.
Note: Replace <VERSION> with the current version and <SPARK_PATH> with the full path to Apache Spark. If there's a period (.) at the end of a command that is also the end of a sentence above, it is not part of the command.
Build Your First Data Pipeline
Build Your First Apache Spark Pipeline
Clickstream Analysis on AWS
Debug Spark ETL pipelines
StreamSets Transformer Engine is an execution engine that runs data processing pipelines on Apache Spark. Spark ETL data pipelines can perform transformations that require heavy processing on the entire data set in batch or in streaming mode. You can install a Transformer Engine on any environment running Apache Spark.
Whether your data sources are on-prem, cloud-to-cloud or on-prem-to-cloud, use the pre-built connectors and native integrations to configure your Spark ETL pipeline without hand coding.