Selfuel Docs
  • Welcome to Selfuel Platform
    • Features
    • Capabilities
    • Target Audience
    • $150 Free Trial
  • Registration and Login
  • Platform UI
  • Stream Processing with Cortex
    • Cortex Quickstart Guide
    • Cortex Elements
      • Streams
      • Attributes
      • Mappings
        • 🚧Source Mapping Types
        • 🚧Sink Mapping Types
      • Node and Application Healthchecks
      • Nodes
        • Node Preview
        • Node Connectivites
        • Node Units
      • Expression Builder
        • 🚧Built-in Functions
      • Windows
        • Cron Window
        • Delay Window
        • Unique Event Window
        • First Event Window
        • Sliding Event Count Window
        • Tumbling Event Count Window
        • Session Window
        • Tumbling Event Sort Window
        • Sliding Time Window
        • Tumbling Time Window
        • Sliding Time and Event Count Window
      • Store and Cache
        • RDBMS
        • MongoDB
        • Redis
        • Elasticsearch
    • Applications
      • Applications Page
      • Creating Applications using Canvas
      • Connector Nodes Cluster
        • Source Nodes
          • CDC Source
          • Email Source
          • HTTP Source
          • HTTP Call Response Source
          • HTTP Service Source
          • Kafka Source
          • RabbitMQ Source
          • gRPC Source
          • JMS Source
          • Kafka Multi DC Source
          • JMS Source
          • AWS S3 Source
          • Google Pub-sub Source
          • AWS SQS Source
          • MQTT Source
          • Google Cloud Storage Source
          • HTTP SSE Source
          • WebSubHub Source
        • Sink Nodes
          • Email Sink
          • HTTP Sink
          • HTTP Service Response Sink
          • HTTP Call Sink
          • Kafka Sink
          • RabbitMQ Sink
          • gRPC Sink
          • JMS Sink
          • Kafka Multi DC Sink
          • AWS S3 Sink
          • Google Pub-sub Sink
          • AWS SQS Sink
          • MQTT Sink
          • Google Cloud Storage Sink
          • HTTP SSE Sink
          • WebSubHub Sink
      • Processing Nodes Cluster
        • Query
        • Join
        • Pattern
        • Sequence
        • Processor
        • 🚧On-demand Query
      • Buffer Nodes Cluster
        • Stream
        • Table
        • Window
        • Aggregation
        • Trigger
    • Run Applications
      • Run Applications Using Runners
      • Update Running Applications
      • Application Versioning
  • Data Integration with Nexus
    • Nexus Quickstart Guide
    • Nexus Elements
      • Concept
        • Config
        • Schema Feature
        • Speed Control
      • Connectors
        • Source
          • Source Connector Features
          • Source Common Options
          • AmazonDynamoDB
          • AmazonSqs
          • Cassandra
          • Clickhouse
          • CosFile
          • DB2
          • Doris
          • Easysearch
          • Elasticsearch
          • FakeSource
          • FtpFile
          • Github
          • Gitlab
          • GoogleSheets
          • Greenplum
          • Hbase
          • HdfsFile
          • Hive
          • HiveJdbc
          • Http
          • Apache Iceberg
          • InfluxDB
          • IoTDB
          • JDBC
          • Jira
          • Kingbase
          • Klaviyo
          • Kudu
          • Lemlist
          • Maxcompute
          • Milvus
          • MongoDB CDC
          • MongoDB
          • My Hours
          • MySQL CDC
          • MySQL
          • Neo4j
          • Notion
          • ObsFile
          • OceanBase
          • OneSignal
          • OpenMldb
          • Oracle CDC
          • Oracle
          • OssFile
          • OssJindoFile
          • Paimon
          • Persistiq
          • Phoenix
          • PostgreSQL CDC
          • PostgreSQL
          • Apache Pulsar
          • Rabbitmq
          • Redis
          • Redshift
          • RocketMQ
          • S3File
          • SftpFile
          • Sls
          • Snowflake
          • Socket
          • SQL Server CDC
          • SQL Server
          • StarRocks
          • TDengine
          • Vertica
          • Web3j
          • Kafka
        • Sink
          • Sink Connector Features
          • Sink Common Options
          • Activemq
          • AmazonDynamoDB
          • AmazonSqs
          • Assert
          • Cassandra
          • Clickhouse
          • ClickhouseFile
          • CosFile
          • DB2
          • DataHub
          • DingTalk
          • Doris
          • Druid
          • INFINI Easysearch
          • Elasticsearch
          • Email
          • Enterprise WeChat
          • Feishu
          • FtpFile
          • GoogleFirestore
          • Greenplum
          • Hbase
          • HdfsFile
          • Hive
          • Http
          • Hudi
          • Apache Iceberg
          • InfluxDB
          • IoTDB
          • JDBC
          • Kafka
          • Kingbase
          • Kudu
          • Maxcompute
          • Milvus
          • MongoDB
          • MySQL
          • Neo4j
          • ObsFile
          • OceanBase
          • Oracle
          • OssFile
          • OssJindoFile
          • Paimon
          • Phoenix
          • PostgreSql
          • Pulsar
          • Rabbitmq
          • Redis
          • Redshift
          • RocketMQ
          • S3Redshift
          • S3File
          • SelectDB Cloud
          • Sentry
          • SftpFile
          • Slack
          • Snowflake
          • Socket
          • SQL Server
          • StarRocks
          • TDengine
          • Tablestore
          • Vertica
        • Formats
          • Avro format
          • Canal Format
          • CDC Compatible Debezium-json
          • Debezium Format
          • Kafka source compatible kafka-connect-json
          • MaxWell Format
          • Ogg Format
        • Error Quick Reference Manual
      • Transform
        • Transform Common Options
        • Copy
        • FieldMapper
        • FilterRowKind
        • Filter
        • JsonPath
        • LLM
        • Replace
        • Split
        • SQL Functions
        • SQL
    • Integrations
      • Integrations Page
      • Creating Integrations Using Json
    • Run Integrations
      • Run Integrations Using Runners
      • Integration Versioning
  • Batch Processing/Storage with Maxim
    • Maxim Quickstart Guide
    • Maxim Elements
    • Queries
    • Run Queries
  • Orchestration with Routines
    • Routines Quickstart Guide
    • Routines Elements
    • Routines
    • Run Routines
  • Runners
    • Runners Page
    • Create a Runner to Run Applications
  • Security
    • Vaults
      • Vaults Page
      • Create Vaults
        • Runner-level Vaults
        • Application-level Vaults
      • Edit and Delete Vaults
      • 🚧Utilizing Vaults in Applications and Runners
    • Certificates
      • Certificates Page
      • 🚧Utilizing Certificates in Applications
      • 🟨Setting Up Security Settings
  • Monitoring Performance
    • Dashboard
    • Application Details
    • Runner Details
  • Logging
    • Log Types
  • Cost Management
    • SaaS
      • Pay-as-you-go
        • Hard Budget Cap
        • Soft Budget Cap
      • Subscriptions
    • On-prem
  • Organization Settings
    • General
    • Access Controls
      • User Roles and Privileges
    • Current Costs
    • Billing Addresses
    • Payment Accounts
    • Subscriptions
    • Pricing
    • Invoicing
  • User Settings
  • Troubleshooting
  • FAQs
Powered by GitBook
On this page
  • Why We Need Nexus ?
  • Features of Nexus
  • Connectors
  • Source Connectors
  • Transform Connector
  • Sink Connectors
  1. Data Integration with Nexus

Nexus Quickstart Guide

Nexus is an exceptionally user-friendly, high-performance, distributed platform for data integration, designed to support real-time synchronization of massive datasets. Capable of handling tens of billions of records reliably and efficiently.

Why We Need Nexus ?

Nexus is centered on data integration and synchronization, aimed at addressing the typical challenges faced in the data integration field:

  • Diverse data sources: Hundreds of widely used data sources have incompatible versions. With the rise of new technologies, even more data sources are emerging. Finding a tool that fully and rapidly supports these sources is a challenge for users.

  • Complex synchronization scenarios: Data synchronization must accommodate various scenarios, including offline-full synchronization, offline-incremental synchronization, Change Data Capture (CDC), real-time synchronization, and full database synchronization.

  • High resource requirements: Existing tools for data integration and synchronization often need significant computing or JDBC connection resources to handle real-time synchronization of large sets of small tables, placing a heavy load on enterprises.

  • Lack of quality assurance and monitoring: Issues such as data loss or duplication frequently occur during integration and synchronization processes. Moreover, these processes lack proper monitoring, making it difficult to grasp the true state of data during execution.

  • Complex technology stack: Enterprises rely on diverse technological components, and users need to develop unique synchronization programs for different systems to achieve data integration.

Features of Nexus

  • Extensive and expandable Connectors: Nexus offers a Connector API that operates as Source, Transofmr and Sink.

  • Connector plugins: The plugin architecture allows users to easily create custom Connectors and integrate them into the Nexus project. Nexus currently supports over 100 Connectors, with more being added regularly. Here’s the list of currently supported connectors.

  • Batch-stream integration: Connectors built using the Nexus Connector API seamlessly handle offline synchronization, real-time synchronization, full synchronization, incremental synchronization, and other scenarios, making the management of data integration tasks much easier.

  • Data consistency with distributed snapshots: Nexus supports a distributed snapshot algorithm, ensuring consistent data across processes.

  • JDBC multiplexing and multi-table parsing for logs: Nexus addresses JDBC connection issues by supporting multi-table and whole-database synchronization. It also supports the reading and parsing of logs across multiple tables, crucial for multi-table CDC synchronization scenarios, minimizing repeated log readings and parsing.

  • High throughput and low latency: Nexus features parallel reading and writing, offering reliable, stable synchronization with high throughput and low latency.

  • Comprehensive real-time monitoring: Nexus provides detailed monitoring for every step of the synchronization process, enabling users to easily track metrics such as data count, data size, and query-per-second (QPS) rates for reading and writing tasks.

The Source Connector in Nexus is responsible for parallel reading and sending data to the downstream Transform or directly to the Sink, which writes the data to its destination. Notably, the Source, Transform, and Sink components can be easily developed and extended according to your needs.

Nexus operates as an EL(T) data integration platform. Consequently, in Nexus, the Transform component is primarily used for simple data transformations, such as converting a column’s data to uppercase or lowercase, renaming columns, or splitting a column into multiple columns.

Connectors

Source Connectors

Nexus supports reading data from a wide range of sources including relational, graph, NoSQL, document, and in-memory databases; distributed file systems such as HDFS; and various cloud storage solutions like S3 and OSS. Additionally, Nexus can read data from many common SaaS services. For a detailed list of supported sources, refer to the list here. You also have the option to develop your own source connector and seamlessly integrate it into Nexus.

Transform Connector

If there is a schema mismatch between the source and the sink, you can use the Transform Connector to modify the schema read from the source to match the sink schema.

Sink Connectors

Nexus supports writing data to numerous destinations including relational, graph, NoSQL, document, and in-memory databases; distributed file systems such as HDFS; and various cloud storage solutions like S3 and OSS. Writing data to many common SaaS services is also supported. For a detailed list of supported sinks, refer to the list here. Additionally, you can develop your own sink connector and easily integrate it into Nexus.

PreviousData Integration with NexusNextNexus Elements

Last updated 8 months ago

Nexus Data Integration Concept