Nexus Quickstart Guide
Nexus is an exceptionally user-friendly, high-performance, distributed platform for data integration, designed to support real-time synchronization of massive datasets. Capable of handling tens of billions of records reliably and efficiently.
Why We Need Nexus ?
Nexus is centered on data integration and synchronization, aimed at addressing the typical challenges faced in the data integration field:
Diverse data sources: Hundreds of widely used data sources have incompatible versions. With the rise of new technologies, even more data sources are emerging. Finding a tool that fully and rapidly supports these sources is a challenge for users.
Complex synchronization scenarios: Data synchronization must accommodate various scenarios, including offline-full synchronization, offline-incremental synchronization, Change Data Capture (CDC), real-time synchronization, and full database synchronization.
High resource requirements: Existing tools for data integration and synchronization often need significant computing or JDBC connection resources to handle real-time synchronization of large sets of small tables, placing a heavy load on enterprises.
Lack of quality assurance and monitoring: Issues such as data loss or duplication frequently occur during integration and synchronization processes. Moreover, these processes lack proper monitoring, making it difficult to grasp the true state of data during execution.
Complex technology stack: Enterprises rely on diverse technological components, and users need to develop unique synchronization programs for different systems to achieve data integration.
Features of Nexus
Extensive and expandable Connectors: Nexus offers a Connector API that operates as Source, Transofmr and Sink.
Connector plugins: The plugin architecture allows users to easily create custom Connectors and integrate them into the Nexus project. Nexus currently supports over 100 Connectors, with more being added regularly. Here’s the list of currently supported connectors.
Batch-stream integration: Connectors built using the Nexus Connector API seamlessly handle offline synchronization, real-time synchronization, full synchronization, incremental synchronization, and other scenarios, making the management of data integration tasks much easier.
Data consistency with distributed snapshots: Nexus supports a distributed snapshot algorithm, ensuring consistent data across processes.
JDBC multiplexing and multi-table parsing for logs: Nexus addresses JDBC connection issues by supporting multi-table and whole-database synchronization. It also supports the reading and parsing of logs across multiple tables, crucial for multi-table CDC synchronization scenarios, minimizing repeated log readings and parsing.
High throughput and low latency: Nexus features parallel reading and writing, offering reliable, stable synchronization with high throughput and low latency.
Comprehensive real-time monitoring: Nexus provides detailed monitoring for every step of the synchronization process, enabling users to easily track metrics such as data count, data size, and query-per-second (QPS) rates for reading and writing tasks.
The Source Connector in Nexus is responsible for parallel reading and sending data to the downstream Transform or directly to the Sink, which writes the data to its destination. Notably, the Source, Transform, and Sink components can be easily developed and extended according to your needs.
Nexus operates as an EL(T) data integration platform. Consequently, in Nexus, the Transform component is primarily used for simple data transformations, such as converting a column’s data to uppercase or lowercase, renaming columns, or splitting a column into multiple columns.
Connectors
Source Connectors
Nexus supports reading data from a wide range of sources including relational, graph, NoSQL, document, and in-memory databases; distributed file systems such as HDFS; and various cloud storage solutions like S3 and OSS. Additionally, Nexus can read data from many common SaaS services. For a detailed list of supported sources, refer to the list here. You also have the option to develop your own source connector and seamlessly integrate it into Nexus.
Transform Connector
If there is a schema mismatch between the source and the sink, you can use the Transform Connector to modify the schema read from the source to match the sink schema.
Sink Connectors
Nexus supports writing data to numerous destinations including relational, graph, NoSQL, document, and in-memory databases; distributed file systems such as HDFS; and various cloud storage solutions like S3 and OSS. Writing data to many common SaaS services is also supported. For a detailed list of supported sinks, refer to the list here. Additionally, you can develop your own sink connector and easily integrate it into Nexus.
Last updated