go-etl is a toolset for data extraction, transformation and loading。(go-etl是一个集数据源抽取,转化,加载的工具集,提供强大的数据同步能力)
APACHE-2.0 License
English | 简体中文
go-etl is a toolset for extracting, transforming, and loading data sources, providing powerful data synchronization capabilities.
go-etl will provide the following ETL capabilities:
Since I have limited energy, everyone is welcome to submit issues to discuss go-etl, let's make progress together!
This data synchronization tool has the synchronization capability for the following data sources.
Type | Data Source | Reader | Writer | Document |
---|---|---|---|---|
Relational Database | MySQL/Mariadb/Tidb | √ | √ | Read、Write |
Postgres/Greenplum | √ | √ | Read、Write | |
DB2 LUW | √ | √ | Read、Write | |
SQL Server | √ | √ | Read、Write | |
Oracle | √ | √ | Read、Write | |
Sqlite3 | √ | √ | Read、Write | |
Unstructured Data Stream | CSV | √ | √ | Read、Write |
XLSX(excel) | √ | √ | Read、Write |
Start data synchronization with the go-etl Data Synchronization User Manual
Refer to the go-etl Data Synchronization Developer Documentation to assist with your development.
make dependencies
make release
Before compilation, it is necessary to use export IGNORE_PACKAGES=db2
export IGNORE_PACKAGES=db2
make dependencies
make release
release.bat
Before compilation, it is necessary to use set IGNORE_PACKAGES=db2
set IGNORE_PACKAGES=db2
release.bat
+---datax---|---plugin---+---reader--mysql---|--README.md
| | .......
| |
| |---writer--mysql---|--README.md
| | .......
|
+---bin----datax
+---exampales---+---csvpostgres----config.json
| |---db2------------config.json
| | .......
|
+---README_USER.md
This package will provide an interface similar to Alibaba's DataX to implement an offline data synchronization framework in the Go programming language. The framework will enable users to perform data synchronization tasks efficiently and reliably, leveraging the power and flexibility of the Go language. It may include features such as pluggable data sources and destinations, support for various data formats, and robust error handling mechanisms.
readerPlugin(reader)—> Framework(Exchanger+Transformer) ->writerPlugin(riter)
The system is built using a Framework + plugin architecture. In this design, the reading and writing of data sources are abstracted into Reader/Writer plugins, which are integrated into the overall synchronization framework.
For detailed information, please refer to the go-etl Data Synchronization Developer Documentation. This documentation provides guidance on how to use the go-etl framework for data synchronization, including information on its architecture, plugin system, and how to develop custom Reader and Writer plugins.
Currently, the data types and data type conversions in go-etl have been implemented. For more information, please refer to the go-etl Data Type Descriptions. This documentation provides details on the supported data types, their usage, and how to perform conversions between different types within the go-etl framework.
We have now implemented basic integration for databases, abstracting the database dialect (Dialect) interface. For specific implementation details, please refer to the Database Storage Developer Guide. This guide provides information on how to work with different database dialects within the framework, allowing for flexible and extensible database support.
Primarily used for parsing byte streams, such as files, message queues, Elasticsearch, etc. The byte stream format can be CSV, JSON, XML, etc.
Focused on file parsing, including CSV, Excel, etc. It abstracts the InputStream and OutputStream interfaces. For specific implementation details, refer to the Developer Guide for Tabular File Storage.
A collection of utilities for compilation, adding licenses, etc.
go generate ./...
This is the build command used to register developer-created reader and writer plugins into the program's code. Additionally, this command inserts compilation information, such as software version, Git version, Go compiler version, and compilation time, into the command line tool.
A plugin template creation tool for data sources. It's used to create a new reader or writer template, in conjunction with the build command, to reduce the developer's workload.
A packaging tool for the data synchronization program and user documentation.
Automatically adds a license to Go code files and formats the code using gofmt -s -w
.
go run tools/license/main.go