menu
Is this helpful?

# TaCustomReader Plug-in

# I. Introduction

The TaCustomReader plug-in enables DataX to read data from Ta. On the underlying implementation, TaCustomReader connects to the remote TA database through JDBC and SELECT the data from the TA library by executing the corresponding sql statement.

# II. Implementation Principle

In short, TaCustomReader connects to the remote TA database through the JDBC connector, and generates a query SELECT SQL statement based on the information configured by the user, and then sends it to the TA cluster, and SQL execution results are assembled into an abstract data set using the data type defined by DataX and passed to the downstream Writer for processing.

# III. Function Description

# 3.1 Sample configuration

Configure a job that reads data from the TA cluster to print in the console:

{
  "job": {
    "setting": {
      "speed": {
        "channel": 1
      }
    },
    "content": [
      {
        "reader": {
          "name": "ta-custom-reader",
          "parameter": {
            "querySql": "select * from v_event_1 where \"$part_date\" = '2020-01-01'"
          }
        },
        "writer": {
          "name": "streamwriter",
          "parameter": {
            "print": true,
            "encoding": "UTF-8"
          }
        }
      }
    ]
  }
}

# 3.2 Parameter description

  • server
    • Description: describe the connection information to the peer TA database in the form of SERVER:PORT.
    • Required: No
    • Default value: synchronous cluster configuration
  • querySql
    • Description: In some business scenarios, users can customize filtering SQL through this configuration type. When the user configures this item, then directly use the content of this configuration item to filter the data. For example, the data needs to be synchronized after multi-table join, use select a,b from table_a join table_b on table_a.id = table_b.id
    • Required: Yes
    • Default value: none

# 3.3 Type conversion

Currently, TaCustomReader supports most Presto types, but there are also some cases where individual types are not supported. Please check your type.

The following is a list of presto type conversions used by TA-reader for TA clusters:

DataX internal type Presto data type
Long TINYINT, SMALLINT, INTEGER, BIGINT
Double REAL, DOUBLE, DECIMAL
String VARCHAR, CHAR, VARBINARY, JSON
Date DATE, TIME, TIMESTAMP
Boolean BOOLEAN

Please note: Except for the above listed field types, other types are not supported