# TaDataWriter Plug-ins
# I. Introduction
TaDataWriter provides the ability for DataX to transfer data to the Ta cluster, and the data will be sent to the TA's receiver.
# II. Functions and Limitations
TaDataWriter implements the function of changing from DataX protocol to Ta cluster internal data. TaDataWriter is agreed in the following aspects:
- Supports and only supports writing to Ta clusters.
- Support data compression, the existing compression format is gzip, lzo, lz4, snappy.
- Support multi-threaded transmission.
- Supported and only supported on TA nodes.
# III. Function description
# 3.1 Sample Configuration
{
"job": {
"setting": {
"speed": {
"channel": 1
}
},
"content": [
{
"reader": {
"name": "streamreader",
"parameter": {
"column": [
{
"value": "ABCDEFG-123-abc",
"type": "string"
},
{
"value": "F53A58ED-E5DA-4F18-B082-7E1228746E88",
"type": "string"
},
{
"value": "login",
"type": "string"
},
{
"value": "2020-01-01 01:01:01",
"type": "date"
},
{
"value": "abcdefg",
"type": "string"
},
{
"value": "2019-08-08 08:08:08",
"type": "date"
},
{
"value": 123456,
"type": "long"
},
{
"value": true,
"type": "bool"
}
],
"sliceRecordCount": 1000
}
},
"writer": {
"name": "ta-data-writer",
"parameter": {
"type": "track",
"appid": "34c703a885014208a737911748a7b51c",
"column": [
{
"index": "0",
"colTargetName": "#account_id",
"type": "string"
},
{
"index": "1",
"colTargetName": "#distinct_id"
},
{
"index": "2",
"colTargetName": "#event_name"
},
{
"index": "3",
"colTargetName": "#time",
"type": "date",
"dateFormat": "yyyy-MM-dd HH:mm:ss.SSS"
},
{
"index": "4",
"colTargetName": "testString",
"type": "string"
},
{
"index": "5",
"colTargetName": "testDate",
"type": "date",
"dateFormat": "yyyy-MM-dd HH:mm:ss.SSS"
},
{
"index": "6",
"colTargetName": "testLong",
"type": "number"
},
{
"index": "7",
"colTargetName": "testBoolean",
"type": "boolean"
},
{
"colTargetName": "add_clo",
"value": "addFlag",
"type": "string"
}
]
}
}
}
]
}
}
# 3.2 Parameter Description
- type
- Description: The data type written user_set, track.
- Required: Yes
- Default: None
- appid
- Description: The appid of the corresponding item.
- Required: Yes
- Default: None
- thread
- Description: Number of threads.
- Required: No
- Default: 3
- compress
- Description: Text compression type. Default non-filling means no compression. Supported compression types are zip, lzo, lzop, tgz, bzip2.
- Required: No
- Default: No compression
- connType
- Description: The way to accept data within the cluster, go receiver or send it directly to kafka.
- Required: No
- Default: http
- column
- Description: Read the list of fields,
type
specifies the type of data,index
specifies the current column corresponding to thereader
(starting with 0),value
specifies the current type as a constant, does not read data from thereader
, but automatically generates the corresponding column according to thevalue
value.
- Description: Read the list of fields,
The user can specify the Column
field information, configured as follows:
[
{
"type": "Number",
"colTargetName": "test_col", //Generate column names corresponding to data
"index": 0 //Transfer the first column from reader to dataX to get the Number field
},
{
"type": "string",
"value": "testvalue",
"colTargetName": "test_col" //Generate the string field of testvalue from within TaDataWriter as the current field
},
{
"index": 0,
"type": "date",
"colTargetName": "testDate",
"dateFormat": "yyyy-MM-dd HH:mm:ss.SSS"
}
]
- For user-specified Column information,
index
/value
must be selected,type
is not required, when setting thedate
type, you can set thedataFormat
is not required.- Must choose: Yes
- Default: all read by reader type
# 3.3 Type Conversion
The type is defined as TaDataWriter:
DataX internal type | TaDataWriter data type |
---|---|
Int | Number |
Long | Number |
Double | Number |
String | String |
Boolean | Boolean |
Date | Date |