# TaJsonFileWriter Plug-in
# I. Introduction
TaJsonWriter provides files that write TA-JSON format to local files. TaJsonWriter serves users who need to restore TA cluster data to json text.
# II. Functions and Limitations
TaJsonWriter realizes the function of local files from DataX protocol to TA-json format. TaJsonWriter has the following functions:
- Support and only support writing to json text files in TA format.
- Support multi-threaded writing, with each thread writing different subfiles.
# III. Function Description
# 3.1 Sample configuration
{
"job": {
"setting": {
"speed": {
"channel": 1
}
},
"content": [
{
"reader": {
"name": "streamreader",
"parameter": {
"column": [
{
"value": 123123,
"type": "long"
},
{
"value": "123123",
"type": "string"
},
{
"value": "login",
"type": "string"
},
{
"value": "2019-08-16 08:08:08",
"type": "date"
},
{
"value": "2019-08-16 08:08:08",
"type": "date"
},
{
"value": "2222",
"type": "string"
},
{
"value": "2019-08-16 08:08:08",
"type": "date"
},
{
"value": "test",
"type": "bytes"
},
{
"value": true,
"type": "bool"
}
],
"sliceRecordCount": 100
}
},
"writer": {
"name": "ta-json-writer",
"parameter": {
"type": "event",
"path": "/data/export/ta_datafile/",
"filename": "test",
"column": [
{
"index": "0",
"colTargetName": "#user_id"
},
{
"index": "1",
"colTargetName": "#distinct_id"
},
{
"index": "2",
"colTargetName": "#event_name"
},
{
"index": "3",
"colTargetName": "#time",
"type": "date",
"dateFormat": "yyyy-MM-dd HH:mm:ss.SSS"
},
{
"index": "4",
"colTargetName": "#event_time",
"type": "string"
},
{
"index": "5",
"colTargetName": "#account_id",
"type": "string"
},
{
"index": "6",
"colTargetName": "timetest",
"type": "date",
"dateFormat": "yyyy-MM-dd HH:mm:ss.SSS"
},
{
"index": "6",
"colTargetName": "timetest2",
"type": "date",
"dateFormat": "yyyy-MM-dd HH:mm:ss"
},
{
"index": "7",
"colTargetName": "os_1",
"type": "string"
},
{
"index": "7",
"colTargetName": "os_2",
"type": "string"
},
{
"index": "8",
"colTargetName": "booleantest",
"type": "boolean"
},
{
"index": "0",
"colTargetName": "testNumber",
"type": "number"
},
{
"colTargetName": "add_clo",
"value": "123123",
"type": "string"
}
]
}
}
}
]
}
}
# 3.2 Parameter description
- path
- Description: path information of the local file system. TaJsonWriter will write multiple files under the Path directory.
- Required: Yes
- Default value: none
fileName
- Description: file name written by TaJsonWriter, which will add a random suffix as the actual file name written by each thread.
- Required: Yes
- Default value: none
- writeMode
- Description: TaJsonWriter data cleaning processing mode before writing:
- truncate, clear all files with fileName prefix under the directory before writing. -append, do not do any processing before writing, DataX TAJsonWriter directly uses filename to write, and guarantees that the filename does not conflict.-nonConflict, if there are files with fileName prefix under the directory, report an error directly.
- Required: Yes
- truncate, clear all files with fileName prefix under the directory before writing. -append, do not do any processing before writing, DataX TAJsonWriter directly uses filename to write, and guarantees that the filename does not conflict.-nonConflict, if there are files with fileName prefix under the directory, report an error directly.
- Default value: append
- Description: TaJsonWriter data cleaning processing mode before writing:
- encoding
- Description: read the encoding configuration of the file.
- Required: No
- Default value: utf-8
- column
- Description: read the list of fields. type specifies the type of data, index specifies the current column corresponding to reader (starting with 0). value specifies the current type as a constant, does not read data from reader, but automatically generates the corresponding column according to the value.
The user can specify the Column field information, configured as follows:
[
{
"type": "Number",
"colTargetName": "test_col", //generate the column names corresponding to the data
"index": 0 //transfer the first column from reader to dataX to get the Number field
},
{
"type": "string",
"value": "testvalue",
"colTargetName": "test_col" //generate the string field of testvalue from TaDataWriter as the current field
},
{
"index": 0,
"type": "date",
"colTargetName": "testDate",
"dateFormat": "yyyy-MM-dd HH:mm:ss.SSS"
}
]
# 3.3 Type conversion
The type is defined as TaJsonFileWriter:
DataX internal type | TaJsonWriter data type |
---|---|
Int | Number |
Long | Number |
Double | Number |
String | String |
Boolean | Boolean |
Date | Date |