目录
此内容是否有帮助?

# Data Re-run Function

# I. Overview

The data re-run function is a function of querying the data in the TA database through SQL statements, and then entering the returned results into the TA database to generate new events or new user features.

# II. Instructions for Use

# 2.1 Command Description

Log in to any TA server, execute the su - tacommand, and switch to the TA user.

Execute ta-tool user_event_import -confto read the configuration file, the command is as follows:

ta-tool user_event_import -conf <config文件> [--date 数据日期]

# 2.2 Command Parameter Description

# 2.2.1 -conf

Must pass, the parameter is the path corresponding to the data re-run task configuration file, which supports wild-card methods, such as: /data/config/*or ./config/* .json

# 2.2.1 --date

Optionally, the parameter represents the data date, the time macro will be replaced based on this reference time, can not pass, not pass the default to take the current date, the format is YYYY-MM-DD, the specific use of the time macro, you can refer to the time macro use method

# 2.2.3 Example

ta-tool user_event_import -conf /data/home/ta/import_configs/*.json

The parameter is the full path of the configuration file, which supports reading multiple configuration files using wild-card

# 2.3 Description of Configuration File

# 2.3.1 The sample configuration file is as follows:

The core of the data re-run function is a configuration file containing query statements and configuration parameters. A configuration file corresponds to a data re-run task, and the configuration file for a backtracking event is as follows:

{
  "event_desc": {
    "ltv_event": "User Life Cycle"
  },
  "appid": "APPID",
  "type": "event",
  "property_desc": {
    "register_date": "registration date",
    "date_prop": "LTV Days"
  },
  "sql": "SELECT 'thinkinggame' \"#account_id\",'ltv_event' \"#event_name\",register_date \"#time\",register_date,ltv,date_prop FROM  (SELECT recharge_money ltv,register_date,CASE date_trunc('day', cast(register_date AS TIMESTAMP)) WHEN CURRENT_DATE - interval '1' DAY THEN '次日' WHEN CURRENT_DATE - interval '2' DAY THEN '三日' WHEN CURRENT_DATE - interval '6' DAY THEN '七日' WHEN CURRENT_DATE - interval '13' DAY THEN '十四日' WHEN CURRENT_DATE - interval '29' DAY THEN '三十日' ELSE NULL END date_prop FROM (SELECT sum(recharge_money) recharge_money ,register_date FROM (SELECT \"#user_id\" ,sum(recharge_value) recharge_money ,\"$part_date\" recharge_date FROM v_event_0 WHERE \"$part_event\" = 'recharge' AND \"$part_date\" > '2018-06-30' AND \"$part_date\" < '2018-07-30' GROUP BY \"#user_id\" , \"$part_date\") a LEFT JOIN (SELECT \"#user_id\" , \"$part_date\" register_date FROM v_event_0 WHERE \"$part_event\" = 'player_register' AND \"$part_date\" > '2018-06-30' AND \"$part_date\" < '2018-07-30') b ON a.\"#user_id\" = b.\"#user_id\" WHERE b.\"#user_id\" IS NOT NULL AND recharge_date >= register_date GROUP BY register_date) c) d WHERE date_prop IS NOT NULL"
}

::: tips

When you need to backtrack the List type, because the underlying storage list type is string separated by\ t, you need to do the following:

split("arrayColumn", chr(0009)) as arrayColumn.

:::

# 2.3.2 Description of configuration parameters

Each configuration file is represented as JSON, and the following is the meaning of each element:

  • event_desc
    • Description: Optional configuration, JSON object, to set the display name of the new event
    • Key: event name
    • Value: Display name
  • appid:
    • Description: The APPID of the target item that must be configured to write the query result
  • type
    • Description: Must configure whether to write the eventtable or the usertable of the target item
    • Available values: event, user
  • property_desc
    • Description: Optional configuration, JSON object, to set the display name of the attribute name
    • Key: attribute name
    • Value: Display name
  • sql
    • Description: You must configure string for the query statement. Please note that the column name of the returned result will determine the specific meaning of the column data, which must have the following:
      • Necessary column name 1: #account_idor #distinct_id, at least one of which should correspond to the account ID and anonymous ID of the triggering user. If the generated event is not triggered by an individual, such as LTV in the above example, it is recommended to use a fixed value outside the ID rule, such as 'system', 'admin', etc
      • Necessary column name 2: #event_name, event type, recommended setting value
      • Necessary column name 3: #time, when the event occurred, the format must be yyy-MM-dd HH: mm: ssor yyy-MM-dd HH: mm: ss. SSS

In addition to the above column names, the data of the remaining columns will be used as the attributes of the event, and the column name is the attribute name.

In addition to backtracking events, it is also supported to generate new user features or overwrite existing user features through query results. The configuration file is as follows:

{
  "appid": "8d1820678a064397bbfcc9732f352e75",
  "type": "user",
  "property_desc": {
    "user_level": "User Level",
    "coin_num": "Gold coin stock"
  },
  "sql": "select \"#account_id\",localtimestamp \"#time\",user_level,coin_num from v_user_0"
}

Similar to the backtrace event, its configuration file is also represented by JSON, which is different from the configuration file of the backtrace event as follows:

  • No need for event_desc
  • Typeis 'user'
  • The necessary column names in sql do not require #event_name, only the following two:
    • Necessary column 1: #account_idor #distinct_id, at least one of them
    • Necessary column 2: #time, indicating time

# 2.4 Time Macro Use

You can use time macros to replace time parameters inside the data re-run task configuration file. When executing the data re-run command, the ta-tool tool will use --dateas a benchmark to calculate the offset of time based on the parameters of the time macro, and replace the time macro in the configuration file. Supported time macro formats: @[{yyyyMMdd}], @[{yyyMMdd} - {nday}], @[{yyyMMdd} + {nday} ]Wait

  • YyyyMMddcan be replaced with any date format that can be parsed Java dateFormat, for example: yyyy-MM-dd HH: mm: ss. SSS, yyyyMMddHH000000
  • N can be any integer representing the offset of time
  • Day represents the offset unit of time, which can be taken as follows: day, hour, minute, week, month
  • Example: Suppose the current time is 2018-07-01 15:13:23.234
    • @[{yyyyMMdd}]Replace with 20180701
    • @[{yyyy-MM-dd} - {1day}]Replace with 2018-06-31
    • @[{yyyyMMddHH} + {2hour}]Replace with 2018070117
    • @[{yyyyMMddHmm00} - {10minute}]Replace with 20180701150300