# Data Cleanup Tool
# I. Introduction
Data removal tools are mainly used to clear the data in the TA system. They support the removal of event data according to the time period and event type, or the removal of full data (including user data). The Meta data of the transaction table and the user table (ie table structure) Will not be cleared. If you need to clear the Meta data, it is recommended to directly create a new project and delete the old project.
WARNING
We recommend data deletion only when clearing test data and data anomalies. It is not recommended to delete data frequently. Please use this tool carefully.
# II. Instructions for use
2.7 and earlier, please execute ta-data-del and enter the data cleaning tool
The data removal tool only supports users of self-hosting services. root logs into any server of the self-hosting cluster and executes su - ta
Then execute the ta-tool data_del
to enter the data cleaning tool interface.
# 2.1 Fill in the appid for the item to be deleted
The appid of the project can be queried in the project management page in the TA background.
# 2.2 Confirm the project name
After entering, the project name of the item to be deleted will be prompted, enter 'y' to confirm, and enter 'n' to cancel the operation.
# 2.3 Select Data Delete Type
Next, you need to select the data to be deleted. You can delete only the event data (that is, the data in the transaction table), or the full amount of data (all transaction table data and all user table data). The following table is the available operation for two deletion methods:
Delete data type | Event data only | Total data |
---|---|---|
Whether to delete the event table | √ | √ |
Whether to delete the user table | √ | |
Can you choose a time? | √ | All time |
Can I select an event? | √ | All events |
# 2.4 Operation under the condition of deleting event data
Next, you need to enter the **event name **to delete the event. The event name entered here is the key value when transferring data, **not the display name **. You can query the event name in the metadata management page, delete multiple events. Available "," Split, after entering, you will be prompted to delete the event name.
If you do not enter any characters to confirm directly, all event data will be deleted:
# 2.5 Fill in the time range for deleting event data
Next, you need to enter the time period for deleting data. The optional time granularity is "days". Please enter the date in the yyy-MM-dd
format. If you do not enter the direct carriage return confirmation, it will be regarded as the first day of selecting the existing data./The last day, the specific operation is as follows (leave blank for not filling in):
Delete data type | Start date | End date |
---|---|---|
Delete OurHours segment data | ||
Delete data from the first day to a specific day | Enter date | |
Delete data from specific date to later | Enter date | |
Delete data for a specific period of time | Enter date | Enter date |
Here is an example of deleting OurHours segment data:
# 2.5 Use of custom conditions
**If you specify the start date and end date of deletion **, a custom condition input box will appear. This function can support you to use SQL statements to further refine the judgment conditions for deletion of data. The entered custom conditions will be used as The WHERE condition of the deletion statement.
The syntax for custom conditions follows the presto standard. The SQL preview of the delete statement is provided before deletion, as well as the amount of data that meets the conditions. **It is recommended that you use the **SQL IDE
to query the data that needs to be deleted before copying the conditions in the WHERE clause to the custom conditions here.
WARNING
Custom conditions do not support sub-query statement completion.
# 2.6 Final confirmation
Finally, before deleting the item, the final confirmation will be made, including the name of the deleted item, the name of the deleted event and the time period for deletion. Enter 'y' to start deleting data. If there is an error, you can enter 'n' to exit the tool and re-enter:
# 2.7 Operation in the case of deleting full data
If you choose to delete the full amount of data, you will directly confirm before deleting, enter 'y' to start deleting the data, if there is an error, you can enter 'n' to exit the tool and re-enter:
# III. Precautions
# 3.1 The data deletion tool only clears the data in the table and does not clear the following data:
- Project member data
- Project information data (project ID and APPID)
- Volume of events used
- Events, event properties, and user feature metadata
- As well as the display name, comments and hidden status configured in metadata management, kanban and report information, etc.
If you need to clear the above data, it is recommended to directly build a new project.
# 3.2 Before using the data deletion tool, make sure that no new data is passed in during the deletion period (e.g. when you delete yesterday's data, you should not pass in yesterday's data during the deletion period) to ensure the effect of data deletion.
# 3.3 A prompt appears when performing a data deletion action:
After the background data synchronization task is over, perform deletion and try again in 5 minutes, please wait! 1st time.
You can wait for the program to execute automatically, or wait some time and retry the delete action.