# Python SDK User Guide

This guide will show you how to use the Python SDK to access your project. You can access the source code of the Python SDK in GitHub (opens new window).

**The latest version is **: 1.7.0 .

**Update time is **: 2021- 12 - 29

# I. Initializate of SDK

Get Python SDK through pip

pip install ThinkingDataSdk

Upgrade command:

pip install --upgrade ThinkingDataSdk

Initialize the SDK

Please introduce the SDK into the program initialization code. There are two ways to import.

from tgasdk.sdk import TGAnalytics, LoggingConsumer
ta = TGAnalytics(LoggingConsumer(log_directory))

import tgasdk.sdk
ta = tgasdk.TGAnalytics(tgasdk.LoggingConsumer(log_directory))

Initialize to get SDK instance

# Initialise SDK in two ways, Consumer includes(LoggingConsumer,BatchConsumer,AsyncBatchConsumer,DebugConsumer)
#default
ta = TGAnalytics(Consumer)
# Add UUID for de-duplication
ta = TGAnalytics(Consumer, enable_uuid=True)

There are four ways you can get an SDK instance:

**(1) LoggingConsumer: **write local files in batches in real time. The files are separated by days and need to be used with LogBus for data upload. The default cache data exceeds 8K and the file will be written once. You can set buffer_size change the size in Byte. If you need to write immediately, you can call the flush () method.

#Default split by day
ta = TGAnalytics(LoggingConsumer(log_directory))

If you want to split the file by the hour, you can initialize it as follows:

ta = TGAnalytics(LoggingConsumer(log_directory, rotate_mode = ROTATE_MODE.HOURLY))

If you want to split the file by size, you can initialize it as follows:

#Split files according to 1G
ta = TGAnalytics(LoggingConsumer(log_directory, log_size= 1024))

If you want to customize the prefix name of the file, you can initialize it as follows:

#Custom File Prefix
ta = TGAnalytics(LoggingConsumer(log_directory, file_suffix="xx"))

log_directoryTo write to the local folder address, you only need to set the listening folder address of LogBus to the address here, and you can use LogBus to monitor and upload data.

**(2) BatchConsumer: **batch real-time transmission of data to the TA server (synchronous blocking), do not need to match the transmission tool, due to network problems will retry 3 times when the transmission fails, still fails will store the data in the buffer, you can set the buffer size, the default is 50, that is, the total number of data retained in the buffer is 50 * 20 (20 is the batch value of each upload, can be set). In the case of prolonged network outages, there is a risk of data loss.

ta = TGAnalytics(BatchConsumer(SERVER_URI,APP_ID))

If you want intranet transmission, you can initialize as follows:

# Comp default value True, which represents gzip compression
batchConsumer = BatchConsumer(server_uri="url", appid="appid",
                              compress=False)
ta = TGAnalytics(batchConsumer)

SERVER_URIFor the URL of the data transfer, for the APP ID of your project APP_ID

If you are using Cloud as a Service, enter the following URL:

http://receiver.ta.thinkingdata.cn

If you are using the version of private deployment, enter the following URL:

http://Data Acquisition Address

Note: Enter the following URL before version 1.3.0:

http://receiver.ta.thinkingdata.cn/logagent
http://Data Acquisition Address/logagent

**(3) AsyncBatchConsumer: **Bulk real-time transmission of data to the TA server (asynchronous non-blocking), do not need to match the transmission tool, Not recommended for use in production environments**.**

ta = TGAnalytics(AsyncBatchConsumer(SERVER_URI,APP_ID,flush_size=200,queue_size=100000))

SERVER_URIFor the URL to transfer data, APP_IDFor the APP ID of your project

If you are using Cloud as a Service, enter the following URL:

http://receiver.ta.thinkingdata.cn

If you are using the version of private deployment, enter the following URL:

http://Data Acquisition Address

flush_sizeIs the threshold for queue caching, beyond which it will be sent immediately.

queue_sizeFor the size of the cache queue, more data than queue_sizewill be lost.

**(4) DebugConsumer: **Real-time transmission of data to the TA server one by one, no need to match the transmission tool. If the data is wrong, the entire data will not be put into storage, and a detailed error description will be returned. It is not recommended to use in the production environment.

ta = TGAnalytics(DebugConsumer(SERVER_URI, APP_ID))

If you want **DebugConsumer **to only validate data and not write to the TA library, you can initialize it as follows:

#write_data value defaults to True and represents TA stored
ta = TGAnalytics(DebugConsumer(server_uri="SERVER_URI", appid="APP_ID",write_data=False))

SERVER_URIFor the URL to transfer data, APP_IDFor the APP ID of your project

If you are using Cloud as a Service, enter the following URL:

http://receiver.ta.thinkingdata.cn

If you are using the version of private deployment, enter the following URL:

http://Data Acquisition Address

# II. Send Events

After the SDK initialization is completed, you can call trackto upload events. In general, you may need to upload more than a dozen to hundreds of different events. If you are using the TA background for the first time, we recommend You upload a few key events first.

If you have doubts about what kind of events you need to send, you can check the Quick Use Guide for more information.

# 2.1 Send Events

You can call trackto upload events. It is recommended that you set the attributes of the event and the conditions for sending information according to the previously combed doc. Here, take user payment as an example:

distinct_id = "ABCDEF123456"

account_id = "TA10001"

properties = {
    "#time":datetime.datetime.now(),
    # Set the time this event occurs, if not, the default is the current time
    "#ip":"192.168.1.1",
    # Set the user's IP and TDA will automatically parse the provinces and cities based on that IP
    #"#uuid":uuid.uuid1(),#Select if enable_above UUID switch on, no filling required
    "Product_Name":"Product Name",
    "Price":30,
    "OrderId":"Order ID abc_123"
}

# Upload event, including account ID and visitor ID
try:
   ta.track(distinct_id,account_id,"Payment",properties)
   # You can also upload only the visitor ID
   # ta.track(distinct_id = distinct_id, event_name = "Payment", properties = properties)
   # Or just upload account ID
   # ta.track(account_id = account_id, event_name = "Payment", properties = properties)
except Exception as e:
    #hanc????

**Note: **In order to ensure the smooth binding of guest ID and account ID, if you will use guest ID and account ID in your game, we strongly recommend that you upload these two IDs at the same time.

Otherwise, there will be a situation where the account cannot be matched, resulting in repeated calculations by the user. For specific ID binding rules, please refer to the chapter on User Identification Rules .

The name of the event can only start with a letter and can contain numbers, letters and an underscore '_'. The maximum length is 50 characters and is not sensitive to letter case.
The property of the event is a dictobject, where each element represents a property.
The value of Key is the name of the attribute, which is of type str. It is stipulated that it can only start with a letter, contain numbers, letters and underscore '_', and is up to 50 characters in length. It is not sensitive to case of letters.
Support str, int, float, bool, datetime.datetime, datetime.date, list

# 2.2 Set Public Event Properties

For some properties that need to appear in all events, you can call the set_super_propertiesto set the public event properties. We recommend that you set the public event properties before sending the event.

# Set Common Event Properties
super_properties = {
    "server_version":"1.2.3",
    "server_name":"A1001"
}
ta.set_super_properties(super_properties)

distinct_id = "ABCDEF123456"
account_id = "TA10001"
properties = {
    "Product_Name":"Product A",
    "Price":60
}

# Upload an event with the attributes of a common event and the event itself
try:
    ta.track(distinct_id,account_id,"Payment",properties)
except Exception as e:
    #handle exception
    print(e)
'''
Equivalent to performing the following operations
properties = {
    "server_version":"1.2.3",
    "server_name":"A1001",
    "Product_Name":"Product name A",
    "Price":60

try:
    ta.track(distinct_id,account_id,"Payment",properties)
except Exception as e:
    #handle exception
    print(e)
'''

The public event attribute is also a dictobject, where each element represents an attribute.
The value of Key is the name of the attribute, which is of type str. It is stipulated that it can only start with a letter, contain numbers, letters and underscore "_", and is up to 50 characters in length. It is not sensitive to case of letters.
Support str, int, float, bool, datetime.datetime, datetime.date, list

If you call set_super_propertiesset a previously set public event property, the previous property value is overwritten. If the public event property and the Key of a property in the trackupload event duplicate, the property of the event overrides the public event property:

super_properties = {
    "server_version":"1.2.3",
    "server_name":"A1001"
}
ta.set_super_properties(super_properties)

super_properties = {"server_name":"B9999"}

ta.set_super_properties(super_properties)
# At this point the value of "server" becomes "B9999"

distinct_id = "ABCDEF123456"
account_id = "TA10001"
properties = {
    # cover "server_version"
    "server_version":"1.3.4",
    "Product_Name":"Product name A",
    "Price":60
}

# Upload the event when the value of "server_version" is "1.3.4" and the value of "server_name" is "B9999"
try:
    ta.track(distinct_id,account_id,"Payment",properties)
except Exception as e:
    #handle exception
    print(e)

If you want to empty all public event properties, you can call clear_super_properties.

# III. User Attributes

TA platform currently supports the user feature interface is user_set, user_setOnce, user_add, user_unset, user_append, user_del.

# 3.1 user_set

For general user feature, you can call user_setto set it. The attributes uploaded using this interface will overwrite the original attribute values. If the user feature does not exist before, the new user feature will be created. The type is the same as the type of the incoming attribute:

properties = {"user_name":"ABC"}
# Upload user property and the value "user_name" is "ABC"
try:
    ta.user_set(account_id = account_id, distinct_id = distinct_id, properties = properties)
    properties = {"user_name":"XYZ"}
    # Upload the user property again, and the value of "user_name" will be overwritten with "XYZ"
    ta.user_set(account_id = account_id, distinct_id = distinct_id, properties = properties)
except Exception as e:
    #handle exception
    print(e)

user_setThe set user feature is a dictobject where each element represents an attribute.
The value of Key is the name of the attribute, which is of type str. It is stipulated that it can only start with a letter, contain numbers, letters and underscore "_", and is up to 50 characters in length. It is not sensitive to case of letters.
Support str, int, float, bool, datetime.datetime, datetime.date, list

# 3.2 user_setOnce

If the user feature you want to upload only needs to be set once, you can call user_setOnceto set it. When the attribute already has a value before, this information will be ignored:

properties = {"user_name":"ABC"}
# Upload user property and the value "user_name" is "ABC"
try:
    ta.user_setOnce(account_id = account_id, distinct_id = distinct_id, properties = properties)
except Exception as e:
    #handle exception
    print(e)
properties = {
    "user_name":"XYZ",
    "user_age":18
    }
# Upload the user property again, when the value "user_name" already exists, so it is still "ABC" without modification; The value of "user_age" is 18
try:
    ta.user_setOnce(account_id = account_id, distinct_id = distinct_id, properties = properties)
except Exception as e:
    #handle exception
    print(e)

user_setOnceThe user feature type and restrictions set are consistent with user_set.

# 3.3 user_add

When you want to upload a numeric attribute, you can call user_addto accumulate the attribute. If the attribute has not been set, a value of 0 will be assigned before calculation. Negative values can be passed in, which is equivalent to subtraction operations.

properties = {
    "total_revenue":30,
    "vip_level":1
}
# Upload user attributes with total_revenue value of 30 and vip_level value of 1
ta.user_add(account_id = account_id, distinct_id = distinct_id, properties = properties)

properties = {"total_revenue":60}
# Upload user attributes with total_revenue of 90 and vip_level of 1
try:
    ta.user_add(account_id = account_id, distinct_id = distinct_id, properties = properties)
except Exception as e:
    #handle exception
    print(e)

user_addThe user feature type and restrictions set are the same as user_set, but only valid for numeric user features.

# 3.4 user_unset

When you want to empty the user feature value of the user, you can call user_unsetto empty the specified attribute. If the attribute has not been created in the cluster, the user_unsetattribute will not be created.

try:
    ta.user_unset(account_id, distinct_id, ["string1", "lasttime"])
except Exception as e:
    #handle exception
    print(e)

# 3.5 user_append

When you want to append the user feature value to the list type, you can call user_appendto append the specified attribute, if the attribute has not been created in the cluster, then create the attribute user_append

list1 = []
list1.append('Google')
list1.append('Runoob')
properties = {'arrkey1': list1, 'arrkey2': ['11','22']}#//为arrkey1,arrkey2的数组类型追加属性
try:
    ta.user_append(account_id=account_id, distinct_id=distinct_id, properties=properties)
except Exception as e:
    #handle exception
    print(e)

# 3.6 user _ del

If you want to delete a user, you can call user_delto delete the user. You will no longer be able to query the user features of the user, but the events generated by the user can still be queried.

try:
    ta.user_del(account_id = account_id, distinct_id = distinct_id)
except Exception as e:
    #handle exception
    print(e)

# IV. Other Operations

# 4.1 Submit Data Immediately

ta.flush()

Submit data to the appropriate receiver immediately.

# 4.2 Close sdk

ta.close()

Close and exit sdk, please call this interface before shutting down the server to avoid data loss in the cache.

# V. Relevant Preset Attributes

# 5.1 Preset Properties for All Events

The following preset properties are included in all events in the Python SDK, including automatic collection events.

Attribute name	Chinese name	Description
#ip	IP address	The user's IP address needs to be manually set, and TA will use this to obtain the user's geographic location information
#country	Country	User's country, generated according to IP address
#country_code	Country code	The country code of the country where the user is located (ISO 3166-1 alpha-2, that is, two uppercase English letters), generated based on the IP address
#province	Province	The user's province is generated according to the IP address
#city	City	The user's city is generated according to the IP address
# lib	SDK type	The type of SDK you access, such as Python, etc
#lib_version	SDK version	The version you access to the Python SDK

# VI. Advanced Functions

Starting from v1.4.0, the SDK supports the reporting of two special types of events: updatable events and rewritable events. These two events need to cooperate with TA system 2.8.

Since special events are only applicable in certain scenarios, please use special events to report data with the help of Count Technology's customer success and analysts.

# 6.1 Updatable Events

You can implement the need to modify event data in a specific scenario through updatable events. Updatable events need to specify an ID that identifies the event and pass it in when the updatable event object is created. The TA background will determine the data that needs to be updated based on the event name and event ID.

# Instance: Report an event that can be updated, assuming the event name is UPDATABLE_ EVENT
distinct_id="65478cc0-275a-4aeb-9e6b-861155b5aca7"
account_id = "123"
event_name = "UPDATABLE_EVENT"
event_id = "123"
properties = {
    "price": 100,
    "status": 3
 }
# Event attributes after reporting are status 3 and price 100
ta.track_update(distinct_id=distinct_id, account_id=account_id, event_name=event_name, event_id=event_id, properties=properties)

// Same event after reporting Name + event_ The event attribute status of ID is updated to 5, price remains unchanged
_new_proterties = {
 "status": 5
}
ta.track_update(distinct_id=distinct_id, account_id=account_id, event_name=event_name, event_id=event_id, properties=_new_proterties)

# 6.2 Rewritable Events

Rewritable events are similar to updatable events, except that rewritable events will completely cover historical data with the latest data, which is equivalent to deleting the previous data and storing the latest data in effect. The TA background will determine the data that needs to be updated based on the event name and event ID.

# Instance: Report an event that can be overridden, assuming the event name is OVERWRITE_ EVENT
distinct_id="65478cc0-275a-4aeb-9e6b-861155b5aca7"
account_id = "123"
event_name = "OVERWRITE_EVENT"
event_id = "123"
properties = {
    "price": 100,
    "status": 3
 }
# Event attributes after reporting are status 3 and price 100
ta.track_overwrite(distinct_id=distinct_id, account_id=account_id, event_name=event_name, event_id=event_id, properties=properties)

//Event attributes status is updated to 5 and price attributes are deleted after reporting
_new_proterties = {
 "status": 5
}
ta.track_overwrite(distinct_id=distinct_id, account_id=account_id, event_name=event_name, event_id=event_id, properties=_new_proterties)

# ChangeLog

# v1.7.0 (2021/12/29)

Added support for complex structure types

# v1.6.2 (2021/08/05)

Fix the problem that the flush method does not refresh the file date

# v1.6.1 (2021/06/04)

Modify LoggingConsumer's default write file condition to write 5 pieces of data once
Modify the default maximum buffer limit of BatchConsumer to 50 batches

# v1.6.0 (2021/03/22)

Added network failure retry mechanism
Increase the buffer area, retry 3 times to send failed data will be stored in the buffer area, the cache data exceeds the upper limit will discard the early data
Support #app_id attribute

# v1.5.0 (2020/11/21)

LoggingConsumer supports automatic directory creation

# v1.4.0 (2020/08/27)

Added track_update interface to support updatable events
Added track_overwrite interface to support rewritable events

# v1.3.3 (2020/08/12)

Fix the problem that AsyncBatchConsumer threads do not release under special circumstances

# v1.3.2 (2020/08/03)

Support LoggingConsumer configuration file prefix name
Added UUID configuration switch

# v1.3.1 (2020/02/26)

DebugConsumer error log printing compatible with python2

# v1.3.0 (2020/02/11)

Data type Support list type
Added user_append interface to support attribute appending of user's array type
Added user_unset interface to delete user features
BatchConsumer performance optimization: support to choose whether to compress; remove Base64 encoding
DebugConsumer Optimization: More complete and accurate validation of data at the server level

# v1.2.0 (2019/09/25)

Supporting DebugConsumer
Support log file segmentation by hour
BatchConsumer supports multi-threading
Fix the problem that AsyncBatchConsumer batch sending does not take effect
Optimizing data reporting return code processing logic
LoggingConsumer: No limit on single log file size by default

← PHP SDK User Guide Ruby SDK User Guide →