目录
此内容是否有帮助?

# LogBus Windows User Guide

This section mainly introduces how to use the Windows version of the data transmission tool LogBus:

Before docking, you need to read the data rules first, and then read this user guide after you have been familiar with TA's data format and data rules.

LogBus must upload data in the data format of TA

# Download LogBus Windows

Latest version: 1.3.0

Update time: 2021-10-19

download (opens new window)

# I. Introduction to LogBus

LogBus is mainly used to import back-end log data to the TA background in real time. Its core working principle is similar to that of Flume. It will monitor the file flow in the log directory of the server. When any log file in the directory has new data, it will verify new data and send it to the TA background in real time.

We recommend the following types of users access data by using LogBus:

  1. Users who use server SDK and upload data through LogBus
  2. Users who have high requirements for accuracy and dimension of the data, whose data requirements cannot be met only depending on the client SDK, or for whom it is inconvenient to access the client SDK
  3. Users who don't want to develop the back-end data push process by themselves
  4. Users who need to transmit bulk historical data

# II. Data preparation before use

  1. First, convert the to-be-transmitted data into TA's data formatby ETL, and write it locally or transmit it to the Kafka cluster. If the server SDKs such as Java are involved to generate Kafka or local file consumer, the data has been correctly formatted without further conversion.

  2. Determine the directory where the uploaded data files are stored, or the address and topic of Kafka, and configure LogBus. In this case, LogBus will monitor the file changes in the file directory (monitor new files or tail existing files) or subscribe to data in Kafka.

  3. Do not directly rename the uploaded data logs stored in the monitoring directory. Renaming the log is equivalent to creating a new file, and LogBus may re-upload these files, resulting in data duplication.

  4. Since the LogBus data transmission component contains a data buffer, the LogBus directory may occupy a slightly larger disk. Ensure that the LogBus installation node has sufficient disk space, and at least 10G of storage space must be reserved for each data transmission to a project (that is, adding an APP_ID).

# III. Installation and update of LogBus

# 3.1 Install LogBus

  1. Download and decompress the LogBus package (opens new window).

  2. Decompressed directory structure:

  3. bin: startup folder

  4. conf: configuration file folder

  5. lib: function folder

# IV. Parameter setting of LogBus

  1. Enter the decompressed conf directory containing the configuration file logBus.conf.Template. This file contains all configuration parameters of LogBus, and it can be renamed as logBus.conf during first use.

  2. Open the logBus.conf file to set related parameters

# 4.1 Project and data source setting (required)

  • Project APP_ID

APP_ID cannot be set repeatedly

##APPID from the token of tga's official website, please get the APPID of the implementation project on the project configuration page of TA background, fill it here, and split multiple Appids by ","

APPID=APPID_1,APPID_2
  • Monitor file configuration (please choose any one, required)

# 4.1.1. When the data source is a local file

## For the path and file name of the data file read by ##LogBus (the file name supports fuzzy matching), you must have read permissions
## Different APPIDs are split by ",", and different directories under the same APPID are split by spaces
## The file name of TAIL_FILE supports wildcard matching
TAIL_FILE=C:/path1/dir*/log.*,C:/path3/txt.*

TAIL_FILE supports monitoring multiple files in multiple sub-directories from multiple paths

Corresponding parameter setting:

APPID=APPID1,APPID2

TAIL_FILE=C:/root/log_dir1/dir_*/log.* C:/root/log_dir*/log*/log.*,C:/test_log/*

Specific rules are as follows:

  • Multiple monitoring paths under the same APP_ID are split by spaces
  • The monitoring paths under different APP_IDs are split by comma ",", and the monitoring paths correspond to APP_IDs after being split by comma ","
  • The directory in the monitoring path supports monitoring through wildcards
  • The file name supports wildcard monitoring
  • For path splitters, you may use "/" or "\\", do not use "</code>", for example, C:/root/.log or C:\\root\\.log

Do not store the log files that need to be monitored in the root directory of the server.


# 4.1.2. When the data source is kafka

When KAFKA_TOPICS needs to monitor multiple topics, you can split the topics by spaces; if there are multiple APP_IDs, you can split the topics under APP_IDs by ",".KAFKA_GROUPID must be unique. KAFKA_OFFSET_RESET is used to set the kafka.consumer.auto.offset.reset parameter of Kafka. The possible values are earliest and latest, and the default setting is earliest.

Note: The Kafka version of the data source must be 0.10.1.0 or higher

Example of single APP_ID:

APPID=appid1

######kafka configuration
#KAFKA_GROUPID=tga.group
#KAFKA_SERVERS=localhost:9092
#KAFKA_TOPICS=topic1 topic2
#KAFKA_OFFSET_RESET=earliest

Example of multiple APP_IDs:

APPID=appid1,appid2

######kafka configuration
#KAFKA_GROUPID=tga.group
#KAFKA_SERVERS=localhost:9092
#KAFKA_TOPICS=topic1 topic2,topic3 topic4
#KAFKA_OFFSET_RESET=earliest

# 4.2 Transmission parameter setting (required)

##Transmission setting
##Transmission url

##http transmission
PUSH_URL=https://global-receiver-ta.thinkingdata.cn/logbus
##If you enable privatization deployment, please modify the transmission URL to http://data acquisition address/logbus

##Maximum amount per transmission
#BATCH=10000
##Minimum transmission time interval (unit: second)
#INTERVAL_SECONDS=600
##Number of transmission threads, single thread by default is recommended under poor network conditions, and multi-thread will consume more memory and CPU resources
#NUMTHREAD=1

##Compression format for file transmission: gzip, snappy, none
#COMPRESS_FORMAT=none

# 4.3 Converter configuration (optional)

##The converter type temporarily is json csv regex splitter
#PARSE_TYPE=json

##Additional fixed attributes, format: name value, name1 value1
#LABELS=

##Property and type, PARSE_TYPE: csv regex splitter, format: name type, name1 type1
##Type: float int string date list bool
#SCHEMA=

##Specify the splitter, not empty when PARSE_TYPE is csv splitter
#SPLITTER=

##Specify the splitter of the list type, applicable when the list type exists, default,
#LIST_SPLITTER=,

##Regular expression, not empty when PARSE_TYPE is regex
#FORMAT_REGEX=

# 4.4 Monitor file deletion configuration (optional)

# Monitor directory file deletion, uncomment to enable the "delete file" function
# Only delete by day or hour
# UNIT_REMOVE=hour
# Files to delete
# OFFSET_REMOVE=20
# Frequency of deleting the monitoring files that have been uploaded
# FREQUENCY_REMOVE=60

# 4.5 Configuration file example

##################################################################################
##    Thinkingdata data analysis platform transmission tool logBus configuration file
##Uncommented parameters are required, and commented ones are optional, which can be filled in at will
##Appropriate configuration
##Environmental requirements: java8, please refer to tga's official website for more detailed requirements
##http://doc.thinkinggame.cn/tdamanual/installation/logbus_installation.html
##################################################################################

##APPID from the token of tga's official website
##Different APPIDs are split by "," and cannot be configured repeatedly
APPID=from_tga1,from_tga2

#-----------------------------------source----------------------------------------

######file-source
##For the path and file name of the data file read by ##LogBus (the file name supports fuzzy matching), you must have read permissions
##Different APPIDs are split by ",", and different directories under the same APPID are split by spaces
##The file name of TAIL_FILE supports regular expressions of java
TAIL_FILE=C:/path1/log.* C:/path2/txt.*,C:/path3/log.* C:/path4/log.* C:/path5/txt.*

######kafka-source
#KAFKA_GROUPID=tga.flume
#KAFKA_SERVERS=
#KAFKA_TOPICS=
#KAFKA_OFFSET_RESET=earliest

#------------------------------------sink-----------------------------------------
##Transmission setting
##Transmission url
##If you enable privatization deployment, please modify the transmission URL to http://data acquisition address/logbus
##PUSH_URL=https://global-receiver-ta.thinkingdata.cn/logbus
PUSH_URL=http://${data collection address}/logbus

##Maximum amount per transmission
#BATCH=10000

##Minimum transmission time interval (unit: second)
#INTERVAL_SECONDS=60

##### http transmission
##Compression format for file transmission:gzip,snappy,none
#COMPRESS_FORMAT=none

##Add the uuid property in each piece of data or not
#IS_ADD_UUID=true

#------------------------------------parse----------------------------------------
##The converter type temporarily is json csv regex splitter
#PARSE_TYPE=json

##Additional fixed attributes, format: name value, name1 value1
#LABELS=

##Property and type, PARSE_TYPE: csv regex splitter, format: name type, name1 type1
##Type: float int string date list bool
#SCHEMA=

##Specify the splitter, not empty when PARSE_TYPE is csv splitter
#SPLITTER=

##Specify the splitter of the list type, applicable when the list type exists, default,
#LIST_SPLITTER=,

##Regular expression, not empty when PARSE_TYPE is regex
#FORMAT_REGEX=

#------------------------------------other----------------------------------------
##Monitor file deletion in the directory, to open the comment (the following two fields must be opened) is to enable the file deletion function, and start the file deletion program every hour
##Press unit to delete the file before offset
##Files to delete
#OFFSET_REMOVE=
##Delete by day or hour
#UNIT_REMOVE=

# V. Start LogBus

Please check the following before first start:

  1. Check java version

Enter the bin directory, including two scripts, i.e.,check_java.bat and logbus.bat

Among them, check_java is used to test whether the java version meets the requirements and run the script. If the java version does not meet the requirements, there will be prompts such as Java version is less than 1.8 or Can't find java, please install jre first.

You can update the JDK version or see the next section to install JDK separately for LogBus

  1. Install the independent JDK of LogBus

If LogBus has a node, the JDK version does not meet the LogBus requirements, and it cannot be replaced with JDK version that meets the LogBus requirements due to the environment. This function can be enabled.

Enter the bin directory, including install_logbus_jdk.bat.

After running this script, add a java directory to the LogBus working directory. LogBus will use the JDK environment in this directory by default.

  1. Configure logBus.conf and run the parameter check command

For the configuration of logBus.conf, please refer to "Configure LogBus"

After configuration, run the env command to check whether the configuration parameters are correct

logbus.bat env

If a red exception message is output, it means that there is a problem with the configuration, and you need to modify it again until there is a prompt that the configuration file has no exception.

After you modify the configuration of logBus.conf, you need to restart LogBus to put the new configuration into effect

  1. Start LogBus
logbus.bat start

After start, open logkit.exe, otherwise it may cause data to be uploaded repeatedly

# VI. Details of LogBus command

# 6.1 Help information

If without a parameter or --help or -h, help information will be shown

Mainly introduce LogBus commands:

usage: logbus <command" auxiliary command> [option]
Command:
        start                                                        Start logBus.
        restart                                                      Restart logBus.
        stop                                                         Stop logBus.
        reset                                                       Reset ogBus read records
        stop_atOnce                                                  Stop logBus.
Auxiliary command:
        env                                                          Verify runtime environment
        server [-url <url>|-url <url> -appid <appid>]                Test the receiver network
        show_conf                                                    Show current logBus configuration information.
        version                                                      Show the version number.
        update                                                       Update logbus to the latest version.

Options:
 -appid <appid>   Project appid
 -h,--help        Show and exit help file.
 -path <path>     Specify the absolute path of the test file
 -url <url>       Specify the test url
Example:
   logbus.bat start                                                    Start logBus.
   logbus.bat stop                                                     Stop logBus.
   logbus.bat restart                                                  Restart logBus.
   logbus.bat server -url http://${receiver address }/logbus -appid *****      Test the receiver network   

# 6.2 Transmission channel check server -url

After verifying the format, you should check whether the data channel is open, and you can verify the format by using the server -url command. You can enter the APP_ID from the TA platform while verifying. Please note that the APP_ID and your project are bound, and ensure that your APP_ID corresponds to your project before input

 logbus.bat server -url http://${receiver address}/logbus -appid ${appid}

# 6.3 Show configuration information show_conf

You can view the LogBus configuration information by issuing the show_conf command, as shown in the following figure:

logbus.bat show_conf

# 6.4 Startup environment check env

You can check the startup environment by using env. If the output information is followed by an asterisk, it means that the configuration is faulty and needs to be modified until there is no asterisk.

logbus.bat env

# 6.5 Update LogBus version update

You can update the version online by using update. After that, LogBus will be updated to the latest version

logbus.bat update

# 6.6 Start start

After you have verified the format, data channel check and environment check, you can start LogBus to upload data, and LogBus will automatically test whether your file has new data written. If there is new data, please update the data.

logbus.bat start

# 6.7 Stop stop

If you want to stop LogBus, please issue the stop command. This command will take a certain time, but without data loss.

logbus.bat stop

# 6.8 Stop stop_atOnce

If you want to stop the LogBus immediately, please issue the stop_atOnce command, which may cause data loss.

logbus.bat stop_atOnce

# 6.9 Restart restart

You can restart the LogBus by issuing the restart command to put the new configuration into effect after modifying the configuration parameters.

logbus.bat restart

# 6.10 Reset reset

reset will reset LogBus. Please issue this command with caution. Once issued, the file transfer records will be cleared, and LogBus will re-upload all data. If you issue this command under unclear conditions, it may cause duplication of your data. It is recommended to issue it after communicating with TA staff.

logbus.bat reset

After issuing the reset command, you need to execute start to restart data transmission

# 6.11 View version number version

If you want to know your LogBus version, you can use the version command. If your LogBus does not have this command, your version is an earlier version

logbus.bat version

# VII. ChangeLog

# Version 1.3.0 --- 2021/10/19

  • Support non-TA data upload

# Version 1.2.0 --- 2021/05/26

  • Support cygWin

# Version 1.1.0 --- 2020/08/28

  • Add #UUID
  • Support #event_id and #first_check_id
  • Support multi-thread sending
  • Support splitter parsing and regular parsing

# Version 1.0.0 --- 2020/06/25

  • LogBus-Windows release