Spider configuration file reference, spider configuration file

Source: Internet
Author: User

Spider configuration file reference, spider configuration file

Spider has a configuration file spider. xml, which is in xml format. spider. xml is managed using DTD to manage all the features, routes, and high availability of spider.

The configuration file can be specified in three different ways:

1. Use environment variables. The SPIDER_CONFIG environment variable specifies the position of the spider Startup File.

2. Run the command using java system attributes. The java System property spider. config specifies the position of the spider Startup File.

3. obtain it from classpath. The configuration file must be stored in the classpath *: directory. When spider middleware starts, it will automatically search for the first spider found under classpath. the xml file is initialized as the configuration file of the spider. Because the configuration file usually needs to be modified, it is generally not placed in jar.

The priority of the three parameters is obtained from the environment variable. If you can obtain the priority, use the Startup file specified by the SPIDER_CONFIG environment variable. If it is null, it is obtained from the java System attribute. If it still cannot be obtained, it will be obtained from the default classpath * directory. If none of the three locations are obtained, the startup fails.

The spider configuration file structure is as follows:

<?xml version="1.0" encoding="UTF-8"?><spider>    <nodeName value="client" cloud="false" role="production"        serviceCenter="localhost:7070" appVersion="" charset="UTF-8" dev="true" detectInterval="60000"/>    <plugins>        <plugin pluginId="spider.localService" serviceTimeout="60000"            zlibCompress="false" encrypt="false" anonymous="true"            serviceProxyPackage="com.ld.net.spider.demo.parallel;com.ld.net.spider.manage.api;com.ld.net.spider.demo.broadcast;com.ld.net.spider.demo.bs.wx;com.ld.net.spider.demo.pl">            <server enable="false" port="17070" reliable="false"                threadCount="200" serviceExportPackage="" />        </plugin>        <plugin pluginId="spider.channel">            <cluster clusterName="ANB" connectionSize="10">                <workNode address="localhost" port="18051" />            </cluster>        </plugin>        <plugin pluginId="spider.filter">        </plugin>    </plugins>    <routeItems consistent="true">        <routeItem serviceId="*" clusterName="ANB" />        <!-- <routeItem serviceId="*" appVersion="" subSystemId=""            systemId="" companyId="" clusterName="spider-server" /> -->    </routeItems></spider>

 

All the node element and attribute names in the configuration file are case-sensitive and named in the hump format. Use the full English name whenever possible.

The usage element or attribute specification of the configuration file is as follows: if the object is a feature, it is used as an attribute; if the object is a subject, it is used as an element.

 

Elements and attributes of each node are described as follows (implemented features are marked in green ):

Elements and attributes of each node are described as follows (implemented features are marked in green ):

Element

Attribute (-- indicates the element itself)

Optional

Default Value, meaning, and value range

Spider

--

No

Spider configuration file Root

NodeName

--

No

Spider node Basic Information

Value

No

Spider node name. spider with the same name will automatically form a cluster, any string, used in cloud mode

Dev

Yes

Running Mode, used to control the log output level. True: development mode. All log information is output. false: production mode. debug-level logs are automatically disabled. The default value is false.

Cloud

No

Spider node running environment, true: runs in service center mode. It automatically receives changes to downstream nodes pushed from the service center, and is suitable for large-scale deployment. false: runs in independent management mode, you can use restful APIs to manage node changes.

Role

Yes

Spider role, prod/nb/np: Run as production server; SC: Run as service center. Default production server. Unless configured as SC, it indicates the production server. The parallel processing plug-in takes effect only when the node is configured as np. For details, see the section "Parallel Execution plug-in.

ServiceCenter

No

Service Center address, ip: port format.

AppVersion

Yes

The maximum length is 8 bits. In the format of xx. xx. xx, we recommend that you use the application service version provided by this spider node for Grayscale upgrade and any string. For details, see the grayscale upgrade section. The default value is "", indicating a non-specific version.

 

     

 

     

 

     

 

Charset

No

Global encoding format, UTF-8 or GBK. It is recommended that the entire environment either UTF-8, or GBK, try to avoid some UTF-8, some GBK, or easy to error.

 

SlowLongTime

Yes

Slow request execution time. Requests whose execution time exceeds this length are automatically written to the local slow log. The default value is 3000 milliseconds.

 

DumpStat

Yes

Whether to enable the regular dump service performance indicator to the local device. true: Yes, it is automatically dumped every 5 minutes. This parameter is independent of the cloud parameter; false: No. Default Value: true.

 

Tcpdump

Yes

Whether to enable dynamic packet capture and interception. True: Yes; false: No. The default value is false. If multiple clients are configured with packages with the same function number, all clients will receive the result. This feature seriously affects performance and poses serious security risks. Therefore, the production environment should be enabled with caution.

 

DetectInterval

Yes

Heartbeat interval. The default value is 60000 milliseconds.

 

     

Plugins

--

No

The list of spider plug-ins. The current version has three plug-ins. The plug-in identifier cannot be modified. Otherwise, the spider may start abnormally.

Plugin

(PluginId = spider. localService)

--

No

Spdier plug-in information. Different plug-ins have different attributes. The spider Core Engine plug-in is used to set the basic features of the spider core.

In the implementation of netty, the tcp queue length is directly set to/proc/sys/net/core/somaxconn, and no API is provided for modification, it needs to be modified at the OS level. The spider was originally planned to support and will be canceled later.

ServiceTimeout

Yes

Service timeout time, which can be overwritten at the service level. The default value is 300 seconds, in milliseconds, positive integer.

ZlibCompress

Yes

Whether to enable the Global zlib compression request package. Optional values: true or false. The default value is false. We recommend that you do not enable it in the LAN, not in the LAN.

Encrypt

Yes

Whether to enable the AES256 encrypted request package. Optional values: true or false. The default value is false.

ServiceProxyPackage

Yes

The path of the remote service package to be called as the spider client, separated by; or. As long as the related path is set on this parameter, if the server sets the corresponding path on the serviceExportPackage parameter, the client can call the services provided by the remote server through @ Autowired injection.

To ensure the correctness of remote calls, make sure that the implementation of the proxy class is not included in this node for RPC calls, otherwise, multiple implementation exceptions will occur when Spring uses Type Injection by default at startup. Otherwise, you need to use Qualifier or Resource annotation.

 

Anonymous

Yes

Whether the server allows non-authenticated connections. True: allowed; false: Not allowed. The default value is true. For details, see the Security Section. This parameter is set on the server side. The client passively executes the parameter based on the server response message. When a node acts as a client role, this parameter does not work, that is, you do not need to set it.

Server

--

No

Information about spider running in Server Mode

Enable

No

Whether to enable the server. "false" indicates that the server is not enabled, and "true" indicates that the server is enabled. If this parameter is set to true, the port cannot be empty.

Port

Yes

As the port number of the server, 1025-63335

 

   

ThreadCount

 

 

Yes

As the number of service processing threads on the server, it is recommended that the number of cpu cores be between 20-50 times, and the default number of CPUs is 20 times.

ServiceExportPackage

Yes

The path of the spider service package automatically released when the server is used, separated by; or. As long as the server sets the relevant path on this parameter, as long as the client sets the corresponding path on the service-proxy-package parameter, it can directly call various services provided in this server package through @ Autowired injection.

Plugin

(PluginId = spider. channel)

--

No

Spdier plug-in information. Different plug-ins have different attributes. channel plug-ins. Each cluster under the channel represents a server cluster and is composed of its worknodes.

Cluster

--

Yes

Define downstream SERVER CLUSTERS

ClusterName

No

Defines the cluster name. It must be the same as the nodeName defined by the downstream server. Each clusterName in a configuration file must be different.

 

   

ReverseRegister

Yes

Indicates whether the node in the cluster is a reverse registration server.

WorkNode

--

Yes

Defines the member nodes in the downstream server cluster. The workNode within a cluster must be unique in address + port.

Address

No

IP address of a member Node

Port

No

The Port Number of the member node, which corresponds to the port defined by plugin pluginId = "spider. localService"-> server in the remote node spider. xml.

Plugin

(PluginId = spider. filter)

Filter

Yes

Filter plug-in. Each filter represents a filter instance. For more information, see "1.3 pipeline plug-in".

RouteItems

--

No

Defines the information of the route table. Routing is used to configure forwarding of different service requests to the corresponding spider server.

Route entries are parsed from top to bottom. When the preceding and the following route configurations conflict, use the preceding route entries.

 

Consistent

Yes

Whether to enable the consistent routing policy. True: enabled. When the spider is running, the route table is sorted according to the rule of function number> version number> organization number> subsystem number> system number (product system number, in this case, the route destination is consistent regardless of the route entry order. false: the spider matches the route entries according to the defined route entry order, different route entries may lead to different routing results. Default Value: false. We recommend that you enable it.

RouteItem

--

No

Define route entries. Route entries can be matched in multiple dimensions to flexibly meet various business scenarios of enterprise systems, currently, you can combine feature numbers, version numbers, subsystem numbers, system numbers, and organization numbers. The priorities are as follows: function No.> Version no.> organization No.> subsystem No.> System No. (Product System No ).

In the five dimensions, the function number and subsystem number are static attributes and must be determined during the compilation period (generally, either of them is required ). The version number, organization number, and system number (product system number) are runtime attributes and can be set during runtime. They are mainly applicable to multi-tenant and phased upgrade scenarios.

The function number must be defined. If all functions are matched, use *. The function number is in the and relationship with each dimension. Other functions are optional. If no definition is defined, it means that all are matched, that is *.

Define at least one entry pointing to the local processing plug-in. The simplest is <routeItem serviceId = "*" clusterName = "spider. localService"/>.

The order of Multiple Route entries affects the final route result, as shown below:

<RouteItem serviceId = "11 *" appVersion = "1.0.2" clusterName = "BSNP-C00001v2"/>

<RouteItem serviceId = "11 *; 21 *" companyId = "C00001" clusterName = "BSNP-C00001"/>

In the above entries, if a 1.0.2 C00001 mechanism's 11xxxxxx function to this node, will be forwarded to the BSNP-C00001v2; if the above route entry order reversed, will forward to the BSNP-C00001.

ServiceId

No

The 8-bit ASCII character defines the function number that the route entry adapts to. The function number supports the wildcard format. * indicates that all functions are matched ,? Matches a visible character. Function numbers can be separated by; or.

AppVersion

Yes

The application version number that matches the current route entry. wildcard configuration is not supported. Multiple application versions can be separated by commas.

SubSystemId

Yes

Define the subsystem numbers that match the route entries. configuration is not supported. Multiple subsystem numbers can be separated by commas.

SystemId

Yes

Defines the system numbers that match the route entries. wildcard configuration is not supported. Multiple System numbers can be separated by commas.

CompanyId

Yes

Defines the ID of the Organization that the route entry matches. configuration is not supported. Multiple organization numbers can be separated by commas.

ClusterName

No

Define the spider remote server to which the function in this entry will be forwarded. Make sure that each independent route has a different clusterName. If multiple routes have the same target node, they should be used in the corresponding attribute. Separate the routes for merging.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.