E-Commerce Order integration

Last Update:2016-02-23 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Overview:

Core Technical Requirements: No loss of orders, distributed order fetching

Order drop-down technology selection, push back is the crawl? At present, the method of timing grasping is more reasonable, because it can control the speed of order inflow and prevent the processing ability of the back-end system. Generally take a time-slicing method: Each time a task is executed to fetch an order within a period of time, in order to ensure that the order is not lost, we will have a few seconds to overlap the boundaries of each slice.

As an order in the middle, often grab orders from multiple platforms, then we can consider allocating multiple servers to complete the order fetching task, to improve throughput, for orders of many platforms, we can allocate more than one server, for the platform of fewer orders, a server can handle multiple platform orders, How to dispatch tasks efficiently is a challenge. Especially when multiple servers correspond to a single platform, we need each server to crawl the orders of different time periods in parallel (order of orders) so as not to repeat the omission. One approach is to assign a server to do task scheduling, then this server may become a single point bottleneck, once the server is down, the entire crawl process will be stalled. It is a better practice to have multiple servers negotiate with zookeeper to determine the time period each server should allocate.

Configuration Management:

Currently using zookeeper to do the configuration management of order fetching. For configuration management, the biggest advantage of zookeeper is the ability to centrally manage configuration information, and when configuration information changes, all nodes are automatically heard and aligned. Two levels of configuration management are currently supported for all configuration options, all platforms can share the configuration (platform number =default) or for a platform personalization (overriding default values)

Configuration Management zookeeper Data structure:

path	zookeeper data	node type
/tasks/cfg/starttime/[ platform number ]	the start time of the order fetching task for each platform (accurate to seconds), such as fetching the most recent 3 month's order, that start time is the current time -3 a month	Persistent
/tasks/cfg/interval/[ platform number ]	Time slice length (in seconds) for each platform order fetch task	Persistent
/tasks/cfg/timeout/[ platform number ]	Time-out for each platform order fetch task
/tasks/cfg/retries/[ platform number ]
/tasks/cfg/retryintv/[ platform number Span style= "Font-size:15px;line-height:115%;font-family:calibri, ' Sans-serif ';" >]	Each platform order fetch task error interval time of each retry
platform number Span style= "Font-size:15px;line-height:115%;font-family:calibri, ' Sans-serif ';" >]	Each platform order grab any The number of concurrent threads on each node
/schedulers/assignment/[ Platform number / [ server node number ]	Empty string	Ephemeral

Note: current server and platform correspondence cannot be intelligently assigned automatically and need to be created manually in zookeeper before starting the server

Task scheduling algorithm:

Through zookeeper coordination between nodes, the zookeeper data structure is as follows

path	zookeeper data
/tasks/runtime/prevcomplete/[ platform number "
/tasks/runtime/inprogress/[ platform number ]-[ task time slice start ]-[ task time slice end ]	Start execution time for actual tasks	Persistent

Get task:

Gets the most recent execution time/tasks/runtime/prevcomplete/from zookeeper, and gets the start time if the most recent execution time is empty
Calculate the next execution time based on the time slice
Save the start and end time of the next execution time slice/tasks/runtime/inprogress/and the new start time of the current platform task. Note: The Zookeeper Multi command is used to ensure that two saves are atomic (both successful and unsuccessful) when two different nodes try to acquire the same time, only the first one will be saved successfully, and the second will throw a keeperexception.
If save fails, retrieve the most recent execution time (repeat step 1-3), repeat n times, if still unsuccessful, return to fetch fetch task failed

Task Completion:

Delete a task in zookeeper/tasks/runtime/inprogress/

Exception Handling:

Each fetch task fails to retry a certain number of times (/tasks/cfg/retries), the interval between retries will gradually increase (retries * [/TASKS/CFG/RETRYINTV]), avoid frequent retries to further deteriorate the network condition, When the number of retries exceeds the limit, the task is placed in a failed queue and notified to the administrator, which may require troubleshooting and subsequent manual processing.

Another situation is that after the task is acquired in the processing process due to server downtime, crawl thread crashes and other reasons can not delete/tasks/runtime/inprogress/, that is, the task is not completed and the processing state is unknown. The cluster needs to elect a leader server through zookeeper to monitor the task list (about zookeeper leader elections are described in the official receipe, which is not discussed here), the monitoring thread in leader is active and periodically scans all/ Tasks under the tasks/runtime/inprogress/node, if the task exists for more than one threshold, the task is deleted and also placed in the failed queue for subsequent processing

E-Commerce Order integration

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

E-Commerce Order integration

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

E-Commerce Order integration

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support