**a recommendation system for the effective cost of taxi drivers**

**Summary**

GPS technology and new forms of urban geography have changed the form of mobile services. For example, the rich GPs trajectory of taxis makes it possible to do a car rental in the field of new methods. In fact, much of the recent work is using the taxi GPS trajectory data to develop a mobile referral system. These systems can recommend a range of passenger points to maximize the likelihood of finding a passenger in the shortest driving distance. However, in the real world, the revenue of taxis and the effective driving time are closely related. In other words, it is more important for a taxi driver to know an exact driving path to shorten the driving time before finding a passenger. Finally, in this paper, we propose to develop a high-yield recommendation system. The purpose of the development is to maximize the benefits of finding passengers according to the recommended path. In particular, we first defined a net profit objective function to evaluate the potential benefits of driving paths. Then, by digging in the history of the taxi, we developed a graph to represent a road network and provided a violent way to generate the recommended best driving path. However, along the way, a key challenge is the huge overhead of figure calculations. Therefore, we have developed a new recursive strategy, which is based on the special form of net profit function to find the best candidate path effectively. In particular, unlike recommending a continuous passenger point and allowing the driver to decide how to reach these points, our referral system is able to provide an entire driving route, and the taxi driver can find the most potentially profitable passengers by recommending them. This makes our referral system more useful and profitable than other existing recommender systems. In the end, we experimented on a real-world dataset that came from the San Francisco Bay Area, and the experimental results clearly validated the effectiveness of our recommendation system.

Keywords: cost-e ffective, Mobile recommender Systems, Taxi Drivers

**1. Introduction**

GPS technology and new forms of urban geography have changed the form of mobile services. For example, the rich GPs trajectory of taxis makes it possible to do a car rental in the field of new methods. In fact, much of the recent work is using the taxi GPS trajectory data to develop a mobile referral system. These systems can recommend a range of passenger points to maximize the likelihood of finding a passenger in the shortest driving distance. However, in the real world, the revenue of taxis and the effective driving time are closely related. In other words, it is more important for a taxi driver to know an exact driving path to shorten the driving time before finding a passenger. Finally, in this paper, we propose to develop a high-yield recommendation system. The purpose of the development is to maximize the benefits of finding passengers according to the recommended path. In particular, we first defined a net profit objective function to evaluate the potential benefits of driving paths. Then, by digging in the history of the taxi, we developed a graph to represent a road network and provided a violent way to generate the recommended best driving path. However, along the way, a key challenge is the huge overhead of figure calculations. Therefore, we have developed a new recursive strategy, which is based on the special form of net profit function to find the best candidate path effectively. In particular, unlike recommending a continuous passenger point and allowing the driver to decide how to reach these points, our referral system is able to provide an entire driving route, and the taxi driver can find the most potentially profitable passengers by recommending them. This makes our referral system more useful and profitable than other existing recommender systems. In the end, we experimented on a real-world dataset that came from the San Francisco Bay Area, and the experimental results clearly validated the effectiveness of our recommendation system. (the author talks too much, instead of a summary, this is not the key)

**2. Related work**

Many people now do a lot of work on personalized referral systems, but these systems are all around traditional algorithms, content-based recommendations, recommendations based on collaborative filtering, mixed recommendations, and so on. Are some of the information on the Internet, to users recommend movies, articles, books, or Web pages and the like. Most of the data comes from user ratings and is rarely from the phone system.

Personalized Recommender systems on mobile phones are more challenging than other traditional areas, mainly because of the complexity of spatial data, the intrinsic relationships between spatiotemporal data, ambiguous situational awareness roles, and the availability of increased environmental awareness. Indeed, the recommendation system in the mobile environment has been studied before [1, 3, 5, 6, 12, 19, 20]. For example, [1,5] mobile route navigation. A personalized Scenario recommendation framework was proposed for mobile phone users. [30] Some technical opportunities were proposed, which were related to the mobile referral system [20]. Aver-janova and others have developed a map-based mobile referral system that can provide some personalized recommendations to users. However, most of the above work is based on user ratings and interactions, the corresponding recommendation system is also for smart devices such as smart phone development. Indeed, the problem of building a mobile referral system for the taxi industry is still a lot of space.

Recently, the rich GPs trajectory of taxis has made it possible to do a car rental in the field of new methods. A lot of energy into the use of taxi trajectory data to do mobile referral system. These systems can extract energy-ecient traffic patterns from historical trajectory data and recommend potential passenger points for taxi drivers. Such systems can provide a range of optimal passenger points. Powell and others [16] proposed a grid-based approach for taxi drivers to suggest profitable locations by creating a time-space profit map. In addition, Yuan and others [24,25,26] have carried out a series of research on mobile intelligence, by exaggerating the bus trajectory, for example, based on probabilistic models to carry out passenger spot detection, and for taxi drivers and passengers both provide location recommendations. Unlike the above research, we propose to develop a novel recommender system that can provide a full driving route rather than a discrete passenger point, and the driver can find the maximum potential profit from this recommendation.

**3. Problem Statement**

In this chapter, we first introduce the problem of defining some preliminary knowledge and then formally defining the maximum net profit maximum net Pro t (MNP) for taxi drivers.

**3.1 Preliminary Knowledge**

*definition of 3.1.1 Road network (road network formulation)*

Definition 1 (road segment). A long street through the intersection can be divided into several sections of r, in particular, each section of R has a starting point r.s and a focus R.E composition. In addition, each section of R has a lot of adjacent segments that make up a set of r.next[], meaning that if RI.S=R.E, RI belongs to r.next[].

Define 2 (PATH). A path R is a series of connected sections, for example, at the same time, the start and end points of R can be represented as R.S=R1.S and R.E=RN.E.

Definition 3 (road network). A road network G can be represented as a figure G=<v,e>,v={ri} represents a collection of nodes (including all road segments), E represents a collection of edges, satisfies

Figure 1 shows a road network. See, each node represents a road segment. Note that there is only one direction for each edge. This is because we do not allow the taxi to turn back on the same road, which is not recommended in the display of life, because there is a great possibility of causing traffic accidents. However, the driver can take a lap around 3 sections, such as R1,R2,R7.

*3.1.2 Calculation of net profit (* Calculation of Net Profit*)*

For each section R, net profit G (r) consists of two parts, which is the possible profit (potential earning) and the possible costs (potential cost). In particular, we define the possible profit of section R as E (r), calculated by the following methods:

(My understanding of this formula is the average cost of a guest in this section, i.e. the mathematical expectation of profit)

Here, nr represents the number of passengers in the section R for a specific period of time, Fee (i; r) refers to the profit of the section R from the first passenger, P (R) refers to the possibility of loading on the section R to the guest (this will be described in section Fourth). On the other hand, the possible cost of the section R can be calculated in the following ways:

Here, L (R) refers to the length of the section R, gas refers to the unit distance inside (for example, per mile) the price of fuel consumption, T (R) refers to the time required to pass the section R, the company's cost (Companyfee) refers to the unit time (say 1 minutes) work costs. In fact, T (R) is closely related to the real-time traffic conditions. Traffic jams, for example, can lead to higher T (r), which results in higher costs (t (r) *companyfee). In this case, this section will not be recommended by our model. Therefore, the net profit of section R, for example, called G (r), can be expressed as:

With the above definition, we can further define the net profit for each path R. In particular, when a given path departs from R1, its total net profit can be calculated in the following ways:

Intuitively, the net profit of path R is the sum of the net profit of the section {RI} included in R, measured on the premise that the previous road segment (e.g. R1 to ri-1) was not loaded with any passengers.

In fact, in the probability of net profit, the driver will not take away from his current location of the road segment, because the profit expectations are very low. More specifically, when adding a segment to the path, we can define the average growth rate of net profit to represent the growth of net profit. Figure 2 shows the trend of the net profit growth rate for increasing the number of road segments and the probability of different passengers. We can see that the growth rate is less than 10% after adding 5 lines.

In **fact, in our experiment, the probability of average passenger per section is usually less than 0.1, so you might want to set an upper bound for the path length m in Equation 4. Through the above definition, we can formally give MNP recommendations to the following definition. **

**(there should be 3.2, 3.2 problem statements here** )

Definition 4 (Problem statement). Given the current position of a driver lcab belongs to R, a specific driving distance m, and has a series of candidate paths, for any r belongs to, satisfies R starting from R. MNP recommendation problem is to recommend a path r* belongs to, this path has the largest net interest rate, for example:

**Unlike existing other taxi driver referral systems, they focus primarily on extracting time-and yield**-high (energy-efficient**) traffic patterns-based on timing/distance and recommending a range of possible passenger points for taxi drivers** [25, The 8],MNP recommendation focuses on providing a full driving path for taxi drivers. Following this route (IDEA), there are two major challenges to solving the MNP recommendation problem. 1th, how to calculate the parameter G (r), P (R) of the segment R from the historical passenger data. 2nd, how to search for the best path effectively from the complex and forward loop network. In the next section, we will cover these two challenges with our solution.

(I think the author of this paper is too bad, 3,3.1,3.1.1,3.1.2, otherwise the direct 3.1,3.2 will not be finished?) ）

**4. Maximum net profit (MAXIMUM net PROFIT (MNP)) Recommended**

In this section, we will present the technical details of the MNP recommendation problem resolution.

**4.1 ****estimation parameters, through** *road buff er*

To accurately obtain the current position of the taxi driver and estimate the parameters of the net profit, such as P (r) and G (R), we have developed the *Road buff ER* assessment for each section. In particular, in a geographic information system, a buffer refers to a region that has a specific distance near a space object. The boundary of the buffer is the solid line from the same distance to the object (this should be the author of the paper with a bit of clerical error, which should be dashed dashed line). Figure 3 (a) explains the different buffer operations, such as a point, a three-wire segment, and a triangular buffers[21]. Intuitively, people are willing to wait for taxis on the roadside, not in the middle of the road, and the passenger points of taxis are usually on the roadside. Therefore, when calculating the number of historical passenger events, we need to build a buffer around each segment, usually like a rectangle that surrounds the road. In particular, the size of the buffer depends on the demand for different problem amounts in the real world.

To build the *road buff er *, first, we need to define vertical roads and horizontal roads. More specifically, by using the longitude (longitude) and latitude (latitude) of the start and end points of each road segment, we can calculate the tangent (tangent) value of this segment. If this tangent is greater than 1, we think that the corresponding road is a vertical road, otherwise it is a horizontal road. For each vertical road, we keep the longitude and end longitude of its starting point inconvenient, and extend the corresponding latitude to and from east to west. For a horizontal road, we maintain the latitude of its starting and ending points, and extend its longitude north and south. For example, Figure 3 (b) shows the buffer operation for vertical roads and horizontal roads.

Given historical passenger data and *road buff ers *, we were able to calculate the number of R passenger events on each segment, which explains how often a passenger event occurs when a taxi passes through each section. Let the number of taxis in the buffer area of the section R indicate the number of passengers in a taxi in the buffer area of section R. Therefore, for each section R, the probability P (r) of passenger time can be estimated as follows:

We can also get the yield fee (i; r) in Equation 1 in the section I historical passenger events in section R. In addition, the distance of the path L (r) and the real-time driving time t (r) can be estimated from historical data or with some additional resources, such as Google Maps. Therefore, net profit G (r) can be derived from Equation 3. In particular, the values of T (r), G (R) and P (r) for each segment can be pre-stored on the nodes of the corresponding road network (for example, Figure 1).

**4.2 MNP Path recommended**

In this small section, we describe how to solve MNP recommendations through different strategies.

*4.2.1 Violence recommendation Strategy*

After acquiring the road network, we can use it to generate candidate paths for MNP recommendations. In the end, we first propose a brute force strategy to accomplish this task, based on the breadth-first search algorithm. In particular, the recommended algorithm is as shown in algorithm 1. In this algorithm, we keep a path queue Q, in order to generate a candidate path set C, and then the fifth step function MNP (c) is used in the candidate set C to find the best path, that is, the maximum profit. However, this brute-force method of searching for MNP paths is not soft, as it examines all possible paths of length m in G.

Lemma 1. Given a fixed driving length of M and road network G={v,e}, Meet | V|=n, the computational complexity of finding an optimal MNP path through brute force algorithms is

Proof: Obviously, the total number of candidate paths in the road network is yes, and the calculation of net profit per path requires M operations. Therefore, the complexity of searching for the best MNP path is.

Intuitively, the computational complexity of this brute-force algorithm is too high to meet the needs of real-world applications. There are some algorithms that can save the arrival time-the real-world highway. In the end, we will further propose another recommendation strategy based on the recursive nature of the net profit function.

*4.2.2 Recursive Referral strategy*

By observing the expression of the net profit of the path, we can rewrite equation 4 into the following form:

While In fact, this particular form of total net profit can be achieved through recursive algorithms. In the end, for each road segment, we can R1 all the candidate paths from R1 as a recursive tree structure. In particular, the recursive tree of each road segment is defined as the following form.

Definition 5 (recursive tree). The recursive tree of the section R1 is a tree, each node of it represents the road segment and the root is R1. In addition, for each node ri in the recursive tree, its child's node collection equals ri.next[].

For example, Figure 4 shows a sample of the recursive tree for section A. In this article, we present a method Rtree (r,m) bit is to construct a recursive tree of depth M for R, which is embodied in algorithm 2. In particular, the tree obtained by our algorithm will retain the M nodes, which represent the nodes in the layer I of the tree. Through this structure, MNP recommendations starting with R1 can be recursively divided into several simpler MNP recommendations. Take Figure 4 as a sample, we can develop a bottom-up method to calculate the length of the MNP path of 3, its net profit can be expressed as G (a,3). In particular, according to the definition of net profit, we can obtain G (a,3) =g (a) + (1-p (a)) *max{g (B; 2); G (C; 2); G (F; 2); G (E; 2)}, where the net profit of the MNP path of length 2 can also be calculated by their sub-path. For example, in our figure we have G (b; 2) = g (b) + (1-p (b)) *{MAXFG (D; 1); G (I; 1)}, and the profit of each individual's segment (for example, the leaf node) can be calculated directly, for example G (D). Therefore, given a recursive tree R, we can obtain a path length of M for the MNP path through recursion M-1 times.

Specifically, in this article, we developed a recursive algorithm for MNP recommendation RNMP (R; K), as shown in algorithm 3. With our algorithm, the parameter R = R1 at the same time k = m, then the MNP path of length M can be obtained from the corresponding MNP value starting from the section R1.

Lemma 2. Assuming a recursive tree with a depth of M, for any R belongs to, |r.next[]|<=n, the complexity of finding an optimal MNP path through a recursive method is.

Proof: Suppose that the calculation cost of finding G (R,r1,m) is T (M), and obviously we will get. Furthermore, for any R satisfying |r.next[]|=n, the calculation can be divided into N sub-problems. Specifically, for paths with only one segment, we get t (1) = 1. At the same time, after recursion M-1 times, we get. Therefore, the computational complexity of finding the best MNP path through a recursive tree is.

Although recursive trees can make more effective recommendations than brute force algorithms, the computational cost increases sharply with the increase in M. According to the discussion in section 3rd, we can set an upper bound of M, because the average profit growth force will become very low after m>5. So we set the = 5 in our experiment.

**4.3 top-k path recommended**

Through the above algorithm, our recommender system can recommend a MNP path for a single driver. However, in real life, an ideal recommender system must be able to provide recommendations for multiple taxi drivers in the same area at the same time. In this section, we focus on this issue and introduce a minimal redundancy strategy for this real-world recommendation processing.

Intuitively, a direct recommendation strategy is to recommend the best driving path for all drivers. However, if we recommend the same path for too many drivers at the same time, this can lead to an overload problem and reduce the performance of the referral system. Overload problem is a classic problem that has been extensively studied. For example, the load balancing mechanism distributes requests between multiple Web servers in order to reduce execution time [22,10]. In our problem, we can think of multiple empty taxis as tasks, and many of the best driving paths as computers. Instead of solving the overload problem by exploring existing load-balancing algorithms, we want to focus on the deflection characteristics of the mobile referral system and develop a direction-based clustering approach (DEN) [29] to distribute empty cars, following Top-k's best driving path [9,23].

Before recommending driving routes to taxi drivers, we first mark all candidate paths according to their net profit and obtain top-k driving routes. After recommending the first driver's top-ranked path, we need to calculate the correlation between this path and the remaining K-1 candidate paths, and then recommend the least relevant path to the second driver.

To calculate the correlation between these candidate paths, we first divide the space into squares and transform each lattice's moving data into a vector, which represents the probability of moving in this lattice. We then transformed the direction information for the taxi movement into the same data format, and further divided each small lattice into 8-direction bins. For example, in Figure 5 (a), the angle of each bin has a range. Next, we divide each lattice into a vector of g= (p1,p2,p3,......, P8), where Pi refers to the probability that the lattice moves in the I direction, and here the FI refers to the frequency of moving objects that pass through the lattice and the direction is along direction I.

For example, as shown in 5 (b), we first recommend path A to the first driver, where path b,c and D are the other candidate paths at the same time and in the same place. Then we divide the space into small squares and get the vectors of each lattice. A candidate path with the lowest correlation to the previously recommended path is usually the first driving direction is different. Therefore, we only need to analyze the first n squares to determine the driving direction. We combine the vectors of n squares and get a vector of 8*n elements for each candidate path. For example, the vector g (a) = (p11,p12,...... pn7,pn8) of path A. We then calculate the correlation for the vectors for each pair of candidate paths. Thus, the similarity of A and B can be computed. If the correlation between path B and path A is the lowest, we will recommend B to the next empty vehicle.

**5. Experimental results**

To verify the efficiency and effectiveness of the proposed system, the 30-day data set was collected in the real-world San Francisco area and was extensively tested.

**5.1 Experimental Data**

*Taxi GPS track. *in this experiment, we used the Discovery Pavilion to collect real-world taxi GPS tracks. The moving trajectory is the vehicle's continuous-time driving state, each recording with a set of representations (latitude, longitude, fare identi er,time stamp). By cleaning up the data set, we received 89,897 passenger and drop-off times in total. In general, we assume that most drivers will follow the directions suggested by Google Maps, so we can get the price on a particular itinerary and the fare information can also be used to calculate the profit of the itinerary that we are interested in. The following Figure 6 shows an example of the 100 taxi drivers in the San Francisco Bay Area for 30 days of passenger points, each red dot represents a passenger incident. Figure 7 is a thermal diagram of the probability of carrying a passenger. Here, different colors and round areas represent different passenger possibilities. This picture shows a lot of passenger activity in San Francisco's Commercial Street, a very busy street with lots of shopping places and museums. Other passenger hotspots include fisherman ' s Wharf, Divisadero St, Cathedral Hill and Western addition.

*network data. *because the number of road networks in San Francisco is not sufficient. We build the road network by using the Google Maps API. First, we look for all the names of streets in San Francisco. In the second step, we run the Google API to find out if there is a crossover between the two roads. We record a record at each intersection. Figure 8 (a) explains our intersection point. We then use the intersection to find the points in its nearest 4 directions and connect the five points together. Therefore, we can obtain 4 different connected sections, consisting of the starting and ending points. However, with the yellow line shown in 8 (b), we may happen to have two intersections connected, but they do not have a path between them. To solve this problem, we calculate the distance of two intersections by coordinates and compare it to the driving distance calculated by Google Maps. If there is a path between two points, the two distances should be very close. If not, it means there is no path between the two intersections, and we remove the segment from the road network collection.

The San Francisco Regional road network collection consists of 5,391 roads, each of which consists of an ID, a starting point, an endpoint, and the historical passenger probabilities we calculate as well as the net profit per road segment. For each segment, the coordinates of a large number of intermediate points may be recorded, and some are noise points. After removing these noise points, we selected 2,149 routes with high occupancy rates to serve our experiments. We can then build road buffer through the starting and ending points on these sections.

By matching the passenger coordinates of the road network with the taxi data set, we were able to get 87688 of the effective passenger events that could be positioned on the road segment, so that two data sets together and each of the passenger points was mapped to the built road buffer. In order to implement the proposed algorithm, we also need to calculate the occupancy rate and the net profit of each section of these sections. This has been shown in the 4th quarter.

In the end, we get the starting and ending points for each segment, along with the occupancy rate, net profit, and average driving time. Note that the driving time is estimated by the distance per road segment/average driving speed of the San Francisco area.

**5.2 On the recommendation of the experimental study**

Here, we offer two sets of experiments. The first group of experiments was on the cost of effective path recommendations, and another set of experiments was on top-k recommendations.

*5.2.11 groups of experiments on cost-effective path recommendations*

Here we show two examples of recommended MNP paths, based on our two algorithms, and compare it to the recommended path for Google Maps. In particular, in figures 9 and 10, we painted the best driving route recommended by our recommendation system in the case of randomly selecting the starting position for the target vehicle. We also assume that the driver's expected cruise length is 5, and after every 5 sections, the system uses the current position as the new starting point to find and restart the referral process. For the sake of comparison, we calculated the actual driving time of each taxi trip and restarted our referral system until the driving time in the MNP path equals the actual driving time. Then we connect these MNP paths together and this is an entire path that needs to be recommended to the driver. In these diagrams, the left one is the recommended travel path for the MNP recommendation system, and the map on the right is the recommended path for Google Maps via the shortest driving distance. However, the path recommended by Google Maps does not maximize the profits of taxis.

Recently, most recommender systems can only recommend a range of hotspots to taxi drivers. No such recommendation system can recommend a whole driving route. If the taxi driver does not know how to drive to the nearest hotspot, he or she must follow the path provided by Google Maps. However, both the passenger rate and the potential net profit will be very low along the way. Taxi drivers are likely to lose money before they reach the next hot spot. Our referral system can improve the potential net profit for taxi drivers relative to the path suggested by Google Maps.

*5.2.21 groups of experiments on TOP-K recommendation*

In the 4th section, we introduce a minimal redundancy strategy to recommend the Top-k driving path and solve the overload problem. In Figure 11, we explain the TOPK driving route, starting at the same location, where K equals 4. This diagram shows that each path has a different driving direction and the correlation between these driving distances is very small. Therefore, the minimum redundancy strategy can improve the performance of our recommendation system.

**5.3 Recommended route for inexperienced taxi drivers**

Given a specific location, the algorithm we propose can recommend a number of high-expected paths for taxi drivers. This algorithm is particularly useful for inexperienced taxi drivers because they lack knowledge about signposts and choose a profitable driving path. To verify the validity of our proposed algorithm, we first divide all the drivers into two categories according to their average net profit. The first 10% drivers are considered old drivers in this data set, while others are considered inexperienced. As a result, the driving route of an old driver is used as a training set and we are used to recommend driving routes for inexperienced drivers.

We define the driver's event e as a sequential sequence, and we can reconstruct each event by extracting the passenger and alighting behavior of each user. For each taxi driver, we define the location where he starts looking for possible passenger points as a L0, and after the stroll time, the driver takes the place of passenger L1 and drives the time to get off at the L2 location. Let ri,j represent the section between Li and LJ, and then event E can be represented, and the profit per unit time of the event can be calculated. Therefore, the proposed algorithm starts at the location, is close to the L0, and returns a series of recommended possible passenger points and sections.

The recommended driving path is measured by the average unit time of the net profit PR, and it is compared with the average earnings per unit time for inexperienced drivers, for example.

The data for the experimental results of recommending driving routes for inexperienced drivers are shown in table 1, and the average net profit per unit time is better than that of a real inexperienced driver.

We first painted the distribution of income for net profit per unit of time, such as the number of times the specific profit value corresponds, as shown in 12. Recommended system recommended path for unit time net income is compared to the performance of inexperienced taxi drivers based on historical data records. In the historical data, the blue bar represents the result of our referral system's net income, and the red bar shows the net income of inexperienced taxi drivers. We can see that most of the recommended events are in places where values are large. This implies that our referral system brings a higher yield than the actual path of inexperienced drivers.

In order to investigate the performance of the Recommender system, we also studied the difference between the recommended path in each event and the actual path of the driver in unit time net profit, such as Pr-pe. 13, the x-coordinate is the difference between the recommended system and the profit of inexperienced taxi drivers, and we can see that most of the time the point is on the right side of the x=0 line, which means that our referral system brings a higher yield than the actual path of inexperienced drivers.

We then evaluated the performance of the recommended strategy for violence and the performance of the recursive recommendation strategy. This experiment generated 1000 randomly selected starting points, and we only compared the run time for five sections, because the probability of carrying the passenger after 5 sections was less than 10%, as in Equation 4. As shown in 14, the red line is the time it takes to run a brute-force recommendation strategy, and the black line is the time it takes to run the recursive strategy. We can see that recursion strategies are more efficient than brute force strategies. Note that all experiments are performed in the Windows 7 Intel (R) Core (TM) i5-3210 CPU and 6.0 GB RAM environment.

To summarize, these experiments show that the effective cost recommendation system can help inexperienced taxi drivers find a better path to maximise their potential profits. In addition, recursive strategies can help to effectively identify the best recommended path.

**6. Conclusion**

In this article, we present a recommendation system for the effective cost of taxi drivers to maximize their profitability by providing profitable driving paths. More specifically, we first provided a net profit object function to evaluate the driving route before the passenger was found. Then, we propose a graph-based approach to efficiently generate candidate paths for finding passengers. Therefore, we can use the net profit object function to sort each candidate path and provide recommendations for taxi drivers in a cost effective manner. A unique perspective of our referral system is the ability to provide an entire driving path rather than recommending a range of discontinuous passenger points. In addition, by walking along the recommended driving route, taxi drivers are able to maximise their earnings over a fixed period of time. Finally, a large number of experiments on datasets collected in the real-world San Francisco Region Verify the effectiveness of the proposed recommendation system.

**Literature:**

**Qu M, Zhu H, Liu J, et al. A cost-effective recommender system for taxi drivers[c]//Proceedings of the 20th ACM SIGKDD International Conference on K Nowledge discovery and data mining. ACM, 2014:45-54.**

A cost-effective recommender system for taxi drivers