When I met one of the founders of the SimPy bag, Klaus Miller, I learned about the package from him. Dr. Miller has read several cute Python columns that suggest techniques for implementing semi-collaborative routines and "lightweight" threading using the Python 2.2+ generator. Especially (to my delight), he found that these techniques are useful when using Python to implement Simula-67 style simulations.
The results showed that Tony Vignaux and Chang Chui had previously created another Python library that was conceptually closer to Simscript, and that the library used standard threading techniques rather than my semi-collaborative routines. When the team studied together, it thought that the style based on the generator was much more efficient, and recently launched a project using the GPL on SourceForge, called SimPy (see Resources for a link to the SimPy home page), which is currently in beta status. Professor Vignaux hopes that he will use a uniform SimPy package in future University teaching at Victoria University in Wellington, University, and I believe the library is also well suited for applications in various practical issues.
I acknowledge that I do not have any basic knowledge of the analog aspects of the field of programming prior to recent communication and research. I suspect that most of the readers of this column, like me, know little about this knowledge. While some might argue that this style of programming is somewhat novel, simulations can be useful in understanding the behavior of real systems with limited resources. Whether you are interested in limited bandwidth networks, car traffic behavior, market and commercial optimization, biological/evolutionary interactions or other "random" systems, SimPy provides a simple Python tool for such modeling.
A random definition
Similar to "Connect," it is one of those words that best describes its job--no more suitable for it:
Random (stochastic), derived from Greek stokhastikos (adjective)
1) speculative, speculative, or speculative; well-guessed.
2) Statistically: involves or contains a random variable or multiple random variables, or involves contingency or probability.
Source: dictionary.com
In this column, I'll always use a fairly simple example of a payment area with multiple channels in the grocery store. Using the simulations demonstrated, we can ask questions about the economic and latency implications of various changes made to scanner technology, shopper habits, staffing requirements, and so on. The advantage of this modeling is that when you have a clear idea of the implications of your changes, it allows you to develop strategies in advance. It is clear that most readers do not specialize in a grocery store, but these technologies can be widely used in various systems.
The concept of simulation
The SimPy library provides only three abstract/parent classes, and they correspond to the three basic concepts of impersonation. There are many other general functions and constants used to control the operation of simulations, but important concepts are combined with these classes.
The core concept in simulation is the process. A process is just an object that completes certain tasks, and then sometimes waits for a while before it is ready to complete the next task. In SimPy, you can also "passivation" the process, which means that after a process completes a task, it does so only when other processes require the process to complete other tasks. It is often useful to use the process as an attempt to accomplish a goal. When writing a process, it is usually written as a loop in which multiple operations can be performed. Between each operation, you can insert a Python "yield" statement, which allows the simulation scheduler to perform the actions of each wait process before returning control.
Many operations performed by a process depend on the use of the resource. Resources are only limited in terms of availability. In biological models, resources may be food supplies; in a network model, resources can be routers or limited bandwidth channels; In our market simulations, resources are payment channels. The only task for resource execution is to limit its use to a specific process at any given time. Under the SimPy programming model, the process alone determines how long it takes to retain resources, and the resources themselves are passive. In a real system, the SIMPY model may or may not be suitable for conceptual scenarios, and it is easy to imagine that resources are inherently limiting their utilization (for example, if the server computer does not have a satisfactory response within the required timeframe, it interrupts the connection). But as a programming issue, whether a process or resource is an "active" side is not particularly important (just make sure you understand your intentions).
The last SimPy class is the monitoring program. In fact, the monitoring program is not very important, but it is very convenient. The whole task of the monitor is to record the events reported to it and to keep statistics about those events (average, count, variance, etc.). The Monitor class provided by the library is a useful tool for documenting simulations, but you can also record events with any other technology you want to use. In fact, my example makes the Monitor subclass to provide some (slightly) enhanced capabilities.
Setup store: for analog programming
In most of the articles I've written, I'll give you a sample application right away, but in this case I think it would be more useful to take you through every step of the grocery store application. If you want, you can clip each part together; SimPy creators will include my example in a future release.
The first step in the SimPy simulation is a few general import statements:
Listing 1. Import SimPy Library
#!/usr/bin/env pythonfrom __future__ Import generatorsfrom SimPy import simulationfrom simpy.simulation import hold, requ EST, release, nowfrom simpy.monitor import monitorimport randomfrom Math import sqrt
Some of the examples included with SimPy use the import * style, but I prefer to make the namespaces I fill more clear. For Python 2.2 (the minimum version required for SimPy), you will need to import the generator attributes as indicated. This is not required for Python versions later than 2.3.
For my application, I have defined several run-time constants that describe several scenarios that I am interested in during a particular simulation run. As I change the scenario, I have to edit these constants within the main script. If the content of this application is more substantial, then I might configure these parameters with command-line options, environment variables, or configuration files. But for now, this style is enough:
Listing 2. Configuring Simulation Parameters
aisles = 5 # of Open aislesitemtime = 0.1 # time to ring up one itemavgitems = # Average Number of I TEMs purchasedclosing = 60*12 # Minutes from store open to store closeavgcust = # Average number of daily cus Tomersruns = ten # Number of times to run the simulation
The main task that our simulations need to accomplish is to define one or more processes. For analog grocery stores, we are interested in the process of paying customers at the channel.
Listing 3. Define the actions of the customer
Class Customer (simulation.process): def __init__ (self): simulation.process.__init__ (self) # randomly Pick how many items the customer is buying Self.items = 1 + int (random.expovariate (1.0/avgitems)) def checkout (SE LF): start = Now () # Customer decides to check out yield request, self, checkout_aisle at_checkout = Now () # Customer gets to front of line waittime.tally (at_checkout-start) yield hold, self, self.items*itemtime leaving = Now () # Customer completes purchase checkouttime.tally (leaving-at_checkout) yield Release, self, Checkout_aisle
Each customer has decided to purchase a certain quantity of goods. (Our simulations do not involve the selection of goods from the grocery aisle; customers just push their carts to the checkout.) I'm not sure that the exponential variable distribution here is really an exact model. I feel right at the bottom of the list, but I feel a little bit untrue about how many items the actual shopper has been buying at the highest limit. In any case, you can see how easy it is to tweak our simulations if you can use better model information.
The actions taken by the customer are our concern. The "execution method" of the customer is. Checkout (). This process method is usually named. Run () or. Execute (), but in my example,. Checkout () seems to be the most descriptive. You can have any name you want for it. The actual action taken by the Customer object is simply to check the simulation time on several points and to record the duration to the waittime and checkouttime monitoring programs. But between these operations is a crucial yield statement. In the first case, the customer requests a resource (Payment channel). Only when the customer obtains the required resources, they can do other things. Once in the payment channel, the customer is actually paying-the time spent is proportional to the number of items purchased. Finally, after the payment office, the customer frees up resources so that other customers can use it.
The code above defines the operation of the customer class, but we need to create some actual customer objects before running the simulation. We can generate customer objects for each customer who will be shopping during the day and assign a corresponding payment time to each customer. But a more concise approach is to have the factory object generate the desired customer object when each customer is in the store. In fact, the simulation will not be interested in all the customers who will be shopping in the day, but only for those customers who want to compete for the payment channel at the same time. Note: The Customer_factory class itself is part of the simulation-it is a process. Although for this customer factory you might associate a man-made machine worker (La Fritz Lang's Metropolis), it should be considered as a convenient tool for programming, and it does not directly correspond to anything in the modeled domain.
Listing 4. Generate Customer Flow
Class Customer_factory (simulation.process): def run (self): while 1: c = Customer () Simulation.activate (c, C.checkout ()) arrival = Random.expovariate (float (avgcust)/closing) yield hold, self, Arrival
As I mentioned earlier, I would like to collect some statistics that are not currently resolved by the SimPy Monitor class. That is, I am not only interested in the average payment time, but also in the worst case scenario in a given scenario. So I created an enhanced monitoring program that collects the minimum and maximum count values.
Monitor simulations with monitoring programs
Class Monitor2 (Monitor): def __init__ (self): monitor.__init__ (self) self.min, Self.max = (int (2**31-1), 0) def tally (self, x): monitor.tally (self, x) self.min = min (self.min, x) Self.max = max (Self.max, X)
The final step of our simulation is of course to run it. In most standard examples, only one simulation is run. But for my grocery store, I decided to cycle through several simulations, each corresponding to a day's business. This seems like a good idea, because some of the statistics are quite different with each day (because the number of customers arriving and the amount of goods purchased are randomly generated by different values).
Listing 6. Run Simulations Daily
For run in range (RUNS): waittime = Monitor2 () checkouttime = Monitor2 () Checkout_aisle = Simulation.resource (aisles) simulation.initialize () CF = Customer_factory () simulation.activate (cf, Cf.run (), 0.0) simulation.simulate (until=closing) #print "Customers:", Checkouttime.count () print " Waiting time average:%.1f "% waittime.mean (), \ " (Std dev%.1f, maximum%.1f) "% (sqrt (Waittime.var ()), Waittime.max)" c10/> #print "Checkout time average:%1f"% checkouttime.mean (), \ # "(Standard deviation%.1f)"% sqrt (Checkout Time.var ()) print ' aisles: ', aisles, ' ITEM time: ', itemtime
Three people are not happy: some results (and what they mean)
When I first considered the grocery store model, I thought simulations could answer several direct questions. For example, I imagine the owner might choose to buy an improved scanner (reduce itemtime), or choose to hire more staff (increase aisles). I want to run this simulation under each scenario (assuming the employee and technology costs are given) and determine which of the above two options will be more cost-cutting.
Only after running the simulations did I realize that there might be something more interesting than expected. Looking at all the data collected, I realized I didn't know what to try to optimize. What the. For example, which is more important to reduce the average payment time and reduce the worst-case time? Which areas will improve overall customer satisfaction? In addition, how to compare the waiting time for customers before payment and the time taken to scan the purchased items? In my personal experience, I will be impatient in the waiting queue, but I won't be bothered when I scan my product (even if it takes some time).
Of course, I don't run grocery stores, so I don't know the answer to all of these questions. But this simulation really allows me to determine exactly what a compromise is, and it is simple enough to make adjustments that can be applied to many behaviors, including those that have not yet been explicitly parameterized-for example, "Do customers really keep coming all day?" ”)。
I just want to demonstrate the value of the model by demonstrating the last example. I have written above that the behavior of complex systems is difficult to conceptualize. I think the example here can prove the fact. What do you think will happen when the available channels are reduced from 6 to 5 (other parameters unchanged)? Initially I would like to increase the payment time in the worst case slightly. And that's not the case:
Listing 7. Two samples running before and after the number of channels change
% python market.pywaiting time average:0.5 (Std dev 0.9, maximum 4.5) waiting time average:0.3 (Std dev 0.6, maximum 3.7) Waiting time average:0.4 (Std dev 0.8, maximum 5.6) waiting time average:0.4 (Std dev 0.8, maximum 5.2) waiting time Avera ge:0.4 (Std dev 0.8, maximum 5.8) waiting time average:0.3 (Std dev 0.6, maximum 5.2) waiting time average:0.5 (Std dev 1 .1, maximum 5.2) waiting time average:0.5 (Std dev 1.0, maximum 5.4) aisles:6 ITEM time:0.1% python market.pywaiting Time average:2.1 (Std dev 2.3, maximum 9.5) waiting time average:1.8 (Std dev 2.3, maximum 10.9) waiting time average:1.3 (Std dev 1.7, maximum 7.3) Waiting time average:1.7 (Std dev 2.1, maximum 9.5) waiting time average:4.2 (Std dev 5.6, maximum 21.3) waiting time aver age:1.6 (Std dev 2.6, maximum 12.0) waiting time average:1.3 (Std dev 1.6, maximum 7.5) waiting time average:1.5 (Std dev 2.1, Maximum 11.2) aisles:5 ITEM time:0.1
Reducing a payment channel does not increase the average wait time by 1/5 or the like, but increases it by about 4 times times. Moreover, the waiting time for the most unfortunate customers (during these specific runs) increased from 6 minutes to 21 minutes. If I am a manager, I think it is extremely important to understand this limit situation for customer satisfaction. Who would have known that already?