Introduction:This article will introduce the Graphic Programming Technology Sikuli released by the MIT research team. Based on Image Retrieval, it provides a Jython-based scripting language and an integrated development environment. Users can use the screen to directly reference GUI elements for programming and interactive operations. In this paper, the actual application analysis andProgramExample to illustrate the application ideas and possibilities.
Introduction
In GUI tests, identifying personalized controls, simulating user behavior, and displaying on-screen verification results often become the bottleneck of automated testing. In most cases, such testing scenarios still rely on manual testing. This article introduces a new graphical programming technology, Sikuli, which gets rid of its dependence on the control API, obtains operation objects through real-time image retrieval on the current screen, and simulates user behavior, match the screen area to verify the actual visual display results. In this paper, the practical application analysis and program examples in Gui automated testing are used to explain the application ideas.
What is Sikuli?
Sikuli is a new graphical programming technology released by MIT's research team. Based on Image Retrieval Technology, it provides a script language based on Jython and an integrated development environment. Users can use the screen to directly reference GUI elements for programming and interactive operations. The term Sikuli is taken from the Mexican Huichol Indian indigenous language, meaning "the eye of God", as Zhang Yuxiang, a developer, said-Sikuli allows computers to "see" the "real world" like humans ".
Installation and IDE use of Sikuli
Currently, the latest version of Sikuli is Sikuli X-1.0rc2. In the Download Area of the official website, you can obtain the installation files and installation methods provided by Mac OS X, windows, and Linux.
Note that the running environment of Java 6 must be supported on the Windows platform. If you want to use the Sikuli guide, which is added to the 1.0rc2 version, update Java to the latest version. On the Linux platform, in addition to installing the Java 6 runtime environment, you also need to install the wmctl and opencv2.0 libcv4, libcvaux4, and libhighgui4 packages.
Sikuli provides a simple script development environment. The default Interface consists of the menu bar, toolbar, sidebar, editing area, console, and status bar, as shown in figure 1.
Figure 1. Sikuli-ide Interface Composition
The toolbar provides two groups of 5 commonly used tool buttons and text search boxes:
- Take screenshot: click this button to enter the screen status. Drag the guides to select the interface elements to be captured and release the left mouse button, automatically Insert the cursor to the current position in the editing area. You can also use the shortcut key Ctrl + Shift + 2 (command + Shift + 2) to activate the status to complete the real-time display of controls such as pop-up menus and drop-down boxes. You can also use the file-> preferences menu to customize the shortcut key.
- Insert image: you can click this button to import an existing PNG image file.
- Create region: click this button to enter the screen area selection status. Drag and Drop to locate the cross line to select the screen area. Release the left mouse button to insert the screen coordinate information of the selected area to the editing area.
- Run: Click to execute the current script. The shortcut key is Ctrl + R (command + r ).
- Run in slow motion, this facilitates focus tracking during program debugging. The shortcut key is CTRL + ALT + R (command + ALT + r ).
Some common functions are listed in the left-side Navigation Pane. You can click a function name to quickly insert it to the editing area. If the function needs to be a parameter, it is automatically transferred to the screen status. The status bar below can be used to view the current row number and the level (column number) of tab indentation at the beginning of the row ).
Sikuli script
Sikuli scripts follow the python syntax specifications. It provides a variety of custom classes and their custom methods. For details, see the documentation on its official website. Because Sikuli is based on Jython, its coreCodeWritten in Java, it can be referenced as a java standard class library in a user-defined Java project. its official website also provides javadoc for reference.
Here, we will first use an automatically open Firefox browser and log on to a simple Gmail instance to quickly view the unique features of Sikuli scripts.
Figure 2. Automatically log on to Gmail
For the Sikuli script shown in, click expand Start Menu, and then click the Firefox icon to start the browser. After the toolbar of Firefox appears, use the toolbar position as the benchmark, and offset 300 to the Right to locate the address bar, place the cursor in the address bar, and enter the Gmail URL in the address bar. On the login page, click the user name input box and enter the user name information. Then, enter the tab key to get the focus of the password input box, enter the password information, and click sign in to complete logon.
From the script in this example, it is not difficult to find the most significant feature of Sikuli. The screen of the GUI object is directly referenced as a function parameter. the semantics of the entire code is clear and readable. Image Retrieval is used during script execution.AlgorithmAnalyze and match the corresponding controls on the current screen, and use the corresponding mouse or keyboard for operations. In this way, you do not need to worry about tedious application-related APIs or obtain web content objects when writing scripts.
The edited Sikuli script can be saved through file-> Save. Save it as a. Sikuli folder on Windows, including all the PNG images used in the scriptSource codeAnd an HTML file that displays the source code.
You can use the menu File> export executable to generate executable files suffixed with. SKL. After an executable file in this format is generated, you can use the command line tool or double-click the file to conveniently run the script.
Application Instance of Sikuli
The emergence of Sikuli provides a new idea for automated GUI testing. Under normal circumstances, automated tests on GUI often identify and obtain GUI objects through their APIs, and then operate on them; for GUI display verification, it is done by specifying the absolute coordinates on the screen and matching the actual pixels at the corresponding position. The former is accurate but relatively complex. Its limitation lies in the need to understand the code implementation in the GUI, relying on the openness of APIS, and the objects to be verified may not be obtained successfully; the latter has strict requirements on the absolute position of GUI elements, and lacks flexibility and the ability to tolerate GUI displacement. A slight change in the object location may seriously affect the verification results, however, in actual application scenarios, the size, location change, and UI re-arrangement of GUI objects are common, reducing the stability and reliability of this verification method. Sikuli works in a way that fits the needs of both scenarios and greatly simplifies the operation and verification process. Next, we will use several instances to demonstrate how to use Sikuli in some typical cases to quickly complete GUI automation.
Verify the cell comment display when hovering in instance 1 Excel
In common automated GUI testing, to complete this verification, you need to write code to locate controls, simulate mouse events, capture objects, and determine the display results. This is not easy to implement. With Sikuli, you can only use the following short script to complete this task.
Figure 3. Display and verify cell comments
In the script segment shown in figure 3, the process of opening Excel and creating cell comments is completed from lines 15 to 19. The display trigger and display verification of cell comments only occupy 22-24 lines of code. The called hover () method automatically matches the region shown in the parameter from the current display on the screen to obtain its position, hover the cursor over the center of the rectangle area, and activate the annotation display. With the support of Sikuli's "visual" capability, you only need to use the verifyresult () method to verify whether the annotation is correctly displayed. As shown in implementation 4 of this method, the exists () method can be called to determine whether the corresponding comments are displayed on the current screen.
Figure 4. verifyresult () method implementation
Similar applications include pop-up and verification of control comment information, and verification of hover effect in Web applications.
Multi-object selection on instance 2 web page
In this example, multiple objects on the Web page are selected at intervals and in batches. Its script segment 5 is shown in.
Figure 5. multi-object selection on the web page
This script displays the expected selection results in result_list by creating an array of 12 numeric objects on the page. The openweb () method is used to automatically open the browser at runtime to enter the specified page. The selectobjs () and selectrange () methods are called in three different multi-choice modes. Implementation of custom methods in the script, as shown in figure 6.
Figure 6. Implementation of custom methods
The selectobjs () method calls the click () method of Sikuli and takes the input Object List and the key modifier defined by Sikuli as the parameter, after pressing the ctrl key, click the object one by one to complete multiple selections.
The selectrange () method implements regional selection. Specify the start position and end position of the constituency with the obj_from and obj_to parameters, or use only the obj_from parameter to specify the start position, and set the offset x and y in both the horizontal and vertical directions to specify the selection area. Call the dragdrop () method and input the start and end positions to complete the region selection.
The verification method is the same as described in instance 1.
Drag and Drop objects on instance 3 Web pages
In this example, the dragdrop () method of Sikuli is used to implement the drag and drop operations on objects. The task completed by the code in Figure 7 is to move the specified image to the trash area by dragging.
Figure 7. Drag an image to the trash Area
This script defines the image to be moved into trash in the droppable array, takes the title bar of the trash area as the search target, and uses the find () method to obtain the match object in the trash area, input the moveTo () method as the target region parameter. The moveTo () method searches for all objects on the current screen that match the image input by the OBJ parameter, and drag it to the area specified by DES. Its implementation is shown in figure 8.
Figure 8. Implementation of the moveTo () method
In this method, when the exists () method is applied to image thumbnails for retrieval, the image similarity (value range: 0 ~ 1) In Sikuli, the default similarity is set to 0.7. To make a correct distinction when the image content is similar, it can be used to similar () the method appropriately improves the similarity during retrieval to avoid matching to other similar areas.
To confirm the matching of the image in the current similarity settings, click the object in the script to activate the pattern Settings dialog box. In this example, click the second row in row 25th droppable, and dialog box 9 is displayed. In the area marked in figure (1), two images are highlighted in red and purple respectively. Indicates that the two images are recognized as the objects to be searched under the current similarity. The more red the color is, the higher the similarity between the image and the image. The closer the purple color is, the lower the similarity. Adjust the similarity slider marked by (2) and change the similarity settings. Then, you can see the number and color changes of the matching areas in the preview area, and select the appropriate similarity settings based on this, this allows the program to precisely and uniquely locate the target region.
Figure 9. Pattern Settings dialog box
Advantages and limitations of Sikuli
Sikuli gives the script a human perspective, allowing the computer not only to capture interface data and return values in the background, but also to "see" a real GUI. The user can directly reference the target GUI element in the script to obtain the object and specify the operations on it, which is simple and efficient. The GUI automation frees you from the dependency on the internal Program Implementation of the application. You do not need to obtain APIs or understand the internal code implementation of the GUI; the differences between standard and non-standard controls in operation and acquisition are ignored. It is applicable to all applications running on Operating Systems with graphical user interfaces, you can obtain and operate a place where a GUI is displayed. Real-time image search and positioning, avoiding the issue of inaccurate positioning and obtaining controls due to location changes such as displacement and UI shuffling. Python syntax compatibility + Java kernel, giving it strong scalability. At the same time, open source also gives Sikuli more development opportunities and space. The Sikuli script compiled using GUI elements has good code readability, which can be understood from the above example scenario-in this short script, it is almost similar to the natural language expression. This form and feature greatly reduces the distance between manual test cases and automated test scripts, making automatic conversion and integration between the two possible.
In addition, the features of GUI programming allow users to quickly get started with the basic programming knowledge, and easily create an automated script to perform various operations on the GUI, greatly reduces the threshold for programming. This allows more people to customize personalized desktop applications.
However, in practical application, we also found that Sikuli still has some limitations at the present stage:
- Depending on the screen to make it in different operating systems, in different browsers, or even at different display resolutions, You need to independently maintain a set of graphic source files, which causes obstacles to its cross-platform capabilities.
- Because retrieval depends on real-time desktop display, if unexpected interface occlusion or focus switching (such as pop-up window) outside the program logic occurs, the execution of the program will be affected.
- IDE is in the initial stage of development, and there are some problems with stability and ease of use. It only supports the primary code editing function, and there are still some inconveniences for large-scale code development and debugging. Its Running Stability on Windows and Linux platforms is slightly inferior to that on MacOSX.
Therefore, it is still difficult to use Sikuli to independently complete programs of a certain scale. However, as an effective supplement to existing automated testing tools, Sikuli can make full use of its advantages, convenience for daily work.
Summary
This article introduces the graphical programming technology Sikuli, which gives testers a preliminary understanding of the features and usage of Sikuli scripts. At the same time, the example shows how to use Sikuli to write automated scripts for GUI interaction and verification, and analyzes the advantages and limitations of Sikuli in the current application.