Similarity discrimination of Python program based on SIM

Source: Internet
Author: User

It has been nearly one months since the start of the SIM study, and it has been an effort to add support to the Python program on the SIM. Now back to think, really need to write their own code is very limited, most of the time is not familiar with the Hustoj code and consumption.

Under Hustoj, through the judge_client call SIM to achieve the similarity of the judgment process, all the discriminant database from the server side each storage of the AC code, such as the number 1000 of the AC code is stored in the Data/1000/ac folder. The results identified by Sim can also be changed by modifying the debug in judge_client.cc and judged.cc to 1 and recompiling the two files, copying the resulting. exe file to the/usr/bin folder, and restarting the judged process. The results of the SIM output are saved in the sim file in/home/judge/run0. Of course, you can also modify the/fps/core/sim under the sim.sh file code, the discriminant results in different forms output to the development position.

The lexical analysis of Python language, in fact, is quite simple, only need to rewrite the pythonlang.l file on it. The first step is to modify the recognition reserved word because the reserved word in the SIM calculation similarity rule is larger than the normal identifier. The second step is to identify useless information such as comments, spaces, line breaks, tab stops, and so on. Python annotations are divided into two types: single-line-basedcomments are identified by singlelinecom ("#". *), and Segment-basedannotations are rewritten by overriding the Multilinecom ("'" "(. | \) *) can also be implemented. Finally, put the pythonlang.l file into the corresponding Sim folder, modify the Makefile file, add the Python compilation generated sim_py executable, copy to/usr/bin, verify. In the initial commissioning, the command line can be used to verify that the./sim_py-p 1.py 2.py |grep ^1.py|awk ' {print $4} ' output discriminant results. It is important to note that Hustoj only supports similarity discrimination in 5 languages by default, so adding a new language requires 2268 lines in the judge_client.cc program &&lang<5 this sentence to be deleted.

The two procedures for running a discriminant are as follows:

File 1:

###################
‘‘‘
Ldf:a+b
‘‘‘
Import Sys
For line in Sys.stdin:
b = Line.split ()
print int (b[0]) + int (b[1])

File 2:

Import Sys
For line in Sys.stdin:
A = Line.split ()
print int (a[0]) + int (a[1])


The result shows that the similarity between file 1 and file 2 is 100%.

Similarity discrimination of Python program based on SIM

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.