Online code compiler (I)-editing and compilation

Source: Internet
Author: User

This article has been published by the author Yao Taihang authorized Netease cloud community.

Welcome to the Netease cloud community to learn more about the operation experience of Netease technology products.

Online Compiler

Online code compiler, which is an online code writing and running tool, provides you with a series of functions necessary for online code editing, code prompts, code diagnosis, compilation, and running, from code writing to startup and running, to achieve the core functions of the IDE, it has a wide range of applications. There are roughly two types of application scenarios:

General scenarios
  • Function basis: only based on the syntax features of the development language and common native libraries.

  • Content Description: In this application scenario, some extreme operation types, such as read/write and external requests, are highly supported. The Code runtime environment usually uses a sandbox to meet security requirements.

  • Application Scope: main application business areas include online code-assisted editing tools (tools, etc.), online examination platforms (Niuke network, etc.), and algorithm competition and question farming platforms (leetcode, etc ).

Special scenarios
  • Function basis: based on a large number of tool APIs provided by the platform, only the necessary common native libraries are used.

  • Content Description: In this application scenario, the content of the code written by the user is restricted to the boundary scope specified by the platform. The Code style, format, and structure should also be carried out in accordance with the Platform specifications, in addition to basic syntax detection, the compiler also performs a detailed inspection on the content and methods involved in the code, and imposes strict restrictions on some sensitive operations involving Io, read/write, and network requests. Because you need to use the APIS provided by the platform itself, a simple sandbox can no longer meet your needs. You need to implement special code runtime environment security protection based on different business characteristics.

  • Application Scope: In terms of applications, the service scope is limited to the scope of the platform based on the different starting points provided by the platform tool API. In the scope of quantification, most quantization Platforms provide online compilation of policy code in Python and Java, and provide related APIs for users to complete quantitative policy development.

As common scenarios are common, there are also many mature examples related to development and setup. This article will not be discussed here. For special scenarios, this article will combine the cases related to the Java online compiler on Netease precious metal quantification platform to elaborate on the Implementation ideas of the online Compilation part.

Case Study

The core of Netease precious metal quantification platform is to use the relevant principles of online compilers. (currently) it provides the function of developing related quantitative policies for precious metal transactions. In the next section, we will use this platform as a case and describe the case based on the theoretical summary. In order to facilitate the subsequent elaboration, we will briefly describe the basic situation of the system:

  • Core Business Description: users can combine their own market investment experience to form a strategy, in the form of back-test or real-disk, you can use historical or real-time quotations to simulate transaction operations at a certain stage or in real time based on the policy content, and output the profit and loss of a policy transaction to verify the strategy, optimize the strategy, and accumulate investment experience.

  • Policy: the logic of "determining the conditions for triggering a transaction". The condition determination is based on time and product quotations, it may also contain machine learning results, training model results, and economic indicators. The quantification platform is a piece of Java (or other language) Code. The Code performs logical judgment and transaction operations by calling the interface provided by the platform.

  • Policy output: the direct result of the policy output is the transaction signal itself and transaction records, which are used to calculate the total profit and loss of the policy, the maximum withdrawal rate, the sharp rate, and other common profit and loss evaluation statistical indicators for a certain period of time.

The process is as follows:

  • Policy writing

  • Platform transaction Simulation

  • Transaction Result Statistics

Users write policies to simulate transactions and calculate results

Online Editing and compilation

A complete online compilation process starts with the code written by the user (of course, the code source is not limited to this), from code building (writing or assembling) to compilation and running, the final output result may cause the expected impact. The process includes

  • Code Construction

  • Syntax Detection

  • Code Diagnosis

  • Code compilation

  • Code running

  • Content feedback

Code Construction

Code construction involves the language type, code structure, and the final code generation method.

Language type

Before constructing an online compiler platform, you must specify the language types supported by the platform. Impact of language types:

  • Compilation Method: It can be summarized into the following three types:

    • Interpreted type: a program written in an interpreted language is executed by its corresponding interpreter and does not directly involve the compilation process, such as JavaScript. This type of language can be dynamically executed during setup, without the tedious compilation process of background programs. When designing the Platform Architecture, you can directly place the relevant code processing process on the upper layer of the Platform (such as the browser itself) based on actual needs, and directly feedback the results, instead of placing the request processing process at the underlying layer, the logic is complicated.

    • Compilation type: the compilation type language is usually relatively powerful and relatively low-level. You must first compile the code into a machine code file of the target program, such as C and C ++, the target program file can be run multiple times on the computer without code. For user code in this language, the Code finally submitted by the user must be handed over to the server and other specific computers for processing, and then the program runs and the program running result is fed back.

    • HYBRID: different from compiled languages, hybrid languages generate bytecode files without generating machine codes during compilation, such as Java and python, bytecode files can also be loaded into special runtime environments for multiple times, but cannot be directly recognized by calculators. User code in such languages must also be processed by computers such as servers, but must be executed by computers that can provide special runtime environments.

  • Code style: Code style. It is mainly used to determine whether the Code has special requirements on the format, so as to optimize the prompt process and facilitate subsequent code detection processes. For example, if python is highly dependent on indentation, special service optimization is required for code prompts and user usage.

  • Code prompt: the code prompt must be confirmed after the language type is confirmed. Generally, browser-based front-end Online Editing framework provides ready-made prompts for native APIs in some languages. In addition to this part, if you need to prompt the user platform to develop additional APIs, you need to sort the additional content into the format required by the Code prompts for Supplement and import.

Code structure

Generally, there is no special requirement for the structure of user code, that is, it has the same functions as general ide. However, in special scenarios, the purpose of code writing is relatively clear, and the content contained in the Code is also expected, so before writing the user code, the fixed code structure can be used to limit the content and composition of user code writing. In the subsequent code detection phase, preliminary code rationality check can also be conducted based on the fixed format.

Taking Java as an example, the fixed structure includes:

  • Disable the specified Package Structure

  • Prohibited class Import

  • Parent class that must be inherited

  • Required interfaces

  • Class uniqueness

  • Required Methods

  • Comments for prompt with fixed code locations

Generation Method

In terms of code generation methods, the paths for user-generated code are different at the interaction level based on the different methods supported by the platform for user code writing, however, the final result is to generate reasonable code.

In the scope of the quantitative platform, user code is used to implement previous data computing and learning strategies for decision making in the future. Taking the quantified platform with some outstanding features on the market as an example, the generation methods include:

  • Original code editing Method

(Sample image source: Netease precious metal quantification platform) in this way, even with the help of code prompts and related annotations, users will also be more difficult in the Code construction process, but for mature programmers, on the contrary, the degree of freedom is relatively high.

  • Componentization Mode

(Sample image source: bigquant) In this method, the predictable code content is componentized. The user selects the required components, and the platform is responsible for splicing Based on the component selection, this greatly reduces the threshold for coding. It is very friendly to those who require special industries but are not the masters of computer technology, and the rationality of the Code is greatly guaranteed.

  • Visual component creation method

(Sample image source: bigquant) This method is a higher-level packaging of componentization. The threshold for code writing is reduced once again, and it has a miraculous effect in the process of expressing the code logic.

In fact, in terms of code generation methods, combined with different needs and specific needs of the business field, there are still many different friendly generation methods. As for the above three methods, we can see that the latter two methods are more friendly and available in code generation, but may reduce the degree of freedom.

In terms of code generation methods, if the code content is predictable and the structure is relatively fixed, it is recommended that you provide the methods in addition to the original code editing when conditions are met, provides other code generation methods based on the build idea. The code generation method can not only improve the user experience, greatly reduce the user's use threshold, but also effectively reduce the possibility of syntax errors and unreasonable logic in user code.

Case Study

Based on the summary of this part of code building, the relevant sections in the case are as follows:

  • Language type: Java 8

  • Code structure: When you edit the policy code, the platform provides the template in advance and provides code prompts for all related APIs. The template contains the interfaces that must be implemented and the methods that must be included, and marks the prompts for comments during the fixed process structure. In the process of writing, writing to users is not limited, and the code detection process is not currently in the editing process.

  • Generation Method: currently, the original code editing method is provided. In the future, it is planned to develop towards componentization.

Code Detection

User code detection is to check the content and syntax of user code before diagnosis and operation, and check the validity of the syntax and the code structure expected during the build phase to check whether the user code is reasonable. If the user code is only part of the final code during the design of the platform, the public part is assembled by the system, and the assembly process can be completed during this detection process.

Code assembly

Code Assembly: adds the remaining public parts to the user's editing part. This reduces the amount of code that the user needs to edit, to some extent, some unexpected content can appear in the user code.

Taking Java as an example, the assembled content can include:

  • Package path: restrict the path of the final generated class

  • Import class: limits the scope of classes that can be used by users.

  • Annotation: You can append other behaviors to user code at the class or method granularity.

  • General method: the general method that must be called during append operation (usually placed in an abstract class)

Combine the assembled content with the user's edited content as a complete file of code that can be detected.

Syntax Detection

Simple syntax detection can be performed directly by identifying files, or by directly trying to use the diagnostic process to determine whether the File Syntax is reasonable, in addition, the complex architecture requires the combination of the syntax analyzer in the Compilation Principle to construct an abstract syntax tree for detailed parsing.

Structure Detection

Check the content of the code structure in the Code build phase, including:

  • File Path (package path)

  • Validity of class names

  • Whether the class must be inherited and implemented

  • Include required parameters?

  • Include required methods

  • Whether it meets other necessary fixed structures

After going through the above process, we can basically find out whether a piece of user code is necessary for compilation and diagnosis.

Case Study

Combined with this summary of code detection, the relevant sections in the case are as follows:

  • Code Assembly: In the Netease precious metal quantification platform, users only need to inherit the interface to implement the three methods of the subject, and the process of the three methods is described in detail in the template, you do not need to write anything other than the class. The code assembly content includes:

    • Set the package path of the class code file to a uniform location, and generate a sub-path based on the user information and timestamp to prevent duplicate class names under the path

    • All packages other than Java. Lang are imported only to the involved parts, including computing, data structure, and time processing.

    • Import APIs provided by all platforms

  • Syntax check: The diagnostic process is directly performed using the editing tool. The abstract syntax tree is not used in this section.

  • Structure Detection: On the quantification platform, user code uses classes as the subject for Writing user code, but does not contain other content. The structure detection content includes:

    • The user code must not be empty.

    • You cannot import classes by yourself.

    • Must inherit the User Policy template interface

    • Must be a complete implementation class of the Policy template

    • The class keyword is unique and the class is unique and cannot contain internal classes.

After preliminary detection, if the code check is correct, you can go to code diagnosis and code compilation to further prove the code availability.

Code Diagnosis

Diagnostic refers to the process of checking whether the code can be run and reporting the problem location during compilation. Generally, the IDE diagnoses the problem during the build process. The diagnosis reports the problem type and specifies the row number of the problem, but not all diagnoses have row numbers. The diagnosis covers:

  • Syntax validity: whether the statement is legal

  • File structure validity: whether the file content meets the basic requirements of a language

  • Call validity: whether other classes or methods involved in the file exist

After the code is diagnosed, it indicates that the code file can be run in the current environment, but the runtime error is not checked during this stage. Generally, code diagnosis is performed along with the code compilation process. During code compilation, diagnostic information is obtained by listening to the compilation process. However, Code Diagnosis in non-compiled languages is performed independently.

The case description will be carried out along with the code compilation process.

Code compilation

Code compilation and use of code files to generate a more underlying target file. In a compiled language, content that can be directly executed by computers, such as machine code, is translated. In a hybrid language, the code file is compiled into content that can be recognized by the JVM and other runtime environments.

The Code provided by the user is only part of the code. Combined with the actual environment, the Code may not be universal and can only be compiled in a fixed environment. Because the user is not using the IDE, the diagnostic, compilation, and loading processes of code must be provided by the platform itself, that is, platform developers must develop relevant functions using language features and inherent tools.

The compilation principles and procedures are not described here.

Case Study

The language environment used by Netease precious metal quantification platform is Java. For Java compilation, the javax. Tools native package provides the key classes required to compile Java source files into. class files. The relevant content is as follows:

Javax. Tools. javacompiler:
/** * Interface to invoke Java&trade; programming language compilers from * programs. * * <p>The compiler might generate diagnostics during compilation (for * example, error messages).  If a diagnostic listener is provided, * the diagnostics will be supplied to the listener.  If no listener * is provided, the diagnostics will be formatted in an unspecified * format and written to the default output, which is {@code * System.err} unless otherwise specified.  Even if a diagnostic * listener is supplied, some diagnostics might not fit in a {@code * Diagnostic} and will be written to the default output. * ...

Java compilation tool. diagnostic information is thrown during compilation. You can use the run method to compile a compilationtask, and then call the call method of compilationtask to execute the compilation task.

Javax. Tools. javafileobject:
/** * File abstraction for tools operating on Java&trade; programming language * source and class files. * * <p>All methods in this interface might throw a SecurityException if * a security exception occurs. * * <p>Unless explicitly allowed, all methods in this interface might * throw a NullPointerException if given a {@code null} argument.

Java source object, responsible for loading the source object to memory.

Javax. Tools. javafilemanage:
/** * File manager for tools operating on Java&trade; programming language * source and class files.  In this context, <em>file</em> means an * abstraction of regular files and other sources of data. ...

Java source file management class, used to manage a series of javafileobjects.

Javax. Tools. Diagnostic:
/** * Interface for diagnostics from tools.  A diagnostic usually reports * a problem at a specific position in a source file.  However, not * all diagnostics are associated with a position or a file. ...

Java file diagnostic information.

Javax. Tools. diagnosticlistener:
/** * Interface for receiving diagnostics from tools. * * @param <S> the type of source objects used by diagnostics received * by this listener *

Diagnostic information listener, triggered during compilation. To generate a compilation task (javacompiler. gettask () or obtain filemanager (javacompiler. getstandardfilemanager (), you must pass a diagnosticlistener to collect diagnostic information.

Based on the above related classes, the call method is as follows:

public static void compile(File srcFile, String targetClassPath) {        JavaCompiler compiler = ToolProvider.getSystemJavaCompiler();        DiagnosticCollector<JavaFileObject> diagnosticListener = new DiagnosticCollector<>();        StandardJavaFileManager fileManager = compiler.getStandardFileManager(null, null, null);        Iterable it = fileManager.getJavaFileObjects(srcFile);        createClassPathIfNotExists(targetClassPath);        List<String> options = new ArrayList<>();        options.add("-classpath");        StringBuilder sb = new StringBuilder();        URLClassLoader urlClassLoader = (URLClassLoader) Thread.currentThread().getContextClassLoader();        for (URL url : urlClassLoader.getURLs()) {            sb.append(url.getFile().replace("%20", " ")).append(File.pathSeparator);        }        options.add(sb.toString());        options.add("-d");        options.add(targetClassPath);        try {            JavaCompiler.CompilationTask task = compiler.getTask(null, fileManager, diagnosticListener, options, null,                    it);            boolean success = task.call();            if (!success) {                StringBuilder errorMsg = new StringBuilder();                for (Diagnostic diagnostic : diagnosticListener.getDiagnostics()) {                    errorMsg.append("line:").append(diagnostic.getLineNumber() - StrategyCodeConstant.DEFAULT_PRE_LINE)                            .append(", ").append(diagnostic.getMessage(null)).append("\n");                }                throw new CompileException(RetCode.COMPILE_ERROR, errorMsg.toString());            }        } catch (CompileException e) {            throw e;        } catch (Exception e) {            throw new CompileException(RetCode.COMPILE_ERROR, e.getMessage(), e);        }    }

Based on the preceding method description, the basic process of the explanation method is as follows:

  1. Obtain the system Compiler

  2. Create a diagnostic listener

  3. Read the Java source file

  4. Create the target class file

  5. Set compilation parameters such as class path

  6. Execute the compilation task

  7. Throw diagnostic information

After the above process, if the listener does not listen for any diagnosis, the generated class file can be directly loaded and run by the class loader.

You can specify specific policies based on specific requirements on the retention method of class files. If you do not need to retain the user code, you can directly generate the memory corresponding to the class file in binary mode. If you need to retain the user code, you can view the compiled class file for conversion in other ways.

Code running

Code execution: loads the compiled content to the specified environment for running. Each language provides relevant processes based on its own characteristics, which is not difficult. The content of the code running discussion here is how to combine user code with the running environment of the online compiler platform itself.

  • In general, user code only relies on native tools and claims to be one. If the language has a running environment similar to JVM, you can directly use the running environment to build a simple sandbox for running.

  • In special scenarios, in addition to necessary native tools, user code is highly dependent on the APIS provided by the platform. Because the API content is all-encompassing, it may involve external access or memory usage of public servers, therefore, the sandbox may not be able to meet the needs to some extent.

Since the sandbox alone cannot meet the requirements, it may face the situation of loading user code into the same running environment where the platform is located. However, in this process, how to standardize the access and call actions of user code is the top priority. In addition, it is essential to maintain platform security on the basis of meeting the basic requirements of user code running (PS: security issues will be elaborated in another article ).

To standardize the access and call actions of user code, you can start from the following aspects to solve the problem:

  • Determine the content of user code calls: whether the content of user code must use the APIS provided by the platform, and whether all behaviors can be exhaustive.

  • Clear user code structure: based on clear behaviors, whether the user code structure is predictable. if so, whether the user code has an interface for external interaction.

  • Specify the method of calling user code: the user code can be called only once, but it still needs to be called several times.

  • Identify possible problems in user code: even after code diagnosis, user code may still encounter runtime exceptions. For these possible runtime exceptions, you must have an estimate and a solution, whether to skip this execution or interrupt the execution process.

Based on the above considerations, we need to determine how the Platform calls user code and opens up the barriers from user code to the platform.

When the user code goes through all the processes before running, it should be predictable here. What the user writes, what it will do, and how it will be used, how to call the platform has become clear.

Case Study

The Netease precious metal quantification platform directly defines templates for users' policy code. The Java classes written by users must inherit the Policy template structure and implement relevant methods.

Class Policy template interface content:

/*** Policy class */public interface Strategy {/*** is called once when the policy is initialized. It is used to select the type, set the service fee, and the amount, ETC ** @ Param context */void Init (context);/*** Main Implementation of the Policy ** @ Param context */void handle (context ); /*** call a policy at the end of execution ** @ Param context */void onexit (context );}

The user's policy code needs to be called and executed multiple times over time to simulate the actual transaction. This call process becomes scheduling. The calling process consists of three parts:

  1. Before scheduling: Call the init method. In this method, you need to initialize some parameters used for scheduling and give the initial values.

  2. Scheduling: the main implementation content of policies in the handle method is continuously executed based on the timeline or quote message. The variable content involved in the handle method will be updated during the process, in this process, you can use the variable in the object for temporary storage of the variable. The scheduling content includes quotation query, simulated warehouse opening and closing operations, and mathematical calculation.

  3. After scheduling: Execute the onexit method. In this method, you can perform processing actions when policy scheduling is completed based on your own needs, and perform custom Statistical Computing or log output.

The quantification platform regulates the calling method of user code by specifying the user code structure, so that all user code behaviors are unified during the calling process.

Content feedback

Content feedback: the user code is generated to run, and the user must be aware of the effect of the Code. In general use cases, that is, a simple online compiler, the content feedback table shows the code compilation and the content output to the console in the Code. However, in special scenarios, the feedback of these two parts is far from enough for users.

The feedback generation time can be divided into the following three stages:

  • Before the official operation, the content covered includes:

    • User code syntax check

    • Diagnosis of user code compilation

    • User code environment Loading

  • During the official operation, the following content is covered:

    • User code running median

    • User code running exception and error

    • User code running log

  • After running, the following content is covered:

    • User code call end notification

    • User code method Return Value

    • Data generated by user code

    • User data computing, statistics, and graphical processing results

For the above content, the platform can selectively provide feedback to users. The feedback method can be selected based on the specific presentation of the platform, or divided into two parts: synchronous and asynchronous for separate notification. The feedback notification form includes:

  • Synchronous notification

    • Output Console

    • Timely push and feedback of message forms

    • Log message

  • Asynchronous notification

    • Running Log File

    • Running status report file

    • Raw user data

    • User data computing, statistics, and graphical processing results

Case Study

The Netease precious metal quantification platform implements content feedback in the following parts:

  1. During code detection and diagnosis, the following content is provided:

  1. Feedback System and user-defined logs when code is running

  1. After the code is run, the policy log file, original transaction information, and statistical summary are fed back.

Log Files

Original transaction information

Statistical summary

The feedback content should be adjusted iteratively based on the user requirements and feedback, but the content should be limited to the part of the user code, and the running status and important parameter information of the server and platform itself should not be disclosed.

Post Link

Based on the actual application scenarios of Netease precious metal quantification platform, this paper expounds the idea of online Compiler Construction and analyzes various possible application scenarios and key points of thinking, this process details the process of editing and compiling. The following sections describe how to ensure the Security Detection and operation of user code:

Online code compiler (II)-user code Security Detection

Other tool online tools

Recommend a more comprehensive online tool platform: http://tool.lu/

Markdown image insertion

Because markdown cannot insert images directly, you need to use a three-way graph bed to store images and generate links. We recommend a good graph bed: Weibo graph bed. You can download the plug-in from the Chrome app store and log on to Weibo to use it. The HTTP, HTML, UBB, and markdown links of thumbnails and source images can be generated.


Free trial of cloud security (yundun) content security, verification code and other services

For more information about Netease technologies, products, and operations, click.


Related Articles:
[Recommended] openwrt custom CGI implementation
[Recommended] Build the springboot Basic Framework (Part II)
[Recommended] kschedule

Online code compiler (I)-editing and compilation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.