Comparison between Rust and Erlang

Source: Internet
Author: User
Tags case statement

Comparison between Rust and Erlang

During my two-year career as a telecommunications network simulator programmer, I fully utilized the concurrency, fault tolerance, and distributed computing features of Erlang to many CPU-intensive applications.

Erlang is an advanced, dynamic, and functional language that Provides Lightweight processes, immutability, distributed location transparency, message transmission, and supervised behavior. Unfortunately, it is not ideal in the underlying work, and obviously it is not their main intention. For example, XML parsing is one of the most typical cases, and Erlang is not good at it. In fact, the XML section must be read from the command line or network, and processing anything outside the Erlang virtual machine is cumbersome. You may also understand this issue. In this case, consider using different languages. In particular, Rust has recently come to the frontend due to its mixed feature set. It has a similar commitment to many aspects of Erlang and adds additional benefits in terms of underlying performance and security.

Rust is compiled into binary and runs directly on the hardware, just like your C/c ++ program. What is the difference between it and C/c ++? A lot. Its motto is: "Rust is a system programming language that runs very fast, prevents segment errors, and ensures thread security ".

This article will focus on the comparison between Erlang and Rust, emphasizing their similarities and differences. Erlang developers studying Rust and those studying Erlang may be interested in it. The last section details the capabilities and disadvantages of each language.

Immutability

Erlang: variables are unchangeable in Erlang. Once bound, they cannot be changed or rebound to different values.

Rust: variables in Rust are also unchangeable by default, but they can be easily changed to variable by adding mut keywords to them. Rust also introduces the concept of ownership and lending to effectively manage memory allocation. For example, strings are stored in executable files. strings are transferred when assigned to other variables, such as integer (i32, i64, u32 ...), Float (f32, f64) and other raw data types are directly stored in the stack.

Pattern Matching

Erlang: the simplicity of Erlang code is its pattern matching function. You can use case statements and "=" (equal to sign) anywhere, including the function name, number of parameters, and parameter itself.

Rust: In the let binding, the = symbol can be used for binding or pattern matching. In addition, Rust match is similar to the case statement in Erlang and the switch statement in most other languages. It tries to perform pattern matching in multiple cases and branches to the matched one. Feature/method Overloading is not built into Rust, but it can use features (trait ). Irrefutable pattern matches anything, and they always work. For example, in let x = 5, x is always bound to value 5. On the contrary, the uncertain (refutable) mode may not match in some cases. For example, in if let Some (x) = somevalue, it is clear that somevalue should process any value except None. The conclusive mode can be used directly in a let binding, but the unconclusive mode can be used in the if let, while let, or match struct.

Loop

Erlang: You can use recursion or list derivation in Erlang to complete the loop.

Rust: In imperative languages, loops appear in common ways such as for, while, and loop, with a basic loop structure. In addition, there is an iterator.

Closures and anonymous Functions

Erlang: Erlang has an anonymous function. You can use the fun and end keywords to box the code block and declare it. All anonymous functions use the current context closure and are transferred across processes on the same node or other connection nodes. Anonymous functions add great value to the Erlang distributed mechanism.

Rust: Rust also supports using closures of anonymous functions. These can also "capture" the environment and be executed elsewhere (in different methods or thread context ). Anonymous functions can be stored in a variable and can be passed as functions and cross-thread parameters.

List and metadata

Erlang: A list is a dynamic one-way linked list. It can store any Erlang data type as an element. The elements in the list cannot be obtained through indexes, but must be traversed from the beginning (unlike the array in Rust ). Tuples are fixed in size and cannot be changed during runtime. They can be pattern matching.

Rust: similar to the list in Erlang, Rust has vectors and arrays. The array size is fixed. If the element size is known during compilation, you can use it. Vector is an internal linked list (similar to the list in Erlang ). When the size changes dynamically, the vector can be either normal or double-ended. Normal vectors are unidirectional, while double-ended vectors are two-way linked lists, which can grow at both ends. Rust also has tuples that cannot be changed at runtime. If the function needs to return multiple values, you can use tuples. The tuples can also be pattern-matched.

Iterator

Erlang: The iterator in Erlang is used with the list. The list module provides various iteration mechanisms, such as map, filter, zip, and drop. In addition, Erlang also supports list derivation, which uses the generator as the list and can perform operations on each element in the list based on the predicate. The result is another list.

Rust: vectors, double-ended vectors, and arrays can be used by the iterator. In Rust, The iterator is lazy by default. The source will not be consumed unless there is a collector at the end. Compared with traditional circular constraints (such as loops), iterators provide a more natural way to use any list data type because they are never out of scope.

Record and Map

Erlang: Record is a fixed-size structure defined during compilation, while Map is dynamic, and their structures can be declared or modified at runtime. Map is similar to hashmap in other languages and is used for key-value storage.

Rust: Rust supports declaring struct during compilation. The struct cannot be modified at runtime. For example, the struct cannot be added or deleted. Because Rust is a low-level language, struct can store references. The lifecycle parameter must be used for reference to prevent hanging references. Rust has a standard collection library that supports many other data structures, such as Map, Set, and sequence. All these data structures can also be asynchronously iterated.

String, Binary, and Bitstring

Erlang: strings in Erlang are only a list of ASCII values of each character stored in a one-way linked list. Therefore, appending a character at the beginning of a string is always easier than appending it at the end. In Erlang, Binary is very special. They are like a continuous byte array and constitute a byte (eight-bit sequence ). Bitstring is a special case of Binary. It stores bit sequences of different sizes, such as three 1-bit sequences and one 4-bit sequence. The length of a bit string does not have to be a multiple of 8. String, Binary, and Bitstring support more advanced convenience syntaxes to make pattern matching easier. Therefore, if you are programming on the network, it is very easy to package and package a network protocol package.

Rust: In Rust, there are two types of strings. The string literal value is stored in an executable file instead of a stack. The string literal value is immutable. A string can have a dynamic size. In this case, they are stored on the stack, and their references are stored on the stack. If strings are known during compilation, they are stored in the form of text, while unknown strings are stored in the heap. This is an effective way to identify the memory allocation policy during compilation and apply it at runtime.

Lifecycle

Erlang: the variable is only bound to the function and released by the garbage collector specific to the current process. Therefore, the life cycle of each variable is the same as that of the function that uses it. That is to say, the program should be modularized to the function as much as possible to effectively use the memory. In addition, you can even use a special trigger to trigger garbage collection. When necessary, call Erlang: gc () to trigger garbage collection.

Rust: Rust has no garbage collection. Rust uses the lifecycle to manage memory. Each variable within a range (separated by braces or the body of the function) is given a new life cycle if it is not borrowed or referenced from the parent process. The life cycle of a variable does not end when the borrowed range ends. It ends only at the end of the parent range. Therefore, the lifecycle of each variable is either managed by the current range or managed by the parent scope, which is ensured by the compiler. During compilation, Rust secretly injects code so that when the variable's lifecycle ends, the values related to the variable are removed. This method avoids the use of garbage collection to determine which variables can be released. By managing the lifecycle in the function, Rust provides fine-grained control over the memory. Unlike the function that triggers garbage collection when the Erlang function ends, in Rust, you can use {} to divide your code into Multiple scopes, the compiler will place the drop code at the end of each scope.

Variable binding, ownership, and lending

Erlang: Erlang has a simple binding method. If a variable is not bound before, the appearance of any variable will be bound to the value on the right, otherwise it is pattern matching. Any Type in Erlang can be bound to a variable. Variables are only bound to the context of the function they appear, and are released by the garbage collector specific to the current process when they are no longer in use. Data ownership cannot be transferred to different variables. If another variable in the context of the same function wants to have the same data, it must clone the data. This complies with Erlang's philosophy of not sharing anything, and allows the use of clone values to be securely sent to different nodes or processes without data competition. In Erlang, It is not referenced and therefore not borrowed. All data is distributed to the stack.

Rust: ownership and lending are two powerful concepts in Rust, making the language unique in mainstream languages. This is precisely why Rust is considered to be a very important reason for low-level non-data competition languages. This can provide memory security without the need for garbage collectors, this guarantees the minimum runtime overhead. Data ownership belongs to one variable, which means that no other variable can share the ownership of the data. If necessary, the ownership is transferred to a different variable assignment, and the old variable is no longer valid. If the variable is sent to the function as a parameter, the ownership is also transferred. This operation is called move because the data ownership is transferred. Ownership helps to effectively manage memory.

Ownership rules: each value has a clear owner at a specific time point: if the owner is out of the range, the value will be garbage collected.

When the ownership of a value is temporarily borrowed from the variable that owns it to a function or variable, it will be lent out, either variable or immutable. Once the borrow exceeds the range of the function or the {} separator block, the ownership will be returned. During the borrow period, the parent function/range has no ownership of the variable until the borrowed function/range ends.

Lending rule: for a variable, there can be any number of immutable references, but there can only be one immutable reference within a range. In addition, mutable and immutable references cannot coexist within the same range.

Reference count

The reference count is used to track the usage of variables by other processes/threads. When a new process/thread holds this variable, the reference count increases. When a process/thread exits, the reference count decreases. When the Count reaches 0, the value is deleted.

Erlang: when data is transferred across multiple processes in Erlang, data is transmitted through one message. This means that it is copied to the heap of another process, rather than the reference count. Data replicated within a process is collected by the per-process garbage collector at the end of its lifecycle. However, binary files larger than 64 kB are referenced and counted when being transferred across Erlang processes.

Rust: when data is shared among threads, data is not copied to improve efficiency. It is encapsulated by a reference counter. References are special because multiple mutable references can be passed to multiple threads, but data synchronization must be mutually exclusive. Reference of immutable data does not need to be mutually exclusive. All related checks are completed during compilation and help prevent data competition in Rust.

Message transmission

Erlang: the message passing in Erlang is asynchronous. Assume that a process sends a message to another process. If the lock is available immediately, the message will be copied to another process mailbox; otherwise, the lock will be copied to a heap segment, the receiving process will get it later. This can achieve real asynchronous and data-free competition, although the cost is to copy the same message in the heap of another process.

Rust: Rust has a channel, just as water flows between two points. If you put something on a stream, it will flow to the other end. Each time a Rust channel is created, a launch and a receiving processor are created. The transmitting processor is used to place messages to the channel, and the receiving processor reads these messages. Once the transmitter places a value on the channel, the ownership of this value is transferred to that channel. If other threads read this value from this channel, the ownership is transferred to this thread. When a channel is used, the ownership principle is retained, and each value has only one owner. When the last thread exits, the resource is reclaimed.

Sudden changes in sharing

Erlang: sharing in Erlang is a sin, but Erlang allows the use of Erlang Term Storage (ETS) to control mutation. ETS tables can be shared across multiple tables and synchronized internally to prevent competition. ETS can be tuned to bring high read concurrency or high write concurrency. The entire table can be appended to a group of processes. If all these processes exit, the whole table will be reclaimed.

Rust: as a low-level language, Rust provides a method for changing resource sharing. Combined with reference count and mutex, resource access synchronizes with mutations of multiple threads. If multiple threads that share the same resource exit, the resource will be reclaimed by the last exit thread. This provides a clean and efficient way to share, mutate, and clean up resources.

Action

Erlang: The behavior is a formal form of the common pattern. The idea is to divide the code of a process into a common part (behavior module) and a specific part (a callback module ). You only need to implement some callbacks and call specific APIs to use the behavior. There are various standard behaviors, such as genserver, genfsm, and gensupervisor. For example, if you want an independent process to run continuously like a server and listen for asynchronous and synchronous calls or messages, you can implement its genserver behavior. It can also implement custom behaviors.

Rust: If you have a set of methods that are commonly used in multiple data types, they can be declared as a feature ). Features are interfaces of the Rust version, which are scalable. Traits eliminates the need for overloading traditional methods and provides a simple mode for overloading operators.

Memory Allocation

Erlang: variables are dynamically forced in Erlang. The type definition is not provided at runtime, And the type conversion is minimized during runtime to prevent type errors. When the program runs, the variables are dynamically allocated on the heap of the underlying OS thread and released during garbage collection.

Rust: Rust is a static, rigorous, and inferred language. Static means that the Rust compiler checks the type during compilation to prevent a type error during running. Some types are inferred during compilation. For example, if a String variable originally declared as String type is assigned to different variables, the type does not need to be implicitly declared, the data type of the new variable will be inferred by the compiler itself. The compiler tries to determine which variables can be allocated to the stack and which variables can be distributed to the stack, so the Rust memory allocation is very efficient and fast. Unlike Erlang, Rust uses stacks to allocate all data types of known sizes during compilation, while dynamic data types (such as Strings and Vectors) it is allocated on the heap during running.

Scalability, fault tolerance, and distributed

Erlang BEAM is a unique feature of Erlang. BEAM is built to ensure scalability, fault tolerance, distribution, concurrency, and other basic guarantees.

How to expand Erlang? Unlike local threads in the operating system, BEAM supports lightweight processes called Green threads, which are usually separated by a few local operating system threads. Literally, 1 million or more Erlang processes can be separated from any local operating system thread. This is made possible by allocating large blocks to local threads and sharing them among multiple Erlang processes. Each Erlang process obtains a block of all its variables. Because it may only contain 233 characters, the heap of the Local Operating System thread can fully cope with 1 million processes. In addition, due to Erlang's built-in asynchronous message transmission, inter-process communication is almost no bottleneck. A process will never be blocked in order to send messages to other processes: It may attempt to get a lock on the mailbox of another process and directly put the messages in it, you can also add a message to a separate heap fragment and attach the heap fragment to another process heap. The Erlang virtual machine also has a built-in distribution function that can run processes and interact with them across machines in a transparent manner.

How Does concurrency work in Rust? When you use a local operating system thread, they are scheduled by the operating system scheduler. When you use the local operating system threads, they will be scheduled by the operating system scheduler. For example, in Linux, the scheduling efficiency decreases with the number of threads. However, the BEAM of Erlang separates and manages multiple green threads from a local operating system thread. By default, each process is specified with a attenuation of 2000 (each operation in erlang has a decay budget, where 1 decay is roughly equivalent to a minimum function call ), it is allowed to run until the allocated attenuation is exhausted and then preemptible. During preemption, the next Erlang process in the running queue will be scheduled to run. This is the scheduling method of each Erlang process.

How does the BEAM layer perform memory management? As we mentioned, the heap of each local operating system thread is shared among multiple Erlang processes. Whenever the Erlang process requires more memory, it searches for available memory in the Local Operating System thread heap and obtains it (if available ). Otherwise, based on the requested data type, the specific memory distributor service will try to use malloc or mmap to obtain a memory block from the OS. BEAM is provided by dividing the memory block into multiple carrier blocks (containers of memory blocks managed by the distributor) and each Erlang process along with the correct carrier, this memory is effectively used in multiple processes. BEAM dynamically calculates the memory allocated and the number of carriers allocated according to the current needs, such as reading a large number of XML segments from network sockets, how many carriers are maintained after the GC cycle is released. The released memory blocks will be merged almost immediately after reallocation, so that the next allocation will be faster.

How does Erlang garbage collection work? Erlang provides a garbage collector for each process, which uses the generational mark clearing algorithm. If you want to work with the built-in non-Sharing Method of Erlang, the garbage collection of a process will not interfere with other processes in any way. Each process has a young heap and an old heap. Garbage collection is more frequent for young heaps. If some data survive two consecutive young garbage collection cycles, it will be moved to the old heap. The old heap will be reclaimed only when it reaches the specified size.

How does Erlang Fault Tolerance work? Erlang considers failure to be inevitable and tries to prepare for the handling. Any common Erlang application must follow a supervision level in which each Erlang process must be monitored by a supervisor. The supervisor is responsible for restarting the work process under its control based on the fault type. The supervisor can also configure a restart policy for the staff based on the type of monitoring, for example, one-to-one (each worker process exits only one worker process), one-to-many (if one worker exits, restart all working processes. BEAM provides links to spread exit signals between processes, and monitors exit signals between processes in the same beam vm, it can also transparently transfer locations across distributed beam vms. BEAM of Erlang can also dynamically load code on one or all virtual machines at a time. BEAM is responsible for loading code changes in the memory and applying them. Inform BEAM of the additional efforts required for module loading sequence and status management to prevent any unknown Process status.

In contrast to Erlang, Rust completes most of the work during program compilation, but does little work during runtime. Because most system programming languages lack Memory Security at runtime, Rust tries its best to ensure that the Code is no problem at runtime after compilation. Although BEAM ensures the memory security during runtime, sometimes the overhead becomes abnormal and complex, so Rust chooses to compile.

Rust's core language feature is designed to be as concise as possible. For example, Rust often builds lightweight green threads (similar to Erlang processes) at night ). At a certain time point, this feature is consciously deleted because it is not considered a general requirement for each application, and it is accompanied by a certain runtime cost. Instead, this feature can be provided through crate as needed. Although Erlang can also be imported into external libraries, its core functions (such as green threads) are embedded into the VM and cannot be closed or exchanged using local threads. Despite this, the green thread efficiency of Erlang Vm is very high, which has been proven in recent decades. disabling it is not a common requirement for those who choose to use Erlang.

How Can Rust be expanded? Expansion restrictions usually depend on the availability of communication and distribution mechanisms. As for the communication mechanism, it is debatable whether the Erlang model based on message transmission and garbage collection of each process and ETS is more efficient than that of Rust with single ownership and sharing variation channels.

In Erlang, any message can be sent to all other processes through replication. The garbage collector clears a large amount of data during sending and receiving processes. On the other side, Rust channels are multiple producers and single consumers. This means that if a message is sent to the consumer, it will not be copied and its ownership will be transferred to the consumer. Then, the consumer injects the clear code at the end of the range to reclaim this value. By cloning this value for all channels, you can send the same message to multiple consumers. In some cases, the combination of Rust's ownership model and predictable memory cleanup may be better than Erlang's garbage collection.

Another important aspect of communication is shared mutation. Theoretically, the ETS of Erlang: is similar to the shared mutation of Rust used in combination with mutex and reference count. However, although Rust has a very fine-grained mutation unit, it is as small as the Rust variable, but the mutation unit in Erlang ETS is at the ETS table level. Another major difference is that Rust lacks a built-in allocation mechanism.

How is the concurrency in Rust? The Rust thread is a local thread by default. The operating system uses its own scheduling mechanism to manage them. Therefore, it is an operating system attribute, not a language attribute. Having a local operating system thread can significantly improve the performance of operating system libraries such as network, file IO, and encryption. Alternatively, you can use some green threads or libraries with built-in scheduling programs. You can have enough options. Unfortunately, there is no stable crate. Rayon is a Data Parallel Database that implements a work-stealing algorithm to balance the load between local threads.

How does Rust manage memory? As discussed, it uses the concept of ownership and lifecycle for a large amount of static analysis to determine which variables can be allocated to the stack and which are allocated to the heap. Rust is doing a good job here. It tries to allocate as much data as possible on the stack, rather than on the stack. This greatly improves the memory read/write speed.

How does garbage collection work? As described above, Rust marks and determines the lifecycle of a variable during compilation. In addition, most of the variables used by Rust tend to exist in the stack, which is easier to manage. In Erlang, the garbage collector must be triggered at a given interval to locate unused data in the entire heap and release it. In languages that allow shared references, it becomes more difficult if there is no warning, such as Java. The predictability of garbage collection duration is hard to implement in these languages. Java is less predictable than Erlang, while Rust is more predictable than Erlang.

How does Fault Tolerance work? Rust itself does not have a built-in mechanism to identify and recover from runtime failures. Rust provides basic error handling through the Result and Option types, but this cannot always be able to handle every unexpected situation unless your language is embedded with the runtime error management framework. Erlang prevails at this point. By implementing its supervision framework and hot code loading, it can provide at least five or nine normal running times. Rust has to work harder to do this.

Conclusion

Erlang and Rust are both powerful in their respective fields. Erlang has been around for a long time and has proved to be a powerful and industry-ready ecosystem in terms of scalability, concurrency, distribution, and fault tolerance. Rust has its own defined features, such as advanced language features, secure programming, and common features (such as concurrency support and error handling rules) that can run at a low level and take advantage of local performance ).

In my opinion, if some very complex use cases require all of the above features, an interesting option is to combine Rust with Erlang as a shared library or native function (NIF ). All data processing, I/O operations, and operating system calls can be dumped to Rust, and the results are synchronized back to the Erlang virtual machine. The goal is to make things easier.

Is Rust a substitute for Erlang? My answer is, no. For decades, Erlang BEAM has been proven to have excellent scalability, concurrency, distribution and fault tolerance. Erlang has been trying to process them through BEAM and extract many common problems so that programmers don't need to worry about them, so they can focus on the problem at hand. Instead, for Rust, we can get a lot of options through the crate created by the Community, but as a programmer, I need to mix them in the correct way. Another challenge for Rust is its steep learning curve. This is definitely a big leap for people who have just started or come from dynamic programming languages. Simply put, these two languages target different audiences and solve different problems. It may be the best practice to combine what they are good.

About the author

Krishna Kumar Thokala is currently an application developer at Thoughttworks. Previously, he worked as a developer on Erlang's telecommunications network simulator for a while. As an architect, he built a configuration management system using yang modeling on NetConf. In addition to building software systems, robotics, electronics, and industrial automation are also areas of interest. You can follow these social platforms: Medium, LinkedIn, Twitter

A Comparison Between Rust and Erlang

This article permanently updates link: https://www.bkjia.com/Linux/2018-03/151574.htm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.