Eclipse string partition sharing optimization Mechanism

Source: Internet
Author: User
In languages like Java/C # That process strings based on reference semantics, strings exist as immutable objects. if the content is the same, some mechanism can be used for reuse. For this type of language, pointing to two strings with different memory locations with the same content is no different from pointing to a string at the same time. Especially when parsing a large number of XML files using strings, such optimization can greatly reduce the memory usage of the program, for example, the standard of the sax parsing engine specifically defines a http://xml.org/sax/features/string-interning feature for string reuse.

At the language level, Java/C # directly provides support for string. Intern. For Java, the implementation is very similar. By using the string. Intern method, the current string uses the content as the key and the object reference as the value and is placed in a global hash table.

Code:

//
// Java/lang/string. Java
//

Public final class string
{
//...
Public native string intern (); // use the JNI function to ensure efficiency
}

//
// Hotspot/src/share/Vm/prims/JVM. cpp
//

Jvm_entry (jstring, jvm_internstring (jnienv * ENV, jstring Str ))
Jvmwrapper ("jvm_internstring ");
If (STR = NULL) return NULL;
Oop string = jnihandles: resolve_non_null (STR); // resolve the reference to an internal handle
Oop result = stringtable: intern (string, check_0); // perform the actual string intern operation
Return (jstring) jnihandles: make_local (ENV, result); // gets the reference of the internal handle.
Jvm_end
//
// Hotspot/src/share/Vm/memory/symboltable. cpp
//
Oop stringtable: intern (OOP string, traps)
{
If (string = NULL) return NULL;
Resourcemark RM (thread); // protects the thread Resource Area
Int length;
Handle h_string (thread, string );
Jchar * chars = java_lang_string: as_unicode_string (string, length); // obtain the actual string content
Oop result = intern (h_string, chars, length, check_0); // complete the string intern operation
Return result;
}
Oop stringtable: intern (handle string_or_null, jchar * Name, int Len, traps)
{
Int hashvalue = hash_string (name, Len); // calculate the hash value based on the string content.
Stringtablebucket * bucket = bucketfor (hashvalue); // obtain the target container Based on the hash value
Oop string = bucket-> Lookup (name, Len); // then, check whether the string already exists.
// Found
If (string! = NULL) return string;
// Otherwise, add to symbol to table
Return basic_add (string_or_null, name, Len, hashvalue, check_0); // put the string into the hash table
}

There is no way to explicitly clear the strings in the global string table. Only when this string is not used, the garbage collection thread analyzes the inaccessible object mark and finally calls the stringtable: unlink method to traverse and clear it.

Code:

//
// Hotspot/src/share/Vm/memory/genmarksweep. cpp
//

Void genmarksweep: mark_sweep_phase1 (...)
{
//...
Stringtable: unlink ();
}

//
// Hotspot/src/share/Vm/memory/symboltable. cpp
//

Void stringtable: unlink (){
// Readers of the string table are unlocked, so we shoshould only be
// Removing entries at a safepoint.
Assert (safepointsynchronize: is_at_safepoint (), "must be at safepoint ")
For (stringtablebucket * bucket = firstbucket (); bucket <= lastbucket (); bucket ++ ){
For (stringtableentry ** P = bucket-> entry_addr (); * P! = NULL ;){
Stringtableentry * entry = * P;
Assert (Entry-> literal_string ()! = NULL, "Just checking ");
If (Entry-> literal_string ()-> is_gc_marked () {// whether the string object is reachable
// Is this one of CILS those necessary only for verification? (DLD)
Entry-> oops_do (& marksweep: follow_root_closure );
P = entry-> next_addr ();
} Else {// reclaim the memory block from the memory pool if it is not reachable
* P = entry-> next ();
Entry-> set_next (free_list );
Free_list = entry;
}
}
}
}

With the above code, we can intuitively understand that for JVM (Sun JDK 1.4.2), String. Intern provides global hash table-based sharing support. Although this implementation is simple and can share strings to the maximum extent, the sharing granularity is too large, and the optimization effect cannot be measured, A large number of strings may cause problems such as reduced performance of global string tables.

Therefore, eclipse has abandoned the JVM-level string sharing optimization mechanism. By providing fine-grained, fully controllable, and measurable string partition sharing optimization mechanisms, this problem can be mitigated to a certain extent. Eclipse's core istringpoolparticipant ipant interface is explicitly implemented by the user, and the user submits the strings to be shared in the javasstrings method.

Code:

//
// Org. Eclipse. Core. runtime. istringpoolparticipant ipant
//

Public interface istringpoolparticipant ipant {
/**
* Instructs this maid to share its strings in the provided
* Pool.
*/
Public void stringstrings (stringpool pool );
}

For example, the markerinfo type implements the istringpoolparticipant ipant interface. In the corresponding strings method, submit the string type that you want to share and notify its lower-level nodes to submit the data.

Code:

//
// Org. Eclipse. Core. Internal. Resources. markerinfo
//

Public class markerinfo implements..., istringpoolparticipant ipant
{
Public void stringstrings (stringpool set ){
Type = set. Add (type );
Map map = attributes;
If (MAP instanceof istringpoolparticipant ipant)
(Istringpoolparticipant ipant) map). stringstrings (SET );
}
}

In this way, as long as an object tree node at all levels selectively implements the istringpoolparticipant ipant interface, all strings to be shared can be submitted to a string buffer pool recursively for reuse optimization. For example, workspace is such a string sharing root entry. After the open method is enabled in the workspace, Cache Management objects that need to be optimized for string sharing are displayed, add to the global string buffer partition optimization list.

Code:

//
// Org. Eclipse. Core. Internal. Resources
//

Public class workspace...
{
Protected savemanager;
Public istatus open (iprogressmonitor Monitor) throws coreexception
{
// Open the Workspace
// Register a new string buffer pool partition.
Internalplatform. getdefault (). addstringpoolparticipant ipant (savemanager, getroot ());
Return status. OK _status;
}
}

In languages like Java/C # That process strings based on reference semantics, strings exist as immutable objects. if the content is the same, some mechanism can be used for reuse. For this type of language, pointing to two strings with different memory locations with the same content is no different from pointing to a string at the same time. Especially when parsing a large number of XML files using strings, such optimization can greatly reduce the memory usage of the program, for example, the standard of the sax parsing engine specifically defines a http://xml.org/sax/features/string-interning feature for string reuse.

At the language level, Java/C # directly provides support for string. Intern. For Java, the implementation is very similar. By using the string. Intern method, the current string uses the content as the key and the object reference as the value and is placed in a global hash table.

Code:

//
// Java/lang/string. Java
//

Public final class string
{
//...
Public native string intern (); // use the JNI function to ensure efficiency
}

//
// Hotspot/src/share/Vm/prims/JVM. cpp
//

Jvm_entry (jstring, jvm_internstring (jnienv * ENV, jstring Str ))
Jvmwrapper ("jvm_internstring ");
If (STR = NULL) return NULL;
Oop string = jnihandles: resolve_non_null (STR); // resolve the reference to an internal handle
Oop result = stringtable: intern (string, check_0); // perform the actual string intern operation
Return (jstring) jnihandles: make_local (ENV, result); // gets the reference of the internal handle.
Jvm_end
//
// Hotspot/src/share/Vm/memory/symboltable. cpp
//
Oop stringtable: intern (OOP string, traps)
{
If (string = NULL) return NULL;
Resourcemark RM (thread); // protects the thread Resource Area
Int length;
Handle h_string (thread, string );
Jchar * chars = java_lang_string: as_unicode_string (string, length); // obtain the actual string content
Oop result = intern (h_string, chars, length, check_0); // complete the string intern operation
Return result;
}
Oop stringtable: intern (handle string_or_null, jchar * Name, int Len, traps)
{
Int hashvalue = hash_string (name, Len); // calculate the hash value based on the string content.
Stringtablebucket * bucket = bucketfor (hashvalue); // obtain the target container Based on the hash value
Oop string = bucket-> Lookup (name, Len); // then, check whether the string already exists.
// Found
If (string! = NULL) return string;
// Otherwise, add to symbol to table
Return basic_add (string_or_null, name, Len, hashvalue, check_0); // put the string into the hash table
}

There is no way to explicitly clear the strings in the global string table. Only when this string is not used, the garbage collection thread analyzes the inaccessible object mark and finally calls the stringtable: unlink method to traverse and clear it.

Code:

//
// Hotspot/src/share/Vm/memory/genmarksweep. cpp
//

Void genmarksweep: mark_sweep_phase1 (...)
{
//...
Stringtable: unlink ();
}

//
// Hotspot/src/share/Vm/memory/symboltable. cpp
//

Void stringtable: unlink (){
// Readers of the string table are unlocked, so we shoshould only be
// Removing entries at a safepoint.
Assert (safepointsynchronize: is_at_safepoint (), "must be at safepoint ")
For (stringtablebucket * bucket = firstbucket (); bucket <= lastbucket (); bucket ++ ){
For (stringtableentry ** P = bucket-> entry_addr (); * P! = NULL ;){
Stringtableentry * entry = * P;
Assert (Entry-> literal_string ()! = NULL, "Just checking ");
If (Entry-> literal_string ()-> is_gc_marked () {// whether the string object is reachable
// Is this one of CILS those necessary only for verification? (DLD)
Entry-> oops_do (& marksweep: follow_root_closure );
P = entry-> next_addr ();
} Else {// reclaim the memory block from the memory pool if it is not reachable
* P = entry-> next ();
Entry-> set_next (free_list );
Free_list = entry;
}
}
}
}

With the above code, we can intuitively understand that for JVM (Sun JDK 1.4.2), String. Intern provides global hash table-based sharing support. Although this implementation is simple and can share strings to the maximum extent, the sharing granularity is too large, and the optimization effect cannot be measured, A large number of strings may cause problems such as reduced performance of global string tables.

Therefore, eclipse has abandoned the JVM-level string sharing optimization mechanism. By providing fine-grained, fully controllable, and measurable string partition sharing optimization mechanisms, this problem can be mitigated to a certain extent. Eclipse's core istringpoolparticipant ipant interface is explicitly implemented by the user, and the user submits the strings to be shared in the javasstrings method.

Code:

//
// Org. Eclipse. Core. runtime. istringpoolparticipant ipant
//

Public interface istringpoolparticipant ipant {
/**
* Instructs this maid to share its strings in the provided
* Pool.
*/
Public void stringstrings (stringpool pool );
}

For example, the markerinfo type implements the istringpoolparticipant ipant interface. In the corresponding strings method, submit the string type that you want to share and notify its lower-level nodes to submit the data.

Code:

//
// Org. Eclipse. Core. Internal. Resources. markerinfo
//

Public class markerinfo implements..., istringpoolparticipant ipant
{
Public void stringstrings (stringpool set ){
Type = set. Add (type );
Map map = attributes;
If (MAP instanceof istringpoolparticipant ipant)
(Istringpoolparticipant ipant) map). stringstrings (SET );
}
}

In this way, as long as an object tree node at all levels selectively implements the istringpoolparticipant ipant interface, all strings to be shared can be submitted to a string buffer pool recursively for reuse optimization. For example, workspace is such a string sharing root entry. After the open method is enabled in the workspace, Cache Management objects that need to be optimized for string sharing are displayed, add to the global string buffer partition optimization list.

Code:

//
// Org. Eclipse. Core. Internal. Resources
//

Public class workspace...
{
Protected savemanager;
Public istatus open (iprogressmonitor Monitor) throws coreexception
{
// Open the Workspace
// Register a new string buffer pool partition.
Internalplatform. getdefault (). addstringpoolparticipant ipant (savemanager, getroot ());
Return status. OK _status;
}

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.