Then the previous article
A simple way to reduce memory usage
1, reuse objects and not more memory y <-x refers to the new variable y points to the memory block containing the x, only when Y is modified to copy to the new memory block,
Generally, as long as the vectors are not referenced by other objects, they can be modified normally to avoid the CPU and RAM overhead associated with copying vectors. According to the program, R is passed by value,
You should avoid using sort-like functions and return a copy with a resource cost that is at least as large or larger as the original object.
2. Delete intermediate data that is not required
Note that when the RM () is called, the memory is not immediately released and returned to the operating system, but when necessary, or when the amount of memory of the deleted object exceeds the threshold value,
Automatically frees memory by R's garbage collector
3. Run-time calculated value instead of persistent storage value
4. Exchange active data and inactive data save some data to disk, SAVERDSP Readrds
Second, using limited memory to process large datasets
1. Using memory-saving data structures
How does R store the data structure? Vectors are the most basic structural unit for all data types and provide a variety of atom vector types (logic,integer,numeric,complex,character,raw)
Many other data structures can be constructed from these vector types, and the essence of the R internal storage structure is the vector
2, sparse matrix contains a large number of 0 values or null sparse parameters
3. Symmetric matrix Dspmatrix
4, the logical value in the bit vector r is 4 bytes or 32 bits, while the bit vectors store each logical value with only one bit. 32 times times less, but not suitable for storing NA values (bit packets)
5, the use of memory mapping files and processing data in the form of blocks, no matter how much optimization is not put into memory, it needs to be stored in the form of memory-mapped file to disk
Block calculation, and then merging the structure, is easy to implement depending on the algorithm itself
Bigmemory Big.matrix supports operations on many r matrices to support Cran packages for Big.matrix objects Biganalytics bigtabulate
FF and Ffbase have done a simple data test native 16G memory I7700 u can support 200 million data volume block computing is a very challenging algorithm
--------------------As of Here we have learned the various optimizations of serial R code, the following describes the use of CPU multi-core parallel computing
To be continued .....
R Language High performance programming (II.)