Unity Mobile End Performance optimization

Source: Internet
Author: User
Tags foreach lua thread cpu usage

https://zhuanlan.zhihu.com/p/26030252


1. Rendering
The use of reflect probe instead of reflection, refraction, as far as possible without RTT, Grabpass, Renderwithshader, Commandbuffer.blit ( Builtinrendertexturetype.currentactive ...) A unified post-processing framework (Bloom, HDR, DOF, etc.) can be used to replace multi-post processing, and the fuzzy function may be shared to reduce the number of blit; In addition, the dimensions of the RTT should be noted. Air refraction, heat wave distortion and so on use Grabpass not all hardware support, change to RTT or post-processing to optimize. Establish a unified shader material instead of a single shader, make full use of shader_feature, Multi_compile, and display the macro switch in the interface. Image blending instead of multi-channel textures, shadow projection, shadow Reception, Metapass, Forwardadd and so on are not required when the pass is removed. Less use of alpha test, discard, clip, alpha converage, etc., because it will affect early-z culling, HSR optimization. Avoid alpha blend penetration issues (transparent sorting methods such as weight blending, depth stripping, etc. are too expensive). The illumination map replaces the dynamic shadow, tries to avoid the real time, the shadow map, the environment map uses 16 bits instead of 32 bits, uses the Projector+rtt or the aperture to replace the real-time shadow. The environmental parameters (wind, rain, sun) and other shader global parameters are managed uniformly. Non-protagonist can use Matcap instead of PBR, no metal does not have to use PBR, carefully select the physical rendering of the FDG (F:schlick, Cook-torrance, LERP, the requirements are not high with 4 times, D:blinn-phong, Beckmann, GGX, GGX Anisotropic,g:neumann, Cook-torrance, Kelemen, Smithggx;standard shader to pay attention to select BRDF1-BRDF3), rendering requirements are not high when not GGX , LH can be used to optimize GGX. Use fixed, half instead of float, establish shader uniform type (fixed efficiency is 4 times times float, half is float twice times), carefully choose shader variable modification (uniform, static, global), Choose Mobile or unlit directory shader use high-low-level rendering, enough memory to consider turning on Mipmap use surface shader pay attention to switch off features, such as: Noshadow, Noambient, Novertexlights,Nolightmap, Nodynlightmap, Nodirlightmap, Nofog, Nometa, Noforwardadd and other standard shader variants are too many (more than 30,000), resulting in a longer compilation time, Memory footprint is also amazing (close to 1G), if you use to turn off useless shader_feature, such as: _parallaxmap, Shadows_soft, dirlightmap_combined dirlightmap_separate, _DETAIL_MULX2, _alphapremultiply_on, in addition to remove the extra pass shaderforge, Amplify Shader editor generated Shader have redundant code to program specifically optimized, amplify Shader Editor is more powerful and open source, suggesting learning. Do not use unity to bring Terrian, because even if only with 3 Splat diagram, shader also corresponds to 4, suggest t4m or into mesh. The model and the material are the same and the quantity is large, using instance to optimize, such as grass. Use the Find texture (LUT) to optimize complex light rendering, such as skin, hair, paint, and more. Try not to use procedural Sky, calculate the Ruili scattering and the low efficiency of the meter scattering. Try not to use SpeedTree, instead of the model and simple leaf animation, but speedtreewind.cginc inside the animation function is very rich, terrianengine in the smoothtrianglewave very useful. Multi-use debugging tools to check shader performance, commonly used tools are: Framedebug, Nsight, Renderdoc, AMD GPU shaderanalyzer/pvrshadereditor, Adreno Profiler, Tencent Cube, UWA, etc. can also be built-in GM interface, such as switch shadow, batch replacement shader and other convenient real-machine debugging. 2. Scripts
Reduce getcomponent, find and other lookup functions in the update loop function call, go. Comparetag instead of Go.tag, reduce sendmessage and other synchronous function calls, reduce string connection, for replace foreach,5.5 later version of foreach has been optimized, less LINQ; The network is placed in a separate thread release optimization: Turn off log, reject code pseudo-random script Mount class to manager and other global classes implemented in Lua as far as possible not to implement the update, fixedupdate and other cyclic functions, LUA and CSharp mutual invocation efficiency is low. 3. Memory Management
Pool management particles, float UI and other small resources, frequent GC will cause the need to call Gc.collect () according to different resources, different devices to manage the resource life cycle, Resources.load and Assetbundle unify interfaces, use reference counting to manage the lifecycle, and print and observe the life cycle. Ensure that resources are unloaded with the scene, do not reside in memory, determine what is preloaded, and which leaks. Memory leak (reduce resident memory): container within the resource does not remove the resources.unloadunusedassets is unloaded; for this case, it is recommended to go directly through the take in the profiler memory Sample to detect it, by directly viewing the name of the Assetbundle in Webstream or Serializedfile, you can determine whether there is a "leak" situation; via Android Pss/ios Instrument feedback of the app thread memory to view, heap memory is too large: to avoid the one-time heap memory allocation, mono's heap memory once allocated, will not be returned to the system, which means that mono's heap memory is only raised. Common: high-frequency call New;log output, high CPU consumption: Ngui's rebuilt mesh causes uipanel.lateupdate (to be segmented by static, mobile, high-frequency movement), and the Ngui anchor's own update logic consumes a lot of CPU overhead. Even if the control is stationary, the anchor point of the control is updated every frame (see the Uiwidget.onupdate function), and its update is recursive, which makes the CPU usage higher. So we have modified the internal code of Ngui so that the anchor is updated only when necessary. It is generally only updated when the control is initialized and the screen size changes. However, the cost of this optimization is that when the vertex position of the control changes (such as when the control is moving, or the size of the control changes), the upper logic needs to be responsible for updating the anchor point. The more the number of dynamic UI elements in the same uipanel, the larger the mesh created, and the greater the overhead of refactoring. For example, in the Battle of the HUD Blood bar may appear in large numbers, at this time, it is recommended that the development team to separate the movement of blood strips into different uipanel, each group of UIPanel under 5~10 a dynamic UI appropriate. The essence of this approach is to minimize the uipanel reconstruction overhead in a single frame from a probability. Resource redundancy: Assetbundle to multiple copies of the package, dynamically modifying resources resulting in multiple copies of instance (such as dynamically modifying materials, Renderer.meterial,animation.addclip). Disk space swap memory: for WEBSTREAM larger assetbundle files (such as UI Atlas-related Assetbundle files, etc.), it is recommended to use Loadfromcacheordownload or createfromfile to replace them. The extracted assetbundle data is stored in the local cache for use. This is ideal for projects with very tight memory, that is, local disk space for memory space 4. Art
Establish resource Review specification and Review tool: PBR material mapping production specification, Scene production resource control specification, role production specification, special effects production specification; Use Assetpostprocessor to build a review tool. Compress texture, optimize sprite fill rate, compress animation, compress sound, compress UI (nine Gongge is better than stretch), strictly control model surface number, texture number, character skeleton number. Particles: Recording animations in place of particles, reducing the number of particles, particles not colliding with the role: Enable optimize Game objects reduce nodes, use (Simplelod, Cruncher) to optimize the number of polygons. Model: Import Check Read/write only, Optimize Mesh, normal tangent, color, disable mipmap compression texture problem: Compression may result in insufficient levels of color, no transparent channel with ETC1, now Android does not support ETC2 is less than 5%, it is recommended to abandon the separation channel approach. UI: Whenever possible, separate dynamic UI elements from static UI elements into different uipanel (the rebuilding of the UI is in UIPanel) so that the refactoring that is caused by the changing UI elements is controlled to a lesser extent as far as possible, so that dynamic UI elements can be divided by synchronization, That is, the UI elements with different motion frequencies are separated into different uipanel as far as possible, and the dynamic UI elements are divided as much as possible by synchronization, that is, UI elements with different motion frequencies are separated into different uipanel as far as possible; Ugui: You can make full use of canvas to slice different elements. Large stickers can cause stuttering and may be cut into multiple loads. iOS uses MP3 compression, Android uses Vorbis compression 5. Batch
Turn on static batch to turn on dynamic batch: Requires that the model is less than 900 vertices, the usage line is less than 300, the tangent is less than 180, the scaling is inconsistent, the use of lightmap, multichannel materials, and so on will invalidate dynamic batch. By reducing Gameobject, the number of scene models has a huge impact on FPS. The less batches the better, the larger the rendering data will put pressure on the bus transmission. 6. Physics
Objects that do not need to be moved are set to static, do not collide with mesh, roles do not use collision trigger logic to optimize pathfinding frequency, AI logic frequency, Fixed timestep, reduced frame to 30 of complex calculations that lag, such as pathfinding, bulk resource loading can be handled by framing or co-asynchronous

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.