000

Index Labels

Garbage Collection and Memory Allocation Sizes

.

As a performance conscious programmer in a soft-realtime environment I've never been too fond of garbage collection.

Incremental garbage collectors (like the one in Lua) make it tolerable (you get rid of the horrible garbage collection stalls), but there is still something unsettling about it. I keep looking at the garbage collection time in the profiler, and I can't shake the feeling that all that time is wasted, because it doesn't really do anything.

Of course that isn't true. Garbage collection frees the programmers from a lot of busywork. And the time they gain can go into optimizing other systems, which leads to a net performance win.

It also simplifies some of the hairy ownership questions that arise when data is transferred between systems. Without garbage collection, those questions must be solved in some other way. Either by reference counting (error-prone) or by making local copies of the data to assume ownership (ugly and costly).

But still, there is that annoying performance hit.

I was pretty surprised to see that the developers Go, a language that looks well-designed and targets low-level programmers, decided to go with garbage collection rather than manual memory management. It seemed like a strange choice.

But recently I've started to see things differently.

One thing I've noticed as I delve deeper and deeper into data-oriented design is that I tend to allocate memory in much larger chunks than before. It's a natural consequence of trying to keep things continuous in memory, treating resources as large memory blobs and managing arrays of similar objects together.

This has interesting consequences for garbage collection, because when the garbage collector only has to keep track of a small number of large chunks, rather than a large number of small chunks, it can perform a lot better.

Let's look at a simple example in Lua. Say we want to write a class for managing bullets. In the non-data-oriented solution, we allocate each bullet as a separate object:

function Bullet:update(dt)
self.position = self.position + self.velocity * dt
end

function Bullets:update(dt)
for i,bullet in ipairs(self.bullets) do
bullet:update(dt)
end
end

In the data-oriented solution, we instead use two big arrays to store the position and velocity of all the bullets:

function Bullets:update(dt)
for i=1,#self.pos do
self.pos[i] = self.pos[i] + dt * self.vel[i]
end
end

I tested these two solutions with a large number of bullets and got two interesting results:

  • The data-oriented solution runs 50 times faster.

  • The data-oriented solution only needs half as much time for garbage collection.

That the data-oriented solution runs so much faster shows what cache coherence can do for you. It is also a testament to how awesome LuaJIT is when you give it tight inner loops to work with.

Note that in this test, the Bullet code itself did not create any garbage. The speed-up comes from being faster at collecting the garbage created by other systems. And the reason for this is simply that with fewer, larger memory allocations, there is less stuff that the garbage collector has to trawl through. If we add in the benefit that the data-oriented solution will create fewer objects and generate less garbage, the benefits will be even greater.

So maybe the real culprit in isn't garbage collection, but rather having many small memory allocations. And having many small memory allocations does not just hurt the garbage collector, it is bad for other reasons as well. It leads to bad cache usage, high overhead in the memory allocator, fragmentation and bad allocator performance. It also makes all kinds of memory problems harder to deal with: memory leaks, dangling pointers, tracking how much memory is used by each system, etc.

So it is not just garbage-collected languages like Lua that would benefit from allocating memory in larger chunks, but manually managed languages like C++ as well.

Recently, I've come to think that the best solution to memory management issues in C++ is to avoid the kitchen-sink global memory allocator as much as possible and instead let each subsystem take a much more hands-on approach to managing its own memory.

What I mean by this is that instead of having the sound system (for example) send lots of memory requests to the kitchen-sink memory manager, it would only request a few large memory blocks. Then, it would be the responsibility of the system to divide that up into smaller, more manageable pieces that it can make practical use of.

This approach has a number of advantages:

  • Since the system knows the usage patterns for its data, it can arrange the memory efficiently. A global memory allocator has no such knowledge.

  • It becomes much easier to track memory use by system. There will be a relatively small number of global memory allocations, each tagged by system. It becomes obvious how much memory each system is consuming.

  • Memory inside a system can be easily tracked, since the system knows what the memory means and can thus give useful information about it (such as the name of the object that owns it).

  • When a system shuts down it can quickly and efficiently free all of its memory.

  • Fragmentation problems are reduced.

  • It actively encourages good memory behavior. It makes it easier to do good things (achieve cache locality, etc) and harder to do bad things (lots of small memory allocations).

  • Buffer overflows will tend to overwrite data within the same system or cause page faults, which will make them easier to find.

  • Dangling pointer access will tend to cause page faults, which will make them easier to find.

I'm tempted to go so far as to only allow whole page allocations on the global level. I.e., a system would only be allowed to request memory from the global manager in chunks of whole system pages. Then it would be up to the system to divide that up into smaller pieces. For example, if we did the bullet example in C++, we might use one such chunk to hold our array of Bullet structs.

This has the advantage of completely eliminating external fragmentation. (Since everything is allocated in chunks of whole pages and they can be remapped by the memory manager.) We can still get address space fragmentation, but using a 64-bit address space should take care of that. And with this approach using 64-bit pointers is less expensive, because we have fewer individually allocated memory blocks and thus fewer pointers.

Instead we get internal fragmentation. If we allocate the bullet array as a multiple of the page size (say 4 K), we will on average have 2 K of wasted space at the end of the array (assuming the number of bullets is random).

But internal fragmentation is a much nicer problem to deal with than external fragmentation. When we have internal fragmentation, it is one particular system that is having trouble. We can go into that system and do all kinds of things to optimize how its handling memory and solve the problem. With external fragmentation, the problem is global. There is no particular system that owns it and no clear way to fix it other than to try lots of things that we hope might "improve" the fragmentation situation.

The same goes for out-of-memory problems. With this approach, it is very clear which system is using too much memory and easy to fix that by reducing the content or doing optimizations to that system.

Dealing with bugs and optimizations on a system-by-system simplifies things enormously. It is quite easy to get a good grasp of everything that happens in a particular system. Grasping everything happens in the entire engine is a superhuman task.

Another nice thing about this approach is that it is quite easy to introduce it on a system-by-system basis. All we have to do is to change one system at a time so that it allocates its memory using the page allocator, rather than the kitchen-sink allocator.

And if we have some messy systems left that are too hard to convert to this approach we can just let them keep using the kitchen-sink allocator. Or, even better, we can give them their own private heaps in memory that they allocate from the page allocator. Then they can make whatever mess they want there, without disturbing the other systems.

Blog Archive

Labels

.NET Programming 2D Drafting 3D Animation 3D Art 3D Artist 3D design 3D effects 3D Engineering 3D Materials 3D Modeling 3D models 3D presentation 3D Printing 3D rendering 3D scanning 3D scene 3D simulation 3D Sketch Inventor 3D Texturing 3D visualization 3D Web App 3ds Max 4D Simulation ACC Adaptive Clearing adaptive components Add-in Development Additive Manufacturing Advanced CAD features Advanced Modeling AEC Technology AEC Tools affordable Autodesk tools AI AI animation AI Assistance AI collaboration AI Design AI Design Tools AI Experts AI for Revit AI Guide AI in CAD AI in CNC AI in design AI in Manufacturing AI in Revit AI insights AI lighting AI rigging AI Tips AI Tools AI troubleshooting AI workflow AI-assisted AI-assisted rendering AI-enhanced Animation animation pipeline animation tips Animation workflow annotation AR architectural design architectural modeling architectural preservation architectural visualization Architecture architecture design Architecture Engineering Architecture Firm Architecture Productivity architecture software architecture technology Architecture Workflow Arnold Renderer Arnold Shader Artificial Intelligence As-Built Model Asset Management augmented reality AutoCAD AutoCAD advice AutoCAD API AutoCAD Basics AutoCAD Beginner AutoCAD Beginners AutoCAD Civil 3D AutoCAD Civil3D AutoCAD commands AutoCAD efficiency AutoCAD Expert Advice AutoCAD features AutoCAD File Management AutoCAD Layer AutoCAD Layers AutoCAD learning AutoCAD print settings AutoCAD productivity AutoCAD Teaching AutoCAD Techniques AutoCAD tips AutoCAD tools AutoCAD training. AutoCAD tricks AutoCAD Tutorial AutoCAD workflow AutoCAD Xref Autodesk Autodesk 2025 Autodesk 2026 Autodesk 3ds Max Autodesk AI Autodesk AI Tools Autodesk Alias Autodesk AutoCAD Autodesk BIM Autodesk BIM 360 Autodesk Certification Autodesk Civil 3D Autodesk Cloud Autodesk community forums Autodesk Construction Cloud Autodesk Docs Autodesk Dynamo Autodesk features Autodesk for Education Autodesk Forge Autodesk FormIt Autodesk Fusion Autodesk Fusion 360 Autodesk help Autodesk InfraWorks Autodesk Inventor Autodesk Inventor Frame Generator Autodesk Inventor iLogic Autodesk Knowledge Network Autodesk License Autodesk Maya Autodesk mistakes Autodesk Navisworks Autodesk news Autodesk plugins Autodesk productivity Autodesk Recap Autodesk resources Autodesk Revit Autodesk Software Autodesk support ecosystem Autodesk Takeoff Autodesk Tips Autodesk training Autodesk tutorials Autodesk update Autodesk Upgrade Autodesk Vault Autodesk Video Autodesk Viewer Automated Design Automation Automation Tutorial automotive design automotive visualization Backup Basic Commands Basics Batch Plot Beginner Beginner Tips beginner tutorial beginners guide Big Data BIM BIM 360 BIM Challenges BIM collaboration BIM Compliance BIM Coordination BIM Data BIM Design BIM Efficiency BIM for Infrastructure BIM Implementation BIM Library BIM Management BIM modeling BIM software BIM Standards BIM technology BIM tools BIM Trends BIM workflow Block Editor Block Management Block Organization Building Design Software Building Maintenance building modeling Building Systems Building Technology ByLayer CAD CAD API CAD assembly CAD Automation CAD Blocks CAD CAM CAD commands CAD comparison CAD Customization CAD Data Management CAD Design CAD errors CAD Evolution CAD File Size Reduction CAD Integration CAD Learning CAD line thickness CAD management CAD Migration CAD mistakes CAD modeling CAD Optimization CAD plugins CAD Productivity CAD Rendering CAD Security CAD Skills CAD software CAD software 2026 CAD software training CAD standards CAD technology CAD Tips CAD Tools CAD tricks CAD Tutorial CAD workflow CAM car design software Case Study CEO Guide CGI design Character Rig cinematic lighting Civil 3D Civil 3D hidden gems Civil 3D productivity Civil 3D tips civil design software civil engineering Civil engineering software Clash Detection Class-A surfacing clean CAD file cleaning command client engagement Cloud CAD Cloud Collaboration Cloud design platform Cloud Engineering Cloud Management Cloud Storage Cloud-First CNC CNC machining collaboration command abbreviations Complex Renovation concept car conceptual workflow Connected Design construction Construction Analytics Construction Automation Construction BIM Construction Cloud Construction Planning Construction Scheduling Construction Technology contractor tools Contractor Workflow Contraints corridor design Cost Effective Design cost estimation Create resizable blocks Creative Teams CTB STB Custom visual styles Cutting Parameters Cybersecurity Data Backup data management Data Protection Data Reference Data Security Data Shortcut Design Automation Design Career Design Collaboration Design Comparison Design Coordination design efficiency Design Engineering Design Hacks Design Innovation design optimization Design Options design productivity design review Design Rules design software design software tips Design Technology design tips Design Tools Design Workflow design-to-construction Designer Designer Tools Digital Art Digital Assets Digital Construction Digital Construction Technology Digital Content Digital Design Digital engineering digital fabrication Digital Manufacturing digital marketing digital takeoff Digital Thread Digital Tools Digital Transformation Digital Twin Digital Twins digital workflow dimension dimensioning Disaster Recovery drafting Drafting Efficiency Drafting Shortcuts Drafting Standards Drafting Tips Drawing Drawing Automation drawing tips Dref Dynamic Block Dynamic Block AutoCAD Dynamic Blocks Dynamic doors Dynamic windows Dynamo Dynamo automation early stage design eco design editing commands Electrical Systems Emerging Features Energy Analysis energy efficiency Engineering Engineering Automation engineering data Engineering Design Engineering Innovation Engineering Productivity Engineering Skills engineering software Engineering Technology engineering tools Engineering Tools 2025 Engineering Workflow Excel Export Workflow Express Tools External Reference facial animation Facial Rigging Facility Management Families Fast Structural Design Field Documentation File Optimization File Recovery Flame flange tips flat pattern Forge Development Forge Viewer FreeCAD Fusion 360 Fusion 360 API Fusion 360 tutorial Future of Design Future Skills Game Development Gamification Generative Design Geospatial Data GIS Global design teams global illumination grading optimization green building Green Technology Grips Handoff HDRI health check Healthcare Facilities heavy CAD file Heavy CAD Files heritage building conservation hidden commands Hospital Design HVAC HVAC Design Tools HVAC Engineering Hydraulic Modeling IK/FK iLogic Import Workflow Industry 4.0 Infrastructure infrastructure design Infrastructure Monitoring Infrastructure Planning Infrastructure Technology InfraWorks innovation Insight intelligent modeling Interactive Design interactive presentation Interior Design Inventor Inventor API Inventor Drawing Template Inventor Frame Generator Inventor Graphics Issues Inventor IDW Inventor Tips Inventor Tutorial IoT ISO 19650 joints Keyboard Shortcuts keyframe animation Keyframe generation Landscape Design Large Projects Laser Scan Layer Management Layer Organization Learn AutoCAD Legacy CAD Licensing light techniques Lighting and shading Lighting Techniques Linked Models Machine Learning Machine Learning in CAD Machine Optimization Machining Efficiency maintenance command Management manufacturing Manufacturing Innovation Manufacturing Technology Mapping Technology marketing visuals Material Creation Maya Maya character animation Maya lighting Maya Shader Maya Tips Maya tutorial measurement Mechanical Design Mechanical Engineering Media & Entertainment MEP Modeling Mesh-to-BIM Metal Structure modal analysis Model Management Model Optimization Modeling Secrets Modular Housing Motion capture motion graphics motion simulation MotionBuilder Multi Office Workflow Multi-User Environment multileader Navisworks Navisworks Best Practices Net Zero Design ObjectARX .NET API Open Source CAD Organization OVERKILL OVERKILL AutoCAD Page Setup Palette Parametric Components parametric design parametric family Parametric Modeling particle effects particle systems PDF PDM system Personal Brand Phasing PlanGrid Plot Settings Plot Style Plot Style AutoCAD Plotting Plugin Tutorial Plumbing Design point cloud Portfolio Post Construction Post-Processing Practice Drawing preconstruction workflow predictive analysis predictive animation Predictive Maintenance Predictive rigging Prefabrication Presentation-ready visuals Printing Printing Quality Procedural animation procedural motion Procedural Rig Procedural Textures Product Design Product Development product lifecycle product rendering Productivity productivity tools Professional 3D design Professional CAD Professional Drawings professional printing Professional Tips Project Documentation project efficiency project management Project Management Tools Project Visualization PTC Creo PURGE PURGE AutoCAD Rail Transit Rapid Prototyping realistic rendering ReCap Redshift Shader reduce CAD file size Render Render Passes Render Quality Render Settings Rendering rendering engine Rendering Engines Rendering Optimization rendering software Rendering Tips Rendering Workflow RenderMan Renewable Energy Renovation Project Renovation Workflow Reports Resizable Block restoration workflow Revit Revit add-ins Revit API Revit automation Revit Best Practices Revit Collaboration Revit Documentation Revit Family Revit integration Revit MEP Revit Performance Revit Phasing Revit Plugins Revit Scripting Revit skills Revit Standards Revit Template Revit Tips Revit tutorial Revit Workflow Ribbon Rigging robotics ROI Scale Autodesk Schedules screen Sculpting Secure Collaboration Sensor Data Shader Networks Sheet Metal Design Sheet Metal Tricks Sheet Set Manager shortcut keys Shortcuts Siemens NX Simulation simulation tools Sketch Sketching Tricks Small Firms Smart Architecture Smart Block Smart Building Design Smart City Smart Design Smart Engineering Smart Factory Smart Infrastructur Software Compliance software ecosystem Software Management Software Trends software troubleshooting Software Update Solar Energy Solar Panels SolidWorks Startup Design static stress Steel Structure Design Structural Optimization subscription model Subscription Value Surface Modeling sustainability sustainable design Sustainable Manufacturing system performance T-Spline team training guide Technical Drawing technical support Template Setup text style Texture Mapping Texturing thermal analysis Time Management time saving tools Title Blocks toolbar Toolpath Optimization Toolpaths Topography Troubleshooting Tutorial Tutorials urban planning User Interface (UI) UV Mapping UV Unwrap V-Ray Vault Best Practices Vault Lifecycle Vault Mistakes Vector Plotting vehicle modeling VFX Viewport configuration Virtual Environments virtual reality visual effects visualization workflow VR VR Tools VRED Water Infrastructure Water Management Weight Painting What’s New in Autodesk Wind Energy Wind Turbines Workbook workflow Workflow Automation workflow efficiency Workflow Optimization Workflow Tips Worksets Worksharing Workspace XLS Xref Xrefs เขียนแบบ