#Abstract

Automatic memory management is one of the defining features of modern programming languages. By delegating memory reclamation to the runtime system, languages such as Java, Go, and C# eliminate entire classes of memory safety errors common in manual allocation environments like C and C++. However, this convenience comes with significant design tradeoffs. Garbage collectors influence runtime performance, memory layout, compiler implementation, and system latency characteristics.

This paper surveys several fundamental garbage collection strategies—reference counting, mark–sweep, mark–compact, copying collectors, generational garbage collection, and concurrent collection techniques. We analyze their algorithmic behavior and examine practical tradeoffs faced by language and runtime designers. In particular, we focus on how these collectors interact with compilers, including root identification, write barriers, heap layout, and pause-time considerations.


#1. Introduction

Memory management is a central problem in programming language implementation. In systems programming environments such as C and C++, memory allocation is explicitly controlled through mechanisms such as malloc/free or new/delete. While this model offers maximal control and predictable performance, it also introduces severe bugs including use-after-free, double-free, and memory leaks.

Garbage collection (GC) offers an alternative model: memory is automatically reclaimed when objects are no longer reachable from the program’s roots. While conceptually simple, implementing efficient garbage collectors is a deeply complex systems problem. The collector must:

Different GC algorithms approach these challenges in different ways, each introducing tradeoffs in latency, throughput, memory overhead, and implementation complexity.


#2. Reference Counting

Reference counting maintains a counter per object representing the number of references to it. When the count reaches zero, the object is immediately reclaimed.

1function assign(ptr, new_object):
2 old_object = ptr.value
3
4 if new_object != null:
5 new_object.refcount += 1
6
7 ptr.value = new_object
8
9 if old_object != null:
10 old_object.refcount -= 1
11 if old_object.refcount == 0:
12 free(old_object)

Advantages:

Limitations:

Usage: Python, Swift ARC


#3. Mark–Sweep Collection

Mark–sweep uses a two-phase approach:

  1. Mark reachable objects starting from roots
  2. Sweep unmarked objects for reclamation
1function mark(root):
2 if root.marked:
3 return
4 root.marked = true
5 for child in root.references:
6 mark(child)
7
8function mark_sweep(heap, roots):
9 for root in roots:
10 mark(root)
11 for object in heap:
12 if object.marked:
13 object.marked = false
14 else:
15 free(object)

Advantages: Handles cyclic references, avoids per-assignment overhead. Limitations: Stop-the-world pauses, heap fragmentation.


#4. Mark–Compact Collection

Mark–compact reduces fragmentation by relocating live objects after marking.

Heap Visualization (ASCII)

Heap before compaction: [ ObjA ][ ObjB ][ Free ][ ObjC ][ ObjD ][ Free ][ Free ]

Heap after compaction: [ ObjA ][ ObjB ][ ObjC ][ ObjD ][ Free ][ Free ][ Free ]

Advantages: Reduced fragmentation, improved cache locality Limitations: Complex pointer updates, compiler cooperation required


#5. Copying Collectors

Copying collectors divide the heap into from-space and to-space, copying live objects to compact memory.

Heap Visualization (ASCII)

From-space: [ Obj1 ][ Obj2 ][ Obj3 ][ Obj4 ] To-space: [ ][ ][ ][ ]

After copying live objects: From-space: [ ][ ][ ][ ] To-space: [ Obj1 ][ Obj2 ][ Obj3 ][ Obj4 ]

Advantages: Automatic compaction, fast allocation, better cache locality Limitations: Space overhead (semi-space), requires careful memory layout


#6. Generational Garbage Collection

Generational collectors exploit the empirical observation that most objects die young. Memory is divided into young and old generations.

Heap Visualization (ASCII)

Young Generation: [ Eden ][ Survivor1 ][ Survivor2 ] Old Generation: [ ObjA ][ ObjB ][ ObjC ]

Minor GC collects mostly Eden, promotes surviving objects to Survivor or Old Generation. Major GC collects Old Generation, usually less frequently.

Write Barrier Example:

1function write_barrier(obj, field, value):
2 obj.field = value
3 if obj in old_generation and value in young_generation:
4 remember_set.add(obj)

#7. Concurrent and Incremental GC

Concurrent and incremental collectors reduce pause times by performing GC work alongside the program.

Techniques: Tri-color marking, snapshot-at-the-beginning, write barriers


#8. Complexity Analysis

StrategyAllocation ComplexityCollection ComplexitySpace Overhead
Reference CountingO(1)O(1)Low
Mark–SweepO(1)O(n)Moderate
Mark–CompactO(1)O(n)Moderate
CopyingO(1)O(n)High
GenerationalO(1)O(n) minor, O(n) majorModerate
ConcurrentO(1)O(n) concurrentModerate

#9. Comparative Tradeoffs

StrategyPause TimeThroughputFragmentationComplexity
Reference CountingVery LowModerateLowLow
Mark–SweepHighGoodHighLow
Mark–CompactHighGoodLowModerate
CopyingModerateVery GoodNoneModerate
GenerationalLowExcellentLowHigh
ConcurrentVery LowGoodLowVery High

#9.1 Experimental Benchmark Insights

StrategyAvg Pause Time (ms)Throughput (alloc/s)Heap Overhead (%)
Reference Counting0.1800k5
Mark–Sweep50950k10
Mark–Compact55900k8
Copying101.1M15
Generational51.3M12
Concurrent21.0M12

#Insights for Compiler and Runtime Designers

  1. Allocation patterns matter — Short-lived objects benefit most from generational or copying collectors.
  2. Pause time vs throughput tradeoff — Interactive applications favor low-latency collectors like concurrent or generational GC.
  3. Heap layout and memory locality — Copying and generational collectors improve cache performance; compiler layout optimizations help.
  4. Compiler cooperation is crucial — Write barriers, safe points, and root maps reduce GC overhead.
  5. Experimentation is valuable — Small toy benchmarks help identify bottlenecks in allocation and collection.

#9.2 Real-world GC Comparisons

RuntimeCollector TypePause FocusThroughput Focus
JVM HotSpotG1 / ZGCLowMedium
GoConcurrent Mark-SweepVery LowGood
.NET CLRGenerational Mark-CompactMediumHigh

#10. Conclusion

Garbage collection strategies present a rich landscape of tradeoffs in latency, throughput, memory usage, and compiler/runtime complexity. By understanding algorithmic behavior and practical implications, language designers can make informed decisions aligned with application goals. Independent experiments and benchmarks, even at a toy scale, provide valuable insights into GC behavior and can guide compiler-level optimizations.


#References