On this page:
3.9.1 Writing your Garbage Collectors
3.9.1.1 Required Functions
init-allocator
gc: alloc-flat
gc: cons
gc: cons?
gc: first
gc: set-first!
gc: rest
gc: set-rest!
gc: flat?
gc: deref
3.9.1.2 Provided Functions
heap-value?
location?
heap-size
heap-set!
heap-ref
root?
get-root-set
read-root
set-root!
procedure-roots
allocator-setup
3.9.2 Testing your Garbage Collectors
3.9.3 Sample Code
3.9.3.1 Trivial Collector
3.9.4 Notes

3.9 Garbage Collectors

In this assignment, you will implement two garbage collectors: mark & sweep, and stop & copy. As we have seen, garbage collection involves starting from a root set of references, which reside in local variables and stack frames, and searching for reachable values on the heap.

3.9.1 Writing your Garbage Collectors

You must complete this assignment using a specific language. Choose Determine language from source in the DrRacket menu, then write
as the first line of your program, overwritting the default #lang at the top of the file.

This language defines an interface to a program’s stack and heap that you will use to implement garbage collection.

3.9.1.1 Required Functions

Your garbage collectors must implement the following functions:

procedure

(init-allocator)  void

The mutator implicitly calls this function after initializing the heap and before calling any allocation routines. It is essentially a callback into the allocator indicating that the heap is ready.

procedure

(gc:alloc-flat v)  location?

  v : heap-value?
Allocates space for a flat value and returns the base address of the allocated block.

procedure

(gc:cons first-addr rest-addr)  location?

  first-addr : location?
  rest-addr : location?
Returns the address of a new cons cell with the given field addresses; the field addresses are presumed to be already allocated.

procedure

(gc:cons? addr)  boolean?

  addr : location?
Returns a boolean that indicates whether the given address refers to a cons cell.

procedure

(gc:first cons-addr)  location?

  cons-addr : location?
Returns the address of the first part of a cons.

procedure

(gc:set-first! cons-addr first-addr)  void

  cons-addr : location?
  first-addr : location?
Sets the address of the first part of a cons.

procedure

(gc:rest cons-addr)  location?

  cons-addr : location?
Returns the address of the rest part of a cons.

procedure

(gc:set-rest! cons-addr rest-addr)  void

  cons-addr : location?
  rest-addr : location?
Sets the address of the rest part of a cons.

procedure

(gc:flat? addr)  boolean?

  addr : location?
Returns a boolean that indicates whether the given address refers to an atomic value.

procedure

(gc:deref addr)  heap-value?

  addr : location?
Returns the atomic value stored at the given address.

3.9.1.2 Provided Functions

To help you write these functions, the GC Collector Racket language defines an interface for the heap and the roots:

procedure

(heap-value? v)  boolean?

  v : any/c
Equivalent to

(or (boolean? v) (number? v) (procedure? v) (symbol? v) (empty? v))

procedure

(location? v)  boolean?

  v : any/c
A location is an integer between 0 and (- (heap-size) 1) inclusive.

procedure

(heap-size)  number?

Returns the size of the heap.

procedure

(heap-set! addr v)  void

  addr : location?
  v : heap-value?
Stores a value at the specified address.

procedure

(heap-ref addr)  heap-value?

  addr : location?
Returns the value at the specified address.

procedure

(root? v)  boolean?

  v : any/c
Determines if v is a root.

syntax

(get-root-set root-identifier ...)

 
  root-identifier : identifier?
Returns the roots of the collection, including local roots created for each root-identifier.

Note: Sometimes you have to fake roots in case a live object is only reachable from arguments passed to a collector function. For example, you should pass the first and rest addresses to get-root-set in your implementation of gc:cons. (This is not the only example!)

procedure

(read-root r)  location?

  r : root?
Returns the address the root references.

procedure

(set-root! r a)  void

  r : root?
  a : location?
Updates the root to reference the specified address.

procedure

(procedure-roots proc)  (listof root?)

  proc : procedure?
Given a closure stored on the heap, returns a list of the roots reachable from the closure’s environment. If proc is not reachable, the empty list is returned.

syntax

(allocator-setup collector-file heap-size)

 
  collector-file : string?
  heap-size : number?
Uses collector-file for the collector with a heap initialized to heap-size.

3.9.2 Testing your Garbage Collectors

You must complete this assignment using a specific language. Choose Determine language from source in the DrRacket menu, then write
as the first line of your program, overwritting the default #lang at the top of the file.

This language is a subset of Racket that uses a garbage collector that you specify. The first line of a test program must be:

(allocator-setup "collector.rkt" heap-size)

where "collector.rkt" must be the name of your collector’s file. heap-size is the size of the heap your collector will use.

The remainder of the program is in a subset of Racket with numbers, symbols, lists, etc. The primitives of the language map directly to the procedures you define in your garbage collector.

3.9.3 Sample Code

To get you started, we’ve provided a trivial collector that signals an error when the heap fills up.

3.9.3.1 Trivial Collector

"collector.rkt":

#lang plai/collector
(define heap-ptr 'uninitialized-heap-ptr)
 
(define (init-allocator)
 (set! heap-ptr 0))
 
(define (gc:alloc-flat p)
 (begin
   (when (> (+ heap-ptr 2) (heap-size))
     (error 'gc:alloc-flat "out of memory"))
   (heap-set! heap-ptr 'prim)
   (heap-set! (+ 1 heap-ptr) p)
   (set! heap-ptr (+ 2 heap-ptr))
 
   (- heap-ptr 2)))
 
(define (gc:cons f r)
 (begin
   (when (> (+ heap-ptr 3) (heap-size))
     (error 'gc:cons "out of memory"))
   (heap-set! heap-ptr 'cons)
   (heap-set! (+ 1 heap-ptr) f)
   (heap-set! (+ 2 heap-ptr) r)
   (set! heap-ptr (+ 3 heap-ptr))
   (- heap-ptr 3)))
 
(define (gc:cons? a)
 (eq? (heap-ref a) 'cons))
 
(define (gc:first a)
 (heap-ref (+ 1 a)))
 
(define (gc:rest a)
 (heap-ref (+ 2 a)))
 
(define (gc:set-first! a f)
 (if (gc:cons? a)
     (heap-set! (+ 1 a) f)
     (error 'set-first! "expects address of cons")))
 
(define (gc:set-rest! a r)
 (heap-set! (+ 2 a) r))
 
(define (gc:flat? a)
 (eq? (heap-ref a) 'prim))
 
(define (gc:deref a)
 (heap-ref (+ 1 a)))
3.9.4 Notes

You must store bookkeeping data on the heap provided by GC Collector Racket. You may store 2-3 atomic values, such as addresses into the heap as variables in your garbage collector. We will assume they represent machine registers. However, all compound data structures must be on the heap.

In particular, in the mark & sweep collector, you should maintain a free list in the heap. That is, you should not use any auxiliary data structure; instead, use the available space in the heap to keep track of the free list. You may use one extra box (“register”) to point to the start of the free list. You may need to adjust your allocation accordingly to have enough space to maintain free list pointers!

Some final words of advice:
  • You will want to test your program with small heap sizes, so that the garbage collector runs frequently.

  • You do not need to compact memory in mark-and-sweep.

  • You may find it convenient to use the Racket construct set!, which allows you to mutate a variable without using boxes. We recommend you use this feature only in one instance: when swapping the semispaces in the stop & copy collector. This will save you the trouble of unboxing the heap every time you use it.