On this page:
2.15.1 Writing your Garbage Collectors
2.15.1.1 Required Functions
init-allocator
gc: alloc-flat
gc: cons
gc: cons?
gc: first
gc: set-first!
gc: rest
gc: set-rest!
gc: flat?
gc: deref
2.15.1.2 Provided Functions
heap-value?
location?
heap-size
heap-set!
heap-ref
root?
get-root-set
read-root
set-root!
procedure-roots
allocator-setup
2.15.2 Testing your Garbage Collectors
2.15.3 Sample Code
2.15.3.1 Trivial Collector
2.15.3.2 Sample Mutator
2.15.4 Notes
2.15.5 Grading

2.15 Garbage Collectors

Complete this assignment with Team Two.

You must submit this in an archive named "gc.zip".

You must submit this in a file named "mark-and-sweep.ss".

You must submit this in a file named "stop-and-copy.ss".

In this assignment, you will implement two garbage collectors: mark & sweep, and stop & copy. As we have seen, garbage collection involves starting from a root set of references, which reside in local variables and stack frames, and searching for reachable values on the heap.

2.15.1 Writing your Garbage Collectors

You must complete this part of the assignment in the GC Collector Scheme language. You can do so by choosing it from the Language|Choose Language... menu in DrScheme or via the language chooser at the bottom-left. You can also choose the Module language and have the following as the first line of your program:
  #lang planet plai/plai/collector

This language defines an interface to a program’s stack and heap that you will use to implement garbage collection.

2.15.1.1 Required Functions

Your garbage collectors must implement the following functions:

The mutator implicitly calls this function after initializing the heap and before calling any allocation routines. It is essentially a callback into the allocator indicating that the heap is ready.

(gc:alloc-flat v)  location?
  v : heap-value?
Allocates space for a flat value and returns the base address of the allocated block.

(gc:cons first-addr rest-addr)  location?
  first-addr : location?
  rest-addr : location?
Returns the address of a new cons cell with the given field addresses; the field addresses are presumed to be already allocated.

(gc:cons? addr)  boolean?
  addr : location?
Returns a boolean that indicates whether the given address refers to a cons cell.

(gc:first cons-addr)  location?
  cons-addr : location?
Returns the address of the first part of a cons.

(gc:set-first! cons-addr first-addr)  void
  cons-addr : location?
  first-addr : location?
Sets the address of the first part of a cons.

(gc:rest cons-addr)  location?
  cons-addr : location?
Returns the address of the rest part of a cons.

(gc:set-rest! cons-addr rest-addr)  void
  cons-addr : location?
  rest-addr : location?
Sets the address of the rest part of a cons.

(gc:flat? addr)  boolean?
  addr : location?
Returns a boolean that indicates whether the given address refers to an atomic value.

(gc:deref addr)  heap-value?
  addr : location?
Returns the atomic value stored at the given address.

2.15.1.2 Provided Functions

To help you write these functions, the GC Collector Scheme language defines an interface for the heap and the roots:

(heap-value? v)  boolean?
  v : any/c
Equivalent to
  (or (boolean? v) (number? v) (procedure? v) (symbol? v) (empty? v))

(location? v)  boolean?
  v : any/c
A location is an integer between 0 and (- (heap-size) 1) inclusive.

Returns the size of the heap.

(heap-set! addr v)  void
  addr : location?
  v : heap-value?
Stores a value at the specified address.

(heap-ref addr)  heap-value?
  addr : location?
Returns the value at the specified address.

(root? v)  boolean?
  v : any/c
Determines if v is a root.

(get-root-set root-identifier ...)
 
  root-identifier : identifier?
Returns the roots of the collection, including local roots created for each root-identifier.

Note: Sometimes you have to fake roots in case a live object is only reachable from arguments passed to a collector function. For example, you should pass the first and rest addresses to get-root-set in your implementation of gc:cons. (This is not the only example!)

(read-root r)  location?
  r : root?
Returns the address the root references.
(set-root! r a)  void
  r : root?
  a : location?
Updates the root to reference the specified address.

(procedure-roots proc)  (listof root?)
  proc : procedure?
Given a closure stored on the heap, returns a list of the roots reachable from the closure’s environment. If proc is not reachable, the empty list is returned.

(allocator-setup collector-file heap-size)
 
  collector-file : string?
  heap-size : number?
Uses collector-file for the collector with a heap initialized to heap-size.

2.15.2 Testing your Garbage Collectors

You must complete this part of the assignment in the GC Mutator Scheme language. You can do so by choosing it from the Language|Choose Language... menu in DrScheme or via the language chooser at the bottom-left. You can also choose the Module language and have the following as the first line of your program:
  #lang planet plai/plai/mutator

This language is a subset of Scheme that uses a garbage collector that you specify. The first line of a test program must be:
  (allocator-setup "collector.ss" heap-size)
where "collector.ss" must be the name of your collector’s file. heap-size is the size of the heap your collector will use.

The remainder of the program is in a subset of Scheme with numbers, symbols, lists, etc. The primitives of the language map directly to the procedures you define in your garbage collector.

2.15.3 Sample Code

To get you started, we’ve provided a sample mutator and a trivial collector that signals an error when the heap fills up. (In fact, our collector does signal an error with the mutator we’ve provided, if you don’t increase the size of the heap.)

2.15.3.1 Trivial Collector

"collector.ss":

  #lang planet plai/plai/collector
  (define heap-ptr 'uninitialized-heap-ptr)
  
  (define (init-allocator)
   (set! heap-ptr 0))
  
  (define (gc:alloc-flat p)
   (begin
     (when (> (+ heap-ptr 2) (heap-size))
       (error 'gc:alloc-flat "out of memory"))
     (heap-set! heap-ptr 'prim)
     (heap-set! (+ 1 heap-ptr) p)
     (set! heap-ptr (+ 2 heap-ptr))
  
     (- heap-ptr 2)))
  
  (define (gc:cons f r)
   (begin
     (when (> (+ heap-ptr 3) (heap-size))
       (error 'gc:cons "out of memory"))
     (heap-set! heap-ptr 'cons)
     (heap-set! (+ 1 heap-ptr) f)
     (heap-set! (+ 2 heap-ptr) r)
     (set! heap-ptr (+ 3 heap-ptr))
     (- heap-ptr 3)))
  
  (define (gc:cons? a)
   (eq? (heap-ref a) 'cons))
  
  (define (gc:first a)
   (heap-ref (+ 1 a)))
  
  (define (gc:rest a)
   (heap-ref (+ 2 a)))
  
  (define (gc:set-first! a f)
   (if (gc:cons? a)
       (heap-set! (+ 1 a) f)
       (error 'set-first! "expects address of cons")))
  
  (define (gc:set-rest! a r)
   (heap-set! (+ 2 a) r))
  
  (define (gc:flat? a)
   (eq? (heap-ref a) 'prim))
  
  (define (gc:deref a)
   (heap-ref (+ 1 a)))

2.15.3.2 Sample Mutator

"mutator.ss":

  #lang planet plai/plai/mutator
  (allocator-setup "collector.ss" 84)
  
  (define (fact x)
   (if (zero? x)
       1
       (* x (fact (sub1 x)))))
  
  (define (fact-help x a)
   (if (zero? x)
       a
       (fact-help (sub1 x) (* x a))))
  
  (define lst (cons 1 (cons 2 (cons 3 empty))))
  
  (define (map-add n lst)
   (map (lambda (x) (+ n x)) lst))
  
  (define (map f lst)
   (if (cons? lst)
       (cons (f (first lst)) (map f (rest lst)))
       empty))
  
  (define (filter p lst)
   (if (cons? lst)
       (if (p (first lst))
           (cons (first lst) (filter p (rest lst)))
           (filter p (rest lst)))
       lst))
  
  (define (append l1 l2)
   (if (cons? l1)
       (cons (first l1) (append (rest l1) l2))
       l2))
  
  (define (length lst)
   (if (empty? lst)
       0
       (add1 (length (rest lst)))))
  
  (define tail (cons 1 empty))
  (define head (cons 4 (cons 3 (cons 2 tail))))
  (set-rest! tail head)
  
  (printf "res ~a~n" head)
  (set! head empty)
  (set! tail head)
  (printf "res ~a~n" lst)
  (printf "res ~a~n" (length '(hello goodbye)))
  (printf "res ~a~n" (map sub1 lst))
  
  (printf "(fact-help 15 1): ~a~n" (fact-help 15 1))
  (printf "(fact 9): ~a~n" (fact 9))
  
  (printf "(append lst lst): ~a~n" (append lst lst))
  
  (printf "(map-add 5 lst): ~a~n" (map-add 5 lst))
  (printf "(filter even? (map sub1 lst)): ~a~n"
          (filter even? (map sub1 lst)))
  (printf "(length lst): ~a~n" (length lst))

If your collector works correctly on the heap size you’ve specified, this should result in:

  > (printf "res ~a~n" head)

  res #0={4 3 2 1 . #0#}

  > (set! head empty)
  > (set! tail head)
  > (printf "res ~a~n" lst)

  res (1 2 3)

  > (printf "res ~a~n" (length '(hello goodbye)))

  res 2

  > (printf "res ~a~n" (map sub1 lst))

  res (0 1 2)

  > (printf "(fact-help 15 1): ~a~n" (fact-help 15 1))

  (fact-help 15 1): 1307674368000

  > (printf "(fact 9): ~a~n" (fact 9))

  (fact 9): 362880

  > (printf "(append lst lst): ~a~n" (append lst lst))

  (append lst lst): (1 2 3 1 2 3)

  > (printf "(map-add 5 lst): ~a~n" (map-add 5 lst))

  (map-add 5 lst): (6 7 8)

  > (printf "(filter even? (map sub1 lst)): ~a~n"
            (filter even? (map sub1 lst)))

  (filter even? (map sub1 lst)): (0 2)

  > (printf "(length lst): ~a~n" (length lst))

  (length lst): 3

2.15.4 Notes

You must store bookkeeping data on the heap provided by GC Collector Scheme. You may store 2-3 atomic values, such as addresses into the heap as variables in your garbage collector. We will assume they represent machine registers. However, all compound data structures must be on the heap.

In particular, in the mark & sweep collector, you should maintain a free list in the heap. That is, you should not use any auxiliary data structure; instead, use the available space in the heap to keep track of the free list. You may use one extra box (“register”) to point to the start of the free list. You may need to adjust your allocation accordingly to have enough space to maintain free list pointers!

Some final words of advice:
  • You will want to test your program with small heap sizes, so that the garbage collector runs frequently.

  • You do not need to compact memory in mark-and-sweep.

  • You may find it convenient to use the Scheme construct set!, which allows you to mutate a variable without using boxes. We recommend you use this feature only in one instance: when swapping the semispaces in the stop & copy collector. This will save you the trouble of unboxing the heap every time you use it.

2.15.5 Grading

This assignment is graded as a written assignment. The coding style guidelines count as a single question.