* README.md: Replace overview.html with overview.md.
* doc/doc.am (dist_doc_DATA): Likewise.
* doc/doc.am (dist_doc_DATA): Replace tree.html with tree.md.
* doc/gcdescr.md (Mark phase): Likewise.
* doc/overview.html: Change file suffix to .md; convert text format
from HTML to Markdown.
* doc/tree.html: Likewise.
(December 1995), pp. 1315-1330. This is very similar to the "rope" package
in Xerox Cedar, or the "rope" package in the SGI STL or the g++ distribution.)
-Further collector documentation can be found
-in [overview.html](doc/overview.html).
+Further collector documentation can be found in the
+[overview](doc/overview.md).
## General Description
doc/gcdescr.md \
doc/gcinterface.md \
doc/leak.md \
- doc/overview.html \
+ doc/overview.md \
doc/porting.md \
doc/scale.md \
doc/simple_example.md \
- doc/tree.html
+ doc/tree.md
actually consist of multiple such pages. HBLKSIZE is usually the page size
divided by a small power of two.)
* The page address part of the candidate pointer is looked up in
- a [table](tree.html). Each table entry contains either 0, indicating that
+ a [table](tree.md). Each table entry contains either 0, indicating that
the page is not part of the garbage collected heap, a small integer _n_,
indicating that the page is part of large object, starting at least _n_
pages back, or a pointer to a descriptor for the page. In the first case,
+++ /dev/null
-<!DOCTYPE HTML>
-<html><head><title>A garbage collector for C and C++</title></head>
-<body>
-<table bgcolor="#f0f0ff" cellpadding="10%">
- <tbody><tr>
- <td><a href="gcinterface.md">Interface Overview</a></td>
- <td><a href="http://www.hboehm.info/gc/04tutorial.pdf">Tutorial Slides</a></td>
- <td><a href="http://www.hboehm.info/gc/faq.html">FAQ</a></td>
- <td><a href="simple_example.md">Example</a></td>
- <td><a href="https://github.com/ivmai/bdwgc/wiki/Download">Download</a></td>
- <td><a href="http://www.hboehm.info/gc/license.txt">License</a></td>
- </tr>
-</tbody></table>
-<h1>A garbage collector for C and C++</h1>
-<ul>
-<li><a href="#platforms">Platforms</a>
-</li><li><a href="#multiprocessors">Scalable multiprocessor versions</a>
-</li><li><a href="#details">Some collector details</a>
-</li><li><a href="#further">Further reading</a>
-</li><li><a href="#users">Current users</a>
-</li><li><a href="#collector">Local Links for this collector</a>
-</li><li><a href="#background">Local Background Links</a>
-</li><li><a href="#contacts">Contacts and Mailing List</a>
-</li></ul>
-[ This is an updated version of the page formerly at
-<tt>www.hpl.hp.com/personal/Hans_Boehm/gc/</tt>,
-before that at
-<tt>http://reality.sgi.com/boehm/gc.html</tt>
-and before that at
-<tt>ftp://ftp.parc.xerox.com/pub/gc/gc.html</tt>. ]
-<p>
-The <a href="http://www.hboehm.info">Boehm</a>-<a href="http://www.cs.cornell.edu/annual_report/00-01/bios.htm#demers">Demers</a>-<a href="http://www.ubiq.com/hypertext/weiser/weiser.html">Weiser</a>
-conservative Garbage Collector (<b>BDWGC</b>) can
-be used as a garbage collecting
-replacement for C <tt>malloc</tt> or C++ <tt>new</tt>.
-It allows you to allocate memory basically as you normally would,
-without explicitly deallocating memory that is no longer useful.
-The collector automatically recycles memory when it determines
-that it can no longer be otherwise accessed.
-A simple example of such a use is given
-<a href="simple_example.md">here</a>.
-</p><p>
-The collector is also used by a number of programming language
-implementations that either use C as intermediate code, want
-to facilitate easier interoperation with C libraries, or
-just prefer the simple collector interface.
-For a more detailed description of the interface, see
-<a href="gcinterface.md">here</a>.
-</p><p>
-Alternatively, the garbage collector may be used as
-a <a href="leak.md">leak detector</a>
-for C or C++ programs, though that is not its primary goal.
-</p><p>
-Typically several versions are offered for
-<a href="https://github.com/ivmai/bdwgc/wiki/Download">downloading</a>:
-preview, stable, legacy.
-Usually you should use the one marked as the <i>latest stable</i> release.
-Preview versions may contain additional features, platform support,
-but are likely to be less well tested.
-The list of changes for each version is specified on the
-<a href="https://github.com/ivmai/bdwgc/releases">releases</a> page.
-</p><p>
-The arguments for and against conservative garbage collection
-in C and C++ are briefly
-discussed in
-<a href="http://www.hboehm.info/gc/issues.html">issues.html</a>.
-The beginnings of a frequently-asked-questions list are
-<a href="http://www.hboehm.info/gc/faq.html">here</a>.
-</p><p>
-The garbage collector code is copyrighted by
-<a href="http://www.hboehm.info">Hans-J. Boehm</a>,
-Alan J. Demers,
-<a href="http://www.xerox.com/">Xerox Corporation</a>,
-<a href="http://www.sgi.com/">Silicon Graphics</a>,
-and
-<a href="http://www.hp.com/">Hewlett-Packard Company</a>.
-It may be used and copied without payment of a fee under minimal restrictions.
-See the README file in the distribution or the
-<a href="http://www.hboehm.info/gc/license.txt">license</a> for more details.
-<b>IT IS PROVIDED AS IS,
-WITH ABSOLUTELY NO WARRANTY EXPRESSED OR IMPLIED. ANY USE IS AT YOUR OWN RISK</b>.
-</p><p>
-Empirically, this collector works with most unmodified C programs,
-simply by replacing
-<tt>malloc</tt> with <tt>GC_malloc</tt> calls,
-replacing <tt>realloc</tt> with <tt>GC_realloc</tt> calls, and removing
-free calls. Exceptions are discussed
-in <a href="http://www.hboehm.info/gc/issues.html">issues.html</a>.
-</p><h2><a name="platforms">Platforms</a></h2>
-The collector is not completely portable, but the distribution
-includes ports to most standard PC and UNIX/Linux platforms.
-The collector should work on Linux, *BSD, recent Windows versions,
-MacOS X, HP/UX, Solaris,
-Tru64, Irix and a few other operating systems.
-Some ports are more polished than others.
-<p>
-Irix pthreads, Linux threads, Win32 threads, Solaris threads
-(pthreads only),
-HP/UX 11 pthreads, Tru64 pthreads, and MacOS X threads are supported
-in recent versions.
-</p><h3>Separately distributed ports</h3>
-For MacOS 9/Classic use, Patrick Beard's latest port is available from
-<tt>http://homepage.mac.com/pcbeard/gc/</tt>.
-(Unfortunately, that's now quite dated.
-I'm not in a position to test under MacOS. Although I try to
-incorporate changes, it is impossible for
-me to update the project file.)
-<p>
-Precompiled versions of the collector for NetBSD are available
-<a href="ftp://ftp.netbsd.org/pub/pkgsrc/current/pkgsrc/devel/boehm-gc/README.html">here</a>.
-</p><p>
-<a href="http://www.debian.org/">Debian Linux</a> includes prepackaged
-versions of the collector.
-</p><h2><a name="multiprocessors">Scalable multiprocessor versions</a></h2>
-Kenjiro Taura, Toshio Endo, and Akinori Yonezawa have made available
-a <a href="http://ieeexplore.ieee.org/abstract/document/1592629/">parallel collector</a>
-based on this one. Their collector takes advantage of multiple processors
-during a collection. Starting with collector version 6.0alpha1
-we also do this, though with more modest processor scalability goals.
-Our approach is discussed briefly in
-<a href="scale.md">this document</a>.
-<h2><a name="details">Some Collector Details</a></h2>
-The collector uses a <a href="http://www.hboehm.info/gc/complexity.html">mark-sweep</a> algorithm.
-It provides incremental and generational
-collection under operating systems which provide the right kind of
-virtual memory support. (Currently this includes SunOS[45], IRIX,
-OSF/1, Linux, and Windows, with varying restrictions.)
-It allows <a href="finalization.md"><i>finalization</i></a> code
-to be invoked when an object is collected.
-It can take advantage of type information to locate pointers if such
-information is provided, but it is usually used without such information.
-See the README and
-<tt>gc.h</tt> files in the distribution for more details.
-<p>
-For an overview of the implementation, see <a href="gcdescr.md">here</a>.
-</p><p>
-The garbage collector distribution includes a C string
-(<a type="text/plain" href="../include/cord.h"><i>cord</i></a>) package that provides
-for fast concatenation and substring operations on long strings.
-A simple curses- and win32-based editor that represents the entire file
-as a cord is included as a
-sample application.
-</p><p>
-Performance of the nonincremental collector is typically competitive
-with malloc/free implementations. Both space and time overhead are
-likely to be only slightly higher
-for programs written for malloc/free
-(see Detlefs, Dosser and Zorn's
-<a href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.30.3073&rep=rep1&type=ps">Memory Allocation Costs in Large C and C++ Programs</a>.)
-For programs allocating primarily very small objects, the collector
-may be faster; for programs allocating primarily large objects it will
-be slower. If the collector is used in a multi-threaded environment
-and configured for thread-local allocation, it may in some cases
-significantly outperform malloc/free allocation in time.
-</p><p>
-We also expect that in many cases any additional overhead
-will be more than compensated for by decreased copying etc.
-if programs are written
-and tuned for garbage collection.
-</p><h1><a name="further">Further Reading:</a></h1>
-<b>The beginnings of a frequently asked questions list for this
-collector are <a href="http://www.hboehm.info/gc/faq.html">here</a></b>.
-<p>
-<b>The following provide information on garbage collection in general</b>:
-</p><p>
-Paul Wilson's <a href="ftp://ftp.cs.utexas.edu/pub/garbage">garbage collection ftp archive</a> and <a href="ftp://ftp.cs.utexas.edu/pub/garbage/gcsurvey.ps">GC survey</a>.
-</p><p>
-The Ravenbrook <a href="http://www.memorymanagement.org/">
-Memory Management Reference</a>.
-</p><p>
-David Chase's
-<a href="http://www.iecc.com/gclist/GC-faq.html">GC FAQ</a>.
-</p><p>
-Richard Jones'
-<a href="https://www.cs.kent.ac.uk/people/staff/rej/gc.html">
-Garbage Collection Page</a> and
-<a href="http://www.cs.kent.ac.uk/people/staff/rej/gcbook/gcbook.html">
-his book</a>.
-</p><p>
-<b>The following papers describe the collector algorithms we use
-and the underlying design decisions at
-a higher level.</b>
-</p><p>
-(Some of the lower level details can be found
-<a href="gcdescr.md">here</a>.)
-</p><p>
-The first one is not available
-electronically due to copyright considerations. Most of the others are
-subject to ACM copyright.
-</p><p>
-Boehm, H., "Dynamic Memory Allocation and Garbage Collection", <i>Computers in Physics
-9</i>, 3, May/June 1995, pp. 297-303. This is directed at an otherwise sophisticated
-audience unfamiliar with memory allocation issues. The algorithmic details differ
-from those in the implementation. There is a related letter to the editor and a minor
-correction in the next issue.
-</p><p>
-Boehm, H., and <a href="http://www.ubiq.com/hypertext/weiser/weiser.html">M. Weiser</a>,
-<a href="http://www.hboehm.info/spe_gc_paper/">"Garbage Collection in an Uncooperative Environment"</a>,
-<i>Software Practice & Experience</i>, September 1988, pp. 807-820.
-</p><p>
-Boehm, H., A. Demers, and S. Shenker, <a href="http://www.hboehm.info/gc/papers/pldi91.ps.Z">"Mostly Parallel Garbage Collection"</a>,
-Proceedings of the ACM SIGPLAN '91 Conference on Programming Language Design and Implementation,
-<i>SIGPLAN Notices 26</i>, 6 (June 1991), pp. 157-164.
-</p><p>
-Boehm, H., <a href="http://www.hboehm.info/gc/papers/pldi93.ps.Z">"Space Efficient Conservative Garbage Collection"</a>,
-Proceedings of the ACM SIGPLAN '93 Conference on Programming Language Design
-and Implementation, <i>SIGPLAN Notices 28</i>, 6 (June 1993), pp. 197-206.
-</p><p>
-Boehm, H., "Reducing Garbage Collector Cache Misses",
-<i> Proceedings of the 2000 International Symposium on Memory Management </i>.
-<a href="http://portal.acm.org/citation.cfm?doid=362422.362438">
-Official version.</a>
-<a href="http://www.hpl.hp.com/techreports/2000/HPL-2000-99.html">
-Technical report version.</a> Describes the prefetch strategy
-incorporated into the collector for some platforms. Explains why
-the sweep phase of a "mark-sweep" collector should not really be
-a distinct phase.
-</p><p>
-M. Serrano, H. Boehm,
-"Understanding Memory Allocation of Scheme Programs",
-<i>Proceedings of the Fifth ACM SIGPLAN International Conference on
-Functional Programming</i>, 2000, Montreal, Canada, pp. 245-256.
-<a href="http://dl.acm.org/citation.cfm?id=351264">
-Official version.</a>
-<a href="http://www.hpl.hp.com/techreports/2000/HPL-2000-62.html">
-Earlier Technical Report version.</a> Includes some discussion of the
-collector debugging facilities for identifying causes of memory retention.
-</p><p>
-Boehm, H.,
-"Fast Multiprocessor Memory Allocation and Garbage Collection",
-<a href="http://www.hpl.hp.com/techreports/2000/HPL-2000-165.html">
-HP Labs Technical Report HPL 2000-165</a>. Discusses the parallel
-collection algorithms, and presents some performance results.
-</p><p>
-Boehm, H., "Bounding Space Usage of Conservative Garbage Collectors",
-<i>Proceedings of the 2002 ACM SIGPLAN-SIGACT Symposium on Principles of
-Programming Languages</i>, Jan. 2002, pp. 93-100.
-<a href="http://portal.acm.org/citation.cfm?doid=503272.503282">
-Official version.</a>
-<a href="http://www.hpl.hp.com/techreports/2001/HPL-2001-251.html">
-Technical report version.</a>
-Includes a discussion of a collector facility to much more reliably test for
-the potential of unbounded heap growth.
-</p><p>
-<b>The following papers discuss language and compiler restrictions necessary to guaranteed
-safety of conservative garbage collection.</b>
-</p><p>
-We thank John Levine and JCLT for allowing
-us to make the second paper available electronically, and providing PostScript for the final
-version.
-</p><p>
-Boehm, H., <a href="http://www.hboehm.info/gc/papers/pldi96.ps.gz">"Simple Garbage-Collector-Safety"</a>,
-Proceedings of the ACM SIGPLAN '96 Conference on Programming Language Design
-and Implementation.
-</p><p>
-Boehm, H., and D. Chase, <a href="http://www.hboehm.info/gc/papers/boecha.ps.gz">"A Proposal for Garbage-Collector-Safe C Compilation"</a>,
-<i>Journal of C Language Translation 4</i>, 2 (December 1992), pp. 126-141.
-</p><p>
-<b>Other related information: </b>
-</p><p>
-The Detlefs, Dosser and Zorn's <a href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.30.3073&rep=rep1&type=ps">Memory Allocation Costs in Large C and C++ Programs</a>.
- This is a performance comparison of the Boehm-Demers-Weiser collector to malloc/free,
-using programs written for malloc/free.
-</p><p>
-Joel Bartlett's <a href="ftp://gatekeeper.dec.com/pub/compaq/WRL/research-reports/WRL-TN-12.ps">mostly copying conservative garbage collector for C++</a>.
-</p><p>
-John Ellis and David Detlef's
-<a href="http://dl.acm.org/citation.cfm?id=1267983">Safe Efficient Garbage Collection for C++</a>
-proposal.
-</p><p>
-Henry Baker's <a href="http://home.pipeline.com/%7Ehbaker1/">paper collection</a>.
-</p><p>
-Slides for Hans Boehm's <a href="http://www.hboehm.info/gc/myths.ps">Allocation and GC Myths</a> talk.
-</p><h1><a name="users">Current users:</a></h1>
-Known current users of some variant of this collector include:
-<p>
-The runtime system for
-<a href="https://gcc.gnu.org/onlinedocs/gcc-4.8.5/gcj/">GCJ</a>,
-the static GNU java compiler.
-</p><p>
-<a href="http://w3m.sourceforge.net/">W3m</a>, a text-based web browser.
-</p><p>
-Some versions of the Xerox DocuPrint printer software.
-</p><p>
-The <a href="http://www.mozilla.org/">Mozilla</a> project, as leak
-detector.
-</p><p>
-The <a href="http://www.mono-project.com">Mono</a> project,
-an open source implementation of the .NET development framework.
-</p><p>
-The <a href="http://www.gnu.org/projects/dotgnu/">DotGNU Portable.NET
-project</a>, another open source .NET implementation.
-</p><p>
-The <a href="http://irssi.org/">Irssi IRC client</a>.
-</p><p>
-<a href="http://titanium.cs.berkeley.edu/">The Berkeley Titanium project</a>.
-</p><p>
-<a href="http://www.nag.co.uk/nagware/NP/NP_desc.asp">The NAGWare f90 Fortran 90 compiler</a>.
-</p><p>
-Elwood Corporation's Eclipse Common Lisp system, C library, and translator.
-</p><p>
-The <a href="http://www-sop.inria.fr/mimosa/fp/Bigloo/">Bigloo Scheme</a>
-and <a href="https://github.com/samoht/camloo">Camloo ML compilers</a>
-written by Manuel Serrano and others.
-</p><p>
-Brent Benson's <a href="http://www.cs.indiana.edu/scheme-repository/libscheme-vhll/final.html">libscheme</a>.
-</p><p>
-The MzScheme scheme implementation.
-</p><p>
-The <a href="http://www.cs.washington.edu/research/projects/cecil/www/cecil-home.html">University of Washington Cecil Implementation</a>.
-</p><p>
-<a href="http://www1.icsi.berkeley.edu/~sather/">The Berkeley Sather implementation</a>.
-</p><p>
-<a href="http://www.cs.berkeley.edu/~harmonia/harmonia/index.html">The Berkeley Harmonia Project</a>.
-</p><p>
-The <a href="http://www.cs.arizona.edu/projects/sumatra/toba/">Toba</a> Java Virtual
-Machine to C translator.
-</p><p>
-The <a href="http://www.gwydiondylan.org/">Gwydion Dylan compiler</a>.
-</p><p>
-The <a href="http://gcc.gnu.org/onlinedocs/gcc/Objective-C.html">
-GNU Objective C runtime</a>.
-</p><p>
-<a href="http://www.math.uiuc.edu/Macaulay2">Macaulay 2</a>, a system to support
-research in algebraic geometry and commutative algebra.
-</p><p>
-The <a href="http://www.vestasys.org/">Vesta</a> configuration management
-system.
-</p><p>
-<a href="http://www.visual-prolog.com/">Visual Prolog 6</a>.
-</p><p>
-<a href="http://asymptote.sf.net/">Asymptote LaTeX-compatible
-vector graphics language.</a>
-</p><h1><a name="collector">More information on the BDWGC primary site</a></h1>
-<a href="simple_example.md">A simple illustration of how to build and
-use the collector</a>.
-<p>
-<a href="gcinterface.md">Description of alternate interfaces to the
-garbage collector.</a>
-</p><p>
-<a href="http://www.hboehm.info/gc/04tutorial.pdf">Slides from an ISMM 2004 tutorial about the GC</a>.
-</p><p>
-<a href="http://www.hboehm.info/gc/faq.html">A FAQ (frequently asked questions) list</a>.
-</p><p>
-<a href="leak.md">How to use the garbage collector as a leak detector.</a>
-</p><p>
-<a href="debugging.md">Some hints on debugging garbage collected
-applications.</a>
-</p><p>
-<a href="gcdescr.md">An overview of the implementation of the
-garbage collector.</a>
-</p><p>
-<a href="tree.html">The data structure used for fast pointer lookups.</a>
-</p><p>
-<a href="scale.md">Scalability of the collector to multiprocessors.</a>
-</p><p>
-<a href="http://www.hboehm.info/gc/gc_source/">Directory</a> containing
-the distribution files of all garbage collector releases.
-It duplicates
-<a href="https://github.com/ivmai/bdwgc/wiki/Download">Download</a> page on
-GitHub.
-</p><h1><a name="background">More background information</a></h1>
-<a href="http://www.hboehm.info/gc/bounds.html">An attempt to establish a bound on space usage of
-conservative garbage collectors</a>.
-<p>
-<a href="http://www.hboehm.info/gc/complexity.html">Mark-sweep versus copying garbage collectors
-and their complexity</a>.
-</p><p>
-<a href="http://www.hboehm.info/gc/conservative.html">Pros and cons of conservative garbage collectors,
-in comparison to other collectors</a>.
-</p><p>
-<a href="http://www.hboehm.info/gc/issues.html">Issues related to garbage collection vs.
-manual memory management in C/C++</a>.
-</p><p>
-<a href="http://www.hboehm.info/gc/example.html">An example of a case in which garbage collection
-results in a much faster implementation as a result of reduced synchronization</a>.
-</p><p>
-<a href="http://www.hboehm.info/gc/nonmoving/">Slide set discussing performance of nonmoving
-garbage collectors</a>.
-</p><p>
-<a href="http://www.hboehm.info/popl03/web/">
-Slide set discussing <i>Destructors, Finalizers, and Synchronization</i>
-(POPL 2003)</a>.
-</p><p>
-<a href="http://portal.acm.org/citation.cfm?doid=604131.604153">
-Paper corresponding to above slide set</a>
-(<a href="http://www.hpl.hp.com/techreports/2002/HPL-2002-335.html">
-Technical Report version</a>).
-</p><p>
-<a href="http://www.hboehm.info/gc/gc_bench/">A Java/Scheme/C/C++ garbage collection benchmark</a>.
-</p><p>
-<a href="http://www.hboehm.info/gc/myths.ps">Slides for talk on memory allocation myths</a>.
-</p><p>
-<a href="http://www.hboehm.info/gc/gctalk.ps">Slides for OOPSLA 98 garbage collection talk</a>.
-</p><p>
-<a href="http://www.hboehm.info/gc/papers/">Related papers</a>.
-</p><h1><a name="contacts">Contacts and new release announcements</a></h1>
-GitHub and Stack Overflow are the major two places for communication.
-<p>
-Technical questions (how to, how does it work, etc.) should be posted to
-<a href="https://stackoverflow.com/questions/tagged/boehm-gc">Stack Overflow</a>
-with "boehm-gc" tag.
-</p><p>
-To contribute, please rebase your code to the latest
-<a href="https://github.com/ivmai/bdwgc/tree/master/">master</a> and submit
-a <a href="https://github.com/ivmai/bdwgc/pulls">pull request</a> to GitHub.
-</p><p>
-To report a bug, or propose (request) a new feature, create
-a <a href="https://github.com/ivmai/bdwgc/issues">GitHub issue</a>.
-Please make sure it has not been reported yet by someone else.
-</p><p>
-To receive notifications on every release, please subscribe to
-<a href="https://github.com/ivmai/bdwgc/releases.atom">Releases RSS feed</a>.
-Notifications on all issues and pull requests are available by
-<a href="https://github.com/ivmai/bdwgc/watchers">watching</a> the project.
-</p><p>
-Mailing lists (bdwgc-announce@lists.opendylan.org, bdwgc@lists.opendylan.org,
-and the former gc-announce@linux.hpl.hp.com and gc@linux.hpl.hp.com) are not
-used at this moment. Their content is available in
-<a href="https://github.com/ivmai/bdwgc/files/1037650/bdwgc-announce-mailing-list-archive-2014_02.tar.gz">bdwgc-announce</a>
-and
-<a href="https://github.com/ivmai/bdwgc/files/1038163/bdwgc-mailing-list-archive-2017_04.tar.gz">bdwgc</a>
-archive files, respectively.
-The gc list archive may also be read at
-<a href="http://bdwgc.opendylan.narkive.com">Narkive</a>.
-</p><p>
-Some prior discussion of the collector has taken place on the gcc
-java mailing list, whose archives appear
-<a href="http://gcc.gnu.org/ml/java/">here</a>, and also on
-<a href="http://lists.tunes.org/mailman/listinfo/gclist">gclist@iecc.com</a>.
-</p></body></html>
--- /dev/null
+[Interface Overview](gcinterface.md) | [Tutorial Slides](http://www.hboehm.info/gc/04tutorial.pdf) | [FAQ](http://www.hboehm.info/gc/faq.html) | [Example](simple_example.md) | [Download](https://github.com/ivmai/bdwgc/wiki/Download) | [License](http://www.hboehm.info/gc/license.txt)
+---|---|---|---|---|---
+
+# A garbage collector for C and C++
+
+ * Platforms
+ * Scalable multiprocessor versions
+ * Some collector details
+ * Further reading
+ * Current users
+ * Local Links for this collector
+ * Local Background Links
+ * Contacts and Mailing List
+
+[ This is an updated version of the page formerly at
+`www.hpl.hp.com/personal/Hans_Boehm/gc/`, before that at
+`http://reality.sgi.com/boehm/gc.html` and before that at
+`ftp://ftp.parc.xerox.com/pub/gc/gc.html`. ]
+
+The
+[Boehm](http://www.hboehm.info)-[Demers](http://www.cs.cornell.edu/annual_report/00-01/bios.htm#demers)-[Weiser](http://www.ubiq.com/hypertext/weiser/weiser.html)
+conservative Garbage Collector (**BDWGC**) can be used as a garbage collecting
+replacement for C `malloc` or C++ `new`. It allows you to allocate memory
+basically as you normally would, without explicitly deallocating memory that
+is no longer useful. The collector automatically recycles memory when
+it determines that it can no longer be otherwise accessed. A simple example
+of such a use is given [here](simple_example.md).
+
+The collector is also used by a number of programming language implementations
+that either use C as intermediate code, want to facilitate easier
+interoperation with C libraries, or just prefer the simple collector
+interface. For a more detailed description of the interface, see
+[here](gcinterface.md).
+
+Alternatively, the garbage collector may be used as a [leak detector](leak.md)
+for C or C++ programs, though that is not its primary goal.
+
+Typically several versions are offered for
+[downloading](https://github.com/ivmai/bdwgc/wiki/Download): preview, stable,
+legacy. Usually you should use the one marked as the _latest stable_ release.
+Preview versions may contain additional features, platform support, but are
+likely to be less well tested. The list of changes for each version
+is specified on the [releases](https://github.com/ivmai/bdwgc/releases) page.
+
+The arguments for and against conservative garbage collection in C and C++ are
+briefly discussed [here](http://www.hboehm.info/gc/issues.html). The
+beginnings of a frequently-asked-questions list are
+[here](http://www.hboehm.info/gc/faq.html).
+
+The garbage collector code is copyrighted by
+[Hans-J. Boehm](http://www.hboehm.info), Alan J. Demers,
+[Xerox Corporation](http://www.xerox.com/),
+[Silicon Graphics](http://www.sgi.com/), and
+[Hewlett-Packard Company](http://www.hp.com/). It may be used and copied
+without payment of a fee under minimal restrictions. See the README.md file
+in the distribution or the [license](http://www.hboehm.info/gc/license.txt)
+for more details. **IT IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY
+EXPRESSED OR IMPLIED. ANY USE IS AT YOUR OWN RISK**.
+
+Empirically, this collector works with most unmodified C programs, simply
+by replacing `malloc` with `GC_malloc` calls, replacing `realloc` with
+`GC_realloc` calls, and removing free calls. Exceptions are discussed
+[here](http://www.hboehm.info/gc/issues.html).
+
+## Platforms
+
+The collector is not completely portable, but the distribution includes ports
+to most standard PC and UNIX/Linux platforms. The collector should work
+on Linux, *BSD, recent Windows versions, MacOS X, HP/UX, Solaris, Tru64, Irix
+and a few other operating systems. Some ports are more polished than others.
+
+Irix pthreads, Linux threads, Win32 threads, Solaris threads (pthreads only),
+HP/UX 11 pthreads, Tru64 pthreads, and MacOS X threads are supported in recent
+versions.
+
+### Separately distributed ports
+
+For MacOS 9/Classic use, Patrick Beard's latest port is available from
+`http://homepage.mac.com/pcbeard/gc/`. (Unfortunately, that's now quite dated.
+I'm not in a position to test under MacOS. Although I try to incorporate
+changes, it is impossible for me to update the project file.)
+
+Precompiled versions of the collector for NetBSD are available
+[here](ftp://ftp.netbsd.org/pub/pkgsrc/current/pkgsrc/devel/boehm-gc/README.html).
+
+
+[Debian Linux](http://www.debian.org/) includes prepackaged versions of the
+collector.
+
+## Scalable multiprocessor versions
+
+Kenjiro Taura, Toshio Endo, and Akinori Yonezawa have made available
+a [parallel collector](http://ieeexplore.ieee.org/abstract/document/1592629/)
+based on this one. Their collector takes advantage of multiple processors
+during a collection. Starting with GC v6.0alpha1 we also do this, though with
+more modest processor scalability goals. Our approach is discussed briefly
+in [this document](scale.md).
+
+## Some Collector Details
+
+The collector uses a [mark-sweep](http://www.hboehm.info/gc/complexity.html)
+algorithm. It provides incremental and generational collection under operating
+systems which provide the right kind of virtual memory support. (Currently
+this includes SunOS[45], IRIX, OSF/1, Linux, and Windows, with varying
+restrictions.) It allows [_finalization_](finalization.md) code to be invoked
+when an object is collected. It can take advantage of type information
+to locate pointers if such information is provided, but it is usually used
+without such information. See the README and `gc.h` files in the distribution
+for more details.
+
+For an overview of the implementation, see [here](gcdescr.md).
+
+The garbage collector distribution includes a C string (`cord.h`) package that
+provides for fast concatenation and substring operations on long strings.
+A simple curses- and win32-based editor that represents the entire file as
+a cord is included as a sample application.
+
+Performance of the non-incremental collector is typically competitive with
+`malloc`/`free` implementations. Both space and time overhead are likely to be
+only slightly higher for programs written for `malloc`/`free` (see Detlefs,
+Dosser and Zorn's
+[Memory Allocation Costs in Large C and C++ Programs](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.30.3073&rep=rep1&type=ps)).
+For programs allocating primarily very small objects, the collector may be
+faster; for programs allocating primarily large objects it will be slower.
+If the collector is used in a multi-threaded environment and configured for
+thread-local allocation, it may in some cases significantly outperform
+`malloc`/`free` allocation in time.
+
+We also expect that in many cases any additional overhead will be more than
+compensated for by decreased copying etc. if programs are written and tuned
+for garbage collection.
+
+# Further Reading:
+
+**The beginnings of a frequently asked questions list for this collector are
+[here](http://www.hboehm.info/gc/faq.html)**.
+
+**The following provide information on garbage collection in general**: Paul
+Wilson's [garbage collection ftp archive](ftp://ftp.cs.utexas.edu/pub/garbage)
+and [GC survey](ftp://ftp.cs.utexas.edu/pub/garbage/gcsurvey.ps).
+
+The Ravenbrook
+[Memory Management Reference](http://www.memorymanagement.org/).
+
+David Chase's [GC FAQ](http://www.iecc.com/gclist/GC-faq.html).
+
+Richard Jones'
+[Garbage Collection Page](https://www.cs.kent.ac.uk/people/staff/rej/gc.html)
+and his [book](http://www.cs.kent.ac.uk/people/staff/rej/gcbook/gcbook.html).
+
+**The following papers describe the collector algorithms we use and the
+underlying design decisions at a higher level.**
+
+(Some of the lower level details can be found [here](gcdescr.md).)
+
+The first one is not available electronically due to copyright considerations.
+Most of the others are subject to ACM copyright.
+
+Boehm, H., Dynamic Memory Allocation and Garbage Collection,
+_Computers in Physics 9_, 3, May/June 1995, pp. 297-303. This is directed
+at an otherwise sophisticated audience unfamiliar with memory allocation
+issues. The algorithmic details differ from those in the implementation. There
+is a related letter to the editor and a minor correction in the next issue.
+
+Boehm, H., and [M. Weiser](http://www.ubiq.com/hypertext/weiser/weiser.html),
+[Garbage Collection in an Uncooperative Environment](http://www.hboehm.info/spe_gc_paper/),
+_Software Practice & Experience_, September 1988, pp. 807-820.
+
+Boehm, H., A. Demers, and S. Shenker,
+[Mostly Parallel Garbage Collection](http://www.hboehm.info/gc/papers/pldi91.ps.Z),
+Proceedings of the ACM SIGPLAN '91 Conference on Programming Language Design
+and Implementation, _SIGPLAN Notices 26_, 6 (June 1991), pp. 157-164.
+
+Boehm, H.,
+[Space Efficient Conservative Garbage Collection](http://www.hboehm.info/gc/papers/pldi93.ps.Z),
+Proceedings of the ACM SIGPLAN '93 Conference on Programming Language Design
+and Implementation, _SIGPLAN Notices 28_, 6 (June 1993), pp. 197-206.
+
+Boehm, H., Reducing Garbage Collector Cache Misses,
+_Proceedings of the 2000 International Symposium on Memory Management_.
+[Official version](http://portal.acm.org/citation.cfm?doid=362422.362438).
+[Technical report](http://www.hpl.hp.com/techreports/2000/HPL-2000-99.html)
+version. Describes the prefetch strategy incorporated into the collector for
+some platforms. Explains why the sweep phase of a _mark-sweep_ collector
+should not really be a distinct phase.
+
+M. Serrano, H. Boehm, Understanding Memory Allocation of Scheme Programs,
+_Proceedings of the Fifth ACM SIGPLAN International Conference on Functional
+Programming_, 2000, Montreal, Canada, pp. 245-256.
+[Official version](http://dl.acm.org/citation.cfm?id=351264). Earlier
+[Technical Report](http://www.hpl.hp.com/techreports/2000/HPL-2000-62.html)
+version. Includes some discussion of the collector debugging facilities for
+identifying causes of memory retention.
+
+Boehm, H., Fast Multiprocessor Memory Allocation and Garbage Collection,
+[HP Labs Technical Report HPL 2000-165](http://www.hpl.hp.com/techreports/2000/HPL-2000-165.html).
+Discusses the parallel collection algorithms, and presents some performance
+results.
+
+Boehm, H., Bounding Space Usage of Conservative Garbage Collectors,
+_Proceedings of the 2002 ACM SIGPLAN-SIGACT Symposium on Principles
+of Programming Languages_, Jan. 2002, pp. 93-100.
+[Official version](http://portal.acm.org/citation.cfm?doid=503272.503282).
+[Technical report](http://www.hpl.hp.com/techreports/2001/HPL-2001-251.html)
+version. Includes a discussion of a collector facility to much more reliably
+test for the potential of unbounded heap growth.
+
+**The following papers discuss language and compiler restrictions necessary
+to guaranteed safety of conservative garbage collection.**
+
+We thank John Levine and JCLT for allowing us to make the second paper
+available electronically, and providing PostScript for the final version.
+
+Boehm, H.,
+[Simple Garbage-Collector-Safety](http://www.hboehm.info/gc/papers/pldi96.ps.gz),
+Proceedings of the ACM SIGPLAN '96 Conference on Programming Language Design
+and Implementation.
+
+Boehm, H., and D. Chase,
+[A Proposal for Garbage-Collector-Safe C Compilation](http://www.hboehm.info/gc/papers/boecha.ps.gz),
+_Journal of C Language Translation 4_, 2 (December 1992), pp. 126-141.
+
+**Other related information:**
+
+The Detlefs, Dosser and Zorn's
+[Memory Allocation Costs in Large C and C++ Programs](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.30.3073&rep=rep1&type=ps).
+This is a performance comparison of the Boehm-Demers-Weiser collector
+to `malloc`/`free`, using programs written for `malloc`/`free`.
+
+Joel Bartlett's
+[mostly copying conservative garbage collector for C++](ftp://gatekeeper.dec.com/pub/compaq/WRL/research-reports/WRL-TN-12.ps).
+
+John Ellis and David Detlef's
+[Safe Efficient Garbage Collection for C++](http://dl.acm.org/citation.cfm?id=1267983)
+proposal.
+
+Henry Baker's [paper collection](http://home.pipeline.com/%7Ehbaker1/).
+
+Slides for Hans Boehm's
+[Allocation and GC Myths](http://www.hboehm.info/gc/myths.ps) talk.
+
+# Current users:
+
+Known current users of some variant of this collector include:
+
+The runtime system for [GCJ](https://gcc.gnu.org/onlinedocs/gcc-4.8.5/gcj/),
+the static GNU java compiler.
+
+[W3m](http://w3m.sourceforge.net/), a text-based web browser.
+
+Some versions of the Xerox DocuPrint printer software.
+
+The [Mozilla](http://www.mozilla.org/) project, as leak detector.
+
+The [Mono](http://www.mono-project.com) project, an open source implementation
+of the .NET development framework.
+
+The [DotGNU Portable.NET project](http://www.gnu.org/projects/dotgnu/),
+another open source .NET implementation.
+
+The [Irssi IRC client](http://irssi.org/).
+
+[The Berkeley Titanium project](http://titanium.cs.berkeley.edu/).
+
+[The NAGWare f90 Fortran 90 compiler](http://www.nag.co.uk/nagware/NP/NP_desc.asp).
+
+Elwood Corporation's Eclipse Common Lisp system, C library, and translator.
+
+The [Bigloo Scheme](http://www-sop.inria.fr/mimosa/fp/Bigloo/) and
+[Camloo ML compilers](https://github.com/samoht/camloo) written by Manuel
+Serrano and others.
+
+Brent Benson's
+[libscheme](http://www.cs.indiana.edu/scheme-repository/libscheme-vhll/final.html).
+
+The MzScheme scheme implementation.
+
+The
+[University of Washington Cecil Implementation](http://www.cs.washington.edu/research/projects/cecil/www/cecil-home.html).
+
+[The Berkeley Sather implementation](http://www1.icsi.berkeley.edu/~sather/).
+
+[The Berkeley Harmonia Project](http://www.cs.berkeley.edu/~harmonia/harmonia/index.html).
+
+The
+[Toba](http://www.cs.arizona.edu/projects/sumatra/toba/) Java Virtual Machine
+to C translator.
+
+The [Gwydion Dylan compiler](http://www.gwydiondylan.org/).
+
+The
+[GNU Objective C runtime](http://gcc.gnu.org/onlinedocs/gcc/Objective-C.html).
+
+[Macaulay 2](http://www.math.uiuc.edu/Macaulay2), a system to support research
+in algebraic geometry and commutative algebra.
+
+The [Vesta](http://www.vestasys.org/) configuration management system.
+
+[Visual Prolog 6](http://www.visual-prolog.com/).
+
+[Asymptote LaTeX-compatible vector graphics language](http://asymptote.sf.net/).
+
+# More information on the BDWGC primary site
+
+[A simple illustration of how to build and use the collector](simple_example.md).
+
+[Description of alternate interfaces to the garbage collector](gcinterface.md).
+
+[Slides from an ISMM 2004 tutorial about the GC](http://www.hboehm.info/gc/04tutorial.pdf).
+
+[A FAQ (frequently asked questions) list](http://www.hboehm.info/gc/faq.html).
+
+[How to use the garbage collector as a leak detector](leak.md).
+
+[Some hints on debugging garbage collected applications](debugging.md).
+
+[An overview of the implementation of the garbage collector](gcdescr.md).
+
+[The data structure used for fast pointer lookups](tree.md).
+
+[Scalability of the collector to multiprocessors](scale.md).
+
+[Directory](http://www.hboehm.info/gc/gc_source/) containing the distribution
+files of all garbage collector releases. It duplicates
+[Download](https://github.com/ivmai/bdwgc/wiki/Download) page on GitHub.
+
+# More background information
+
+[An attempt to establish a bound on space usage of conservative garbage collectors](http://www.hboehm.info/gc/bounds.html).
+
+[Mark-sweep versus copying garbage collectors and their complexity](http://www.hboehm.info/gc/complexity.html).
+
+[Pros and cons of conservative garbage collectors, in comparison to other collectors](http://www.hboehm.info/gc/conservative.html).
+
+[Issues related to garbage collection vs. manual memory management in C/C++](http://www.hboehm.info/gc/issues.html).
+
+[An example of a case in which garbage collection results in a much faster implementation as a result of reduced synchronization](http://www.hboehm.info/gc/example.html).
+
+[Slide set discussing performance of nonmoving garbage collectors](http://www.hboehm.info/gc/nonmoving/).
+
+[Slide set discussing _Destructors, Finalizers, and Synchronization_, POPL 2003](http://www.hboehm.info/popl03/web/).
+
+[Paper corresponding to above slide set](http://portal.acm.org/citation.cfm?doid=604131.604153)
+([Technical Report](http://www.hpl.hp.com/techreports/2002/HPL-2002-335.html)
+version).
+
+[A Java/Scheme/C/C++ garbage collection benchmark](http://www.hboehm.info/gc/gc_bench/).
+
+[Slides for talk on memory allocation myths](http://www.hboehm.info/gc/myths.ps).
+
+[Slides for OOPSLA 98 garbage collection talk](http://www.hboehm.info/gc/gctalk.ps).
+
+[Related papers](http://www.hboehm.info/gc/papers/).
+
+# Contacts and new release announcements
+
+GitHub and Stack Overflow are the major two places for communication.
+
+Technical questions (how to, how does it work, etc.) should be posted
+to [Stack Overflow](https://stackoverflow.com/questions/tagged/boehm-gc) with
+_boehm-gc_ tag.
+
+To contribute, please rebase your code to the latest
+[master](https://github.com/ivmai/bdwgc/tree/master/) and submit
+a [pull request](https://github.com/ivmai/bdwgc/pulls) to GitHub.
+
+To report a bug, or propose (request) a new feature, create
+a [GitHub issue](https://github.com/ivmai/bdwgc/issues). Please make sure
+it has not been reported yet by someone else.
+
+To receive notifications on every release, please subscribe to
+[Releases RSS feed](https://github.com/ivmai/bdwgc/releases.atom).
+Notifications on all issues and pull requests are available
+by [watching](https://github.com/ivmai/bdwgc/watchers) the project.
+
+Mailing lists (bdwgc-announce@lists.opendylan.org, bdwgc@lists.opendylan.org,
+and the former gc-announce@linux.hpl.hp.com and gc@linux.hpl.hp.com) are not
+used at this moment. Their content is available
+in
+[bdwgc-announce](https://github.com/ivmai/bdwgc/files/1037650/bdwgc-announce-mailing-list-archive-2014_02.tar.gz)
+and
+[bdwgc](https://github.com/ivmai/bdwgc/files/1038163/bdwgc-mailing-list-archive-2017_04.tar.gz)
+archive files, respectively. The gc list archive may also be read
+at [Narkive](http://bdwgc.opendylan.narkive.com).
+
+Some prior discussion of the collector has taken place on the gcc java mailing
+list, whose archives appear [here](http://gcc.gnu.org/ml/java/), and also
+on [gclist@iecc.com](http://lists.tunes.org/mailman/listinfo/gclist).
+++ /dev/null
-<HTML>
-<HEAD>
- <TITLE> Two-Level Tree Structure for Fast Pointer Lookup</TITLE>
- <AUTHOR> Hans-J. Boehm, Silicon Graphics (now at HP)</author>
-</HEAD>
-<BODY>
-<H1>Two-Level Tree Structure for Fast Pointer Lookup</h1>
-<P>
-The BDWGC conservative garbage collector uses a 2-level tree
-data structure to aid in fast pointer identification.
-This data structure is described in a bit more detail here, since
-<OL>
-<LI> Variations of the data structure are more generally useful.
-<LI> It appears to be hard to understand by reading the code.
-<LI> Some other collectors appear to use inferior data structures to
-solve the same problem.
-<LI> It is central to fast collector operation.
-</ol>
-A candidate pointer is divided into three sections, the <I>high</i>,
-<I>middle</i>, and <I>low</i> bits. The exact division between these
-three groups of bits is dependent on the detailed collector configuration.
-<P>
-The high and middle bits are used to look up an entry in the table described
-here. The resulting table entry consists of either a block descriptor
-(<TT>struct hblkhdr *</tt> or <TT>hdr *</tt>)
-identifying the layout of objects in the block, or an indication that this
-address range corresponds to the middle of a large block, together with a
-hint for locating the actual block descriptor. Such a hint consist
-of a displacement that can be subtracted from the middle bits of the candidate
-pointer without leaving the object.
-<P>
-In either case, the block descriptor (<TT>struct hblkhdr</tt>)
-refers to a table of object starting addresses (the <TT>hb_map</tt> field).
-The starting address table is indexed by the low bits if the candidate pointer.
-The resulting entry contains a displacement to the beginning of the object,
-or an indication that this cannot be a valid object pointer.
-(If all interior pointer are recognized, pointers into large objects
-are handled specially, as appropriate.)
-
-<H2>The Tree</h2>
-<P>
-The rest of this discussion focuses on the two level data structure
-used to map the high and middle bits to the block descriptor.
-<P>
-The high bits are used as an index into the <TT>GC_top_index</tt> (really
-<TT>GC_arrays._top_index</tt>) array. Each entry points to a
-<TT>bottom_index</tt> data structure. This structure in turn consists
-mostly of an array <TT>index</tt> indexed by the middle bits of
-the candidate pointer. The <TT>index</tt> array contains the actual
-<TT>hdr</tt> pointers.
-<P>
-Thus a pointer lookup consists primarily of a handful of memory references,
-and can be quite fast:
-<OL>
-<LI> The appropriate <TT>bottom_index</tt> pointer is looked up in
-<TT>GC_top_index</tt>, based on the high bits of the candidate pointer.
-<LI> The appropriate <TT>hdr</tt> pointer is looked up in the
-<TT>bottom_index</tt> structure, based on the middle bits.
-<LI> The block layout map pointer is retrieved from the <TT>hdr</tt>
-structure. (This memory reference is necessary since we try to share
-block layout maps.)
-<LI> The displacement to the beginning of the object is retrieved from the
-above map.
-</ol>
-<P>
-In order to conserve space, not all <TT>GC_top_index</tt> entries in fact
-point to distinct <TT>bottom_index</tt> structures. If no address with
-the corresponding high bits is part of the heap, then the entry points
-to <TT>GC_all_nils</tt>, a single <TT>bottom_index</tt> structure consisting
-only of NULL <TT>hdr</tt> pointers.
-<P>
-<TT>Bottom_index</tt> structures contain slightly more information than
-just <TT>hdr</tt> pointers. The <TT>asc_link</tt> field is used to link
-all <TT>bottom_index</tt> structures in ascending order for fast traversal.
-This list is pointed to be <TT>GC_all_bottom_indices</tt>.
-It is maintained with the aid of <TT>key</tt> field that contains the
-high bits corresponding to the <TT>bottom_index</tt>.
-
-<H2>64 bit addresses</h2>
-<P>
-In the case of 64 bit addresses, this picture is complicated slightly
-by the fact that one of the index structures would have to be huge to
-cover the entire address space with a two level tree. We deal with this
-by turning <TT>GC_top_index</tt> into a chained hash table, instead of
-a simple array. This adds a <TT>hash_link</tt> field to the
-<TT>bottom_index</tt> structure.
-<P>
-The "hash function" consists of dropping the high bits. This is cheap to
-compute, and guarantees that there will be no collisions if the heap
-is contiguous and not excessively large.
-
-<H2>A picture</h2>
-<P>
-The following is an ASCII diagram of the data structure.
-This was contributed by Dave Barrett several years ago.
-<PRE>
-
- Data Structure used by GC_base in gc3.7:
- 21-Apr-94
-
-
-
-
- 63 LOG_TOP_SZ[11] LOG_BOTTOM_SZ[10] LOG_HBLKSIZE[13]
- +------------------+----------------+------------------+------------------+
- p:| | TL_HASH(hi) | | HBLKDISPL(p) |
- +------------------+----------------+------------------+------------------+
- \-----------------------HBLKPTR(p)-------------------/
- \------------hi-------------------/
- \______ ________/ \________ _______/ \________ _______/
- V V V
- | | |
- GC_top_index[] | | |
- --- +--------------+ | | |
- ^ | | | | |
- | | | | | |
- TOP +--------------+<--+ | |
- _SZ +-<| [] | * | |
-(items)| +--------------+ if 0 < bi< HBLKSIZE | |
- | | | | then large object | |
- | | | | starts at the bi'th | |
- v | | | HBLK before p. | i |
- --- | +--------------+ | (word- |
- v | aligned) |
- bi= |GET_BI(p){->hash_link}->key==hi | |
- v | |
- | (bottom_index) \ scratch_alloc'd | |
- | ( struct bi ) / by get_index() | |
- --- +->+--------------+ | |
- ^ | | | |
- ^ | | | |
- BOTTOM | | ha=GET_HDR_ADDR(p) | |
-_SZ(items)+--------------+<----------------------+ +-------+
- | +--<| index[] | |
- | | +--------------+ GC_obj_map: v
- | | | | from / +-+-+-----+-+-+-+-+ ---
- v | | | GC_add < 0| | | | | | | | ^
- --- | +--------------+ _map_entry \ +-+-+-----+-+-+-+-+ |
- | | asc_link | +-+-+-----+-+-+-+-+ MAXOBJSZ
- | +--------------+ +-->| | | j | | | | | +1
- | | key | | +-+-+-----+-+-+-+-+ |
- | +--------------+ | +-+-+-----+-+-+-+-+ |
- | | hash_link | | | | | | | | | | v
- | +--------------+ | +-+-+-----+-+-+-+-+ ---
- | | |<--MAX_OFFSET--->|
- | | (bytes)
-HDR(p)| GC_find_header(p) | |<--MAP_ENTRIES-->|
- | \ from | =HBLKSIZE/WORDSZ
- | (hdr) (struct hblkhdr) / alloc_hdr() | (1024 on Alpha)
- +-->+----------------------+ | (8/16 bits each)
-GET_HDR(p)| word hb_sz (words) | |
- +----------------------+ |
- | struct hblk *hb_next | |
- +----------------------+ |
- |mark_proc hb_mark_proc| |
- +----------------------+ |
- | char * hb_map |>-------------+
- +----------------------+
- | ushort hb_obj_kind |
- +----------------------+
- | hb_last_reclaimed |
- --- +----------------------+
- ^ | |
- MARK_BITS| hb_marks[] | *if hdr is free, hb_sz
-_SZ(words)| | is the size of a heap chunk (struct hblk)
- v | | of at least MININCR*HBLKSIZE bytes (below),
- --- +----------------------+ otherwise, size of each object in chunk.
-
-Dynamic data structures above are interleaved throughout the heap in blocks of
-size MININCR * HBLKSIZE bytes as done by gc_scratch_alloc which cannot be
-freed; free lists are used (e.g. alloc_hdr). HBLK's below are collected.
-
- (struct hblk) HDR_BYTES
- --- +----------------------+ < HBLKSIZE --- (bytes)
- ^ +-----hb_body----------+ (and WORDSZ) ^ --- ---
- | | | aligned | ^ ^
- | | | | hb_sz |
- | | | | (words) |
- | | Object 0 | | | |
- | | | i |(word- v |
- | + - - - - - - - - - - -+ --- (bytes)|aligned) --- |
- | | | ^ | ^ |
- | | | j (words) | | |
- n * | Object 1 | v v hb_sz BODY_SZ
- HBLKSIZE | |--------------- | (words)
- (bytes) | | v MAX_OFFSET
- | + - - - - - - - - - - -+ --- (bytes)
- | | | !ALL_INTERIOR_POINTERS ^ |
- | | | sets j only for hb_sz |
- | | Object N | valid object offsets. | |
- v | | All objects WORDSZ v v
- --- +----------------------+ aligned. --- ---
-
-</PRE>
-</body>
--- /dev/null
+# Two-Level Tree Structure for Fast Pointer Lookup
+
+The Boehm-Demers-Weiser conservative Garbage Collector uses a 2-level tree
+data structure to aid in fast pointer identification. This data structure
+is described in a bit more detail here, since
+
+ 1. Variations of the data structure are more generally useful.
+ 2. It appears to be hard to understand by reading the code.
+ 3. Some other collectors appear to use inferior data structures to solve the
+ same problem.
+ 4. It is central to fast collector operation. A candidate pointer
+ is divided into three sections, the _high_, _middle_, and _low_ bits. The
+ exact division between these three groups of bits is dependent on the
+ detailed collector configuration.
+
+The high and middle bits are used to look up an entry in the table described
+here. The resulting table entry consists of either a block descriptor
+(`struct hblkhdr *` or `hdr *`) identifying the layout of objects in the
+block, or an indication that this address range corresponds to the middle of
+a large block, together with a hint for locating the actual block descriptor.
+Such a hint consist of a displacement that can be subtracted from the middle
+bits of the candidate pointer without leaving the object.
+
+In either case, the block descriptor (`struct hblkhdr`) refers to a table
+of object starting addresses (the `hb_map` field). The starting address table
+is indexed by the low bits if the candidate pointer. The resulting entry
+contains a displacement to the beginning of the object, or an indication that
+this cannot be a valid object pointer. (If all interior pointer are
+recognized, pointers into large objects are handled specially,
+as appropriate.)
+
+## The Tree
+
+The rest of this discussion focuses on the two level data structure used
+to map the high and middle bits to the block descriptor.
+
+The high bits are used as an index into the `GC_top_index` (really
+`GC_arrays._top_index`) array. Each entry points to a `bottom_index` data
+structure. This structure in turn consists mostly of an array `index` indexed
+by the middle bits of the candidate pointer. The `index` array contains the
+actual `hdr` pointers.
+
+Thus a pointer lookup consists primarily of a handful of memory references,
+and can be quite fast:
+
+ 1. The appropriate `bottom_index` pointer is looked up in `GC_top_index`,
+ based on the high bits of the candidate pointer.
+ 2. The appropriate `hdr` pointer is looked up in the `bottom_index`
+ structure, based on the middle bits.
+ 3. The block layout map pointer is retrieved from the `hdr` structure. (This
+ memory reference is necessary since we try to share block layout maps.)
+ 4. The displacement to the beginning of the object is retrieved from the
+ above map.
+
+In order to conserve space, not all `GC_top_index` entries in fact point
+to distinct `bottom_index` structures. If no address with the corresponding
+high bits is part of the heap, then the entry points to `GC_all_nils`,
+a single `bottom_index` structure consisting only of `NULL` `hdr` pointers.
+
+`Bottom_index` structures contain slightly more information than just `hdr`
+pointers. The `asc_link` field is used to link all `bottom_index` structures
+in ascending order for fast traversal. This list is pointed to be
+`GC_all_bottom_indices`. It is maintained with the aid of `key` field that
+contains the high bits corresponding to the `bottom_index`.
+
+## 64-bit addresses
+
+In the case of 64-bit addresses, this picture is complicated slightly by the
+fact that one of the index structures would have to be huge to cover the
+entire address space with a two level tree. We deal with this by turning
+`GC_top_index` into a chained hash table, instead of a simple array. This adds
+a `hash_link` field to the `bottom_index` structure.
+
+The _hash function_ consists of dropping the high bits. This is cheap
+to compute, and guarantees that there will be no collisions if the heap
+is contiguous and not excessively large.
+
+## A picture
+
+The following is an _ASCII_ diagram of the data structure used by GC_base (as
+of GC v3.7, Apr 21, 1994). This was contributed by Dave Barrett.
+
+
+ 63 LOG_TOP_SZ[11] LOG_BOTTOM_SZ[10] LOG_HBLKSIZE[13]
+ +------------------+----------------+------------------+------------------+
+ p:| | TL_HASH(hi) | | HBLKDISPL(p) |
+ +------------------+----------------+------------------+------------------+
+ \-----------------------HBLKPTR(p)-------------------/
+ \------------hi-------------------/
+ \______ ________/ \________ _______/ \________ _______/
+ V V V
+ | | |
+ GC_top_index[] | | |
+ --- +--------------+ | | |
+ ^ | | | | |
+ | | | | | |
+ TOP +--------------+<--+ | |
+ _SZ +-<| [] | * | |
+ (items)| +--------------+ if 0 < bi< HBLKSIZE | |
+ | | | | then large object | |
+ | | | | starts at the bi'th | |
+ v | | | HBLK before p. | i |
+ --- | +--------------+ | (word- |
+ v | aligned) |
+ bi= |GET_BI(p){->hash_link}->key==hi | |
+ v | |
+ | (bottom_index) \ scratch_alloc'd | |
+ | ( struct bi ) / by get_index() | |
+ --- +->+--------------+ | |
+ ^ | | | |
+ ^ | | | |
+ BOTTOM | | ha=GET_HDR_ADDR(p) | |
+ _SZ(items)+--------------+<----------------------+ +-------+
+ | +--<| index[] | |
+ | | +--------------+ GC_obj_map: v
+ | | | | from / +-+-+-----+-+-+-+-+ ---
+ v | | | GC_add < 0| | | | | | | | ^
+ --- | +--------------+ _map_entry \ +-+-+-----+-+-+-+-+ |
+ | | asc_link | +-+-+-----+-+-+-+-+ MAXOBJSZ
+ | +--------------+ +-->| | | j | | | | | +1
+ | | key | | +-+-+-----+-+-+-+-+ |
+ | +--------------+ | +-+-+-----+-+-+-+-+ |
+ | | hash_link | | | | | | | | | | v
+ | +--------------+ | +-+-+-----+-+-+-+-+ ---
+ | | |<--MAX_OFFSET--->|
+ | | (bytes)
+ HDR(p)| GC_find_header(p) | |<--MAP_ENTRIES-->|
+ | \ from | =HBLKSIZE/WORDSZ
+ | (hdr) (struct hblkhdr) / alloc_hdr() | (1024 on Alpha)
+ +-->+----------------------+ | (8/16 bits each)
+ GET_HDR(p)| word hb_sz (words) | |
+ +----------------------+ |
+ | struct hblk *hb_next | |
+ +----------------------+ |
+ |mark_proc hb_mark_proc| |
+ +----------------------+ |
+ | char * hb_map |>-------------+
+ +----------------------+
+ | ushort hb_obj_kind |
+ +----------------------+
+ | hb_last_reclaimed |
+ --- +----------------------+
+ ^ | |
+ MARK_BITS| hb_marks[] | * if hdr is free, hb_sz is the size of
+ _SZ(words)| | a heap chunk (struct hblk) of at least
+ v | | MININCR*HBLKSIZE bytes (below),
+ --- +----------------------+ otherwise, size of each object in chunk.
+
+
+Dynamic data structures above are interleaved throughout the heap in blocks
+of size `MININCR * HBLKSIZE` bytes as done by `gc_scratch_alloc` which cannot
+be freed; free lists are used (e.g. `alloc_hdr`). `HBLK`'s below are
+collected.
+
+
+ (struct hblk) HDR_BYTES
+ --- +----------------------+ < HBLKSIZE --- (bytes)
+ ^ +-----hb_body----------+ (and WORDSZ) ^ --- ---
+ | | | aligned | ^ ^
+ | | | | hb_sz |
+ | | | | (words) |
+ | | Object 0 | | | |
+ | | | i |(word- v |
+ | + - - - - - - - - - - -+ --- (bytes)|aligned) --- |
+ | | | ^ | ^ |
+ | | | j (words) | | |
+ n * | Object 1 | v v hb_sz BODY_SZ
+ HBLKSIZE | |--------------- | (words)
+ (bytes) | | v MAX_OFFSET
+ | + - - - - - - - - - - -+ --- (bytes)
+ | | | !ALL_INTERIOR_POINTERS ^ |
+ | | | sets j only for hb_sz |
+ | | Object N | valid object offsets. | |
+ v | | All objects WORDSZ v v
+ --- +----------------------+ aligned. --- ---