src/backend/utils/mmgr/README
-Notes About Memory Allocation Redesign
-======================================
-
-Up through version 7.0, Postgres had serious problems with memory leakage
-during large queries that process a lot of pass-by-reference data. There
-was no provision for recycling memory until end of query. This needed to be
-fixed, even more so with the advent of TOAST which allows very large chunks
-of data to be passed around in the system. This document describes the new
-memory management system implemented in 7.1.
-
+Memory Context System Design Overview
+=====================================
Background
----------
allocated in.
At all times there is a "current" context denoted by the
-CurrentMemoryContext global variable. The backend macro palloc()
-implicitly allocates space in that context. The MemoryContextSwitchTo()
-operation selects a new current context (and returns the previous context,
-so that the caller can restore the previous context before exiting).
+CurrentMemoryContext global variable. palloc() implicitly allocates space
+in that context. The MemoryContextSwitchTo() operation selects a new current
+context (and returns the previous context, so that the caller can restore the
+previous context before exiting).
The main advantage of memory contexts over plain use of malloc/free is
that the entire contents of a memory context can be freed easily, without
malloc and friends, but there are some deliberate differences too. Here
are some notes to clarify the behavior.
-* If out of memory, palloc and repalloc exit via elog(ERROR). They never
-return NULL, and it is not necessary or useful to test for such a result.
+* If out of memory, palloc and repalloc exit via elog(ERROR). They
+never return NULL, and it is not necessary or useful to test for such
+a result. With palloc_extended() that behavior can be overridden
+using the MCXT_ALLOC_NO_OOM flag.
* palloc(0) is explicitly a valid operation. It does not return a NULL
pointer, but a valid chunk of which no bytes may be used. However, the
* pfree and repalloc do not accept a NULL pointer. This is intentional.
-pfree/repalloc No Longer Depend On CurrentMemoryContext
--------------------------------------------------------
-
-Since Postgres 7.1, pfree() and repalloc() can be applied to any chunk
-whether it belongs to CurrentMemoryContext or not --- the chunk's owning
-context will be invoked to handle the operation, regardless. This is a
-change from the old requirement that CurrentMemoryContext must be set
-to the same context the memory was allocated from before one can use
-pfree() or repalloc().
-
-There was some consideration of getting rid of CurrentMemoryContext entirely,
-instead requiring the target memory context for allocation to be specified
-explicitly. But we decided that would be too much notational overhead ---
-we'd have to pass an appropriate memory context to called routines in
-many places. For example, the copyObject routines would need to be passed
-a context, as would function execution routines that return a
-pass-by-reference datatype. And what of routines that temporarily
-allocate space internally, but don't return it to their caller? We
-certainly don't want to clutter every call in the system with "here is
-a context to use for any temporary memory allocation you might want to
-do". So there'd still need to be a global variable specifying a suitable
-temporary-allocation context. That might as well be CurrentMemoryContext.
+The Current Memory Context
+--------------------------
+
+Because it would be too much notational overhead to always pass an
+appropriate memory context to called routines, there always exists the
+notion of the current memory context CurrentMemoryContext. Without it,
+for example, the copyObject routines would need to be passed a context, as
+would function execution routines that return a pass-by-reference
+datatype. Similarly for routines that temporarily allocate space
+internally, but don't return it to their caller? We certainly don't
+want to clutter every call in the system with "here is a context to
+use for any temporary memory allocation you might want to do".
The upshot of that reasoning, though, is that CurrentMemoryContext should
generally point at a short-lifespan context if at all possible. During
permanent memory leaks.
-Additions to the Memory-Context Mechanism
------------------------------------------
-
-Before 7.1 memory contexts were all independent, but it was too hard to
-keep track of them; with lots of contexts there needs to be explicit
-mechanism for that.
-
-We solved this by creating a tree of "parent" and "child" contexts. When
-creating a memory context, the new context can be specified to be a child
-of some existing context. A context can have many children, but only one
-parent. In this way the contexts form a forest (not necessarily a single
-tree, since there could be more than one top-level context; although in
-current practice there is only one top context, TopMemoryContext).
-
-We then say that resetting or deleting any particular context resets or
-deletes all its direct and indirect children as well. This feature allows
-us to manage a lot of contexts without fear that some will be leaked; we
-only need to keep track of one top-level context that we are going to
-delete at transaction end, and make sure that any shorter-lived contexts
-we create are descendants of that context. Since the tree can have
-multiple levels, we can deal easily with nested lifetimes of storage,
-such as per-transaction, per-statement, per-scan, per-tuple. Storage
-lifetimes that only partially overlap can be handled by allocating
-from different trees of the context forest (there are some examples
-in the next section).
-
-Actually, it turns out that resetting a given context should almost
-always imply deleting, not just resetting, any child contexts it has.
-So MemoryContextReset() means that, and if you really do want a tree of
-empty contexts you need to call MemoryContextResetOnly() plus
-MemoryContextResetChildren().
+pfree/repalloc Do Not Depend On CurrentMemoryContext
+----------------------------------------------------
+
+pfree() and repalloc() can be applied to any chunk whether it belongs
+to CurrentMemoryContext or not --- the chunk's owning context will be
+invoked to handle the operation, regardless.
+
+
+"Parent" and "Child" Contexts
+-----------------------------
+
+If all contexts were independent, it'd be hard to keep track of them,
+especially in error cases. That is solved this by creating a tree of
+"parent" and "child" contexts. When creating a memory context, the
+new context can be specified to be a child of some existing context.
+A context can have many children, but only one parent. In this way
+the contexts form a forest (not necessarily a single tree, since there
+could be more than one top-level context; although in current practice
+there is only one top context, TopMemoryContext).
+
+Deleting a context deletes all its direct and indirect children as
+well. When resetting a context it's almost always more useful to
+delete child contexts, thus MemoryContextReset() means that, and if
+you really do want a tree of empty contexts you need to call
+MemoryContextResetOnly() plus MemoryContextResetChildren().
+
+These features allow us to manage a lot of contexts without fear that
+some will be leaked; we only need to keep track of one top-level
+context that we are going to delete at transaction end, and make sure
+that any shorter-lived contexts we create are descendants of that
+context. Since the tree can have multiple levels, we can deal easily
+with nested lifetimes of storage, such as per-transaction,
+per-statement, per-scan, per-tuple. Storage lifetimes that only
+partially overlap can be handled by allocating from different trees of
+the context forest (there are some examples in the next section).
For convenience we also provide operations like "reset/delete all children
of a given context, but don't reset or delete that context itself".
+Memory Context Reset/Delete Callbacks
+-------------------------------------
+
+A feature introduced in Postgres 9.5 allows memory contexts to be used
+for managing more resources than just plain palloc'd memory. This is
+done by registering a "reset callback function" for a memory context.
+Such a function will be called, once, just before the context is next
+reset or deleted. It can be used to give up resources that are in some
+sense associated with an object allocated within the context. Possible
+use-cases include
+* closing open files associated with a tuplesort object;
+* releasing reference counts on long-lived cache objects that are held
+ by some object within the context being reset;
+* freeing malloc-managed memory associated with some palloc'd object.
+That last case would just represent bad programming practice for pure
+Postgres code; better to have made all the allocations using palloc,
+in the target context or some child context. However, it could well
+come in handy for code that interfaces to non-Postgres libraries.
+
+Any number of reset callbacks can be established for a memory context;
+they are called in reverse order of registration. Also, callbacks
+attached to child contexts are called before callbacks attached to
+parent contexts, if a tree of contexts is being reset or deleted.
+
+The API for this requires the caller to provide a MemoryContextCallback
+memory chunk to hold the state for a callback. Typically this should be
+allocated in the same context it is logically attached to, so that it
+will be released automatically after use. The reason for asking the
+caller to provide this memory is that in most usage scenarios, the caller
+will be creating some larger struct within the target context, and the
+MemoryContextCallback struct can be made "for free" without a separate
+palloc() call by including it in this larger struct.
+
+
+Memory Contexts in Practice
+===========================
+
Globally Known Contexts
-----------------------
Mechanisms to Allow Multiple Types of Contexts
----------------------------------------------
-We may want several different types of memory contexts with different
-allocation policies but similar external behavior. To handle this,
-memory allocation functions will be accessed via function pointers,
-and we will require all context types to obey the conventions given here.
-(As of 2015, there's actually still just one context type; but interest in
-creating other types has never gone away entirely, so we retain this API.)
-
-A memory context is represented by an object like
-
-typedef struct MemoryContextData
-{
- NodeTag type; /* identifies exact kind of context */
- MemoryContextMethods methods;
- MemoryContextData *parent; /* NULL if no parent (toplevel context) */
- MemoryContextData *firstchild; /* head of linked list of children */
- MemoryContextData *nextchild; /* next child of same parent */
- char *name; /* context name (just for debugging) */
-} MemoryContextData, *MemoryContext;
-
-This is essentially an abstract superclass, and the "methods" pointer is
-its virtual function table. Specific memory context types will use
+To efficiently allow for different allocation patterns, and for
+experimentation, we allow for different types of memory contexts with
+different allocation policies but similar external behavior. To
+handle this, memory allocation functions are accessed via function
+pointers, and we require all context types to obey the conventions
+given here.
+
+A memory context is represented by struct MemoryContextData (see
+memnodes.h). This struct identifies the exact type of the context, and
+contains information common between the different types of
+MemoryContext like the parent and child contexts, and the name of the
+context.
+
+This is essentially an abstract superclass, and the behavior is
+determined by the "methods" pointer is its virtual function table
+(struct MemoryContextMethods). Specific memory context types will use
derived structs having these fields as their first fields. All the
-contexts of a specific type will have methods pointers that point to the
-same static table of function pointers, which look like
-
-typedef struct MemoryContextMethodsData
-{
- Pointer (*alloc) (MemoryContext c, Size size);
- void (*free_p) (Pointer chunk);
- Pointer (*realloc) (Pointer chunk, Size newsize);
- void (*reset) (MemoryContext c);
- void (*delete) (MemoryContext c);
-} MemoryContextMethodsData, *MemoryContextMethods;
-
-Alloc, reset, and delete requests will take a MemoryContext pointer
-as parameter, so they'll have no trouble finding the method pointer
-to call. Free and realloc are trickier. To make those work, we
-require all memory context types to produce allocated chunks that
-are immediately preceded by a standard chunk header, which has the
-layout
-
-typedef struct StandardChunkHeader
-{
- MemoryContext mycontext; /* Link to owning context object */
- Size size; /* Allocated size of chunk */
-};
-
-It turns out that the pre-existing aset.c memory context type did this
-already, and probably any other kind of context would need to have the
-same data available to support realloc, so this is not really creating
-any additional overhead. (Note that if a context type needs more per-
-allocated-chunk information than this, it can make an additional
-nonstandard header that precedes the standard header. So we're not
-constraining context-type designers very much.)
-
-Given this, the pfree routine looks something like
-
- StandardChunkHeader * header =
- (StandardChunkHeader *) ((char *) p - sizeof(StandardChunkHeader));
-
- (*header->mycontext->methods->free_p) (p);
+contexts of a specific type will have methods pointers that point to
+the same static table of function pointers.
+
+While operations like allocating from and resetting a context take the
+relevant MemoryContext as a parameter, operations like free and
+realloc are trickier. To make those work, we require all memory
+context types to produce allocated chunks that are immediately,
+without any padding, preceded by a pointer to the corresponding
+MemoryContext.
+
+If a type of allocator needs additional information about its chunks,
+like e.g. the size of the allocation, that information can in turn
+precede the MemoryContext. This means the only overhead implied by
+the memory context mechanism is a pointer to its context, so we're not
+constraining context-type designers very much.
+
+Given this, routines like pfree their corresponding context with an
+operation like (although that is usually encapsulated in
+GetMemoryChunkContext())
+
+ MemoryContext context = *(MemoryContext*) (((char *) pointer) - sizeof(void *));
+
+and then invoke the corresponding method for the context
+
+ (*context->methods->free_p) (p);
More Control Over aset.c Behavior
---------------------------------
-Previously, aset.c always allocated an 8K block upon the first allocation
-in a context, and doubled that size for each successive block request.
-That's good behavior for a context that might hold *lots* of data, and
-the overhead wasn't bad when we had only a few contexts in existence.
-With dozens if not hundreds of smaller contexts in the system, we need
-to be able to fine-tune things a little better.
+By default aset.c always allocates an 8K block upon the first
+allocation in a context, and doubles that size for each successive
+block request. That's good behavior for a context that might hold
+*lots* of data. But if there are dozens if not hundreds of smaller
+contexts in the system, we need to be able to fine-tune things a
+little better.
-The creator of a context is now able to specify an initial block size
-and a maximum block size. Selecting smaller values can prevent wastage
-of space in contexts that aren't expected to hold very much (an example is
-the relcache's per-relation contexts).
+The creator of a context is able to specify an initial block size and
+a maximum block size. Selecting smaller values can prevent wastage of
+space in contexts that aren't expected to hold very much (an example
+is the relcache's per-relation contexts).
Also, it is possible to specify a minimum context size. If this
value is greater than zero then a block of that size will be grabbed
pattern cheap, the first block allocated in a context is not given
back to malloc() during reset, but just cleared. This avoids malloc
thrashing.
-
-
-Memory Context Reset/Delete Callbacks
--------------------------------------
-
-A feature introduced in Postgres 9.5 allows memory contexts to be used
-for managing more resources than just plain palloc'd memory. This is
-done by registering a "reset callback function" for a memory context.
-Such a function will be called, once, just before the context is next
-reset or deleted. It can be used to give up resources that are in some
-sense associated with an object allocated within the context. Possible
-use-cases include
-* closing open files associated with a tuplesort object;
-* releasing reference counts on long-lived cache objects that are held
- by some object within the context being reset;
-* freeing malloc-managed memory associated with some palloc'd object.
-That last case would just represent bad programming practice for pure
-Postgres code; better to have made all the allocations using palloc,
-in the target context or some child context. However, it could well
-come in handy for code that interfaces to non-Postgres libraries.
-
-Any number of reset callbacks can be established for a memory context;
-they are called in reverse order of registration. Also, callbacks
-attached to child contexts are called before callbacks attached to
-parent contexts, if a tree of contexts is being reset or deleted.
-
-The API for this requires the caller to provide a MemoryContextCallback
-memory chunk to hold the state for a callback. Typically this should be
-allocated in the same context it is logically attached to, so that it
-will be released automatically after use. The reason for asking the
-caller to provide this memory is that in most usage scenarios, the caller
-will be creating some larger struct within the target context, and the
-MemoryContextCallback struct can be made "for free" without a separate
-palloc() call by including it in this larger struct.