9. Memory Use

One of the earlier machines I programmed in assembly was the Hewlett-Packard 2116 processor. It was part of the HP 2000F timesharing computer system I had access to. In this machine, there was no concept of a stack inherently understood by the CPU. When a subroutine was called, for example, the first instruction of that routine was arbitrarily destroyed by the CPU in order to place the return address there. Then the CPU continued with the next instruction of the routine. Leaving the routine meant jumping indirectly through the first location of the subroutine.

This was a recipe for semantic limitations. Subroutines couldn't call themselves recursively, nor could they call another routine which might call them back. Pre-emption was also a big problem because, if the CPU switched to another task and that task needed to call an interrupted subroutine, disaster would result.

These days, it's easy for programmers to not recall (or even know about) those days before such important concepts like the stack were developed and refined and then implemented. Most CPUs today support a stack concept in their hardware -- it's just that useful. But there was a time when these things were hard-learned by experience and imagined only through arcane theory.

The Operating System Point of View

While your program lives in its own world, the operating system or operating environment (to use a more equivocal term) has a somewhat different view of things, including your program. The O/S has the responsibility of managing memory and loading your program somewhere in it. It may also need to protect itself from your programs and your program from other programs, as well. It also needs to perform relatively quickly. The O/S also provides various common services that programs need and may have to virtualize resources on the computer (make it look like your program is the only program using them, even though there are actually a dozen other programs using it, too.)

Operating system designers, over the years, have arrived at a rather useful, simplified view about what a program or task should look like when it is located in memory. This simplified view is quite broadly useful and, although there are myriad variations on myriad computers to deal with myriad issues, continues to show its value over and over again as a clarifying concept.

As I've already suggested earlier, a program is often the combined product of several translation units or modules or source files, linked together in a useful fashion. Whether a program is written in assembly language, C, Pascal, other languages, or some combination of them, the memory layout for the program usually follows a standardized template.

Although linkers support many complex details to support building correct programs from separately compiled modules, the final, linked product tends to break down into six distinct sections. All of the code is collected together, constants are collected together, static data which needs to be initialized is collected together, and static data which does not need any initialization is also collected. Automatic data is placed on the 'stack' and heap space grows from the end of the static data (initialized, uninitialized, and constant) up towards the stack while the stack grows downwards. These six distinct areas are placed into consecutive areas of memory, ordered in a precise way, like this:

Typical Program Memory Layout

Description Minimum Required Access Requires Non-Volatile Storage? Run Time Behavior

Code Execute-Only Yes Fixed/Static

Constants Read-Only Yes Fixed/Static

Initialized Data Read/Write Yes Fixed/Static

Uninitialized Data Read/Write No Fixed/Static

Heap Read/Write No Variable/Grows Upwards

Stack Read/Write No Variable/Grows Downwards

"Non-volatile" indicates whether or not there needs to be some form of non-volatile storage. If yes, this can be on disk or in some hardware ROM or flash memory. But it needs to be saved somwhere.

Also, keep in mind that both heap and stack are both part of a fixed block of memory, with the stack starting at the end of it and growing down and the heap starting at the beginning of it and growing up. The remaining "unused" memory in between the active heap and the active stack, like a double-ended candle, burns at both ends.

The code section is much like the constant data section, except that it is for program instructions. (Code is nothing more than just constant data, really.) The size of the code space is usually set at link time and does not need to change during execution.

The constant data section, as I use the term above, is meant for data which cannot be modified during execution. This region is different from initialized data because constants do not ever require write-access during run-time. Examples of this kind of data are PI and sine tables, both used in trigonometry. For some systems, these constants might be co-located with code, if the code space permits reading-as-data. For others, the constants will be treated just like initialized data, except that none of the code will attempt to change any of it.

The initialized data section is for a region of read/write memory, where the initial values must be preset to specific values before the program is allowed to start running. However, the initialized data area requires read/write access at run time because they can (and probably will) be changed while the program runs. An example of this kind of data might be the seed for a random number generator, which is initialized to some default value before the program starts running. Or some default values that may be later changed by the operator/user of the program.

The unitialized data section is for data which does not require initial values. This means that the data could start out randomized and that this would not impact the correct behavior of the program. An example here may be a common, shared variable which is initialized from some command line parameter value after the program starts running.

The heap section isn't static at run time. Normally, this section is set up having zero size to start and then grows and shrinks during execution. This is the area used by routines like C's malloc(), for example. In assembly programs, heap is not as often used as it is in C, but the idea is still valid for assembly programs and may be useful. Either way, a simple design for the heap has it growing upwards and away from the last memory location required by all the collected static regions, towards the stack. Heap space, in some languages or systems, uses what is called "garbage collection" methods in order to compact it when memory is freed from the middle.

The stack section also isn't static at run time. Normally, this section also starts out with zero size and grows and shrinks during execution. It usually grows downward and away from the last possible memory location for the program and towards the changing end of the heap section. The result of the operation of the stack and heap is that there is an invisible area of read-write memory, a "no man's land" so to speak, between the heap and the stack; an area which often shrinks like a candle burning at both ends. If, during execution, the heap grows into the stack (or visa versa) then the program will probably fail to operate correctly. Some operating systems are able to automatically detect this event and correct for it by adding some more memory into the gap.

Most conventional programming languages and their linkers are designed with the above run-time model in mind, with the above mentioned sections arranged in that order in memory. For example, no matter how many source code files you have, no matter how many different fragments of code you write, it's the linker's job to somehow collect all those various fragments and place them into one, big, common code section so that all the code is in a single convenient place. In addition, the linker should try and collect up all of your constants and literal text strings together into another common constant section and, if appropriate, place it just after the code section. Since both represent fixed, immutable constant bytes, that plan makes sense. Then, it makes sense to find all your initialized variables, those needing to have a starting value when the program begins, into yet another common section and to place that one just after the other two. With all those collected together, that's all that is actually needed to be saved in some form of non-volatile storage, like flash or a floppy or hard disk. These are the only parts of your program that the operating system can't compute or generate on its own. The uninitialized variables, heap, and stack are easily built at run-time without any additional information.

Imagine what would happen if these things weren't organized in this way. Bits of unitialized variables mixed in with initialized ones, constants salted into random locations in uninitialized memory space, etc. The entire mixed up mess would need to be kept in non-volatile storage, perhaps excepting the heap and stack, with a potentially serious waste of good non-volatile memory holding unnecessary values for uninitialized variable memory that is going to be initialized later, once the program starts up, anyway. It really does help for the linker to organize things according to these uses.

This scheme also allows rather simple program loaders and protection schemes, since only the first three sections need to be saved on disk and those can be quickly and directly loaded into the first part of the allocated memory for the program. Also, as another example, an operating system with appropriate hardware support can arrange things so that the first two sections are kept in memory marked as read-only and the last four sections in read-write memory. Some types of hardware only support a single division of memory in this way and organizing the program like this helps it to better fit inexpensive hardware designs. Some operating systems, like DOS, manage only the read/write, dynamic RAM available directly to their CPU, so all of the memory is read-write. That's fine. It allows the code and constants to be written, but if you don't try to do that there is no problem.

In operating systems that can share memory sections between instances of running programs (for example, when you run two or three separate copies of Microsoft Word), it makes it much nicer if the common code and common constants are collected up because that permits those memory sections to simply be mapped into the local instance memory of each process without having to reload or allocate separate memory for each. If these fragments were all mixed up with other types of memory, then the operating system would have no choice but to allocate the entire program memory space for each instance, without any sharing.

In von Neumann architectures, like the DOS-based PC, the operating system stores only the code, constant, and initialized data sections (sometimes just called CODE, CONST, and INIT) on a disk, in a specially formatted, executable file -- the .COM and .EXE file. When the program is to be started, the operating system allocates enough volatile RAM from its own volatile RAM memory heap to hold the entire program with all the segments and then loads just those three on-disk portions from the file into the associated segments in RAM. It then starts the program. There is no need to load the other sections -- the stack and heap are empty, anyway, and the uninitialized data doesn't need to be initialized from data on the disk. Some languages, like C, do guarantee that the unitialized data area will be initialized to a semantic zero, though -- but this is handled by the C start-up code found in the library that all C programs link with.

In Harvard architectures, it's somewhat more difficult to arrange some of these details, as any operating system trying to load the program must have access to the CODE segment as if it were a writable data segment. This is one reason why many Harvard style CPUs support a special type of "code memory treated as data" instruction, just for that purpose.

In protected mode environments, such as what the 80286+ can provide, where code segments are not even writable when they are configured as code memory, the operating system is often forced to forge additional, writable data selectors as aliases to the code area of the program it is loading and use those instead.

Summary

When you write your assembly code for your program, it's helpful to consider which one of the six basic types of memory each part of your program requires. The assembler pre-defines some directives for just these different purposes, too. For example, the .code directive specifies the first type; the .const directive specifies the second type; the .data directive specifies the third type; the .data? directive specifies the fourth type; and the .stack directive specifies the sixth type (the ML assembler directives don't have a special one just for the fifth type.) Note this fact and place your code and data where it belongs.

I haven't discussed the meaning of "Align" or "Combine" or "Class" and haven't said much, if anything, about the group called DGROUP. So, I'd recommend reading some of the documentation noted in my PC Docs web page -- these include some PDF versions of the Microsoft MASM manuals as well as a Microsoft web page containing the technical documentation. Also, Randy Hyde's excellent tutorial, as well. There is a lot of excellent sources and my discussions are only a tiny drop, by comparison.

But here's a table borrowed from Appendix E of the Microsoft MASM Programmer's Guide. It may be helpful. Take note of the different directive types in each model and relate these back to my comments about operating systems, generally, noted earlier on this page. This may help guide you in properly using these directives.

Model	Directive	Name	Align	Combine	Class	Group
Default Segments and Types for Standard Memory Models
Tiny	.CODE	_TEXT	WORD	PUBLIC	'CODE'	DGROUP
	.FARDATA	FAR_DATA	PARA	PRIVATE	'FAR_DATA'
	.FARDATA?	FAR_BSS	PARA	PRIVATE	'FAR_BSS'
	.DATA	_DATA	WORD	PUBLIC	'DATA'	DGROUP
	.CONST	CONST	WORD	PUBLIC	'CONST'	DGROUP
	.DATA?	_BSS	WORD	PUBLIC	'BSS'	DGROUP
Small	.CODE	_TEXT	WORD	PUBLIC	'CODE'
	.FARDATA	FAR_DATA	PARA	PRIVATE	'FAR_DATA'
	.FARDATA?	FAR_BSS	PARA	PRIVATE	'FAR_BSS'
	.DATA	_DATA	WORD	PUBLIC	'DATA'	DGROUP
	.CONST	CONST	WORD	PUBLIC	'CONST'	DGROUP
	.DATA?	_BSS	WORD	PUBLIC	'BSS'	DGROUP
	.STACK	STACK	PARA	STACK	'STACK'	DGROUP
Medium	.CODE	name_TEXT	WORD	PUBLIC	'CODE'
	.FARDATA	FAR_DATA	PARA	PRIVATE	'FAR_DATA'
	.FARDATA?	FAR_BSS	PARA	PRIVATE	'FAR_BSS'
	.DATA	_DATA	WORD	PUBLIC	'DATA'	DGROUP
	.CONST	CONST	WORD	PUBLIC	'CONST'	DGROUP
	.DATA?	_BSS	WORD	PUBLIC	'BSS'	DGROUP
	.STACK	STACK	PARA	STACK	'STACK'	DGROUP
Compact	.CODE	_TEXT	WORD	PUBLIC	'CODE'
	.FARDATA	FAR_DATA	PARA	PRIVATE	'FAR_DATA'
	.FARDATA?	FAR_BSS	PARA	PRIVATE	'FAR_BSS'
	.DATA	_DATA	WORD	PUBLIC	'DATA'	DGROUP
	.CONST	CONST	WORD	PUBLIC	'CONST'	DGROUP
	.DATA?	_BSS	WORD	PUBLIC	'BSS'	DGROUP
	.STACK	STACK	PARA	STACK	'STACK'	DGROUP
Large -or- Huge	.CODE	name_TEXT	WORD	PUBLIC	'CODE'
	.FARDATA	FAR_DATA	PARA	PRIVATE	'FAR_DATA'
	.FARDATA?	FAR_BSS	PARA	PRIVATE	'FAR_BSS'
	.DATA	_DATA	WORD	PUBLIC	'DATA'	DGROUP
	.CONST	CONST	WORD	PUBLIC	'CONST'	DGROUP
	.DATA?	_BSS	WORD	PUBLIC	'BSS'	DGROUP
	.STACK	STACK	PARA	STACK	'STACK'	DGROUP
Flat	.CODE	_TEXT	DWORD	PUBLIC	'CODE'
	.FARDATA	_DATA	DWORD	PUBLIC	'DATA'
	.FARDATA?	_BSS	DWORD	PUBLIC	'BSS'
	.DATA	_DATA	DWORD	PUBLIC	'DATA'
	.CONST	CONST	DWORD	PUBLIC	'CONST'
	.DATA?	_BSS	DWORD	PUBLIC	'BSS'
	.STACK	STACK	DWORD	PUBLIC	'STACK'

Last updated: Monday, July 12, 2004 01:06

The Operating System Point of View

Typical Program Memory Layout

Summary

Default Segments and Types for Standard Memory Models