Softools WinIDE Tips: Using xmem

xmem starts at the end of the stack segment and is allocated up from there. So in your projects, don't set the end address of STACK to the end of memory. It is usually best to lower it as much as possible to maximize available xmem.

An xmem buffer is used as follows:

#define BUF_SIZE 40000l
char far * buff;
Then allocate xmem:
buff=(char far *)xalloc(BUF_SIZE);

For other types adjust the xalloc size, i.e for longs:
long far * buff;
buff=(long far *)xalloc(BUF_SIZE*sizeof(long));

For an array of structures:
struct my_struct far * sbuff;
sbuff=(struct my_struct far *)xalloc(NUM_ENTRIES*sizeof(struct my_struct));

Make sure that any pointers that use xmem (far) buffer are defined as far pointers.

Memory Map:

The Rabbit has 4, 256k blocks of memory and 3 chip selects. Normally the first two (00000-7fffff) are set to Flash, and the second two (80000-fffff) are set to RAM. When running under the debugger, the chip selects are reversed with RAM at 0.

The Rabbit has segment registers that allow data and stack to be relocated. ZW and Softools do opposite things here. ZW recompiles everything when compiling to RAM or flash. So the addresses are changed at compile time. ST just relocates the RAM segments at run time as needed using the segment base registers. So, no recompile is needed to switch between the two modes. Both compilers remap RAM to 0 when running in RAM.

The code is built to run in RAM at 0. Typically:

0000:
CODE
CONST
DATA
FARCODE
FARCODE2
FARCONST
BSS
IDATA
STACK
(xmem)

Everything below BSS is to be in flash and is never written to, The rest is RAM. When running under the debugger, the above arrangement is used as-is, in RAM with no address changes (other than that the RAM chip select is mapped to 0.) When burned to flash, the segment registers are used to remap BSS and everything after it to 0x80000 (RAM chip select start address.)

For boards that are 44mHz or higher, these boards have fast RAM for code and data. The cstart.asm will automatically copy FLASH-->RAM on startup. So the first 512k is fast RAM, and the next 256k is the slow RAM. The top 256k is usually mapped to flash. The top 256k setting can be changed in cstart if needed. This has a unique effect, both code written to flash and code run under the debugger run with identical memory maps. Both use RAM at 0 and STACK is not remapped. This fast RAM can run with 0ws for full speed (Flash needs 1ws on 44mHz boards.)

Rabbit CPU Segment Registers

One of the more confusing things about the Rabbit CPU is logical/physical memory mapping. The rabbit only has 16-bit regs, so it can only directly address 64k. So the CPU chops the 64k area into blocks of code (const), data, stack and far code. With Softools, data and stack are the same. We need to specify where in the 64k, code ends and data starts. This is why in the link settings, the stack end address is set. This allows stack to grow down and code+consts to grow up. What is not clear is that we set the physical end of STACK for the debug configuration, the logical address is calcualted automatically.

Xmem (xalloc()'ed RAM) is simply calculated at run time from wherever STACK is mapped to until the end of RAM. Far code uses its own memory remapping so it can address the full 1 meg independently.

What gets confusing is Logical vs Physical addresses:

Logical Debug (RAM)
Physical Flash
Physical Segments/Notes

0000 00000 00000 CODE, CONST

z000 BSS start address 80000 BSS, IDATA, STACK. 'z' sets logical split between RAM/FLASH. 'z' must be a 4k boundary.

N/A 0z000 and up 0z000 and up Far code+far data (no constant logical address), uses e000-ffff logical window to access code.

DFFF STACK end address 80000+size of BSS,IDATA and STACK This+1 is the start of xalloc() data

E000-FFFF 8k window that gets mapped for each far function call

Notice that the debug (RAM) addresses are unchanged. When running in Flash, RAM segments get changed.

The SEGSIZE register sets the split in the logical map. I.e if it is 0x60, then addresses >0x6000 are remapped using the STACK register. The DATASEG register is not used at all. In ST the DATASEG=CODESEG. In the table above, the 'z' in the map is the start of STACKSEG. Suppose SEGSIZE is 0x60. This sets up the following map:

Logical Addr	Segment	Mapping
0000-5FFF	CODESEG	Logical= Physical
6000-DFFF	STACKSEG	Physical=Addr+STACKSEG<<12
E000-FFFF	XPC seg	Physical=Addr+XPC<<12

SEGSIZE (upper 4 bits) set the 4k address the marks the boundary from CODESEG to STACKSEG. To map globals and stack to RAM at 0x80000, STACKSEG would be set to (0x80000-0x6000)>>12 (0x7A). You might think STACKSEG would be 0x80 for this, but the address is added to the STACKSEG<<12. So the start logical address needs to be subtracted in calculating the value as the CPU will add it back at run time.

Note Dynamic C used DATAORG in the Bios to specify the split (#define DATAORG 0xZ000.) This caused problems as the root code and data varied you would have to adjust the DATAORG in the bios. Softools and cstart.asm calculate this dynamically, so there is no need to specify the split.

Note: the lower 4 bits of SEGSIZE is used for the DATASEG. Softools does not use this segment. If SEGSIZE is set up as "ZY" (hex nibbles):

Logical Address	Segment
0000 to Y000-1	Root Code	Physical=Logical
Y000 to Z000-1	DATASEG	Physical=logical+DATASEG<<12
Z000 to DFFF	STACKSEG	Physical=logical+STACKSEG<<12
E000 to FFFF	XPC seg	Physical=logical+XPC<<12

In Softools, DATASEG is set to 0 and 'Y' is 0x0; this is the same as the root code/data area. It is possible to use DATASEG for your own use.

Softools Output

When creating a .bin file, an image is generated of the physical memory. IDATA is a bit special, The IDATA data needs to be initialized. The startup code copies DATA from flash to IDATA in RAM. The linker does the reverse, creates a copy of what is in IDATA in DATA. Then when generating a .bin file, all the RAM segments are not included.

The difficult part is the end of the STACK segment. This is a physical address. Stack, IDATA and BSS grow downward from this address. As code and data grow, they will eventially collide (BSS and FARCODE overlap linker error.) So we change the end address of stack as needed in the linker setting. When run under the debugger, memory address don't change. Under the debugger, everything is in RAM so this is where we will run out of memory first. So, it is the "controlling" factor.

When compiled to flash, logical RAM segments are moved to 0x80000. The BSS and FARCODE would never overlap. Actually, the linker setting for the end address of STACK is not used when running from Flash. The exception is for WinIDE >=1.68. The cstart.asm will not remap STACK if it is compiled with and end address >0x80000. For boards with fast RAM (44mHz boards), the STACK segment is not re-mapped when running from Flash.

Far BSS Data

The last thing is far data (FARBSS). This is where a problem occurs. FARBSS gets linked at a physical address. When switching from running in RAM and running in Flash, there is no way to remap this segment at run time. It could be located after STACK, but that is the same address that xalloc() uses. Put it before BSS and it will end up in flash when running from flash. Really the simplest option is to fix FARBSS at the end of RAM and never xalloc() all of RAM.

Note, for the 1.68 compiler: FARBSS should follow STACK and have no fixed address. The cstart will adjust the xmem start address for FARBSS size so the change to main shown below is not needed and the end address is not set.

For debugging:

CODE
CONST
DATA
FARCODE
FARCODE2
FARCONST (may not exist)
BSS
IDATA
STACK
FARBSS (end address fixed at top of RAM, i.e. 7FFFF for 512k RAM.)

For flash build (non-fast RAM boards.):

CODE
CONST
DATA
FARCODE
FARCODE2
FARCONST (may not exist)
BSS
IDATA
STACK
FARBSS (end address fixed at top of RAM, i.e. FFFFF for 512k RAM.)

For 1.58, Add the following to your main code:

extern unsigned long _xmemsize;

main()
{
  _xmemsize-=FARBSS_SIZE;
  // rest of main code goes here.
}

You would have to define FARBSS_SIZE to the size of the FARBSS seg. This will adjust the xmem area to not use the FARBSS area.

The above will work in the debugger and in flash when running on boards with fast RAM. For boards without fast RAM, you will need two different builds (models will not work.) Copy the project file (.prj) to a new file with a different name. Then open this project and set the end of the STACK segment to 0x80000 + sizeof(BSS+IDATA). If you are unsure, set it to 0x8c000.

A simpler option is to use xalloc() for all far data. Just define a far * and initialize it with xalloc(). It will work the same in the debugger and in flash. There is a small overhead of loading the pointer vs a static compiled address. Using xalloc is simpler and will save the need for separate RAM/FLASH builds.

Using xalloc() has an advantage as the size can be dynamic. I often use all remaining xmem for a buffer. Here's a sample:

typedef struct log_data LOG_DATA;

LOG_DATA far * log_buff;
unsigned long log_size; // # of entries in log

log_size=xavail(); // get free xmem
log_size-=0x8000l; // leave enough room for the TCP/IP stack to allocate buffers
log_size/=sizeof(LOG_DATA); // calc # of structure entries
log_buff=(LOG_DATA far *)xalloc(xsize*sizeof(LOG_DATA)); // alloc xsize entries

The buffer will use the largest xmem buffer possible in both the debuigger and when running in Flash.

Additional Information: Tech Tips List - SHDesigns Home Page

Logical	Debug (RAM) Physical	Flash Physical	Segments/Notes
0000	00000	00000	CODE, CONST
z000	BSS start address	80000	BSS, IDATA, STACK. 'z' sets logical split between RAM/FLASH. 'z' must be a 4k boundary.
N/A	0z000 and up	0z000 and up	Far code+far data (no constant logical address), uses e000-ffff logical window to access code.
DFFF	STACK end address	80000+size of BSS,IDATA and STACK	This+1 is the start of xalloc() data
E000-FFFF			8k window that gets mapped for each far function call