The Legend of Zelda: Oracle of * also had the Game Boy Color's 32K of RAM (4K fixed, 4K switched) and 16K of VRAM, which is easily randomly accessible while rendering is off, to work with. For random access with little RAM, such as on an NES without extra RAM on the cartridge, you need to use a static dictionary in ROM.
For text, try
Huffword or
byte pair, with pointers to each document (e.g. each page of text or each line of dialogue). I used byte pair for
my robotfindskitten port, and I was going to use Huffword for an e-book reader until the project
ran into serious emulator bugs.
For map data, there's either multi-layer metatiles (
Mega Man;
Blaster Master;
Sonic the Hedgehog) or object coding (
Super Mario Bros. series).
The Legend of Zelda for NES is known for using metatiles that are a column of the screen.
Another trick I like to use that generalizes RLE in a static-dictionary-ish way is based on a
Markov chain. For each symbol, store the most common symbol that follows it, and then treat a "run" as a set of symbols each followed by its most common next symbol. (RLE is the special case of this where each symbol's most common next symbol is itself.) Background maps in
Haunted: Halloween '85 for NES are stored this way with vertical runs. Level maps in
Wrecking Ball Boy for Pygame are horizontal rows of tiles that are then Markov-expanded vertically.
It's also useful to consider how you are encoding run lengths. The best method rate-wise depends on the distribution of run lengths and literal lengths; the distribution implied by popular RLE schemes such as PackBits may not be optimal. And on the NES, you often have to break VRAM uploads into fixed-size packets. I like to use a
unary code, in which each bit of a control word can be 0 for a single literal symbol or 1 for part of previous run, to be fairly efficient on real data and easy to decode to a multiple of 8 bytes. LZSS in the Allegro 4 library and Game Boy Advance and Nintendo DS BIOS uses a variant of this unary code, where each bit of the control word is one way for a literal and the other way for a reference to previous data.