7.5.3 says that if you are using 8 bit color mode, the 16 bit color is generated into RAM.
In 8.1, it indicates you can read via parallel, but not in serial mode.
I really wonder how true this 'read' condition is. I guess it's worth a shot.
Ah yes, that point looks pretty clear to me. Otherwise the logic for the screen would probably be much more complicated. I've been already thinking about the possibilities - like having an offscreen sprite atlas and copying it onto the front buffer. Then however I realized that this technique would also have severe drawbacks: No transparency support. And since sending the data line by line lacks also transparency, there's no solid way to fix it. Too bad they didn't just pack twice or four times more RAM and included some transparency mode. That would make making games quite easy...
I think I'll stick to scanline rendering as it requires less fiddling.
There are btw. two things I've been thinking in the past about regarding compression: Text compression and image compression. I dropped doing an implementation since I didn't see much use (yet) with the stuff I want to do, but by mentioning full screen images, I've been wondering if you would be interested in that.
The text compression I thought of would be quite simple: A script scans a file with all strings you want to use in your game. It determines the character usage and creates a custom compression table of all used characters. Depending on the number of distinct characters it would pack the strings into 4 bit tuples where certain sequences would tell the decompressor that the character needs another 4 bit tuple for unpacking. I wrote a simple Lua script for compression and it pretty much works... it would save 50-70% of storage space for all strings and the decompressor should be rather simple since the format is quite straight forward. It wouldn't however really matter unless you have ~1k of text.
I could finish that work if you are interested.
Another thing was image compression... I didn't follow that since rendering times would decrease significantly, however again, if you are interested, I could look into that direction as well. I think it wouldn't be too much work. I'm just also not sure how "good" it would look like. The idea is similar to DXT/ETC compression - with some adaptions: I would slice an image into a 4x4 or 8x8 block. Each block would have a color palette associated. An image would have 16 color palettes to pick from, 4 colors each. Each pixel would therefore use 2bits plus 4 bits for each 8x8 or 4x4 block for the palette selection. That would mean using 16.5 bytes for each 8x8 block instead of 64 bytes or 4.5 bytes for each 4x4 block instead of 16 bytes. Depending on the image content, it might look ok (of course it's lossy compression, so quality might be problematic

). The decompressor would be still rather simple to implement (read a byte, unpack pixels, load palette colors, translate colors to line pixels). However the compressor would require some more thinking to get "good" quality.
But it's been mostly just ideas. It would be more interesting to make use of that