
Playing with Game Boy palettes
Writing an emulator is really cool! But when you want to test it, you might occasionally need something simpler than an existing ROM, something you have more control over, something you can play with.
In this article, we’ll see how to build a custom Game Boy ROM from scratch, and how to make it do mildly interesting stuff1.
And then, I might write a couple more articles to talk about game design in general!
The simplest ROM
Several tools exist for building a Game Boy ROM. You can do it using different languages like C or assembly, or even just a dedicated IDE. I personally went with assembly to save time, and also because I have already described how the Game Boy CPU handles opcodes.
One of the most popular tools I found to generate a ROM from assembly is RGBDS, which was recommended by another blog post I read that explained in detail how to make an empty Game Boy ROM.
With that in mind, we can use the resources from the aforementioned site to end up with boilerplate code that will compile to an empty ROM. I’ll gloss over most of the specifics and just show the main source file we’ll get to work with.
If you want to see the whole thing, feel free to have a look at the repository where this code now lives.
; Boilerplate code for an empty ROM that won't crash.
; The header.asm file contains the necessary hardcoded values to generate
; a mostly valid ROM header. I say "mostly" because it doesn't contain the
; header checksum, but that is computed externally by rgbds at build time.
INCLUDE "include/header.asm"
; The header.asm file also contains basic code to ensure the Game Boy jumps to
; the `main` label below at the end of the boot process.
SECTION "default", ROM0
main:
;
; Do Things Here.
;
lock:
; Stop execution here, all the work should be done from an interrupt.
; Also weird stuff will happen if we don't put that railguard instruction
; at the end of our code.
JR lock
If we build that code as it is, we’ll have a ROM that boots… and not much else2.

Cool! Now, we just need to fill in the blanks to Make Things Happen.
What kind of Things? I’m glad you asked!
What about palettes?
A frightening amount of years ago, I mentioned a funny bug in that emulator I was writing along with those blog posts. Namely, that my black pixels were the wrong shade, and how it related to the way the Game Boy handles palettes.
The main lesson we learned from that bug is that you can change a tile’s colors in one of two ways:
- Rewrite the tile’s data in memory, which implies writing up to 16 bytes.
- Modify the Game Boy’s palette (the
BGP
register), which only requires writing one byte.
Let’s see what happens if we do the latter.
A still-pretty-simple ROM
So far, we’ve only been interested in how the Game Boy executes ROM code, but now we’ll have to actually write such code.
It’s not that hard: we just need to modify the contents of the BGP
register. To make the results visually interesting, we’ll want to do that repeatedly, at regular intervals.
Just doing this is an interesting exercise in itself because now we have to figure out how Game Boy code can run in a loop with a slight delay. And we do have a clue: the Game Boy’s boot ROM does exactly this when it makes the Nintendo logo scroll down.
It was kind of mentioned in an even older article, back when we were still figuring out what those video registers were.
The bit we’re interested in is that we can use the VBlank period to do stuff, and the rest of the time, we can just… wait for VBlank to happen. And since we know it occurs roughly 60 times per second, we can compute some kind of delay.
Please stand by
The boot ROM code waits for the next frame by continually checking the value of PPU register LY
in a loop:
Addr_0064:
LD A, ($ff00+$44) ; Wait for screen frame
CP $90
JR NZ, Addr_0064
This is called an “active” wait, in that the CPU is going to be busy the whole time. This actually works just fine, but it does use 100% of your CPU power. It doesn’t really matter to an emulator, but on real hardware, that means draining the battery faster.
Wouldn’t it be nicer if the Game Boy itself could tell our code that a full frame has just been drawn? Well, then, good news everyone: there is such a thing, it’s called an interrupt!
I haven’t mentioned interrupts in those articles because they’re not used in the boot ROM code. So, as usual, I will oversimplify: an interrupt is a way for the Game Boy hardware to tell the software that something specific happened — a button was pressed, a timer overflowed, the VBlank period just started, serial communication with another Game Boy just completed…
When one of those things happens, provided your code subscribed to the corresponding interrupt, the Game Boy will automatically jump to a pre-defined address in memory so your code can react to the event.
Wait, wait, wait, wasn’t one of those things about VBlank? Isn’t that exactly what we need?
Yes! Good catch! There is, indeed, a VBlank interrupt that will occur whenever LY
reaches 144 — which is exactly what the boot ROM code was checking in the first place.
Feel free to interrupt me!
A cleaner way to wait for VBlank is to enable the corresponding interrupt (this is done by writing the proper value to the IE
register at address 0xffff
) and then enable interrupt handling with the EI
(E
nable I
nterrupts) opcode.
We still need a way to tell the Game Boy to not do anything outside of that interrupt. We could, once again, put an infinite loop somewhere, which will run on its own between each interrupt call, but that’s still an active wait. Right now this is actually the only thing our empty ROM does:
lock:
JR lock
Enter the HALT
instruction, which will put the Game Boy into a low-power mode until an interrupt is handled. We can simply insert that into our loop, and it will then put the CPU to sleep between each execution of the VBlank interrupt handler. Again, this won’t make a visible difference, but it would be a lot nicer to hardware if you were to run that ROM on an actual Game Boy3.
lock:
HALT
JR lock
Cool! Now we have to actually write code for that interrupt handler. A common way to wait until some time has elapsed is to load a counter value into a register, decrement it each time the VBlank interrupt fires, and only do something when it reaches zero.
Let’s try and just invert the palette bits every second. We can add this code at the end of our empty ROM source file, using predefined labels provided in the header.asm
source file, which the Game Boy will automatically jump to when the interrupt we want occurs.
INCLUDE "include/header.asm"
; RGBDS supports defining internal variables, for readability.
DEF WAIT_FRAMES EQU 60
SECTION "default", ROM0
main:
; Initialize frame counter here. We'll wait 60 frames each time.
LD C, WAIT_FRAMES
; Subscribe to vblank interrupt by setting bit zero of register IE ($ffff)
; to 1.
LD A, $01
LD ($ff00+$ff), A
; Enable interrupts so that the Game Boy CPU will jump to the `vblank` label
; whenever a frame is done.
EI
lock:
; Stop execution here, all the work should be done from the vblank interrupt.
HALT
JR lock
; Entry point for the VBlank interrupt.
vblank:
; Invert background palette value in hardware register $ff47, but only
; when our frame counter reaches zero.
; Decrement, then check frame counter value. Return if it's not zero.
DEC C
JR NZ, vblankDone
; If we got here, our counter is zero, so we Do The Thing.
LD A, ($ff00+$47) ; Load current BGP value in A.
XOR A, $ff ; Invert the value by XORing every bit with 1.
LD ($ff00+$47), A ; Store the inverted value back in BGP.
; Reload our frame counter to the initial value for the next iteration.
LD C, WAIT_FRAMES
vblankDone:
; Return from interrupt call and re-enable interrupts.
RETI
; All other interrupts are unimplemented and will simply return without doing
; anything special.
stat:
timer:
serial:
joypad:
RETI
And let’s see what it looks like (warning: mild flashing).
Well, that wasn’t too bad, but it still doesn’t look like much. Yet!
Once upon a tile
We totally could do more funny stuff just using whatever is present in video RAM after the Game Boy boots up. Still, this seems like a good opportunity to come up with our own graphics.
Well, I say “graphics” but I’m going to keep it very simple, let’s start with a single tile that we’ll use to fill the background map.
Remember tiles? We talked about them that one time, way back!
How about we make one from scratch? A very simple one. One that would look somewhat like this:

BGP
palette.I saved some time by directly translating those pixels into bits so we can just insert the following hardcoded data somewhere in our ROM code. There’s a nicer way to do it, but it’s just one tile, we can do this!
; Raw tile data. One tile always uses 16 bytes: 2 per 8-pixel row. The first
; byte contains the lowest bit for all 8 pixels, the second byte the highest.
; Have I ever mentioned that endianness is a pain?
tileData:
DB $00, $00 ; Row 0, color 0b00.
DB $ff, $00 ; Row 1, color 0b01.
DB $00, $ff ; Row 2, color 0b10.
DB $ff, $ff ; Row 3, color 0b11.
DB $00, $00 ; Row 4, color 0b00.
DB $ff, $00 ; Row 5, color 0b01.
DB $00, $ff ; Row 6, color 0b10.
DB $ff, $ff ; Row 7, color 0b11.
That’s our tile. Easy! Now the fun can start.
Have you tried turning it off and on again?
That solitary tile will do absolutely nothing inside our ROM code. We first need to copy it over to the Game Boy’s video RAM. We also need to update the background map with the proper IDs so that it will only display this tile.
This would be nearly trivial if not for a tiny little detail: you can’t actually access video RAM while the PPU is drawing pixels.
Quoting the Pan Docs:
WARNING
When the PPU is drawing the screen, it is often directly reading from Video Memory (VRAM) and from the Object Attribute Memory (OAM). During these periods, the Game Boy CPU cannot access VRAM and OAM.
That means that any attempts to write to VRAM or OAM are ignored (data remains unchanged). And any attempts to read from VRAM or OAM will return undefined data (typically $FF).
Okay, this isn’t that big a deal, we’ll do the easy thing and turn off the PPU long enough to copy our tile to video RAM.
Yeah, er, about that… still quoting the Pan Docs:
CAUTION
Stopping LCD operation (Bit 7 from 1 to 0) may be performed during VBlank ONLY, disabling the display outside of the VBlank period may damage the hardware by burning in a black horizontal line similar to that which appears when the GB is turned off. This appears to be a serious issue. Nintendo is reported to reject any games not following this rule.
Okay fine, we’ll put a small wait loop at the beginning of our code as they did in the Pan Docs’ example, and turn the PPU off when it’s safe to do so4.
main:
; Initialize tile data. We first need to turn the PPU off. Which requires
; us to wait until VBlank. We could use the vblank interrupt too, but this
; is a short, one-time initialization so we'll do it the easy way.
waitForFrame:
LD A, ($ff00+$44) ; Wait for LY to reach value 144 (0x90 in hexadecimal)
CP $90
JR NZ, waitForFrame
; VBlank started, turn PPU off now.
XOR A ; Set A to zero
LD ($ff00+$40), A ; Set all bits in LCDC to zero, turning off the PPU.
Okay, now we can finally put that tile in video RAM.
Moving data around in assembly is kind of tedious. Fortunately RGBDS is helping a bit by computing some addresses for us, so there is that.
; Copy tile data from wherever the assembler stored it to 0x8000.
; We'll use DE to hold the source address and HL the destination.
LD DE, tileData ; RGBDS will replace tileData with its actual address.
LD HL, $8000
copyTileData:
LD A, (DE) ; Load current tile byte into A.
INC DE ; Point DE to next tile byte.
LD (HL+), A ; Write data byte to address [HL], then increment HL.
BIT 4, L ; Check whether L has reached 0x10 (16).
JR Z, copyTileData ; If not (bit 4 of L is still zero), keep copying data.
What’s left to do is updating the background map. I copied our tile data to the very beginning of the memory space where the PPU will read them at address 0x8000
, so we want the whole background map to contain ID zero.
This is one of the very first things the boot ROM does, so I’ll happily copy and paste that code here. I’m lazy like that. I’ll just put it before the previous chunk of code so it doesn’t overwrite the tile data we just put in video RAM.
; Clear VRAM from 0x8000 to 0x9fff (borrowed from boot ROM code).
LD HL, $9fff
clearVRAM:
LD (HL-), A ; Set byte at address HL to zero, then decrement HL.
BIT 7, H ; Check whether H is 0x80 or larger (bit 7 non-zero).
JR NZ, clearVRAM ; If so, keep clearing VRAM.
Whew. Where were we now…
I shift you not!
We can already run the ROM again as it is, we’ll see the tiles’ colors alternate every second.
Oh wait, this doesn’t look like our tile yet. Remember that tile colors are only indices into the background palette, and after boot up that palette only defines two colors: white for color index 0, and black for all the others. Which means our tile actually looks like this to the Game Boy right now:

BGP
register at startup.Let’s redefine BGP
so that it actually shows all four colors the Game Boy can display. We’re just packing numbers from 0 to 3 in that byte, so 0b00
, 0b01
, 0b10
and 0b11
, which translates to 0x1b
once all put together5.
; Configure BGP to show all four colors.
LD A, $1b
LD ($ff00+$47), A
And now, this should look more like what we expected, though it still doesn’t look like much.
If we wanted to achieve the same result “manually”, we’d have to rewrite all sixteen bytes of that tile in video RAM — and we would need to do that only during VBlank too. Changing BGP
instead is much easier and faster. It will also allow us to write other kinds of effects.
An easy one to do is cycling the palette. That is, rotating through all four colors: color 0 will become color 1, color 1 will become color 2, color 2 will become color 3, and color 3 will wrap back to color 0.
In essence, we are shifting all the bits in BGP
to the left (though the direction doesn’t really matter) and copying the leftmost bit that we shifted out back to bit 0 of the register.
Does that remind you of something?
RL C
at work in a loop. The register goes back to its initial value every 9 rotations.Yep, we can just use one assembly instruction to rotate the value in BGP
, though in this instance we will want to bypass the Carry Flag entirely — it’s cool, that’s what the RLC
6 instruction is for.
We’ll simply rewrite the line that used to invert the bits in BGP
to rotate them instead.
LD A, ($ff00+$47) ; Load current BGP value in A.
RLC A ; Rotate all bits left in BGP twice
RLC A ; since a BGP entry is 2 bits wide.
LD ($ff00+$47), A ; Store the cycled value back in BGP.
This effect will also look better if we reduce the delay a little.
DEF WAIT_FRAMES EQU 10
And there we go!

That’s an effect you might have seen in old games — though it looks better with more colors. We could also simulate a fade-to-black effect using a similar technique where we would gradually increment all color values in BGP
by one until they are all 3 (black).
Thank you sir. May I have some more?
This article grew a lot longer than I planned and I really should stop here. But I mentioned earlier that there was a nicer way to include tile data in your assembly code and I’d love to show you how, if you can bear with me for a little longer7.
What if we wanted slightly more detailed graphics, with more tiles? Again, I say “graphics” but what I have in mind is still pretty simple:
Now this requires eight distinct tiles, and I don’t really want to write 128 byte values by hand. Guess what, RGBDS already thought of that, and it provides a couple tools to include bitmap data in your code.
First, the RGBDS suite of utilities includes rgbgfx
, which converts good old PNG files into binary data suitable for display on the Game boy.
For instance it can convert the following PNG into a list of bytes defining all eight tiles in it.
Second, the assembly syntax supported by RGBDS contains the INCBIN
mnemonic, which can include binary files generated with rgbgfx
directly in your source code, as if you had written all those DB
instructions yourself. Neat!
We could start from the picture above, which contains our eight tiles, convert it to binary and include that in the code. But then, we still have one problem to solve: initializing the tile map.
In the previous example, it was easy: write ID zero across the whole background map. Except now we need to be smarter than that. This is what the visible part of our background map should look like:
Oof.
The visible tile map is 20×18 tiles, that’s still 360 bytes to define in our code and then write to that part of video RAM somehow. I originally did it manually, using some trickery to encode how many times a tile repeats along with its ID.
That was fun, but not really necessary because rgbgfx
can also do that for us: if we provide it with the full picture of our Game Boy screen, it is capable of generating both the data for the tiles themselves as well as the tile map!
rgbgfx -v -u -o tiles.bin -t tilemap.bin tilemap.png
So in the end, we can just include those two binary files in our code. We’ll still need a loop to copy these bytes where they need to go — I’m not going to show it here, you already got the general idea — but we saved ourselves a lot of work writing tedious DB
entries.
tileData:
INCBIN "tiles.bin"
.end
tilemapData:
INCBIN "tilemap.bin"
.end
And now we have something that almost looks like the basis for a demo of some kind!

Was there a point to this?
Actually, just writing the code for this article and trying to test it on our demo emulator highlighted a couple small bugs left in there, so there was that. Also, shapes and colors are nice!
But more importantly, doing this made me start thinking in terms of ROM development, as opposed to the emulator side. That’s the first toe I dipped in the deep and scary pool of old-school game programming.
And I will hopefully write more about that topic in the near future.
In the mean time, thank you for reading!
Source code
You can find the source code for all these custom ROMs on github. They don’t ship with an emulator so you can pick your favorite one. You will also need a recent version of RGBDS.
Then you should be able to just run make
in the source folder.
git clone https://github.com/lazy-stripes/blog-code.git
cd blog-code/playing-with-gameboy-palettes/
# Make sure the RGBDS binaries are in your PATH, or edit the Makefile.
make
The generated ROMs will all have the .gb
extension. Have fun!
-
Please note that I’ll run the ROMs we build in this article on Goholint, not the barebones demo emulator we came up with in the Writing an Emulator series of articles, because this is already going to be a long enough article without me delving into the finer points of implementing missing CPU instructions or interrupts handling. Also Goholint can save GIFs, which made it a lot easier to include pictures in this article. ↩︎
-
Actually, if we don’t put any code in there, the Game Boy will happily keep trying to execute all the subsequent bytes in the ROM and, depending what they are (most likely zero or
0xff
), it will either wrap around the whole memory space and try to execute some invalid opcode, or go into anRST 38H
infinite loop that will definitely make your stack overflow. ↩︎ -
In fact it might also be nicer to your computer’s CPU as it should reduce the workload of your emulator. ↩︎
-
For the record, Goholint used to let you write to VRAM at any time. I since made it so it behaves like the Pan Docs described, and I have not really noticed much of a difference, but at least now it’s a bit more accurate. ↩︎
-
Or
0xe4
if you put them in reverse order, this won’t matter much here. ↩︎ -
R
otateL
eft throughC
arry (the actual instruction then takes a register name as parameter), not to be confused withRL C
:R
otateL
eft the value in registerC
. Yes, I know. There’s a good reason why we usually work with higher-level languages. ↩︎ -
I won’t mind if you don’t, thank you for reading this far! ♥ ↩︎