Writing an emulator: scrolling at last

2019-12-22 / Development, Emulator

Last article left us with a display that works well enough, except for the scrolling part. We’ll cover it today, and when we’re done, our no-longer-tiny example program ought to be able to run the Boot ROM¹ and read the logo to scroll directly from a cartridge.

Define “scrolling”

I’m only going to look at vertical scrolling today. That is, the ability to make the screen’s output appear to move up or down over time.

We’ve actually done most of the work with our existing PPU and Fetcher. The PPU is already continuously outputting graphical data from memory to screen, but what we need to keep in mind is that the memory zone used for graphics is noticeably larger than the actual screen.

Notice how the area actually shown on screen only covers 20×18 tiles of the 32×32 background map.

That means we’d only need to change some offsets between two frames to make the PPU display a different region of video RAM entirely. By tweaking the right offsets at the right speed, the boot ROM code can tell the PPU to “move” the screen’s output up in video RAM, which will gradually display the logo as if it was scrolling down the screen.

In essence, we’re only going to adjust the offsets we compute for the Fetcher each frame by mixing in some extra parameters that can set by Game Boy code.

In the Game Boy, one of those parameters is SCY, the PPU’s vertical scrolling register. Quoting the Pan Docs again:

FF42 – SCY – Scroll Y (R/W)
FF43 – SCX – Scroll X (R/W)

Specifies the position in the 256×256 pixels BG map (32×32 tiles) which is to be displayed at the upper/left LCD display position.

Values in range from 0-255 may be used for X/Y each, the video controller automatically wraps back to the upper (left) position in BG map when drawing exceeds the lower (right) border of the BG map area.

Remember how we got the PPU’s Fetcher to read from specific addresses in video RAM? So far we always had it start fetching pixels from the screen’s top-left position, so it’s only showing the first 20×18 tiles in the background map.

An animated diagram showing how a single frame is fetched from the background map and how, for each line, the Y-coordinate of the current scanline is equal to the current value of LY. — A single frame being fetched from the background map. No scrolling.

If we want to give Game Boy code any control over this, we need to use the values stored in SCX and SCY to tell the Fetcher precisely where to start reading data in video RAM to show that part of the background map. Now, SCX is not needed here, the boot ROM program is only going to scroll a logo vertically and exclusively use SCY for that. Moreover, horizontal scrolling has extra gotchas I don’t want to get into just now.

A few frames of the same video content rendered with a different SCY offset each time.

Again, it seems like it should be pretty quick to implement.

Yet another register

We’re going to need to add a fourth register to our PPU — and then a fifth when we start using SCX, not mentioning the few others we’ll need down the road. If you recall the previous article, managing all these extra register addresses was growing tedious, with switch/case blocks that did little more than map an address to a variable.

This looks like a good time to make it a little better. Many of our registers are only used for storage and don’t do anything specific. Just writing to them or reading from them given their specific address is something we’ll end up doing a lot.

To this effect, I came up with a new Addressable type which doesn’t even require a structure.

// Registers type allows mapping a 16-bit address to an 8-bit register variable.
// It also implements the Addressable interface.
type Registers map[uint16]*uint8

Go lets us create a new type based on existing ones, and attach methods to it. What we want to do here is associate a 16-bit register address to an 8-bit variable we should be able to read or modify. A map of 16-bit integers to 8-bit integer pointers will let us do just that.

Then, it only needs to implement the three methods required by the Addressable interface.

// Contains returns true if the given address corresponds to a known register.
func (r Registers) Contains(addr uint16) bool {
    return r[addr] != nil
}

// Read returns the byte in the register at the given address.
func (r Registers) Read(addr uint16) uint8 {
    if regPtr := r[addr]; regPtr != nil {
        return *regPtr
    }
    panic("invalid register read address")
}

// Write sets the value of the register at the given address. If you need some
// extra checks (for read-only registers for instance), you can just override
// this method in types embedding it.
func (r Registers) Write(addr uint16, value uint8) {
    if regPtr := r[addr]; regPtr != nil {
        *regPtr = value
    } else {
        panic("invalid register write address")
    }
}

Checking whether a map contains the given address is trivial since Go will ensure any key not associated to a proper variable will return the map’s zero value (nil, here)².

Similarly, reading the value of the register at a given address or writing to it just amounts to looking it up in the map and using the stored pointer to get or set that value.

And now, other structured types can embed this new Registers type and benefit from those three methods, which will make them Addressable with almost no extra code needed… except a little bit of initialization.

We can do that with our PPU type right away, and add that new SCY register while we are at it:

type PPU struct {
    Registers // Embeds a mapping of addresses to register variables.

    LCDC uint8 // LCD Control register.
    LY   uint8 // Number of the scanline currently being displayed.
    BGP  uint8 // Background map tiles palette.
    SCY  uint8 // Y-scrolling (from the top of the screen).

    // Other fields omitted...
}

That was easy enough: we only had to add a single Registers entry in there to embed it. We can now remove the Contains(), Read() and Write() methods from the PPU type itself: it will use the methods embedded with the Registers type by default from now on… provided we initialize this map of registers.

This is done with a tiny snippet of extra code in the NewPPU() function:

// NewPPU returns an instance of PPU using the given display object.
func NewPPU(screen Display) *PPU {
    // Pre-instantiate the PPU object so we can refer to its registers.
    ppu := PPU{Screen: screen}

    // Associate addresses with the corresponding register variables.
    ppu.Registers = Registers{
        0xff40: &ppu.LCDC,
        0xff42: &ppu.SCY,
        0xff44: &ppu.LY,
        0xff47: &ppu.BGP,
    }

    return &ppu
}

First, an actual PPU instance is created, so we can refer to the addresses of its LCDC, SCY, LY and BGP registers immediately afterwards.

Then, those registers are associated to a 16-bit address as they would have been in a regular map variable.

And now, calling mmu.Read(0xff42) from pretty much anywhere in our program will automatically return whatever is currently stored in our brand new ppu.SCY register. We can add any number of registers to our PPU type, and simply add an entry to that map when we instantiate the PPU. There is no more need to add some OR condition in Contains() or an extra case in Read() or Write(). This is all taken care of by the embedded Registers type³.

We now also have a new register dedicated to vertical scrolling that’s just sitting there, doing nothing so far.

SCY U NO SCROLL?

As I mentioned earlier, “scrolling” in our case only means fetching pixels from a different memory area over time. In effect, it amounts to adding a mere variable in the bunch of offsets we computed a couple articles ago.

If you recall, that was taking place at the very end of the OAM Search state in our PPU’s Tick() method:

        if p.ticks == 80 {
            // Move to Pixel Transfer state. Initialize the fetcher to start
            // reading background tiles from VRAM. The boot ROM does nothing
            // fancy with map addresses, so we just give the fetcher the base
            // address of the row of tiles we need in video RAM, adjusted with
            // the value in our vertical scrolling register.
            //
            // In the present case, we only need to figure out in which row of
            // the background map our current line (at position Y) is. Then we
            // start fetching pixels from that row's address in VRAM, and for
            // each tile, we can tell which 8-pixel line to fetch by computing
            // Y modulo 8.
            p.x = 0
            y := p.SCY + p.LY // Real Y value taking scrolling into account

            // The following is almost identical to the non-scrolling version,
            // substituting our computed Y value instead of only using LY.
            tileLine := y % 8
            tileMapRowAddr := 0x9800 + (uint16(y/8) * 32)
            p.Fetcher.Start(tileMapRowAddr, tileLine)

            p.state = PixelTransfer
        }

This is almost exactly the same code as before, we merely added a y variable that’s just the sum of LY (the current scanline in our frame) and SCY (the vertical offset starting from which we need to read pixels in video RAM).

And believe it or not, that’s all we need!

Ta-dah again!

Now, depending on your actual hardware, the image might be scrolling too fast to be noticeable. To obtain the picture above I used a feature from SDL that attempts to synchronize texture updates with your screen’s refresh rate. Unfortunately, this is unreliable because:

Your screen’s refresh rate may vary.
Your screen’s refresh rate may not match the original Game Boy screen’s.
Some GPU drivers just can’t synchronize with your screen at all.

As of this article, I have yet to find a reliable way to run the emulator at a speed that feels natural. I’m hoping (and probably gonna have) to use sound in some way to clock the whole system. So, until then, you know the drill: we’ll make do.

This ███████^® sucks

Yeah, I know, this registered trademark black bar is getting really boring. Especially now that we have everything we need to make it scroll!

Since the Boot ROM reads the Nintendo logo from the cartridge itself, we now need to implement support for actual ROM files. This, too, will need a whole article when we get to special memory controllers found in a lot of Game Boy cartridges to extend their size beyond the usual 32KB of the oldest and smallest ROMs.

However, we have almost all the components we need to just map a 32KB (or smaller) ROM file into our emulator’s memory. Just like the Boot ROM, the most basic cartridge can be seen as a 32KB-long byte array, which covers all addresses from 0x0000 to 0x7fff.

Didn’t we implement something similar, way back? Our RAM type provides exactly the features we need, except we need to initialize the underlying byte array with the contents of the ROM file we want to run. We should also forbid memory writes to those addresses since the simplest cartridges are just read-only ROMs. Some others do accept writes to special addresses related to memory banking, but, again, that will need its own article.

In the mean time, we can simply embed our RAM type, write a tiny constructor function to read its contents from a file, and override the embedded RAM type’s Write() method to make it a no-op.

// Cartridge type acting like a read-only extension of our RAM type, initialized
// with a file just like the BootROM type. This type directly embeds RAM so the
// Read() and Contains() methods are already implemented. We can just add an
// empty Write() method to make it fully read-only.
type Cartridge struct {
    RAM // Also embeds `bytes` and `start` properties.
}

// NewCartridge returns a new address space containing the cartridge's contents
// and starting from zero. Returns nil in case of error, so that if there is
// no cartridge file in the current folder, we silently ignore it.
func NewCartridge(filename string) *Cartridge {
    cart, err := ioutil.ReadFile(filename)
    if err != nil {
        fmt.Println(err)
        return nil
    }
    return &Cartridge{RAM{cart, 0}}
}

// Write does nothing for our cartridge. Some actual cartridges with extra chips
// in them actually do some specific stuff when you write to them, but that is
// way beyond the scope of this program.
func (c *Cartridge) Write(addr uint16, val uint8) {
    // Ignore all writes to this address space.
}

See how quick that was? I didn’t even need to omit error handling for brevity this time!

This cartridge file should ideally be provided on the command line when running the emulator⁴, and this should be done in our main() function.

    // MMU looking up addresses in boot ROM or BOOT register first,
    // then in the PPU, then in RAM, then in the cartridge (if any).
    // So even if the RAM object technically contains addresses shadowing the
    // BOOT, LCDC, LY or SCY registers, the boot or ppu objects will take
    // precedence.
    mmu := MMU{[]Addressable{boot, ppu, ram}}

    // If a cartridge file is given as parameter, try to load it and add it to
    // our MMU. Otherwise, the emulator will still behave as if no game was
    // inserted.
    if len(os.Args) == 2 {
        if cart := NewCartridge(os.Args[1]); cart != nil {
            mmu.Add(cart)
        }
    }

The only real change is that we now try reading any file whose path is given as parameter to the program. I quickly implemented an Add() method for the MMU type that simply appends a new Addressable object at the very end of its list. Since we pretty much want every other component (especially the boot ROM object) to have priority over the cartridge itself, this works fine.

We can now pass any ROM to the test program and see if it can read a logo from it. Just as the boot ROM binary image, I can’t really provide any old ROM here. Sure, there are some in the public domain, but I thought it’d be funnier to actually provide a completely custom ROM you could play with — or even open in a text editor. You can download it from the same place as the test program and try it out if you don’t have a 32KB ROM laying around.

Thank you for bearing with me so far.

Since this is still not the genuine logo, the Boot ROM will hang there as well. If you do try out an actual ROM, there is a fair chance the test program will exit soon after displaying the logo since it will then try to continue executing code, and there are most likely machine instructions in the cartridge that we haven’t implemented yet.

Remember, we initially set out to execute the Boot ROM itself and nothing more. This was also my goal when I started writing my emulator, and I didn’t think I’d get that far, to be honest.

Yet getting that scrolling display to work was just encouraging enough that I went and decided I’d at least try and run some small ROM, like Tetris or Mario Land.

But this is another story for another time. Right now, I’m rather happy with that example program which, using only over one thousand lines of code (not counting comments), is running almost the whole boot ROM. Obviously, we’re still missing sound, and that’s something I actually haven’t solved yet. I kind of hope to figure it out as I write the next article.

In the mean time, thank you for reading, and all my best wishes for 2020!

References

The SDL wiki
Pan Docs: Graphics
Pan Docs: Memory Map
The Ultimate Game Boy Talk (the whole Pixel Processing Unit part, still)
Example program: scrolling at last
Example ROM

You can download the example program above and run it anywhere from the command line:

$ go run scrolling-at-last.go

It expects a dmg-rom.bin file to be present in the same folder. Note that it might take a little while to start, as Go will build the SDL bindings the first time you run the program.

You can also download the example ROM in the same folder and run it from the command line:

$ go run scrolling-at-last.go cartridge.gb

At last, you can substitute cartridge.gb with the path to any GB ROM you have and see what happens!

With no sound yet. I’m still working on it at this time. ↩︎
There is a cleaner way to do it where you can get an extra boolean variable when querying a map, which tells you whether the value was indeed present, but in our case this would do the exact same thing and just be more verbose. ↩︎
You can still override them. For instance, to make sure writes to LY are ignored. ↩︎
Well, ideally there would be some GUI to directly select your ROM file from your hard drive but this the one thing I said SDL wasn’t very good at. Command-line parameters will do juuuust fine. ↩︎