Posted: 2022-02-15
Last modified: 2022-02-16 @ 5273e29

Oversights and Redesigns

This post is part of the following series: Rusty Game Boy Emulator.

For the third installment of this series, I’m going to have to redesign the emulator a bit due to an oversight I made in the opcodes.

The oversight

I was happily hacking along, implementing opcodes from the table when I noticed something about some of the instructions. I’ll use two instructions to show what I mean.

Here’s a nice and normal instruction from the table:

INC A   # Instruction
1 4     # Size in bytes, and how many clock cycles it takes
Z 0 H - # Affects flags Z and H, clears flag N, no effect on flag C

And here’s one that confused me for a bit, and then made me realize I have to redo a couple of things:

JP Z, a16   # If the Z flag is set, jump to given address
3 16/12     # Takes 16/12 clock cycles?
- - - -

16/12 cycles? What does that mean?

What it means

This is a conditional instruction that takes either 16 or 12 cycles, depending on the outcome of the check of the Z flag. If the check is true – the flag is set – it takes 16 cycles. It’ 4 extra clock cycles, or one machine cycle, which I assume means that the extra cycle comes from loading the program counter with the address given as the argument to the instruction.

I wasn’t reading every instruction in the table very closely when I started this project, so the code was not made for this. I really should have foreseen that this would be the case, but I was simply too eager to start coding things up that I forgot how processors worked for a bit.

What needs to change

The CPU needs to somehow get information about whether or not extra cycles were needed when executing an instruction. I see two ways this could be implemented.

The tedious way

Remember the OpCode struct?

struct OpCode<'a>(
    /// Mnemonic
    pub &'a str,

    /// Function to call
    pub fn(&mut Cpu),

    /// Instruction size in bytes
    pub u8,

    /// Machine cycles
    pub u8,
);

One solution is to modify the function pointer type to a function that returns a u8, the number of cycles the instruction took. Then the Cpu struct can directly save this return value and delay for the appropriate amount of time. There is no timing code yet, so nothing is done with the information of how many cycles an instruction took, but it will be needed in the future.

Why is this tedious? Well, now every already implemented instruction has to change its function signature and return how many cycles the instruction took. I have already implemented some 200 instructions, and even though most of them are generated by macros, the thought of having to go through them all and cross-referencing the table again does not exactly fill me with joy.

The easy, slightly hacky way

My current function for stepping through instructions looks like this:

pub fn step(&mut self) {
    let OpCode(_mnemonic, func, size, cycles) =
        opcodes::OPCODES[self.read_byte(self.registers.pc) as usize];
    func(self);
    self.registers.pc = self.registers.pc.wrapping_add(size as u16);
    self.machine_cycles = cycles;
}

The function simply extracts the instruction mnemonic, function, size, and number of cycles from the big table of instructions. It runs the instruction’s function, passing in itself, updates the program counter to point at the next instruction, and registers how many cycles it took. Looking at the code now, I see that it should execute func after updating the program counter, as otherwise the jump instructions will be off by as many bytes as they took up in memory. So that’s another oversight to add to the to-do list (See update at the bottom). But what I want to focus on is the easy fix for the conditional cycles conundrum ¹.

Since the functions take in a &mut self, they could modify the machine_cycles field before returning, and the step function could simply do a += assignment. Making sure that machine_cycles is set to 0 before calling func, this would work, and code would only have to be added to the relatively few instructions that suffer from the problem.

pub fn step(&mut self) {
    let OpCode(_mnemonic, func, size, cycles) =
        opcodes::OPCODES[self.read_byte(self.registers.pc) as usize];
+   self.machine_cycles = 0;
    func(self);
    self.registers.pc = self.registers.pc.wrapping_add(size as u16);
-   self.machine_cycles = cycles;
+   self.machine_cycles += cycles;
}

This is a very easy fix, but it feels slightly hacky to have the instructions touch anything other than the registers (and, indirectly, the memory) for some reason. I can’t quite put my finger on why though. Maybe because the instructions need knowledge of the internal workings of the CPU. I have been considering wrapping a bunch of CPU functions in a trait for some abstraction, and this fix does not gel with that possibility, since it needs knowledge of the internal workings of the Cpu struct.

Why I chose the easy way

This is a project for having fun and learning a bit. I don’t really want to get bogged down with going over the table and cross-referencing everything I’ve implemented so far again. Besides, if each instruction returned how many cycles it took, that information would be spread out over several files. Taking the easy way ensures that all the information is available in a single place.

I may revisit this decision in the future, but for now I will leave it at that.

2022-02-16 Update: Not quite that simple

I wrote the above article late at night, and didn’t actually test my simple solution before going to bed. The actual solution ended up being a slightly less simple, but I still didn’t have to go back and change all the code from before. I know I’ll have to in the future, though…

Anyways, I did make the branching instructions poke at the machine_cycles field if the branch was taken, but that was not quite enough. Originally I thought I had to move the increment of the program counter to before executing the func call, but since these and many other instructions take an argument, and my current implementation for retrieving said argument depends on the state of the program counter, moving the increment caused loads of tests to fail because the instructions were given the wrong arguments.

So instead, I made the increment of the program counter conditional on a boolean that is to be set by any instructions that update the program counter. That way, it will not update and be off by a few bytes when executing such an instruction. The arguments to the instruction I saved before executing it in another new field in the Cpu struct: current_argument:

pub struct Cpu {
    // ...
+   current_argument: Option<Argument>,
}

Argument is just an enum that looks like this:

enum Argument {
    Byte(u8),
    Word(u16),
}

Thinking about it now, the OpCode struct’s function pointer can be changed to accept an Option<Argument> as an argument. That way they don’t have to ask the Cpu for it by reading memory themselves. Dang it, I’m going to have to refactor everything and implement it with traits anyway, huh?

CCC for short, of course. Although you should always avoid alliteration. ↩︎