Articles, Blog

Memory Mapping – Super Nintendo Entertainment System Features Pt. 09

This video is part 9 in a series about Super
Nintendo Entertainment System features. The SNES’s CPU allowed for a 24-bit address
space, and here we’ll see exactly how it is used and how each component gets mapped to
regions of memory. Afterwords, there will be a brief overview
of the interrupt vectors are and how they are used. There are many components of the Super Nintendo
architecture that need to be accessible by the CPU. This includes things like memory–work RAM,
video RAM, object attribute memory, color graphics RAM, audio RAM, save RAM, any co-processor
memory; the game’s ROM image; any CPU, PPU, and co-processor control registers; or joypad
registers. Like mentioned in a previous video, most of
these are Memory Mapped Input/Output devices. This means that a portion of the SNES’s address
space is reserved to access each region. The address space is 24-bits wide; this allows
for 2^24 or 16,777,216 different locations to allocate to any and all of these different
things. The benefit of memory mapping is that the
CPU can treat all of them the same. In part 7 about DMA, we saw that accessing
data from OAM, CGRAM, or VRAM required using a handful of special registers to write and
read data to and from these areas. However, with things like work RAM, save RAM,
or cartridge ROM, data can be read and written just by using a certain 24-bit address that
maps to that area. Take this code for example. Even though all of these instructions are
just referencing some location in the 24-bit address space, they refer to a PPU register,
work RAM, ROM, and save RAM respectively. This video will explain how the address space
is divvied up between all the different types of memory. Before looking at how memory is mapped, let’s
look at the address space itself first. Each location of this 24-bit address space
refers to a single 8-bit byte. So $000000 refers to the very first byte,
$000001 refers to the second byte, and so on. 256 bytes make up what is called a page. You can tell the page number by looking at
the highest 4 hex digits of the address. So page zero goes from $000000 to $0000FF,
page one from $000100 to $0001FF and so on. 256 pages make up one bank. So one bank is equal to 65,536 bytes, or 64
kilobytes. Again, you can tell the bank number by looking
at the highest 2 hex digits of the address. So bank zero goes from $000000 to $00FFFF,
bank one from $010000 to $01FFFF and so on. And then the entire address space is made
up of 256 banks. This covers the entire range from $000000
to $FFFFFF. 64 kilobytes times 256 banks equals 16 megabytes
of mappable regions. Her I’ve drawn the address space going from
top left to bottom right, but most diagrams you’ll find elsewhere will have this flipped
upside-down, so I’ll do that too for consistancy. To make some things easier to digest later,
let’s split up the address space into four equal sized quadrants. Quadrant I goes from bank $00 to $3F, Quadrant
II from bank $40 to $7F, Quadrant III from bank $80 to $BF, and Quadrant IV from bank
$C0 to $FF. It will also be helpful to split all these
banks in half. The bottom half of a bank is from $0000 to
$7FFF, and the top half is from $8000 to $FFFF. This is why I flipped the diagram upside-down,
so that the top half is actually on the top. Okay, this is what we’ll be looking at for
the rest of the video, so please get comfortable with this diagram. It should be simple to identify regions of
the address space, such as the top half of Quadrant III, the bottom half of bank $70,
or the entirety of banks $FE and $FF. The memory and registers inside the SNES console
are mapped to the same region no matter what the mapping mode is. The first of which is the 128 kilobytes of
work RAM. 128 kilobytes works out to be exactly two
banks worth of space. And so work RAM is mapped to banks $7E and
$7F, and it will always be there no matter the mapping mode. Now the first 32 pages of work RAM, from $7E0000
to $7E1FFF, is special in that it is mirrored. A byte in some type of memory or register
is said to be mirrored if it can be accessed from more than one location in the address
space. In this case, those 32 pages can be accessed
from not only bank $7E, but all of the banks in Quadrants I and III. For example, accessing byte $7E1234 will access
the exact same byte as $001234 or $9A1234. This is done so that at least some parts of
work RAM can be accessed at the same time as accessing other things like ROM and CPU
registers. Otherwise, the instruction would have to specify
to access from bank $7E every time you write to or read from work RAM. The other things in the console are the PPU,
CPU, and joypad registers. Just like the work RAM mirror, these are mirrored
across all banks in Quadrants I and III. The PPU registers are mapped to various bytes
in page $21, while CPU registers are mapped to pages $42 and $43. The manual joypad registers are mapped to
bytes $16 and $17 in page $40 to remain backwards compatible with the NES, since this was the
same address used for that console. And that’s it. The rest of the unmapped space so far is available
for anything else to use. Any devices plugged into the cartridge slot
or the expansion port could listen for some or all of these addresses and respond with
the appropriate data. While there is hardly anything stopping a
developer from mapping whatever they want to these areas, there were some conventions
that were followed. Sidenote here, this diagram is drawn to scale
at the moment. To make viewing things easier, let’s expand
these smaller sections. Just keep in mind that these areas are smaller
than they look. The remaining space can be divided into two
particular regions. The first region being all of the parts of
the bottom halves of Quadrants I and III that aren’t taken up by the work RAM mirror and
hardware register mirrors. The second region being everything else that’s
left, which is the top half of Quadrants I and III, all of Quadrant IV, and the remaining
part of Quadrant II that doesn’t belong to work RAM. By convention, ROM data could only be mapped
to the second region. There is a special pin connected to the cartridge
slot called ROMSEL, short for ROM Select. This line is normally high and gets pulled
low when an address from the second region, the ROMSEL region, is being accessed by the
CPU. This line is used as sort of a shortcut to
let the cartridge know that the address on the bus at the moment could be a ROM address
and to send back some data over the data lines. This allowed for very simple circutry for
cartridges that only contained a single ROM chip. Add in some SRAM under certain mapping modes,
or any sort of enhancement device and this line isn’t as helpful since these chips need
to look out for addresses outside the ROMSEL region. Speaking of mapping modes, let’s finally dive
into those. Most of the commercially licensed games released
for the Super Nintendo can be categorized into six different mapping modes. There are many different variations of each
mode depending on how large the ROM and SRAM chips needed to be, and if there were any
particular enhancement chips on the board. The simplest of the mapping modes is mode
0; sometimes known as mode 20 or LoROM. The first half bank of the ROM image is mapped
to the top half of bank $80, from $808000 to $80FFFF. The second half of the first ROM bank gets
mapped to the top half of bank $81. This pattern continues until the entire ROM
is mapped to the address space. The end result is that the ROM takes up the
top half of Quadrant III and Quadrant IV if it is large enough. The largest ROM size supported in this mapping
mode is 4 megabytes. If the cartridge contains SRAM, it gets mapped
to the address space as well. It starts at address $F00000 and counts up
through the bottom half of the bank. 32 kilobytes of SRAM will run up to $F07FFF. Even larger amounts of SRAM will get mapped
to the bottom half of the following banks. Now generally, all of the area left in the
ROMSEL region will be some sort of mirror of the ROM or SRAM images. It really depends on how the circuit on the
cartridge is built, but in general, any leftover area in the top halves of Quadrants III and
IV are ROM mirrors. The bottom half of Quadrant IV will also be
a ROM mirror. If the catridge has SRAM, the remaining areas
of the bottom halves of banks $F0-$FF are SRAM mirrors. And then finally, Quadrants I and II are mirrors
of Quadrants III and IV. There is a small thing to point out here that
applies to all of the mapping modes. Quadrant II usually is a mirror of Quadrant
IV, except for the work RAM in banks $7E and $7F.
A side effect of this is that if the ROM image is very large, some parts may only be accessible
from Quadrant IV and not Quadrant II. The next mapping mode is mode 1; also known
as mode 21 or HiROM. The main difference between HiROM and LoROM
is that the banks aren’t broken in half and shoved into only the top half of the address
space. The first bank of the ROM image is mapped
to bank $C0, all the way from $C00000 to $C0FFFF. Then the next bank gets mapped to bank $C1,
and so on, all the way to bank $FF if necessary. With this mapping mode, the entire ROM image
is accessable from just Quadrant IV. The largest ROM size supported in this mapping
mode is also 4 megabytes. Now if the cartridge contains SRAM, it will
be mapped to Quadrant III. It starts at address $B06000 and counts up
through the bottom half of the bank. 8 kilobytes of SRAM will run up to $B07FFF. Even larger amounts of SRAM will get mapped
to the same pages of the following banks. And again just like before, the leftover area
will be a mirror of the ROM or SRAM. The leftover parts of Quadrant IV will mirror
the ROM area in banks before it. The top half of Quadrant III will mirror the
top half of Quadrant IV. If SRAM is present, it will get mirrored all
the way through bank $BF. Then it may or may not also be mirrored to
the same pages of banks $80-$AF. And finally, again, Quadrants I and II are
mirrors of Quadrants III and IV. With HiROM, the ROM banks are mapped contiguously
into the address space, but at the cost of not being able to see the work RAM mirrors
or hardware registers when working in one of these banks. If working in a bank in Quadrants I or III,
only the top half of the ROM banks are accessible. The next three mapping modes were reserved
for cartridges with certain enhancement chips. These special chips will have entire videos
dedicated to them, so I won’t go into much detail here. Mapping mode 2, or 22, or 2A, was reserved
for use with special ROM mappers, such as the Super Memory Management Controller, or
Super MMC . A ROM mapper allows for an amount of ROM that
is larger than 4 megabytes to be dynamically mapped to the address space by only mapping
certain sections of the ROM at once. For example, a 6 megabyte ROM could be broken
into 6 sections that are 1 megabyte each, and then any 4 of these sections could be
mapped onto the entirety of Quadrant IV. The ID 22 was given to the S-DD1 chip, and
2A was given to the SPC7110 chip, both of which will be covered in a future video. Mapping mode 3, or 23, was reserved for use
with the Super Accelerator System, or SAS. The SAS includes the SA-1 chip, which emulates
the Super MMC from the previous mapping. It allows for ROM images of up to 8 megabytes,
of which 4 are selected like before. The SA-1 chip includes 2 kilobytes of interal
RAM that gets mapped to the address space, and the SAS also includes up to 128 kilobytes
of backup RAM which is also memory mapped. The exact memory mapping along with the great
capabilities of the SA-1 chip will be covered in a future video as well. Mapping mode 4, or 24, was reserved for use
with the GSU, also known as the SuperFX. Although all commercial games that used the
SuperFX chip were supposedly marked as using LoROM, the actual mapping is much different. Games that used the GSU could support up to
8 megabytes of ROM, although only 2 megabytes could be shared between both processors. The GSU also came with 128 kilobytes of backup
RAM, and up to 128 kilobytes of SRAM could be addressed too. More details about how these are laid out
in memory and how they are mirrored will be–you guessed it–covered in a later video. Now mapping mode 5, also known as mode 25
or ExHiROM, allows for up to 7.9375 megabytes of ROM without the use of the Super MMC mapper
or any other hardware. In this mode, Quadrants I and II aren’t mirrors
of Quadrants III and IV, which allows for more ROM to be mapped at once. The first 4 megabytes of the ROM are mapped
the same as HiROM, from bank $C0 to bank $FF. Then, any additional ROM gets mapped from
bank $40 onward to bank $7D. You can see why a full 8 megabytes aren’t
available, since work RAM takes up the last two banks of Quadrant II. SRAM is mapped the same way as HiROM, from
$B06000 to $BF7FFF. It gets mirrored the same way as well, filling
up Quadrants III and I. The top half of the ROM banks from Quadrant
IV are mirrored to Quadrant III, and similar from Quadrant II to Quadrant I. The amount of ROM mapped right now is 7.875
megabytes, but an additional 64 kilobytes can be mapped to the top halves of banks $3E
and $3F. This requires a null buffer in the ROM image–what
would have been the bottom halves of banks $7E and $7F aren’t mapped to anything and
are inaccessable. There is also one last mapping mode that is
worth pointing out, but it is only used by homebrewers and ROM hackers. It is known as ExLoROM, and it is to LoROM
as ExHiROM is to HiROM. The ROM starts at $808000 and takes up the
top halves of Quadrants III and IV up to $FFFFFF. Then, it continues at $008000 and fills up
the top halves of Quadrants I and II up to $7DFFFF. This brings the maximum ROM size up to 7.9375
megabytes just like ExHiROM. Like LoROM, SRAM is mapped to the lower halves
of banks $F0-$FF, and mirrored to banks $70-$7D. Then the remaining parts of Quadrants II and
IV are mirrors of the ROM in the same bank. Even with all of these mapping modes, there
are still some spaces outside of the ROMSEL region that never get mapped to anything. Other enhancement chips that don’t require
a special mapping are free to utilize this space to communicate with the CPU. This includes things like the OBC-1, CX4,
ST018, S-RTC, or the DSP chips for example. However, they won’t end up mapping every single
remaining byte to something. Addresses that don’t map to anything are said
to be open bus. See, when a byte is retrieved from some address
outside of the CPU (like ROM or RAM), it is put into an intermediate register called the
memory data register, or MDR for short. So a byte from, say, ROM goes to the MDR,
then to the CPU register like the accumulator. Even the bytes that correspond to the next
instruction fetched from ROM go through the MDR first. If the address specified isn’t mapped to anything,
the MDR isn’t overwritten with a new value, so whatever was in the MDR previously will
be treated as the read value. In this example, the value loaded into the
accumulator is the $34 that was part of the operand of this very instruction. This open bus behavior can be very odd and
unintuitive, which is why it’s declared as undefined behavior and should be avoided. The entire address space can be also be split
up based on the length of their waitstates, or how quickly data can be read or written
at each location. The areas designated for work RAM and its
mirrors can be accessed at a speed of 2.68 MHz. The ROMSEL region in Quadrants I and II is
also accessible at 2.68 MHz, while that in Quadrants III and IV can be accessed at 2.68
MHz or 3.58 MHz. This depends on the first bit in CPU register
$420D, and only works if the hardware used in the game cartridge can support the shorter
access time. Games that take advantage of the faster ROM
access are said to be FastROM, while those that don’t are just SlowROM. Pages $20-$5F of the remaining banks can be
accessed at 3.58 MHz, except for pages $40 and $41 which are limited to only 1.78 MHz. And finally pages $60-$7F are accessible at
the standard 2.68 MHz. Games with FastROM should be sure to access
ROM in the faster region instead of the slower mirrors. Enhancement chips can also map their interface
to a faster or slower region depending on how quickly they can operate. The last thing to touch on in this video is
the interrupt vectors. The interrupt vectors are a set of 10 16-bit
pointers that define where in bank $00 execution should immediately jump to if an interrupt
occurs. They are found at the very end of bank $00. They are split into two groups of 5, one for
when the CPU is in 6502 emulation mode, and one for when its in 65816 native mode. The COP vector points to the routine that
will be run whenever a COP instruction is executed. The BRK vector points to the routine that
will be run whenever a BRK instruction is executed in native mode only. The ABORT vector is unused on the Super Nintendo,
but would have pointed to the routine that would be run whenever a page fault or memory
access violation occurred. The NMI vector points to the routine that
will be run at the start of the vertical blanking period when the non-maskable interrupt is
enabled. The RESET vector points to the routine that
will be run upon booting up or soft resetting the console. Note that the SNES always starts in emulation
mode, which is why this vector doesn’t exist in native mode. Finally, the IRQ vector points to the routine
that will be run whenever a software interrupt occurs, and also when a BRK instruction is
executed in emulation mode. These vectors are located in the ROM, so it
must be certain that they will be mapped to the proper location in the address space. So for LoROM mapped games, the vectors appear
at offset $7FE4 in the ROM, while HiROM mapped games will have them at offset $FFE4. Similarly, ExLoROM mapped games will have
them appear all the way at offset $407FE4, while ExHiROM mapped games will have them
at offset $40FFE4. Thank you for watching. You may have noticed the complete absence
of anything related to the audio system in the memory map. The next video will be about the SNES’s audio
processor, the SPC-700, and how it communicates with the main CPU to produce sound effects
and music.

  • I'm looking forward to the SPC-700 video, but I'm really looking forward to that SA-1 future video. "Our processor just can't handle this workload. I know, let's bolt a (virtually) identical chip with a faster clock onto the cart." Ahh, those were the good old days…

  • So, are you saying the SNES is actually a 24-bit console with a 15-bit graphics card? You heard right, 15-bit. Because it can only choose from roughly 33,000 colors. 16-bit would by over 65,000 colors. Although I bet the video card itself does have a 16-bit wide data bus.

  • Some Game Boy games might have had those null buffers too. I always wonder if they contain anything interesting. Even a ROM dump usually won't include them, unless it's done by extracting the actual chip.

    (The GB games would be any that use MBC1 chip with more than 1MB of ROM. But I'm not sure any exist… 😅)

  • I understood very little of this but you never fail to hold my interest. Will you be doing a video on more obscure expansion chips? Everyone knows about the SuperFX and the SA1, but there's a bunch of other, more situational chips which never seem to get any attention.

  • I can't wait until 2020 when you will cover the S-DSP and SPC700 and we will see the true extent of pitch modulation, white noise, the delay module, and what's up with that BRR compression.

  • This is super interesting and really helps me understand how mashups like Zelda/Metroid randomize works. One question for you: I thought Star Ocean used ExLoROM but you said it was never used by official games. What Map does Star Ocean use?

    Thanks again! I love the series!

  • Here's a more circuitry-based analysis of LoROM and HiROM

  • 16:38 Actually there's slightly more space because only the top half of WRAM is obstructing ROM, unlike in ExHiROM where all of WRAM is obstructing.

  • This is why I never got very far into learning ASM when ROM hacking Super Mario World as a kid 😅 this stuff went way over my head. I'm sure I could learn it much easier now as an adult, but that scene has pretty much died out. Excited for the SPC video!

  • So, there's H-Blank DMA and V-Blank interrupts, but no H-Blank interrupts, so one can't do anything too crazy between lines?

  • … so all of these mirrors are so if you're on a particular vertical line/bank, what you can access from there without changing banks?

  • Brilliant stuff. Hoping for Mega Drive, PC Engine, Neo-Geo in the future. Good to know how my favorite stuff work.

  • Imagine if Nintendo had actually gone through with full NES backwards compatibility. I wonder how they would've implemented the audio in that case.

  • Firstly, I live your videos, and especially the amount of effort you put into making not just very good, but pleasing diagrams. I’d have some questions though. Is the point of memory mirroring to improve access times. From Coroutines or what else?

  • I have a hard time entirely understanding these videos, but in every one I learn a little bit something new about the wonderful SNES. Keep up the good work.

  • Thank you so much for explaining the memory mapping! That was the confusing part of most of your prior videos having to do with RAM.

  • All this block mirroring stuff makes me wonder if there is something going on that I don't know about. For example, is using a 24-bit address sufficiently burdensome that trying to use a 16-bit address within the same bank is preferable? Besides the obvious fact that 24-bit memory instructions are 32-bits in size, and the difficulty of assembling a 24-bit address from two registers.

    That's the only reason I can come up with for why mirroring would be so frequently used, when you could just have more available storage in the ROM.

  • I have to wonder why there was never a MidROM configuration that uses the entire ROMSEL region to it's fullest. You always lose 2MB of addressing space in either configuration. (4MB for the Ex**ROM variants)

  • I still dont get why the Address space wastes so many addresses on mirroring why mirror if the entire 24bit space is addressable? is it to save clock ticks on higher address access? Just dont get the logic.


    From what I got (I'm no expert, but I have at least some understanding of memory and programming) …. picture a Playstation Game on 4 discs. Say, FF8. As you go through the game, you can revisit older areas. Why? Because those older areas' data are mirrored on the later discs. If they didn't mirror the older areas' data, then you would get a "Please Insert Disc 1" every time you wanted to step into Balamb Garden, for example. The memory mapping works much the same way: If you're working in a specific area of the RAM, it is much slower to jump out of that section, and into another section to find the data you need. UNLESS, of course, you copied that data and mirrored it into several different sections. Now, they didn't mirror ALL of the data, only some of it, just like in FF8 …. dialogue and cutscenes, for example, the FMV video that plays during the assault at the beginning of the game only exists on Disc1, as does the FMVs where Squall is dancing with Rinoa. Why? Because you'll never see that cutscene past that point in the game, no point in copying that onto the other discs. But, Balamb Garden's assets need to be copied onto 3 of the discs, because it is possible to visit Balamb Garden at multiple points in the game. You don't want the player switching discs constantly, so you copy some of those assets that you think the player will be needing periodically throughout the game. Make more sense?

  • I'm glad you explained why all the different mapping modes exist. It seemed unnecessarily complicated to me when I read it (compared to the Genesis, for example, where the map is almost always the same), but having different modes for different tradeoffs w.r.t. data contiguity and access time makes a lot more sense.

  • The funny thing is that, Nintendo spent all this effort designing the memory map so programmers could use 16-bit addressing mode, yet 99% of programmers went with long addressimg mode for everything anyway.

  • One thing to add is that the cpu can have 24 bit instructions. But works faster if you restrict yourself to instructions using only 16 or 8 bit addresses. This makes it useful to layout the memory in such a complicated way

  • Wait, what!? 6502 emulation mode? I never heard about that before. You mean there's a way to execute 6502 assembly in the SNES CPU? What else is compatible with the NES? Not much I assume, otherwise we would have seen retrocompatibility back then. But I guess that explains some ports that are way too close to their NES counterparts, like Ninja Gaiden Trilogy. At least the logic could be ported, right?

  • When you flipped the graph at the beginning, my brain started to 'unflip' everything, making it nonsensical. Had to pause and look away for a minute.

  • Another great video, thank you. So, why does the SNES always start up in emulation mode? Is this because of the intended backwards compatibility, where if a NES/Fami cartridge was inserted, it would boot correctly, as those cartridges wouldn’t contain the code to switch the SNES mode if it started in native mode?

  • Wouldn’t it be funny if this guy at one point said «I have no idea what I’m talking about» and had been bamboozling us the whole time

  • The only thing that kneecaps the SNES is the CPU's 8-bit bus and low clock speed. Realistically, you'll never see it reach that 3.58 MHz max speed, under full load you'd be looking at closer to 2 MHz on average. But it was an intentional design choice by Nintendo, their whole philosophy with the SNES was to accentuate the use of custom chips in the game carts to build upon the stock console's capabilities and they certainly did that. In accomplishing what they set out to do, the SNES hardware can be best described as a "dependable solid foundation."

  • I hope one of those 9 dislikes be from somebody who worked in the manufacture and develop stage of this console. Yeah cuz you know…theres always the savy disliker who doesnt really know shit. lol

  • Now imagine, on NES you only have the first column of this graph, 0000-FFFF, so only 16-bit addresses. This 64kB area is then broken down similarly:
    0000-07FF – 2kB Work RAM
    0800-1FFF – Mirrors of Work RAM
    2000-2007 – PPU registers
    2008-3FFF – Mirrors of PPU registers
    4000-401F – APU (sound) & Joypad registers, Factory testing registers (locked without modding)
    4020-7FFF – Not mapped / Open bus (typically used by extra Work or SRAM and more advanced MMC chips)
    8000-FFF9 – Cartridge ROM (an MMC chip in the cart if any, is responsible for banking the larger ROM into this area)
    FFFA-FFFF – Interrupt Vectors

  • I might of missed this but where can you write to memory? like where is mario's lives/coins stored? everything seems to already be taken by mirrors or open bus space.

  • Please do a supplemental explanation for the Super Metroid/Link to the Past crossover randomiser so there's a decent explanation for linking literally anywhere.

  • Going from the VIC 20 memory map to this was not a good idea, it was enjoyable tho.
    not even planning to learn assembler, just wanna know more or less the ins and outs of 8 and 16 bit machines.

  • I’m not even gonna pretend I can understand everything you’re talking about, but I still can’t get enough of your videos. You are too good! Keep it up.

  • Why are so many people using decimal number to say hexadecimal numbers? If you say forty i hear decimal number but you actually mean sixty four. And then it gets inconsistent.
    When you get to 0xC0 it is no longer "Cty" but C 0.
    And if 0x59 it is "fifty nine" then 0x5A must be "fifty A", no?

Leave a Reply

Your email address will not be published. Required fields are marked *