This past October, Dolphin turned 20 years old since its initial release to the public as an experimental GameCube emulator. It's been a long ride, with twists and turns. I don't know if anyone back in 2003 expected Dolphin not only to still be under active development 20 years later, but to also support the GameCube's successor in the Wii.
You might be wondering, where is all the pageantry? The honest truth is that things aren't ready yet. We have a few massive changes on the horizon that we wanted to be ready for the 20th anniversary, but that date was not an excuse to release something in a broken and incomplete state. For now, development will continue as normal, but we promise that there is some excitement to be had on the horizon.
In the meantime, we have some great changes for you this in Dolphin Progress Report!
The waiting game is over. After much discussion and work, AdmiralCurtiss bit the bullet and added a dark mode to Dolphin. Some of you may be wondering right now, Wait, didn't Dolphin already have dark mode? Yes. Also no.
Essentially, on macOS, Android, and Linux, Dolphin has supported dark mode for years. On those platforms, if your system is configured to dark mode, Dolphin will switch to darker colors to match without requiring any input from the user. It is completely seamless on macOS and Android, and has been since early 2020, nearly four years ago. Dolphin on Linux has supported dark mode for even longer, however it may or may not work out of the gate depending on the packaged Qt defaults and other Linux quibbles. If you're a Linux user, you know the dance.
However, the above fails to mention a very significant operating system that we support - Windows. As that is nearly half of our userbase, a lot of our users were missing out on nihilistic bliss. There were many reasons why we didn't support dark mode on Windows until now, but the core of the matter is our desktop GUI toolkit - Qt.
While Qt 5 supported automatic dark mode switching on macOS and Linux, it did not respond to the Windows dark mode setting at all. But there was a lot of demand for the feature, so Qt promised that a solution was in progress. For us as an application that uses Qt, we could have bodged together some sort of dark mode switching ourselves, but it would have inevitably been jank and weird. So upon hearing that Qt was working on it, we decided to wait for their solution. That eventually came with Qt 6.5, which indeed delivered automatic dark mode switching on Windows as promised... in their new "Fusion" style. The Windows-matching QWindowsVistaStyle Dolphin has used for years was abandoned and will not be getting dark mode at all. Qt gave many reasons for their decision (spoiler: Windows be Windowsing), but fundamentally, they did what they thought was best given the circumstances.
With Qt's unexpected move to a new style, we had to make some decisions. Do we adopt the new Fusion style, affecting all of our Windows users both light and dark, or do we use the switching they built into Qt 6.5 and create something ourselves? So people experimented, with nearly a half dozen pull requests of all kinds! We tried Fusion, but the look of the style proved to be unpopular. So we started experimenting with custom styles of our own, with multiple people going different directions and bikeshedding for months.
The end result is a hand-crafted dark mode alternate QWindowsVistaStyle created by our very own AdmiralCurtiss. It is what we had hoped Qt's default support for dark mode on Windows would have been - Dolphin on Windows, but dark.
Now every platform we support now has dark mode and automatic switching! Of course it will switch automatically, but on Windows you can manually select between our dark and light styles and even custom user styles with the new Style dropdown in Config → Interface. This replaces the old Custom User Style control, combining our official styles and custom user style controls in one place.
For years now, whenever a user has asked us for settings that would give them the best image quality possible from Dolphin, there's been a familiar back and forth. In response to the question, we'd tell them to set Dolphin's Internal Resolution to match their screen (following our recommendations in the GUI), set Anisotropic Filtering to x16, and use 4x or 8x SSAA (Super Sample Anti-Aliasing).
On a sufficiently-capable gaming PC, that combination will give pristine visual quality. However, users would counter our recommendation by saying that since SSAA runs at a higher resolution than the screen and uses downsampling for its resolve, why can't they just do the same by raising Dolphin's Internal Resolution higher than their screen? And we'd reply by reminding them that Dolphin didn't have its own output resampling, and the basic downsampling from the GPU meant that their idea would create more aliasing rather than less.
This song and dance has happened dozens of times, enough that we would remind users about this whenever new Internal Resolution options were added. However, as of this change, things are different. Dolphin now has its own Output Resampler! With Area, Bicubic, Sharp Bilinear and more, users can now take upscaling and downscaling into their own hands. In fact, as far as we know no other game, emulator, or even GPU driver implements such an advanced and comprehensive resampler!
Has this changed our recommendation for the best possible visual settings? Not really. While you can use output resampling to create a "DIY SSAA" that can meet SSAA's visuals, hardware SSAA is faster at equivalent fidelity (on an Nvidia GPU). SSAA is an easy recommendation.
However, performance was never the point. Output Resampling is an extremely powerful new tool that was built to give users control of Dolphin's final look. From an especially soft and temporally stable image, to razor sharp pixels, to the highest fidelity imaginable, our new output resampling features can do it all! There's a lot to cover here, so let's get started.
What is Output Resampling?¶
Before we continue, let's briefly go over the basics for any readers who are unfamiliar with these topics. Whenever rendering at a resolution that doesn't match the render window, the pixels of the game's output must be resampled (effectively remapped) into a new image that matches the new pixel grid. For example, a 1x Native (640x528) frame in a 1920x1080 canvas must be scaled up to fill the canvas. As it is not a 1:1 ratio and not even an integer multiplier, we can't just directly translate pixels from our output onto pixels of the screen, it must be resampled during upscaling. With the familiar Bilinear upscaling, it will fill the canvas by interpolating the source pixels into the new pixel grid, for a consistent, though soft, resulting image.
The GameCube and Wii were built with analog displays in mind, where exactness simply didn't exist. Every game is close to a standard aspect ratio but subtly nonstandard, and frustratingly uniquely subtly nonstandard. Plus, widescreen is achieved by just.. making the pixels wider, not adding more of them, so that means non-square pixels are a factor too! To accurately recreate these games on modern displays, we have no choice but to account and adjust for these behaviors, which we do by resampling the image. As such, any time you play a game in Dolphin, the image you are seeing will have been resampled at least a little in all cases.
Resampling is unavoidable in Dolphin.
Without an internal output resampler, Dolphin has been entirely reliant on host GPU's provided resampling for this important task. It's... not amazing. The GPU Bilinear Resampler was designed to be fast and simple, and not necessarily to have the best visual resolve.
Despite all this, resampling hasn't been a huge issue for us. After all, Dolphin's GPU load is pretty low* and we are always* CPU limited, so with GPU power to spare*, users should just run at the Internal Resolution that we recommend for their screen and call it a day. With modern devices, even a phone can do that*. And they probably even have enough spare GPU power to add antialiasing for an even better result! Despite all those asterisks, that is still usually the case, if all you care about are high resolutions and crisp antialiased lines. However, not everyone wants that. And even in 2023, there are still cases where not everyone can achieve that. This is where Output Resampling comes to its own.
What are the resample options?¶
Our Output Resampler is extremely comprehensive with many different possible settings and use cases. Here are some of the options and some ways that you can use them.
Area is a resampler that samples every pixel on the screen for the maximum possible sample count. It is built for downsampling, with a soft resolve to combat aliasing, shimmer, and moire issues inherent to downscaling. It can be used for upsampling though, to curious results.
Upsampling isn't really what Area is intended for, but is does give an interesting result - an extremely sharp look, akin to Sharp Bilinear. However, we recommend Sharp Bilinear over using Area this way, as Sharp Bilinear is explicitly designed for upscaling pixel art titles and has lower GPU usage than Area. Experiment with it yourself and compare!
Downsampling is where Area shines. The frick ton of samples and softened resolve allows Area to avoid the downsample aliasing that Bilinear and other resamplers experience. In fact, by combining the area resampler with MSAA and an Internal Resolution much higher than your screen, you can create a DIY SSAA.
However, SSAA is a graphics driver feature that is optimized with your GPU to make it surprisingly efficient, for a brute force technique anyway. A DIY SSAA cannot be faster than a hardware optimized solution! But by being within the users' control, Area provides a lot of interesting downsampling options that we have not had before. Furthermore, SSAA is not so optimal everywhere and may not even be available for everyone. More on that later.
Bicubic uses cubic spline interpolation for its resolve. It comes in three flavors: B-Spline (soft), Catmull-Rom (sharp), and Mitchell-Netravali (medium).
B-Spline is an exceptionally soft upsampler. With extreme temporal stability and minimal noise, B-Spline can give an extremely clean and stable result.
On the opposite end of the spectrum is Catmull-Rom, an extremely sharp upsampler.
Mitchell-Netravali is inbetween these two extremes.
Sharp Bilinear was previously added to Dolphin as a post processing filter, and now has joined the suite of output resamplers. As we have already covered Sharp Bilinear quite recently, we won't be going into detail here. But in summary, Sharp Bilinear is a best-of-both-worlds combination of Bilinear and Nearest Neighbor, allowing nearest neighbor pixel clarity without its shimmering artifacts. It is primarily designed for upsampling low resolution sprites, but it can be used to upsample 3D graphics if you want unnaturally sharp jaggies. It is not intended for downsampling.
When is Output Resampling Better than Existing Solutions?¶
Output Resampling is primarily focused on providing visual options for players, and it typically isn't outright superior to existing solutions. However, it is very powerful, and during testing, we found several scenarios where Output Resampling is able to create a better result than what was possible before.
EFB Copy Brute Forcing¶
Some games use a large EFB effect that is particularly expensive, and have to render it at a fraction of the game's output resolution. A classic example is the mirrored surface of Fountain of Dreams in Super Smash Bros. Melee.
We don't have to worry about the hardware limitations that lead to these decisions, but the consequences of their choices affect us even today: as these effects are rendered proportional to the game's rendering resolution, they scale proportionally too. No matter what Internal Resolution multiplier we use, the mirrored pool in Fountain of Dreams will always be 1/8th of that. As such, the only way to increase the resolution of these effects relative to the screen (without modding the game) is to increase the Internal Resolution way beyond the screen resolution. But as we have already talked about thoroughly in this section, this will introduce aliasing with the GPU bilinear resampler. Users simply had to deal with this - it was either aliasing or a low resolution reflection.
Until now! Using our Output Resampler (specifically Area), we can raise the Internal Resolution as much as we want without adding aliasing!
Upsampling to a Close Resolution¶
As mentioned previously, resampling is basically unavoidable in Dolphin due to quirks of the GameCube and Wii. However, if you use the Internal Resolution recommendations we have our in GUI (such as 3x Native for a 1080p screen), the GPU Bilinear Resampler will be good enough that switching to our Area resampler will give only marginal improvements even for pixel peepers. But not everyone has a system powerful enough to run at our recommendations for their screens, and they may have to settle with the highest Internal Resolution their hardware can manage. Depending on their circumstance, our Output Resampler could help them achieve a better image than was possible before.
A good example of this is resampling a 2x Native (1280x1056) 16:9 title into a 1920x1080 canvas. 2x Native nearly matches a 1080p screen vertically (1056 → 1080), but it has significantly fewer pixels horizontally (1280 → 1920).
This benefit of our Output Resampler is highly situational, but it can improve Dolphin's image quality on lower end systems. Experiment and see what works best for you!
Downsampling on Platforms where SSAA is Unviable¶
SSAA has pristine visual quality and surprisingly good performance for a brute force solution. However, that doesn't matter if your system doesn't support SSAA! In desktop land we have been spoiled by SSAA being near universal, but Dolphin's mobile users are now over half of our userbase, and SSAA is uncommon on mobile SoCs. Furthermore, SSAA may be present but not optimal. Adreno in particular hates SSAA! A DIY SSAA made with Output Resampling may be just what a user needs on those platforms.
Unfortunately, our Output Resampler is not yet available on Dolphin Android. We tried our best to hunt down some non-Android devices that lack SSAA or have poor SSAA performance, but the best we had on hand was a 2019 Surface Pro X. While its Adreno GPU absolutely hated SSAA as predicted, it was too weak to serve as a good demonstration.
Once this feature arrives on Android, we have loads of powerful Android tablets to try this on. Stay tuned!
Now that we have proper downsampling, we've decided to increase the maximum Internal Resolution exposed by default in our desktop Qt GUI from 8x Native to 12x Native (8K). This will allow users of 4k displays to increase the Internal Resolution beyond their screen for downsampling purposes without editing INIs. For the twelve of you with an 8k panel, now you don't need to set MaxResolution at all! Unless you want to, 8k users tend to have silly hardware so do whatever you want.
However, just because Dolphin shows 12x Native in the GUI does not mean that your computer can achieve it. 12x Native is 48,660,480 pixels, or 48.7 megapixels. Currently most computers will run out of VRAM before reaching 12x Native! Statistically speaking your computer will not run this resolution at fullspeed. Use this power responsibly!
This change only applies to our desktop Qt GUI, and does not affect Dolphin on Android. We do not need to explain this.
If you're in the TAS community or relying on constant savestate usage for your project, this is a huge optimization that might make your life a lot easier. Malleo noticed that Dolphin was using LZO for savestates, which has a good compression ratio, but is quite a bit slower than LZ4 when it comes to decompression. By moving Dolphin's savestate compression to LZ4, savestate loading is now ~72% faster.
For the casual users, savestate loads are now snappier. For speedrunners this will make practicing a single trick over and over again take a little less time. But for people using hundreds, thousands, or tens of thousands of savestate loads for TAS creation or AI projects, the savings really start to add up.
To run game code that's been designed for the GameCube and Wii's PowerPC CPU, Dolphin's JIT translates PowerPC machine code into machine code that your computer can run natively. Once a chunk of code has been translated, the resulting code block is stored in Dolphin's JIT Cache so that the code doesn't have to be translated all over again if it needs to run again later.
Usually when we optimize the JIT, the improvement is in how the code gets translated. But this time, krnlyng has improved how quickly Dolphin can start executing the next block after a block is done running. Or to be more specific, how quickly Dolphin can do it in the case where it doesn't know in advance which block comes next.
To find out what block to run, Dolphin has a large table that maps emulated code addresses to the corresponding block of translated code. Before krnlyng's changes, this table was 256 KiB, and doing lookups in the table worked as follows:
- Dolphin grabs the lowest 16 bits of the current emulated code address. This results in a number from 0 to 65,535.
- The number calculated in the previous step is used as an index in the table. Dolphin reads the table at that index and gets a pointer to JIT block metadata.
- Dolphin checks that the pointer isn't 0, and then reads the JIT block metadata. If the pointer is 0, the code hasn't been translated yet, so Dolphin has to go and translate it before it can continue.
- Dolphin checks that the emulated code address stored in the JIT block metadata matches the current emulated code address. Sometimes, the address might not match because two different emulated addresses have been mapped to the same index in the table. In that case, Dolphin has to use a much slower piece of code to find the right block.
- Dolphin checks that the IR and DR bits of the MSR register match. We'll skip the details, but this is very similar to the previous step.
- Dolphin gets the pointer to the translated code from the JIT block metadata, and jumps to it.
After the changes, Dolphin has a new table that's 32 GiB in size, and doing lookups in it works as follows:
- Dolphin takes the entirety of the current emulated code address and prepends the MSR bits that were mentioned earlier. This results in a number from 0 to 17,179,869,183.
- Dolphin reads the table at the index calculated in the previous step and gets a pointer directly to the translated code.
- Dolphin checks that the pointer isn't 0. Like before, a 0 means that Dolphin has to stop what it's doing and go translate the code.
- Dolphin jumps to the pointer.
This not only gets rid of the problem of different emulated addresses sharing the same index in the table, but also lets us skip reading the JIT block metadata entirely! But you may have an important question: How the heck is Dolphin going to fit a 32 GiB table into RAM when many computers and all phones have less total RAM than that?
In reality, Dolphin isn't asking the operating system for 32 GiB of real memory. Instead, it sets up 32 GiB of address space and asks the operating system to only allocate memory as needed. The first time Dolphin tries to access any given section of the 32 GiB table that isn't backed by real memory, the operating system allocates a small chunk of memory on the fly and fills it with zeroes. In the end, this new table can use a few more megabytes of memory than the old table, but nowhere even near 32 GiB.
In an ideal world, that would be all we have to say about the new solution. But for Windows users, there's a special quirk. On most operating systems, we can use a special flag to signal that we don't really care if the system has 32 GiB of real memory. Unfortunately, Windows has no convenient way to do this. Dolphin still works fine on Windows computers that have less than 32 GiB of RAM, but if Windows is set to automatically manage the size of the page file, which is the case by default, starting any game in Dolphin will cause the page file to balloon in size. Dolphin isn't actually writing to all this newly allocated space in the page file, so there are no concerns about performance or disk lifetime. Also, Windows won't try to grow the page file beyond the amount of available disk space, and the page file shrinks back to its previous size when you close Dolphin, so for the most part there are no real consequences... unless you like to download large files while running Dolphin.
We will look into improving the situation on Windows, but for the time being, please don't be alarmed if you see a sudden decrease in available disk space when running Dolphin.
When playing certain games for an extended time, Dolphin's JIT Cache has an annoying tendency to run out of space. The most prominent example is of games like Metroid Prime 2: Echoes where the game can dynamically load code into different places in memory. An even more difficult case is N64 Virtual Console games, like The Legend of Zelda: Ocarina of Time which use a recompiler to generate code on the fly. Once either the FarCode or NearCode cache is full, they must both be flushed, wiping both old and current data in the cache. This forces the JIT to rebuild all the code the game currently is running on the fly, leading to a noticeable stutter that cannot be avoided.
A huge step that mostly remedied this situation was merged back in 2021 which allowed Dolphin to evict code from the JIT Cache that the game itself invalidated. However, It wasn't a perfect solution. Dolphin was reliant on the game invalidating code, but sometimes games just don't, especially if they happen to be a trashfire like True Crime: New York City. Also it couldn't actually defragment evicted code, meaning that the longer the game was running, the more fragmented the caches would become. If there was no longer enough space to fit a chunk of code, a full flush would be required. Back in 2021, we mentioned that the ability to just use a bigger JIT Cache was still on the table, but we were worried about its ramifications and hoped it would not be necessary.
Fast forward to today, and dreamsyntax made a proposal to expand the JIT Cache. As part of it, they showcased a game that blew away all of our tricks: Shadow the Hedgehog. It generates just enough code that it created a consistent, rhythmic, annoying stutter due to the JIT Caches filling up and flushing every 15 - 30 minutes.
dreamsyntax showed that enlarging the JIT Cache completely eliminated the JIT Cache flushes in Shadow the Hedgehog, and showed many other games that benefitted from it as well. They made a compelling case, so with a little hesitation, we pulled the trigger. As of this change, the JIT Cache has been enlarged!
However, all of the things we were worried about with a larger JIT Cache, such as longer more severe stutters, just didn't happen. The JIT Cache flush stutters that we have observed take the same amount of time, yet they happen MUCH less frequently if ever. In fact, the potential to have a cache flush stutter is now pushed so far into a play session that it is outside the usual play session for most users! Players may never see a JIT Cache flush stutter again as of this change! It makes us feel a little silly that we were so hesitant to enlarge the cache.
So far it appears that this is the JIT Cache flush stutter solution we have all been waiting for. Hopefully that holds true, as any further improvements to the JIT Cache from here will be much harder.
Note: This change only applies to our x86-64 JIT, and does not affect our AArch64 JIT for ARM systems. Due to architectural differences, raising the AArch64 JIT Cache size beyond 128 MiB would require special solutions.
In the latest version of SteamOS, the gyro is disabled whenever SteamOS thinks the gyro is not being used. This includes situations like opening the Steam Deck's menus, which users might do quite frequently! For Dolphin, this would make the gyro suddenly stop working for unknown reasons, with no clear way to reenable it. We're not sure why this behavior changed, but we have no choice but to find some way to deal with it.
ArcaneNibble came up with a clever workaround. Rather than try to detect when the gyro is enabled/disabled, Dolphin will try to re-enable the gyro roughly every second or so. If it's already enabled? No harm, no foul. If it's disabled, it'll usually get re-enabled quickly, hopefully before the player even noticed it was disabled.
This will make playing some of your favorite Wii games on the latest version of SteamOS a lot more enjoyable, without you having to worry about the motion controls suddenly breaking for previously unknown reasons.
5.0-20097 and 5.0-20109 - Allow Widescreen Heuristic to be Modified Per-Game by OatmealDome and Billiard¶
As an emulator for consoles in the 4:3 to 16:9 transition, players can expect to encounter many different aspect ratios as they play their library in Dolphin. So users don't have to change our aspect ratio setting with each game, we have a widescreen heuristic which detects how wide of an image the game is rendering and sets Dolphin accordingly.
Unfortunately, being a heuristic, Dolphin is effectively making an informed guess as to what aspect ratio the game wants. And sometimes, a game is able to fool our heuristic and create wildly inconsistent results!
Unfortunately, we couldn't fix this by just changing what threshold the heuristic uses, as this would create false negatives - situations where the heuristic should have changed the aspect ratio but didn't. Regressions would be inevitable. Fortunately, there is a very simple solution available to us - our GameINIs. As of this change, the parameters used by our widescreen heuristic can be be overridden by our GameINIs. If we know a game is problematic for our heuristic, we can tweak how the heuristic behaves for that game without affecting any other games!
With that said, none of our GameINIs have had any parameters added yet. For the time being, you'll still run into problems in games like Pokémon Colosseum, but now that we have a workable system we can move forward with figuring out what parameters are needed for each problematic game.
Sometimes playing certain games in Dolphin on a touchscreen feels like an impossible challenge that would require another couple of hands and maybe a few extra fingers. For our touchscreen gamers, ThunderousEcho comes in with the ability to change various buttons on the touchscreen to be toggled instead of having to hold them down. This is known as a latching button, and is effectively the same as a toggle button in Dolphin's input system for physical controllers.
Simply tap the button once to have it "held" and then tap it a second time to release it. This is a simple addition, but it opens up the touchscreen controls a lot to be able to play more games on a touchscreen device. If you don't have a controller for your phone/tablet, or just prefer the single device form factor, latching might be the feature you need to play your favorite game.
If you're a veteran of the Dolphin Progress Reports, you're well aware of the eternal battle with PanicAlerts and the seemingly random deadlocks they can cause in combination with certain settings. Thankfully, another one bites the dust as JosJuice fixes a hang that could happen when pressing the "Ignore for this session" button in a PanicAlert.
Is this the end of the PanicAlert deadlocks? Has our lovable villain finally been felled in such an anti-climatic manner? Find out next time on Progress Report Z.