Sunday, July 07

Geek

Daily News Stuff 7 July 2019

Seven Of Seven Edition

Tech Thoughts

It's the Sunday of an extra-long weekend in the US and tech news is rather thin on the ground right now.  So here are some random thoughts about bit-banging video cards.

Those videos yesterday about building the world's worst video card made me wonder if there was something in the middle - between breadboarding an entire circuit from 74LS chips and just soldering resistors to a development kit and doing everything in software.

And, it turns out, there is one - exactly one - chip right in the middle.

http://ai.mee.nu/images/WorldsWorstGPU6.jpg?size=720x&q=95

The Pic32MX2 (PDF) is available in a old-fashined 28-pin DIP (.3 inch, rather than the fat .6 inch type) so you can drop it into your breadboard right beside your quad J/K flip flops and your 8-input NAND gates and it won't look out of place.

But inside is a different story: It's a 50MHz 32-bit MIPS M4K with up 256K of flash and 64K of RAM.  It has SPI and I2C and USB and DMA and counters and timers and analog/digital converters and all that stuff.  Though in the 28-pin package, which is the only one that you can plop into a breadboard, you have a total of 19 selectable I/O pins so all those features are multiplexed to hell.

It's $6.65 in Australia, quantity 1, for the exact version we want: The full 256KB of flash and 64KB RAM, 50MHz (there's a 40MHz version that is slightly cheaper, but that would screw up our video clocks), 28-pin DIP, and USB 2.0 support.

The way I'd probably approach this would be to design it to drive a standard 1080p monitor, but over VGA.  Monitors to support that are everywhere, and dirt cheap.

Unfortunately the 1080p60 pixel clock is 150MHz, which is far higher than we can reasonably drive with this except maybe in a monochrome character-cell mode.

http://ai.mee.nu/images/WorldsWorstGPU2.jpg?size=720x&q=95

But if we scale back to a retro-level 320x180 (1/6th the resolution horizontally and vertically) an 8-bit frame buffer takes 57600 bytes, neatly fitting in our 64K RAM.  And the pixel clock is 25MHz, half our CPU clock.  (Well, not exactly.  We might need to find a more precise crystal, like the guy in the second video.)

And if we drop to 1080p30 for our timings, the pixel clock is 12.5MHz, which is the sort of speed you can mess about with as a hobbyist and have a chance of getting something working.


What I would do (if I were to do this, which, since it requires next to no soldering, I just might) is this:
  • A 57,600-byte framebuffer - or maybe slightly larger so we can scroll it around using "hardware" registers.

  • Once every six output lines, the CPU reads the next input line from the frame buffer.

  • Each byte is referenced in a 512-byte lookup table that converts it to a 16-bit value.

  • The 16-bit value is written to a 640-byte line buffer.

  • DMA streams the 640 byte line buffer a byte at a time to an 8-bit I/O port connected to a set of resistor ladders, at 25MHz, double our pixel clock.

    That's a lot, but if the internal memory is 32 bits and the DMA can buffer it appropriately, it will only be using 1/8th of the available bandwidth.

  • The resistor ladders convert six bits into 64 main colours with the extra two bits common across the LSB of each channel.  So you have 64 real colours, 16 grey levels, and some other inbetween colours.

    Though I'm sure if I look this up there's probably someone who can show that with two extra resistors and a diode you can have logarithmic output that looks 97% better.

  • Here's the clever bit (well, I think it's clever): Normally the 512 byte lookup table simply converts an 8-bit logical colour into an 8-bit physical colour, and we send the same pixel value twice.

    But we can also treat part or all of the lookup space as a lookup table for a pair of 4-bit pixels instead of one 8-bit pixel, and return two different output values - and get up to 16 colours at 640x180, with the display mode switchable at any even-numbered column (that is, any standard-resolution column).

    You can switch back and forth between resolutions in the middle of a line as often as you want.  Even the Amiga needed a couple of blank scan lines to pull off that trick.
http://ai.mee.nu/images/WorldsWorstGPU3.jpg?size=720x&q=95

You could use 720p30 instead, but the table on Wikipedia shows it as having the same pixel clock as 1080p30.  I have no idea why, but unless that's wrong it would actually make things a lot harder.

Doing all this stuff would use almost all the RAM and an unknown percentage of CPU, DMA, and timer resources on our $6 chip, so the idea would be to add a second $6 chip to be our "actual" CPU, and connect them over SPI, which is serial but plenty fast for this.

Of our 19 programmable I/O pins on our GPU, we need 8 for pixel data, two for sync, and four for SPI, leaving a whole 5 for whatever we want!

We could construct ourselves a fancy audio chip in the same way, though that might be possible using the CPU alone.  Each chip has two SPI interfaces so the CPU can control two other chips, and you could potentially daisy-chain them.  Although I'm not sure if you can use dual SPI and USB.

Anyway, at the end of all that you have a rather nice little retro-computer: 64KB main RAM and 64KB graphics RAM is adequate, you have an 80 column mode (though only 22 lines, oh well...)  And it plugs into any cheap 1080p LCD that supports VGA.

You have 256KB of flash storage directly in the CPU to hold the kernel, Basic interpreter, and your code, and another 256KB sitting in the graphics chip that we can turn into virtual floppy drive.

And very much unlike computers of the 80s, it runs at 50 goddam megaHertz.

And for all that it's just two chips on a breadboard (plus a bunch of resistors, oscillators, VRMs, capacitors, and all that little annoying stuff, including probably a CPLD for something that it turned out no we couldn't do in software).

http://ai.mee.nu/images/WorldsWorstGPU4.jpg?size=720x&q=95

Now, if we didn't need to stick to the breadboard - if we were going to actually make a hundred of these and put them up on eBay - if we happened to have a brother (hi K!) who had an entire frickin' warehouse (okay, it's a small warehouse) full of surface-mount parts and the parts for making use of surface-mount parts - we could do a couple of things differently:
  • The 44-pin SMD version of the PIC32MX2 gives us a total of 31 programmable I/Os, so we could do 15-bit colour output.  (We might - just - be able to squeeze 12-bit colour out of the 28-pin version, but I'm betting there's something I've forgotten that will already eat the 5 pins we have left.)

    Still 256 colours maximum at 320x180, but out of 32,768.

    We could also do 160x180 half-resolution mode at full 15-bit colour.

  • Since we're doing SMD anyway, there's a couple of more powerful options available.

    If we are willing to break our budget and go up to $9.96 (again A$ qty 1) we can get a PIC32MX4 part with twice the flash (nice), twice the RAM (nicer), and running up to 120MHz.

    If we push our pixel clock up to 37.5MHz (running the CPU at 3x that) we can do 480x270 low-resolution and 960x270 high-resolution.  Well, actually, no.  480x270 leaves us with just 1472 bytes of RAM.  If we don't want to go insane we'll need to leave ourselves a small margin and add some black bars top and bottom - let's say 480x250.

  • But for another 36¢ we can switch parts to the PIC32MK family - still MIPS architecture, though a different core - and get 1MB flash and 256KB RAM, 120MHz, for $10.32.

    Now we can tackle 640x360!  And it has six SPI ports, and they're twice as fast as the original version.

  • The next step up is the PIC32MZ.  At $14.56 we get 200MHz operation, 1MB flash, and 512KB of RAM, enough (maybe) to hit 960x540.  There are even 250MHz options with 2MB flash if we need that.

  • And then Microchip laughs at our endeavours, because the next step beyond that isn't a sweet 80s retro-computer, it's a sweet 80s Unix workstation.

    At $25.62 the PIC32MZ2025DAH169-I/6J - let's call him Ted - Ted runs at 200MHz, has 2MB flash, 256KB of SRAM, a built-in graphics controller including video timing and multi-mode blitter, and 32MB of DRAM.

    Basically they're saying: Yes, very cute, now here's a real processor.
http://ai.mee.nu/images/WorldsWorstGPU5.jpg?size=720x&q=95

Update: Speaking of character cell mode, which I was at some point, turns out it's a lot harder than I thought.  It's not that complicated, but it's not a very efficient task to perform on a CPU.  It is easy and efficient to do in hardware, which is why we had character cell video cards before we had pixel-addressable ones.

I wanted to add a 640x360 16-colour character cell text mode, but it looks like just refilling the line buffer would use about 50% of the CPU.  (I'm doing MIPS instruction times in my head so I could be off by a factor of two, but it's still a lot.)

But what I did figure out how to do is a 640x360 16 colour graphics text mode.  It requires an extra 512 bytes for the lookup table but is otherwise just as efficient as 360x180 graphics mode, because that is generating double the number of pixels it needs to (so it can dynamically switch to high resolution) and this is generating twice as many pixels per clock.

Basically, each byte in the frame buffer in this mode specifies a palette (one of 16) and four pixels (each 0 or 1).  Each pixel can be one of two colours, but for each group of four pixels you can choose which two colours.  It's not the same freedom as character cell mode would be, but it works well enough.  (And our GPU can update the available palettes on the fly just like the Amiga did, if you really want to.)

The same trick can give us 320x360 medium-resolution and 640x180 high-resolution modes with 16, 32, 64, or 128 colours.

http://ai.mee.nu/images/WorldsWorstGPU1.jpg?size=720x&q=95

I'm still working out how to do dual playfields (320x180 only) and sprites.  Amiga-style blitter objects are easy enough as long as you have enough free RAM...  Which we kind of don't since our frame buffer uses 88% of our RAM before we do anything.

I stuck with MIPS throughout this thought experiment because I can get started with those 28-pin DIP packages, but Microchip also sell a broad range of low-cost Arm micronconrollers.  A 120MHz Cortex M4 with 1MB flash and 256KB RAM, a more capable DMA controller, built-in SDHC and Ethernet, and floating point (!) should handle this just as well as the Pic32MK and costs...

Wait, I lost my Mouser tab...  Cheeky bastards, mouser.com.au is NOT the same as au.mouser.com...  

A$7.49 qty 1.  So 84¢ more than our starting point for around twice the performance, four times the memory, Ethernet support, a small non-volatile RAM, a 4KB CPU cache, an FPU, and, well, stuff.  The datasheet is 2100 pages.


Tech News



Video of the Day

Seeing Antifa's difficulty getting their flag-burning on, I put forward my own modest proposal: A bill that requires all America flags to be made out of 100% natural nitrocellulose.





Disclaimer: Love laughs at locksmiths.  Component pricing laughs at hardware hackers.

Posted by: Pixy Misa at 09:14 PM | No Comments | Add Comment | Trackbacks (Suck)
Post contains 1983 words, total size 14 kb.




Apple pies are delicious. But never mind apple pies. What colour is a green orange?




56kb generated in CPU 0.05, elapsed 0.1767 seconds.
52 queries taking 0.1502 seconds, 282 records returned.
Powered by Minx 1.1.6c-pink.