SOC Design

Friday, July 15, 2005

Al's Story

My friend and neighbor Al has been in the electronics business a bit longer than I have. I spent a week traveling all over Western Nevada with Al two weeks ago, so we spent a lot of time in the car together. Al told me that one of his first jobs was to develop a transistor tester to characterize the hand-made junction transistors his company was getting from RCA, Philco, and other vendors. Transistor-to-transistor device characteristics varied wildly back then because transistor manufacture was pure alchemy in the 1950s.

Another job Al had in that era was to linearize some signals. Al told me it was pretty tough to do this back then. All he had to work with was resistors, capacitors, diodes, and discrete transistors.

Last week, I read about a new microcontroller from Zilog called the Z8 Encore XP, which would have made this job much easier. The Z8 Enxore XP is based on the 8-bit Z8 processor core, which I recall using to design the electronics for a total-organic-content water analyzer back in the early 1980s.

Today's Z8 processor comes packaged in an 8-lead SOIC and costs around a dollar. In addition to the processor core, the Z8 Encore XP includes four 10-bit A/D converters and 4K bytes of flash memory on chip. You program it in C. If I could go back 50 years to tell my friend Al that I could do a piecewise linearization of four of his signals to about 0.1% for a buck (that would be linearization for 25 cents per signal), and that I could get a prototype running in a day, I wonder what he'd say.

The point in telling you Al's story is to remind you that no technology stands still. Consequently, your approach to system design can't stand still either. We all chuckle a bit about Al's predicaments 50 years ago because we have far more advanced ways of dealing with the problems he had to solve using much cruder tools.

However, is Al's situation still funny when you're using the same SOC design techniques that you used 10 years ago?

To infinity and beyond!

The title quote is from Buzz Lightyear but the topic of this post is David Lammer's cover article on 65nm process technology in the July 11 EE Times. The article is about the 2005 Symposium on VLSI Technology held in Kyoto, Japan last month. A few points in this excellent article caught my eye with respect to system design.

Point one: 65 nm technology buys you a cool 10 million transistors per square millimeter! An economical chip is around 100mm squared, which works out to one billion transistors on the chip. Even a cheap 5x5mm chip made with 65nm technology carries 250 million transistors. Hand-coding enough RTL to fill these chips will take, like, "To infinity and beyond!" You'd better get ready to find a more efficient way to design chips.

Point two: If you sniff at point one and say that leading-edge 65nm process technology is only for high-priced, cutting-edge products, read the statement from Mark Pinto, Chief Technology Officer at Applied Materials: "Demand from China is only going to grow—and 65nm is absolutely ideal for consumer chips aimed at growing markets."

Point three: Srini Raghvendra, Senior Director of Design For Manufacturing at Synopsys said: "Design productivity, measured in terms of gates per engineering workday, must improve fourfold at 65nm over the 130nm node." Do you have a plan to achieve that?

Point four: Process technology, especially at the 65nm node and future nodes, will no longer provide the automatic power reductions that "classical" Moore's-Law scaling has delivered for the past 40 years. "Addressing the problem requires architectural, system-level decisions." said Eric Filseth, a Cadence marketing manager.

Point five: Hardware/software codesign becomes more crucial at 65nm. "Teams must start on software creation at the same time that RTL design commences." according to Tohru Furuyama, general manager of R&D at Toshiba's SOC engineering center in Kawasaki, Japan.

Point 6: Mask costs for 65nm chips are estimated at $3 million. That's still small compared to the cost of designing a chip with as many as a billion transistors, but it's not an insignificant sum. You'd better have some good simulation models that will run your application code before you tape out a mask set.

Point 7: Process variations can occur across a single die at the 65nm node, which means that more functional chips can be out of spec. To weed these out, you will need more at-speed testing, which means more built-in self testing (BIST) because otherwise, you can count on leaving these chips on testers for an hour apiece. Do you have a plan to add BIST to your designs? You'd better.

All of these issues are addressed by processor-centric SOC design. If you haven't yet read Engineering the Complex SOC by Chris Rowen and Steve Leibson, this would be a good time to do so. We keep a nice writeup on the book on the Tensilica Web site if you need more information.

Monday, July 11, 2005

Rumors of RISC's demise somewhat premature

This week in EE Times, Ron Wilson writes about a keynote given by MIPS CEO John Bourgoin lamenting the demise of RISC processors (MIPS' main product). When the idea of RISC's clock-per-instruction concept started in the 1980s at IBM, processor clock cycles and memory access times were near parity so the RISC concept made sense for minicomputer and mainframe processor design. RISC processor design eschews complex, multiclock processor instructions and essentially replaces microcode ROM with single-cycle instruction RAM caches and optimizing compilers.

Even the most successful CISC instruction set ever invented, that of the 8086 microprocessor and its descendants, became "RISC-ified" in the 1990s. Inside modern x86 processors, an instruction chopper/shredder (see the end of the movie "Galaxy Quest" for a vivid visualization of this device) finely slices the CISC x86 instructions into simpler operations that are then distributed to one or more RISC engines hidden deep inside the machine.

Today's problem with the RISC concept, which Wilson addresses, is that memory access time is now much slower than today's processor clock cycle times, at least for DRAMs. The result is a heavy reliance on increasingly large SRAM caches that continue to keep up with processor clock rates, barely.

However, this situation is not strictly the fault of RISC's pipelined one-instruction/clock approach. The problem is caused by another RISC fault, the reduction in the number and complexity of instructions to a basic set of less than 100 instructions. Compilers can indeed create instruction streams that perform complex tasks from these simple instructions, but it takes a lot of instructions to do so. If complex programs require many instructions to function, then they need larger caches and higher clock rates to meet performance goals. Larger caches and higher clock rates ultimately increase product cost.

Enter post-RISC configurable processors. With such processors, design teams can add specialized, task-specific instructions to the processor that function like CISC instructions (by doing complex things) but adhere to RISC's pipelined, one-instruction/clock approach. These processors work well as deeply embedded task engines inside of SOCs where task specificity is easy to define and appropriate to use. In such applications, programs are typically small and do not require large caches. In addition, specialized instructions reduce the number of instructions needed to perform the target tasks, which relieves the pressure to constantly boost clock rate.

In short, RISC (like the dinosaurs) isn't dead, it has evolved.