Don’t reinvent the wheel if you only need to add a few spokes
Bob Colwell once worked for Intel as chief IA32 architect through the Pentium II, III, and 4 microprocessors. The guy knows a lot about traditional microprocessor design. These days, he’s a consultant and a regular columnist in the IEEE Computer Society’s publication Computer. The April issue of Computer just arrived at my house and Colwell’s column was the first thing I turned to, as usual because I love reading what he has to say.
This month, Colwell’s theme is the “point of highest leverage.” By that, he means putting your efforts on producing results that will have the greatest impact on your project. Some of his advice is excellent for any team contemplating the design of a new system.
Colwell starts by discussing a common sense approach to developing software:
“If you have written computer programs, you have probably wrestled with computer performance analysis. Naïve programmers may just link dozens of off-the-shelf data structures and algorithms together, while more experienced coders design their program with an eye toward the resulting speed. But either way, you end up running the program and wishing it was faster.”
This paragraph succinctly sums up the experience of all computer programmers for the last 60 years, ever since the switch was first thrown to power up ENIAC. Colwell continues:
“The first thing you do is get a run-time histogram of your code, which reveals that of the top 25 sections, one of them accounts for 72 percent of the overall runtime, while the rest are in the single digits… Do you a) notice that one of the single-digit routines is something you’d previously worried about and set out to rewrite it, or b) put everything else aside and figure out what to do about that 72 percent routine?”
Developers who choose alternative b) obviously understand the principle of the “point of highest leverage.” Previously, software developers stuck with fixed-ISA processors (like Colwell’s Pentiums) would have to heavily rework the troublesome code, perhaps dropping into assembly language to truly maximize the processor’s performance. Today’s SOC developers have another choice: extend a processor core’s instruction set specifically for the target code to achieve a project’s performance goals. This approach is a natural evolution of the harnessing of microprocessor technology, now that the tool automation is available to automatically generate the RTL for the extended microprocessor and all of the required software-development tools.
But things (at least processor-related things) are no longer the way Colwell describes them later in this column:
“It is, however, crucial to identify exactly what should be at the top of your worry list. Important changes (read: risks) such as new process technologies automatically go on that list because if trouble arises there, you have few viable alternatives. If you’re contemplating a new microarchitecture, that goes at the top of the list. After all, your team hasn’t conjured up the new microarchitecture yet—you’re only asserting that you need one. The gap between the two facts may turn out to be insurmountable.”
These words were sure and true in the day when processors and software tools were developed by hand. Developing new Pentium architectures is surely a year’s-long endeavor requiring hundreds of engineers. However, one engineer can now add new registers and instructions to a base processor in a few days using automated design tools. The resulting processor hardware, generated automatically by the design tools, is correct by construction.
There are two key elements that are essential to the ability to rapidly create such extended processors. The first is a small, fast base processor architecture that can execute any program because it is a complete processor. The base processor may not execute the target code at the desired speed, but it can at least execute that code. That’s a significantly advantageous starting point.
It’s also a very logical starting point. There’s no need to reinvent a way to add two 32-bit integers. It’s been done before. However, there are very real, performance-related reasons for adding new instructions that streamline code. For example, specific registers sized to an application’s data elements (such as 48-bit, 2-element audio vectors) and instructions that explicitly manipulate those data elements (such as direct codebook lookups and customized MAC instructions) can significantly boost code performance well beyond the limits of traditional assembly-language coding while adding very few gates to the processor’s hardware design.
The second essential element is the automatic generation of the associated software-development tools. The task of manually writing compilers, assemblers, debuggers, and profilers for a new processor architecture is as time consuming, and just as important, as developing the new processor itself. The processor is useless if software developers cannot easily write and debug programs for it.
Colwell’s column wavers perilously close to the edge of reality when he writes:
“Start with the givens. Experience gives you a set of things you can take for granted: techniques, know-how, who is good at what, tools that have proven themselves, validation plans and repositories, how to work within corporate planning processes. If you’ve accumulated enough experience, you’ve learned never to take anything for granted, but some things don’t need to appear at the top of your worry list.”
Most design teams do not have processor customization at the top of their worry list because they don’t realize that it’s now possible to directly attack processor performance by designing a better processor. These people already “know” that they’re not processor designers and that it “would be foolish” for them to even consider developing a processor with instructions specifically for a task on an SOC.
Conventional wisdom says that when a fixed-ISA processor cannot handle a job, you need to design hardware by writing some Verilog or VHDL. That conventional wisdom is based on nearly 35 years of design experience with microprocessors. When the microprocessor cannot do the job, it needs supplemental hardware.
That conventional wisdom is now plainly wrong. The “techniques” and “know-how” that Colwell takes for granted because of experience have now been superseded because of the march of technology. In the 1960s, the conventional wisdom rejected integrated circuits entirely. Here’s a quote from a speech Gordon Moore gave to SPIE in 1995:
“In 1965 the integrated circuit was only a few years old and in many cases was not well accepted. There was still a large contingent in the user community who wanted to design their own circuits and who considered the job of the semiconductor industry to be to supply them with transistors and diodes so they could get on with their jobs.”
Things were no different a few years later when Intel introduced the first microprocessor. The Intel 4004, which appeared in 1971, did not take the system-design world by storm. Design engineers knew how to wire up hundreds or thousands of TTL gates packaged a few at a time in 7400-series logic packages. They did not know how to write and debug software. Further, early microprocessors cost one or two hundred dollars, far more than the few TTL packages they replaced. As a result, it took about a decade for microprocessors to become well established as essential elements in system design.
Things were again no different in the late 1980s as the IC-design industry was facing a complete breakdown in design methodology. The schematic-capture methods of the day were proving to be completely inadequate to the task of describing the complexity of the chips that could be built. Here’s a quote from an article on VHDL written by EDA editor Michael C Markowitz in the March 30, 1989 issue of EDN Magazine:
“The reluctance of designers to embrace new techniques over their well-worn, time-proven methods will impede VHDL’s rate of acceptance… But once the benefits become clearer and the reluctance to write code rather than draw or capture a design dissipates, VHDL will gather steam as a design language.”
Markowitz was dead on regarding the onset of hardware-description languages although it was Verlilog, not VHDL that established a hold on designers in the United States. European designers did adopt VHDL.
Kurt Keutzer, then with AT&T Bell Labs and now a professor at UC Berkeley, summarized the situation quite well in his paper that same year, at the 1989 Design Automation Conference:
“One of the biggest obstacles to the acceptance of synthesis for ASIC design is the lack of education. Designing a circuit using a synthesis system is radically different from designing a circuit using most current design systems. The ability to hand optimize transistor or gate-level networks is of little use in synthesis systems, while an entirely new class of skills are demanded. The acceptance of synthesis procedures requires a significant re-education of designers currently in industry, as well as a broadened academic curriculum for the upcoming generation of designers.”
Today, there are many SOC designers who believe that things are as they always have been. They weren’t around 15 years ago to see logic synthesis take over the industry. They believe that people have been writing RTL since the dawn of time and will continue to do so until the universe expires of thermodynamic heat death. These people share much in common with the 1960s designers who wanted their discrete diodes and transistors, the 1970s designers who refused to learn how to program microprocessors, and the 1980s designers who clung to their schematics rather than embracing Verilog and VHDL. There are too many transistors on today’s SOCs to design even most of them using hardware-description languages. Once again, IC fabrication technology has outstripped our “popular” design methods and new methods are required to keep pace.
Colwell is right about a lot of things but he’s wrong about experience giving you “a set of things you can take for granted.” Thanks to the pace of technological development, you must always question your assumptions about the things you can take for granted. The industry changes. Design changes. And the companies that adapt quickly survive. The rest don’t.
This month, Colwell’s theme is the “point of highest leverage.” By that, he means putting your efforts on producing results that will have the greatest impact on your project. Some of his advice is excellent for any team contemplating the design of a new system.
Colwell starts by discussing a common sense approach to developing software:
“If you have written computer programs, you have probably wrestled with computer performance analysis. Naïve programmers may just link dozens of off-the-shelf data structures and algorithms together, while more experienced coders design their program with an eye toward the resulting speed. But either way, you end up running the program and wishing it was faster.”
This paragraph succinctly sums up the experience of all computer programmers for the last 60 years, ever since the switch was first thrown to power up ENIAC. Colwell continues:
“The first thing you do is get a run-time histogram of your code, which reveals that of the top 25 sections, one of them accounts for 72 percent of the overall runtime, while the rest are in the single digits… Do you a) notice that one of the single-digit routines is something you’d previously worried about and set out to rewrite it, or b) put everything else aside and figure out what to do about that 72 percent routine?”
Developers who choose alternative b) obviously understand the principle of the “point of highest leverage.” Previously, software developers stuck with fixed-ISA processors (like Colwell’s Pentiums) would have to heavily rework the troublesome code, perhaps dropping into assembly language to truly maximize the processor’s performance. Today’s SOC developers have another choice: extend a processor core’s instruction set specifically for the target code to achieve a project’s performance goals. This approach is a natural evolution of the harnessing of microprocessor technology, now that the tool automation is available to automatically generate the RTL for the extended microprocessor and all of the required software-development tools.
But things (at least processor-related things) are no longer the way Colwell describes them later in this column:
“It is, however, crucial to identify exactly what should be at the top of your worry list. Important changes (read: risks) such as new process technologies automatically go on that list because if trouble arises there, you have few viable alternatives. If you’re contemplating a new microarchitecture, that goes at the top of the list. After all, your team hasn’t conjured up the new microarchitecture yet—you’re only asserting that you need one. The gap between the two facts may turn out to be insurmountable.”
These words were sure and true in the day when processors and software tools were developed by hand. Developing new Pentium architectures is surely a year’s-long endeavor requiring hundreds of engineers. However, one engineer can now add new registers and instructions to a base processor in a few days using automated design tools. The resulting processor hardware, generated automatically by the design tools, is correct by construction.
There are two key elements that are essential to the ability to rapidly create such extended processors. The first is a small, fast base processor architecture that can execute any program because it is a complete processor. The base processor may not execute the target code at the desired speed, but it can at least execute that code. That’s a significantly advantageous starting point.
It’s also a very logical starting point. There’s no need to reinvent a way to add two 32-bit integers. It’s been done before. However, there are very real, performance-related reasons for adding new instructions that streamline code. For example, specific registers sized to an application’s data elements (such as 48-bit, 2-element audio vectors) and instructions that explicitly manipulate those data elements (such as direct codebook lookups and customized MAC instructions) can significantly boost code performance well beyond the limits of traditional assembly-language coding while adding very few gates to the processor’s hardware design.
The second essential element is the automatic generation of the associated software-development tools. The task of manually writing compilers, assemblers, debuggers, and profilers for a new processor architecture is as time consuming, and just as important, as developing the new processor itself. The processor is useless if software developers cannot easily write and debug programs for it.
Colwell’s column wavers perilously close to the edge of reality when he writes:
“Start with the givens. Experience gives you a set of things you can take for granted: techniques, know-how, who is good at what, tools that have proven themselves, validation plans and repositories, how to work within corporate planning processes. If you’ve accumulated enough experience, you’ve learned never to take anything for granted, but some things don’t need to appear at the top of your worry list.”
Most design teams do not have processor customization at the top of their worry list because they don’t realize that it’s now possible to directly attack processor performance by designing a better processor. These people already “know” that they’re not processor designers and that it “would be foolish” for them to even consider developing a processor with instructions specifically for a task on an SOC.
Conventional wisdom says that when a fixed-ISA processor cannot handle a job, you need to design hardware by writing some Verilog or VHDL. That conventional wisdom is based on nearly 35 years of design experience with microprocessors. When the microprocessor cannot do the job, it needs supplemental hardware.
That conventional wisdom is now plainly wrong. The “techniques” and “know-how” that Colwell takes for granted because of experience have now been superseded because of the march of technology. In the 1960s, the conventional wisdom rejected integrated circuits entirely. Here’s a quote from a speech Gordon Moore gave to SPIE in 1995:
“In 1965 the integrated circuit was only a few years old and in many cases was not well accepted. There was still a large contingent in the user community who wanted to design their own circuits and who considered the job of the semiconductor industry to be to supply them with transistors and diodes so they could get on with their jobs.”
Things were no different a few years later when Intel introduced the first microprocessor. The Intel 4004, which appeared in 1971, did not take the system-design world by storm. Design engineers knew how to wire up hundreds or thousands of TTL gates packaged a few at a time in 7400-series logic packages. They did not know how to write and debug software. Further, early microprocessors cost one or two hundred dollars, far more than the few TTL packages they replaced. As a result, it took about a decade for microprocessors to become well established as essential elements in system design.
Things were again no different in the late 1980s as the IC-design industry was facing a complete breakdown in design methodology. The schematic-capture methods of the day were proving to be completely inadequate to the task of describing the complexity of the chips that could be built. Here’s a quote from an article on VHDL written by EDA editor Michael C Markowitz in the March 30, 1989 issue of EDN Magazine:
“The reluctance of designers to embrace new techniques over their well-worn, time-proven methods will impede VHDL’s rate of acceptance… But once the benefits become clearer and the reluctance to write code rather than draw or capture a design dissipates, VHDL will gather steam as a design language.”
Markowitz was dead on regarding the onset of hardware-description languages although it was Verlilog, not VHDL that established a hold on designers in the United States. European designers did adopt VHDL.
Kurt Keutzer, then with AT&T Bell Labs and now a professor at UC Berkeley, summarized the situation quite well in his paper that same year, at the 1989 Design Automation Conference:
“One of the biggest obstacles to the acceptance of synthesis for ASIC design is the lack of education. Designing a circuit using a synthesis system is radically different from designing a circuit using most current design systems. The ability to hand optimize transistor or gate-level networks is of little use in synthesis systems, while an entirely new class of skills are demanded. The acceptance of synthesis procedures requires a significant re-education of designers currently in industry, as well as a broadened academic curriculum for the upcoming generation of designers.”
Today, there are many SOC designers who believe that things are as they always have been. They weren’t around 15 years ago to see logic synthesis take over the industry. They believe that people have been writing RTL since the dawn of time and will continue to do so until the universe expires of thermodynamic heat death. These people share much in common with the 1960s designers who wanted their discrete diodes and transistors, the 1970s designers who refused to learn how to program microprocessors, and the 1980s designers who clung to their schematics rather than embracing Verilog and VHDL. There are too many transistors on today’s SOCs to design even most of them using hardware-description languages. Once again, IC fabrication technology has outstripped our “popular” design methods and new methods are required to keep pace.
Colwell is right about a lot of things but he’s wrong about experience giving you “a set of things you can take for granted.” Thanks to the pace of technological development, you must always question your assumptions about the things you can take for granted. The industry changes. Design changes. And the companies that adapt quickly survive. The rest don’t.
1 Comments:
Yo, you have a Terrific blog here! Lots of content means more readers, more readers means more Sales!
I'm definitely going to bookmark you!
I have a free increase computer speedfree increase computer speed site/blog. It pretty much covers free increase computer speed Problems with your Windows Xp Computing !
Come and check it out if you get time We are just a Click Away ! :-)
By Anonymous, at 2:31 AM
Post a Comment
<< Home