The future is to hardware virtual machines

There are strange things about today's computers:

A Pentium 4 processor contains 50,000,000 transistors and has a speed of 4,000 MIPS. But an F21 processor contains 10,000 transistors and has a speed of 250 MIPS. If the transistors of the Pentium 4 were used to make 5,000 F21 processors, those 5,000 processors would run together at 1,250,000 MIPS, that is 300 times faster than a Pentium 4. (This calculation is a quite bold. Using the same transistor technology as the Pentium 4, the F21 processors would be between just a few times faster and much more than 300 times faster, depending on the kind of algorithms. On the other hand, I didn't take into account the fact lots of the Pentium 4 transistors are used for the on-chip cache memory. Without cache memory the F21 processors would be faster only for a narrow set of algorithms. If each F21 processor has a fair amount of cache memory or fast RAM, far less than 5,000 F21 processors can be carved out of 50,000,000 transistors.)

Common tasks are slower on today's computers than on old home computers from twenty years ago. My current PC is slower for common office tasks than the Acorn Archimedes RISC computer a friend lent me ten years ago. Their prices are roughly the same.

The Archimedes computer didn't even need a hard disk to allow me to make sophisticated vector drawings, design electronic circuits, publish big and complex texts... It just needed a floppy disk and 4 MB RAM. My current PC has a 80 GB hard disk and 512 MB RAM, but it can't rival...

When my computer has to do a few different things at the same time, it falls to its knees, even though it clearly has the necessary processor power and hard disk bandwidth.

Not to mention severe reliability problems like viruses and software crashes. Luckily I use Linux, which spares me most of these annoyances.

Why? Why this nonsense? Because every machine is virtual... Let's explain:

Maybe you have already used an "emulator". An emulator is a software that allows you to, for example, run Windows on a Macintosh computer. Or to run Linux inside a Windows system... Whatever. To run Windows on a Macintosh, the emulator mimics a PC computer. Windows truly believes it runs on a real PC. The mimicked PC is called a "virtual machine". It does not exist physically, it is just mimicked by a software. Properly used, emulators can be very helpful. Anyway, if you tried one you probably noticed many problems: slowness, crashes, no access to devices...

You wouldn't believe emulators and virtual machines are a key to reliability in aerospace systems (and all kinds of critical military and industrial systems). Why are public emulators so bad? Basically for these reasons:

They are made for systems that were never meant to be emulated.

Those systems have no proper specifications, no proper data sheets. Hence emulators can only be improved by trial and error.

Those systems constantly change. And these changes are sometimes even made on purpose to annoy.

Why are aerospace emulators so reliable? For the opposite reasons. In order to make embedded computer systems highly reliable, they were split into two distinct domains. One domain is the flight software that controls the plane. It is in a highly standardized and universal executable code. The other domain is the hardware that will run that code. That hardware can be just anything, provided it contains the adequate emulator to run the standardized code. The flight software is completely independent of the hardware. So, making aerospace computer systems is about two distinct tasks:

Conceive and certify hardware systems that are perfect emulators / virtual machines. It doesn't matter what processor or bus architecture the hardware uses, as long the emulation is perfect.

Write and certify software that will pilot the plane. You don't have to know what hardware it will run on, you just know the hardware will be a high-quality emulator.

Critical aerospace softwares do not need to run very fast. So the slowness of emulation is not a problem.

Processor manufacturers like Intel, AMD or IBM have a symmetrical problem. They too have to make hardware intended for existing software. But they are confronted with two problems:

The existing software was compiled for intricate little processors that were never meant to be improved. Furthermore the persons who wrote and compiled those softwares stuck to no rules.

The sole purpose of the new hardware is to be faster, not to be more reliable or more efficient.

So, the engineers at Intel, AMD and IBM use tricks to make the existing softwares run faster. The new processors contain all kinds of cogwheels and levers to gain a little speed. Most of the work done by the new processors is not the calculations required by the software, rather it is endless dumb computations to find a way to perform those calculations faster. While there is a lot of genius in the tricks the engineers invented, the overall result is incredibly stupid. Indeed, the computations performed by the new processors could have been done ways more efficiently by compilers. A good example are the memory caches. Those caches are real time hardware controlled. If they were controlled by static directives issued by the compiler or the source code, or dynamic directives issued the OS, the speed increase would be tremendous.

If you ask those engineers to design a much faster processor using much less transistors, they can do it (Intel and IBM already produced such processors). But then you need to recompile the softwares so they can run on this new processor architecture. This can be donein three different ways. The first one is quick and inexpensive but yields the least performance. The last one yields by far the most performance but it needs time and investments:

The existing executable code is translated into the new code.

The source code is recompiled into the new executable code.

The source code is rewritten to adapt to the philosophy of the new system before it is compiled.

Serious companies, using serious computer systems, do such things. The problem is public software and hardware systems are very poorly designed. They are such a mess you cannot recompile or rewrite the software and expect it will run correctly. Intel tried this approach for their first public 64-bit processors and failed. So, the only and short-term solution has been to build gigantic new processors that emulate old processors with lots of tricks and heat to get a little more speed.

Why are public systems poorly designed? There are many historical reasons, many errors and a lot of incompetence. I'd say the fundamental reason is the customers and buyers were and still are very immature. The only remedy would have been that state-funded research centers design new architectures and give them strong standards. Those systems would have been adopted because they would easily be tens of times faster than existing systems and because the standards could have been trusted. State research centers have not been intellectually capable to do this. Little private teams did succeed in conceiving new and efficient systems, for example the Acorn team who conceived the ARM processor and the Archimedes system. But they didn't have the power to set international standards. So we kept using those blurbs that are found inside Intel and Macintosh computers.

Can the public market evolve towards efficient computer systems? I think so. Two distinct evolutions can converge towards this:

Computers more and more rely on dedicated hardware for dedicated purposes. There are many examples:

3D-accelerated graphics chips. Their calculation power exceeds that of the main processor by hundredfolds.

Servers connected through Ethernet ports. For example a stand-alone modem connected to the computer through Ethernet is far more reliable than a modem piloted by the computer itself.

Postcript printers.

Physics processing units. These resemble accelerated graphics chips but their purpose is to simulate physical phenomena like water flow or the behavior of complex virtual structures. Their purpose in public computers is mostly to make games more realistic. Again, their calculation power is a few orders of magnitude higher than the main processor's.

Software systems more and more rely on virtual machines to gain security and portability:

Java, Javascript, .Net and other systems are meant to execute the same way on different machines. This is mostly achieved by software majors.

Many C, C++,... libraries and development interfaces try to be system and platform independent, in order to ensure the software can be compiled for different machines. This is a key contribution of the open-source world.

Critical systems, like firewalls, tend to become virtual, just like aerospace software.

Next step, I hope, is that dedicated hardware will be manufactured to execute standardized software like Java software. Java is far from an optimal language for speed and parallelization. But it is standard, it is widely used and a hundred or so dedicated Java processors can be put on a single cheap PCI expansion card. If every few years a new standard is set for the processor chipsets together with adequate compilers, if the language philosophy evolves and if real parallelization is implemented (not the ridiculous "dual core" systems of today), real computer power will at last become available to the public.

I think the Acorn team made a fundamental mistake when they conceived the ARM processor. The processor itself was perfect (it is now widely used in low-cost devices, to emulate all kinds of things, including Intel processors). But making a brand new machine out of it, the Archimedes, proved to be a strategic error. They had better made an expansion card for the PCs. Five years later every PC would have included an ARM processor natively and every important software would have been ARM-oriented. This would not have made the PC's more sophisticated. The hardware of today PCs is incredibly complex, slow and blockading, just like the processors. In contrast the ARM global architecture is very simple, smart and fast. Installing an ARM motherboard as an expansion card in the PC's would have improved the speed and the usability of the PCs drastically. Over the years the ARM system would have taken over the whole PC. The Intel processors would have faded away and become emulated. That would have made the PCs less expensive, ten times faster and far more simple and standardized. Then those ARM-based PCs would have become the hosts for further improvements and new architectures.

Apple made a symmetrical mistake: they used the Motorola 68000 processor for the first Macintosh computers. The problem is that this processor was not conceived to perform the calculations of a whole computer system. Rather it was conceived to be the core of a complex computer system. The purpose of the Motorola 68000 processors was to be a reliable, neat and easy to use central unit that controls the calculations made by specialized units. This was perfectly understood by the authors of the Amiga system. If Apple had quickly implemented specialized chipsets and calculation beasts like the ARM inside the Macintosh, they would have made really outstanding computers. They didn't have the necessary insight and still don't have.

Today's public software systems are conceived just like the processors and the motherboards: gigantic heaps of short-term tricks that accumulate and last and fester on for years. The contribution of the open-source world is to at least manage this chaos in a professional way. The SuSE Linux 10.0 system on my PC allows me to work efficiently. We won't get little and efficient softwares till the education world understands it needs to really teach computer science and responsibility to the students, not fake it. I would like computer science and its mathematic an logic games to be a replacement for the Latin courses. It would altogether open the students' minds and prevent they accept such horrors as today's computer architectures. Our civilization depends on computers but almost nothing is taught about computers in schools. People are kept in the belief that major enterprises do clever jobs and sell almost the best possible machines. Just the opposite happens. Fear, short-term speculations and incompetence have led our public computer industry to downgrade to the worst and least efficient it could produce.

About Intel processors and virtual machines: http://www.informit.com/articles/article.asp?p=459621&rl=1

You can lend your processor power for valuable scientific research: http://boinc.berkeley.edu . I put an old AMD 1600+ PC at work solely for this purpose. It consumes 70 Watt, most of it for the AMD processor. This costs almost 7 € electricity per month. Part of the calculations that computer is performing are to evaluate the global warming. I don't know the contribution of 70 Watt 24/7 to the global warming anyway there is a paradox here. A StrongARM-based computer would consume a tenth or even less, for the same calculation power...

Since years, Pentium and compatible processors contain an integer vector calculation unit: the "MMX". It is quite easy to get hold of a compiler that can read MMX-oriented assembler instructions: http://www.goof.com/pcg . Also such compilers can optimize the binary code they produce by using MMX instructions. But I could find no C / C++ library with data structures and functions (or classes and methods) to express algorithms aimed at the MMX calculation unit. Why do I have to endure assembler code to benefit the MMX? This is a good example of "write your program the way you were told at school, we will maybe find a way to make it run a little faster". Why can't I express my calculations the way the processor performs them? I would get a strong speed increase, I would understand what I'm doing and I would learn things. Today processors could easily contain a ten or so MMX units at virtually no cost. If it was possible to write C / C++ program explicitely aimed at those MMX units, lots of scientific softwares or games would be way faster and even easier to write. It's so simple, so obvious and it would be so efficient. Why isn't it available?

Links:

Data and enhancements from Didier Bizzarri and Benno Shulenberg.

Eric Brasseur - March 23 2006 till October 4 2006 [ Homepage | eric.brasseur@gmail.com ]