A Pentium 4 processor contains 50,000,000 transistors and has a
speed of 4,000 MIPS. But an F21 processor
contains 10,000 transistors
and has a speed of 250 MIPS. If the transistors of the Pentium 4 were
used to make 5,000 F21 processors, those 5,000 processors would run
together at 1,250,000 MIPS, that is 300 times faster than a Pentium 4.
(This calculation is a quite bold. Using the same transistor technology
as the Pentium 4, the F21 processors would be between just a few
times faster and much more than 300 times
faster, depending on the kind of algorithms. On the other hand, I
didn't take into account the fact lots of the Pentium 4 transistors are
used for the on-chip cache memory. Without cache memory the F21
processors would be faster only for a narrow set of algorithms. If each
F21 processor has a fair amount of cache memory or fast RAM,
far less than 5,000 F21 processors can be carved out of 50,000,000
transistors.)
Common tasks are slower on today's computers than on old home
computers from twenty years ago. My current PC is slower for common
office tasks than the Acorn Archimedes RISC computer a friend lent
me ten years ago. Their prices are roughly the same.
The Archimedes computer didn't even need a hard
disk to allow me to make sophisticated vector drawings, design
electronic circuits, publish big and complex texts... It just needed a
floppy disk and 4 MB RAM. My current PC has a 80 GB hard disk and 512
MB
RAM, but it can't rival...
When my computer has to do a few different things at the same
time, it falls to its knees, even though it clearly has the necessary
processor power and hard disk bandwidth.
Not to mention severe reliability problems like viruses and software
crashes. Luckily I use Linux, which spares me most of these annoyances.
Why? Why this nonsense? Because every machine is virtual... Let's
explain:
Maybe you have already used an "emulator". An emulator is a software
that
allows you to, for example, run Windows on a Macintosh computer. Or to
run
Linux inside a Windows system... Whatever. To run Windows on a
Macintosh, the emulator mimics a PC computer. Windows truly believes it
runs on a real PC. The mimicked PC is called a "virtual machine". It
does not exist physically, it is just
mimicked by a software. Properly used, emulators can be very helpful.
Anyway, if you tried one you probably noticed many problems: slowness,
crashes, no access to devices...
You wouldn't believe emulators and virtual machines are a key to
reliability in aerospace systems (and all kinds of critical military
and
industrial systems). Why are public emulators so bad? Basically for
these reasons:
They are made for systems that were never meant to be emulated.
Those systems have no proper specifications, no proper data
sheets. Hence emulators can only be improved by trial and error.
Those systems constantly change. And these changes are sometimes
even
made on purpose to annoy.
Why are aerospace emulators so reliable? For the opposite reasons. In
order to make embedded computer systems highly reliable, they were
split into two distinct domains. One domain is the flight software that
controls the plane. It is in a highly standardized and universal
executable code.
The other domain is the hardware that will run that code. That hardware
can
be just anything, provided it contains the adequate emulator to run the
standardized code. The flight software is completely independent of the
hardware. So, making aerospace computer systems is about two distinct
tasks:
Conceive and certify hardware systems that are perfect emulators
/ virtual machines. It doesn't matter what processor or bus
architecture the hardware uses, as long the emulation is perfect.
Write and certify software that will pilot the plane. You don't
have to know what hardware it will run on, you just know the
hardware will be a high-quality emulator.
Critical aerospace softwares do not need to run very fast. So the
slowness of emulation is not a problem.
Processor manufacturers like Intel, AMD or IBM have a symmetrical
problem. They too have to make hardware intended for existing software.
But they are confronted with two problems:
The existing software was compiled for intricate little
processors that were never meant to be improved. Furthermore the
persons who wrote and compiled those softwares stuck to no rules.
The sole purpose of the new hardware is to be faster, not to be
more reliable or more efficient.
So, the engineers at Intel, AMD and IBM use tricks to make the existing
softwares run faster. The new processors contain all kinds of cogwheels
and levers to gain a little speed. Most of the work done by the new
processors is not the calculations required by the software, rather it
is endless dumb computations to find a way to perform those
calculations faster. While there is a lot of genius in the tricks the
engineers invented, the overall result is incredibly stupid. Indeed,
the
computations performed by the new processors could have
been done ways more efficiently by compilers. A good example are
the memory caches. Those caches are real time hardware controlled. If
they were controlled by static directives issued by the compiler or the
source code, or dynamic directives issued the OS, the
speed increase would be tremendous.
If you ask those engineers to design a much faster processor using
much less transistors, they can do it (Intel and IBM already produced
such processors). But then you need to recompile
the softwares so they can run on this new processor architecture. This
can be donein three different ways. The first one is quick and
inexpensive but
yields the least performance. The last one yields by far the most
performance but it needs time and investments:
The existing executable code is translated into the new code.
The source code is recompiled into the new executable code.
The source code is rewritten to adapt to the philosophy of the
new system before it is compiled.
Serious companies, using serious computer systems, do such things. The
problem is public software and
hardware systems are very poorly designed. They are such a mess you
cannot recompile or rewrite the software and expect it will run
correctly. Intel tried this approach for their first public 64-bit
processors and failed. So, the only and short-term solution
has been to build gigantic new processors that emulate old processors
with lots of tricks and heat to get a little more speed.
Why are public systems poorly designed? There are many historical
reasons, many errors and a lot of incompetence. I'd say the fundamental
reason is the customers and buyers were and still are very immature.
The only remedy would have been that state-funded research centers
design
new architectures and give them strong standards. Those
systems would have been adopted because they would easily be tens of
times faster than existing systems and because the standards could have
been trusted. State research centers have not
been intellectually capable to do this. Little private teams did
succeed in conceiving new and efficient systems, for example the Acorn
team who conceived the ARM processor and the Archimedes system. But
they didn't have the power to set international standards. So we kept
using those blurbs that are found inside Intel and Macintosh computers.
Can the public market evolve towards efficient computer systems? I
think so. Two distinct evolutions can converge towards this:
Computers more and more rely on dedicated hardware for dedicated
purposes. There are many examples:
3D-accelerated graphics chips. Their calculation power exceeds
that of the main processor by hundredfolds.
Servers connected through Ethernet ports. For example a
stand-alone modem connected to the computer through Ethernet is far
more reliable than
a modem piloted by the computer itself.
Postcript printers.
Physics processing units. These resemble accelerated graphics
chips but their purpose is to simulate physical phenomena like water
flow or the behavior of complex virtual structures. Their purpose in
public computers is mostly to make games more realistic. Again, their
calculation power is a few orders of magnitude higher than the main
processor's.
Software systems more and more rely on virtual machines to gain
security and portability:
Java, Javascript, .Net and other systems are meant to
execute the same way on different machines. This is mostly achieved by
software majors.
Many C, C++,... libraries and development interfaces try to be
system and platform independent, in order to ensure the software can be
compiled for different machines. This is a key contribution of the
open-source world.
Critical systems, like firewalls, tend to become
virtual, just like aerospace software.
Next step, I hope, is that dedicated hardware will be manufactured to
execute standardized software like Java software. Java is far from an
optimal language for speed and parallelization. But it is standard, it
is widely used and a hundred or so dedicated Java processors can be put
on a single cheap PCI expansion card. If every few years a new standard
is set
for the processor chipsets together with adequate compilers, if the
language philosophy evolves and if real parallelization is
implemented (not the ridiculous "dual core" systems of today), real
computer power will at last become available to the public.
I think the Acorn team made a fundamental mistake when they conceived
the
ARM processor. The processor itself was perfect (it is now widely used
in low-cost devices, to emulate all kinds of things, including Intel
processors). But making a brand new machine out of it, the Archimedes,
proved to be a strategic error. They had better made an expansion card
for the PCs. Five years later every PC would have included an ARM
processor natively and every important software would have been
ARM-oriented. This would not have made the PC's more sophisticated. The
hardware of today PCs is incredibly complex, slow and blockading, just
like the processors. In contrast the ARM global architecture is
very simple, smart and fast. Installing an ARM motherboard as an
expansion card in the PC's would have improved the speed and the
usability of the PCs drastically. Over the years the ARM system would
have
taken over the whole PC. The Intel processors would have faded away and
become emulated. That would have made the PCs less expensive, ten
times faster and far more simple and standardized. Then those
ARM-based PCs would have become the hosts for further improvements and
new architectures.
Apple made a symmetrical mistake: they used the Motorola 68000
processor
for the first Macintosh computers. The problem is that this processor
was
not conceived to perform the calculations of a whole computer system.
Rather it was conceived to be the core of a complex computer system.
The purpose of the Motorola 68000 processors was to be a reliable, neat
and easy to use central unit that controls the calculations made by
specialized units. This was perfectly understood by the authors of the
Amiga
system. If Apple had quickly implemented specialized chipsets and
calculation beasts like the ARM inside the Macintosh, they would have
made really outstanding computers. They didn't have the necessary
insight and still don't have.
Today's public software systems are conceived just like the processors
and the motherboards:
gigantic heaps of short-term tricks that accumulate and last and fester
on
for years. The contribution of the open-source world is to at least
manage this chaos in a professional way. The SuSE Linux 10.0 system on
my
PC allows me to work efficiently. We won't get little and efficient
softwares till the education world understands it needs to really teach
computer science and responsibility to the students, not fake it. I
would like computer science and its mathematic an logic games to be a
replacement for the Latin courses. It would altogether open the
students' minds and prevent they accept such horrors as today's
computer
architectures. Our civilization depends on computers but almost nothing
is taught about computers in schools. People are kept in the belief
that major
enterprises do clever jobs and sell almost the best possible machines.
Just the opposite
happens. Fear, short-term speculations and incompetence have led our
public computer
industry to downgrade to the worst and least efficient it could produce.
You can lend your processor power for valuable scientific research: http://boinc.berkeley.edu . I
put an old AMD 1600+ PC at work solely for this purpose. It consumes 70
Watt, most of it for the AMD processor. This costs almost 7 €
electricity per month. Part of the calculations that computer is
performing are to evaluate the global warming. I don't know the
contribution of 70 Watt 24/7 to the global warming anyway there is a
paradox here. A StrongARM-based computer would consume a tenth or even
less, for the same calculation power...
Since years, Pentium and compatible processors contain an integer
vector calculation unit: the "MMX". It is quite easy to get hold of a
compiler that can read MMX-oriented assembler instructions: http://www.goof.com/pcg . Also such
compilers can optimize the binary code they produce by using MMX
instructions. But I could find no C / C++ library with data structures
and functions (or classes and methods) to express algorithms aimed at
the MMX calculation unit. Why do I have to endure assembler code to
benefit the MMX? This is a good example of "write your program the way
you were told at school, we will maybe find a way to make it run a
little faster". Why can't I express my calculations the way the
processor performs them? I would get a strong speed increase, I would
understand what I'm doing and I would learn things. Today processors
could easily contain a ten or so MMX units at virtually no cost. If it
was possible to write C / C++ program explicitely aimed at those MMX
units, lots of scientific softwares or games would be way faster and
even easier to write. It's so simple, so obvious and it would be so
efficient. Why isn't it available?