30 Years of Computing at CERN - Part 3
Paolo Zanella , Former Division Leader of DD
This is the last of a three-part series, made out of the original (excellent) paper written by Paolo Zanella in 1990. - Miguel Marquina, editor
Note that when the word "present" is used, it refers to "1990".
7. THE HARMONIOUS GROWTH OF THE 8O'S
In the 1980's large systems and the computer industry in general have undergone a process of Accelerated evolution. Today the profusion of different classes of computers indicates a very rapid expansion due to the technology and to a certain specialization. Terms like microcomputers, personal computers, professional workstations, minicomputers, superminis, minisupers, mainframes and supercomputers reflect the fact that problems can now be solved with systems not necessarily large, or so expensive that they have to be shared with other users. The advent of distributed computing services and the continuing advances in technology are making the classification of systems ever more difficult and imprecise.
During this period the computing environment at CERN has undergone a spectacular expansion inside and specially outside the computer center. The 80's have been characterized by frantic cabling of office buildings as well as laboratories and experimental areas. Integration has been the name of the game and standardization has emerged as an obvious corollary. The scene is no longer dominated by one or two high-end machines. Mainframes are part of a network of hundreds of shared computers and thousands of powerful, single-user workstations. If anything dominates this landscape it is the interoperability, the communications and networking aspect rather than the performance of any particular machine.
8. RECENT EVOLUTION OF HIGH-PERFORMANCE COMPUTING
The high-quality IBM hardware served first with a WYLBUR sauce and then with a VM/CMS flavor was an undeniable success and the 1980's saw a number of IBM systems succeeding each other ever more rapidly and supporting a population of users which has grown by a factor of ten in the last ten years! The individual mainframes appearing on the CERN stage had IBM numbernames such as 3032, 3081/D/K, 3090/200/400/600E. They were all based on the same architectural principles, and they offered a smooth growth from the unitary value of the 168 to the power of the current 3090/600E evaluated at 39 units. The main memory size underwent an even more impressive change from the initial 4 Mbytes to the present 256 Mbytes, not to speak of caches and another 256 Mbytes of solid state extensions. But the picture would be incomplete without mentioning the contributions of IBM compatible machines, such as the Siemens 7880 and 7890S which were run under the same operating systems as the IBM mainframes (MVS and VM) and in a way practically transparent to the users. The Siemens 7890S is currently providing some 13 units of capacity and it has set new records of reliability on-site.
In the meantime the CDC line was upgraded in several steps through the replacement of ageing machines by more modern ones, e.g. the Cybers 170/720/730/835/875, until the mid-80's when it was suddenly discontinued. The main reasons were financial and it is true that the computer center had to take its share of the sacrifices required by the construction of LEP. It is also true that this type of machines suffering from an unstable operating environment and faced with tough competition of rival services, was gradually loosing popularity in the HEP community.
Exit CDC, enter Cray. Early in 1988 a Cray vector supercomputer (Cray X-MP 48) was installed both to increase the center's strength in view of the large amount of LEP data expected, and to see if this type of number cruncher with powerful scalar and vector capabilities could be efficiently exploited by the imaginative HEP community and hence reduce the gap between needed and provided computing capacity. It is clearly too early to see whether vector architectures will play a significant role in the future of computing at CERN. In order to explore as seriously as possible this alternative computational approach, six Vector Feature (VFs) devices have been added to the six processors of the IBM 3090/600.
Since 1982 there is a fourth manufacturer in the computer centre: Digital Equipment Corp. (DEC). Unlike the other vendors, they found their way to the top after having spread hundreds of PDP's and VAXes all over the site. Physicists were clearly fascinated by the user friendliness of the hardware and of the software. Digital made neither black boxes nor... blue boxes. They could be opened, connected to experimental equipment, and coexist with home-made electronics in the counting room. The fierce competition (e.g. Hewlett-Packard, Norsk Data, CII, IBM, etc ... ) had difficulties to survive the agression of the 32-bits VAX family propelled by VMS and DECnet. After almost all the experiments had been contaminated (or 'VAXinated'), and even the LEP machine's database and CAD system were installed on VAXes, a DECcluster was set up in the central computer room to offer a number of services initially reserved to the LEP experiments. The current central VAX system consists of an 8800, an 8700, three 8650 and an 8530. The total VAX count at CERN is well over 300 and still growing.
Two more success stories have marked the last few of the past 30 years, namely the personal computer and the workstation. The first are gradually invading the buildings and are used for a variety of administrative and office tasks, e.g. text processing, document handling, electronic mail, preparation of presentation material, spreadsheet and database work, etc. Two types of personal computer are present at CERN and account for well over 2000 systems: the ubiquitous Apple Macintosh and the EBM or EBM-compatible PCs (e.g. Olivetti), with the first type currently growing much faster than the second. For the record, the first IBM PCs appeared on-site in 1983.
As to the workstations, they are, and will be even more so in the future, a major factor of the computing environment. Apollo workstations connected into local area distributed systems are already over 100 and growing fast. There are also several clusters of VAX workstations which have proved particularly useful for program development. The power of these machines is already quite impressive and is increasing quite dramatically every year, while their price is dropping due to technology and competition. It is easy to predict a very healthy future to the advanced workstation as part of a distributed system providing access to file servers, communications servers, print servers and compute servers. While one is learning how to solve the technical and managerial problems posed by these new systems, the number one challenge is how to face the extremely rapid obsolescence of such devices and how to plan and finance their continuous replacement.
To conclude this quick walk through the complex evolution of computing at CERN one should consider the growth of the central computers' usage from the Mercury to the current situation, expressed in CERN CPU units. It is true that CPU units do not tell the full story. If we, however, plot the number of tapes in the vault (a few hundred thousand to date), or the disk capacity against time, we get even steeper climbing curves. It is worth noting that this exponential progression has been necessary to support a rapidly increasing population of users. Thirty years after the first attempts at running an Autocode program, there are today over 6000 registered users of the CERN computer centre depending more and more on its facilities and services. It is also interesting to note that for the last 20 years this growth has been achieved at almost constant budgets and staff numbers.
9. FROM GENERAL PURPOSE TO HEP SPECIFIC COMPUTING
During the 30 years under consideration a few major trends have been driving the evolution of computing at CERN and, as a matter of fact, everywhere else. Among these forces there is a strong attraction towards decentralization and separation of functions. This is probably one of the causes and not a consequence of the extremely diverse market offer.
There is a definite tendency to specialize systems, associating them to a particular service, e.g. CAD, LEP Database, Microprocessor support, and integrate them in the global informatics environment. This gives better service, better reliability and easier management. The on-line computing function is no exception and it is spreading over a number of machines each assigned to a particular set of jobs. A similar tendency can be observed in the accelerator control systems.
At CERN we have had a tradition in the design of special hardware and software. We seem to believe that our problems are very special and that we can solve them better than anybody else. It is fair to say that in some cases we may even be right!
As far as the hardware is concerned, we started by building a whole family of film measuring and digitizing equipment and when the film was replaced by electronic detectors we built all sorts of special processors, programmable fast trigger devices (ESOP, XOP) and emulators (Mice, 168/E, 3081/E). The emulators have been a real success in the last decade, both on-line as a high-level trigger technique and off-line as cheap event processors. We also designed and built the first communication systems (data links, FOCUS, OMNET, CERNET, etc..) well before one could buy network components off the shelf. The fact that one can now buy high-functionality building blocks merely shifts the integration job to a higher level.
One can tell a similar story for the software side. After the initial original contributions to basic system software (paper tape drivers, assemblers, compilers, operating systems, program libraries, data base management systems, etc ... ) the effort has shifted to higher levels of system integration and to specific HEP problems, e.g. HEPVM, ZEBRA, PAW, GEANT, etc... Moving closer to experiments, the evolution of computer applications has involved an ever larger fraction of the participating physicists. The computer wizard of the 60's has been replaced by the local guru of the 80's, but out in the field computing can still trigger religious wars!
10. THE LANDSCAPE AT THE START OF LEP
At the beginning of the LEP era the CERN Computer Centre offers a wide range of computing and communications facilities. The main systems for physics based on IBM, Siemens/Fujitsu, Cray and Digital mainframes account for almost 90 CERN units of installed scalar capacity (plus a substantial vector capability), roughly 100 000 times the power of the Mercury. Taking into account all the machines existing on-site including the workstations and the personal computers, the overall computing capacity has expanded by almost six orders of magnitude in 30 years.
Data communications at CERN cover electronic mail, file transfer, remote job entry and terminal access services. The ageing CERNET is still in use while an expanding set of Ethernet segments bridged together cover the whole site. 2000 terminals are connected to a Gandalf terminal switch called INDEX which has been operated for the last fifteen years and still allows users to work on any one of a large number of systems. The number of personal computers and workstations connected to Ethernet, the LEP token ring or any other local area network, is already well over 2000 and growing rapidly. A number of protocols are used over the local networks, e.g. CERNET, TCPAP, OSI, Digital DNA and IBM SNA.
CERN is linked to universities and research laboratories all over the world via leased lines and the public X.25 network. A variety of services are provided over these lines. CERN acts as the central site for X.25 and DECnet networks used by the HEP community and as the Swiss node of EARN and EUNET. Currently the prevailing line speed is 64 kbps, but an urgent upgrade to 2 Mbps circuits has been advocated and is just starting.
At the experiments, data are filtered in real time and collected on tape by complex arrays of processing devices interconnected by high-speed bus systems, e.g. FASTBUS, VMEbus. Several techniques are applied in successive stages, e.g. fast special processors, dedicated microprocessors, emulators, conventional mini's, workstations and midrange computers. A very important milestone, UA-I (1983) can be considered the first of a new generation of experiments, making extensive use of computers and microprocessors (over 300) on-line to the detectors.
The selected, compressed and recorded experimental data are then submitted to further processing off-line at CERN as well as at the physicists' home institutes. Large scale special programs are developed for this analysis job. Software engineering techniques are beginning to be adopted by the larger collaborations. It is clear that the design and production of software can no longer be regarded as the creative outburst of some gifted individuals. Analysis chains are made of millions of lines of codes to be executed on millions of events stored on tens of thousands of tape cassettes. The off-line activity has become a major management job. It also includes the simulation task, another very important and fast growing aspect of computation.
The final analysis stage makes increasingly use of the powerful graphics capabilities of professional workstations. Complex data patterns are visualized as 2, 3 or multi-dimensional images. Recently, the 'Physics Analysis Workstation' Project (PAW) has produced a distributed application which takes advantage of the interactive graphics power of modern workstations e.g. Apollo, DEC, and the number crunching and large storage capabilities of central mainframes. The introduction of interactive graphics techniques at CERN dates back to the 60's when we were using large and expensive CDC first generation displays to visualize physics events and mathematical functions (e.g. GAMMA, SIGMA). The first large scale project to exploit interactive graphics was ERASME, which used DEC computers (initially PDP 10 and 11's, and later, VAXes) to scan and measure bubble chamber films. The expertise built up throughout the 60's and 70's was then used in the 80's to design the modern event visualization and analysis systems (e.g. MERLIN developed for the UA-I and UA-2 experiments), which are now widely used in experimental High Energy Physics.
Three decades of expanding computer applications have left behind a huge amount of software which after proper selection, sorting, and documentation has formed the basis of an invaluable treasury: the CERN Program Library. Continually updated and distributed worldwide, the Library is at the same time an active tool, an extremely useful collection of numerical procedures, and a unique repository where all the experience accumulated over 30 years of computing at CERN is conserved.
After 30 years of continuous development the field of computing is still changing very fast. The attention has been shifting from input/output to CPU speed and main memory size and then back to input/output and peripherals, from algorithms to programming languages and then to operating systems, from data links to networks and to distributed systems. The current problem is all of the above plus large scale data handling: we need dramatic improvements in data storage and transmission systems. The scale is new but the problem was already there with the Mercury...
New challenges and exciting opportunities appear every year. The problems of High Energy Physics have never been so complex, but never before technology has been so rich and generous in proposing solutions. Life is hectic for planners and service providers who can hardly match user's expectations and have a hard time keeping the environment stable while replacing hardware and software at an unprecedented rate.
The landscape is totally changed. The computer is no longer a technical curiosity. Its impact has reached levels far beyond imagination. Hardly anybody is unaffected. Yet the field is moving too fast to be really close to maturity. Success or failure still depend strongly on people. While expertise is spreading and most physicists, engineers and administrators are more comfortable with computers, our systems are vulnerable and depend on a few key experts. This is after all reassuring. Contrary to popular scenarios of the 50's, and in spite of their shortcomings, human beings are still in the driver's seat.
11. THE NEXT 30 YEARS
In 30 years time the CERN Computing School will celebrate its 50th anniversary. If I had to pick one thing likely to still be alive 30 years from now I would choose FORTRAN. It is as safe a bet as to predict that everything else is going to change.
Of course there will be limits to growth. The number of users will level off one day since we are approaching a predictable ceiling. The number of personal computers and workstations, however,
has still plenty of room for growth given that we are far from the limit of 2-3 systems per employee, already experienced in 'high-tech' industries. For sure, the financial limitation is going to be felt for quite some time.
Technology promises to continue to advance at the current rate for the foreseeable future. By the time when one will no longer be able to squeeze any more juice out of semiconductors and magnetic recording, optical techniques will come to the rescue. So, it should be possible to continue for a while to get more hfips, Megapixels, Giganops and Terabytes. And surely, we will know what to do with them, if anything at all will be left for the user. The long awaited standardization of communication protocols, operating systems and application environments, added to the well-known appetite for resources of graphic layers, friendly user interfaces, encryption, file and database systems, will probably keep the user in a state of constant unhappiness...
The question is whether there will be any dramatic changes. I mean something even more dramatic than moving to a 16-bit byte, seeing a complete set of ISO protocols replacing TCP/IP, or watching physicists using C and UNIX... Will people stop programming, either because every possible program has been written or because neural networks will replace programmable circuits? Will the myth of the almighty central brain be revived by some super massive parallel engine leaving only text processing, Mac paint and video games to the personal play stations? When will a super friendly human interface entertain the user in his own language and provide him with all the information he wants without having to type and mouse? When will we all be working from home? Will the computer ultimately collapse to an intelligent pinhead, whilst data will expand to fill any available space?
Waiting for all this to happen, I invite the young generation of computing physicists to a short pictorial tour of the past: the selection of 'historical' images presented at the end of this paper should help understand and remember those very intense and exciting, first 30 years. I realize that this short account does not cover many important aspects of the history of computing at CERN (e.g. Accelerator Controls, Computing in Theoretical Physics, Office and Management Information Systems, etc.), and that it only mentions very briefly many developments, which would deserve a more extensive presentation. The rich specialized literature existing in the laboratory should help satisfying the legitimate curiosity of the history-minded student.
Appendix 3 (part 3)
Siemens 7890S [1985-present]
Made by Fujitsu (as the M382) and commercialized in Europe by Siemens, this twin-CPU IBM compatible machine is an early version of a speeded-up series and as such has a capacity of around 6.5 CERN units per CPU which is very similar to the power of the 3090-200. It has 64 MBytes of main memory in 256 Kbit dynamic RAM chips, and 32 channels, but no expanded storage. The 7890 has a double level cache for each CPU, the first level consisting of a 64 KBytes of 5.5 nsec, 4 Kbit/chip static RAM, and the second level (global buffer) consisting of 512 KBytes of 16 nsec, 16 Kbit/chip static RAM.
The machine is capable of running IBM extended architecture system MVS/YA but not VM/XA because it lacks an instruction introduced later by IBM. Nonetheless it is an impressive machine given its appearance three years before the 3090/200 of comparable speed. At CERN, where it has set new standards of reliability, it shares peripherals with the IBM 3090 system and it has been run under the same IBM operating system (initially MVS/JES2/WYLBUR, and later VM/CMS) until the end of 1988 when the 3090 started operating under VM/XA.
MM 3090 [1986-present]
This is the current flagship of the IBM mainframes. The 3090-200 was installed in December 1985, with 64 MBytes of main memory in 64 Kbit chips, and 64 MBytes of expanded storage. It had two processors, each rated at 6 CERN units, and 32 (later 40) channels. The cache memory consisted of 64 KBytes per processor, and the CPU cycle was 18.5 nsec. The CPU was built in ECL chips. Above all the machine could be considered as IBM's first real scientific processor since the 370/195, as it is optimized for 64-bit word arithmetic.
The machine was upgraded to a model 3090-400E in May 1988, with 256 MBytes of main memory in I Mbit chips, 256 MBytes of expanded storage, and a total of 80 channels. In September 1988, four vector facilities (VF) and 16 channels were added, and at the end of 1988, two more CPU's and two VFs were installed, making a full 3090-600E with 6 VFs. Each processor is rated at 6.5 CERN units of scalar power, has a cycle time of 17.5 nsec, and 64 KBytes of cache memory. This model 600 can also be considered as a twin 'triadic machine'.
The disk capacity has reached 200 GBytes towards the end of 1988 and is supposed to grow very fast in the coming years. Old tape units are being replaced by cassette drives (IBM 3480: 18 tracks, 19000 bpi, holding over 200 MBytes). The operating system has been gradually changed from MVS to VM. Currently the machine runs under VM/XA. The IBM VM service is the workhorse of the CompaWr Centre. This installation is part of the IBM European Academic Supercomputer Initiative, a small number of IBM 3090's equipped with VFs and interconnected via EASInet (currently at 64 Kbps).
DEC VAX 8600/8650/8700/88001 (1985-present)
The central VAX cluster grew out of a set of VAX 11/780s installed in the Computer Centre from 1982 and dedicated to database (ORACLE) and CAD (EUCLID) services for the LEP machine designers.
In 1985, following a major upgrade including the installation of three VAX 8600s, an interactive VMS service was started, limited to the LEP experiments. In 1987 the VAX cluster was further expanded to include a twin-processor VAX 8800 and three 8650s. The VMS service offered to up to 200 simultaneous users was a big success and the central VAX cluster has become a focal point for the VAX/VMS community at CERN. With the addition in 1988 of an 8700 and a 8530 this cluster of DEC hardware has become a non-negligible entity in the Computer Centre with over 3000 registered users.
The largest machine in the cluster, the 8800, has two CPU's each providing about 1.5 CERN units of capacity and a main memory of 96 MBytes. The machines connected to the cluster share access to 18 tape units and over 50 GBytes of disk storage.
CRAY X-MP/48 [1988-present]
The Cray X-MP/48 supercomputer has been installed in January 1988. It has a clock cycle time of 9.5 nsec and it features 4 CPU's, 8 Mwords (64 bits) of central memory, and it is running under UNICOS, a UNIX based operating system. It has powerful vector capabilities, 64-bit registers, 128 Mwords of solid-state backing store, 6 IBM 3480 tape cartridge units and some 48 GBytes of disk storage. Users' access is via IBM VM/CMS and VAX VMS. The total scalar power of the machine is about 32 CERN units. It is hoped that selected applications can be sufficiently vectorized to take full advantage of the vector capabilities of the Cray and thus gain appreciable speedups.