For a young field, computer science has a lot of history behind it. Here are just a few facts as they pertain to the Fortran programming language.
The most common conceptual model of a computer is the so-called Von Neuman model. In this model, a computer is considered to consist of a linear memory and a central processor. Although reality is more complicated than this, the model is a fair approximation to the truth, and is certainly accurate enough to give insight in the basic issues of programming. Another aspect of the Von Neuman model is the concept of stored program: the instructions are part of the memory, and therefore a form of data themselves. This is largely irrelevant for more programming languages.
Later in the course we will also consider matters pertaining to secondary storage (disks, tapes) which is organised in files, and input / output devices.
Computer memory is linear: there is a numbered series of basic information place holders (called addresses), usually bytes consisting of 8 bits, or words consisting of typically 2 or 4 bytes. While these powers of two are ubiquitous nowadays, in the recent past word lengths of 36 or 60 bits could be found.
The content, that is, the bit patterns, of these bytes and words can be interpreted as numerical or other data. We will go into this elsewhere. The memory also stores the instructions in a computer program. These are processed by a part of the processor called the Control Unit, but that is not of relevance for this course.
The data in the computer memory is processed in the CPU, that is, the processing unit gets data from memory, operates on it, and stores the result back in memory. The CPU itself can be thought to be subdivided in registers and the Arithmetic Logic Unit.
Registers are a form of memory. Normally the CPU cannot directly operate on data in the main memory; instead it has to load data into the registers before it can perform the intended operation.
The fastest computers of the last 25 years no longer satisfy the Von Neuman model. Instead they have multiple CPUs. The number of CPUs can vary from 2 to several thousands. Since these processors are all active at the same time, such a machine is called a parallel computer. In some parallel computers the processors always execute the same instruction (this is called the Single Instruction Multiple Data model, or lockstep mode), in some they can execute independent instructions (called Multiple Instruction Multiple Data, and sometimes Single Program Multiple Data).
A further distinction in parallel computers is between those where the processors operate on the same memory, and those where each processor has its own memory, but where a network of some type provides the possibility of data exchange between the memories. This distinction is not hard and fast: some computers have separate physical memories, but the operating system makes it look as a single address space, and some intermediate layer, eg the caching scheme, provides for the necessary communication which is otherwise invisible to the programmer.
Programming languages serve to instruct the processor how to operate on data in memory. Languages come in various degree of sophistication. A sequence of statements in a programming language is called program. In general, a program for a specific purpose can be shorter if the language is more sophisticated.
Machine language is just about the most primitive language in which to talk to a computer.
LOAD memory address 522 TO REGISTER A LOAD memory address 708 TO REGISTER B ADD the contents of registers A and B leaving the result in C STORE REGISTER C in memory address 15
Writing in machine language is very tedious.
Assembly language is one step up from machine language: you still have to spell out every single instruction, but you don't have to talk about explicit machine addresses anymore. Instead, symbolic variables are used, that are later translated to explicit addresses.
LOAD X TO REGISTER A LOAD Y TO REGISTER B ADD the contents of registers A and B leaving the result in C STORE REGISTER C in Z
The translation to machine language is done by a program called an assembler.
In high level languages it is easier to express what is meant, rather than exactly how the result is computed.
Z <- X+Y
The program translating this into machine language (or perhaps into assembler as an intermediate step) is called a compiler.
An important argument for higher level languages is that they are independent of the specific machine. For each high level programming language there is a language standard, declaring what a valid program looks like; any implementation on a specific machine has to adhere to this standard. This makes the language portable.
The most important difference with lower level languages is there is
no longer a one-to-one mapping between statements in the high level languages
and the machine language program it is translated into. The compiler not
only has to expand simple statements like the above, but it is at liberty
to rearrange the order of computation, as long as the result stays the same.
Example: the program
X <- Z^2-1 Y <- Z^2+1
can be computed more efficiently as
t <- Z^2 X <- t-1 Y <- t+1
at the cost of having to store one more data item. Compilers will perform these and other kinds of optimisations to let a program be executed as efficiently as possible. One of the merits of Fortran over other languages is that compilers can translate programs written in it to highly efficient code.
The oldest widely known programming language is Fortran, which stands for FORmula TRANslating system. It dates back to the early 1950s. As the name indicates, it was meant primarily for numerical computations, and for that purpose it is still widely used. In a number of ways, Fortran shows the signs of old age, since certain features (source code format) were appropriate for the systems on which the language was developed, but despite there being no use for them anymore they have remained part of the language.
As the Fortran language was implemented on machines of different manufacturers, additions and modifications were made. While such additions might be useful for writing a program on one specific machine, it makes porting a program to another machine a hassle. Therefore, a standard for the Fortran language was drawn up, called Fortran66, for the year in which the draft was finished. Occasionally you may run into old code written in Fortran66. Most compilers nowadays still have an option to compile programs written in this old language.
You can recognise Fortran66 programs by the following tell-tale signs:
DO
loops
In the 1960s many new languages were designed and implemented, often with useful features that were conspicuously absent in Fortran. A next standard, Fortran77, included some of the more common constructs of other programming languages (eg, the block IF statement), and it rectified one of the sillier aspects of Fortran66: the semantics of DO loops.
To keep up with developments in programming languages, and to make Fortran better suited to paralell computers, a new standard was drawn up in the 1980s, first called Fortran8x in the hope that it could be finished by 1988. In the end it was called Fortran90, and it includes the whole of Fortran77. The transition from Fortran77 to Fortran90 is much more drastic than the earlier transition to Fortran77: a great number of very sophisticated new languages constructs have been added.
Fortran95 is a slight extension and clarification of Fortran90.
And still some people are not satisfied: specifically for use on parallel computers the F90 language was extended to the HPF standard.
Developing a computer program involves work that does not take place on the computer. Designing an algorithm is done partly in your head, partly on paper. Do not start typing right away. It is often useful to start sketching a program by giving a high level description, or drawing up a flow chart.
Your code can be correct (or, more likely, incorrect) to various degrees. First of all, it is syntactically correct if the compiler finds no (or, as one compiler has it, none of the) errors. A correctly compiling program can still be incorrect.
There are some cases where the language standard
prohibits something that the compiler can not detect, eg because the prohibited
situation arises through the interaction of program parts that were compiled
in separate files. For such semantic errors there are programs
such as Ftnchek that can inspect a whole code and detect likely
errors.
Errors that only appear when the program is actually run are called run-time errors. Examples are division by zero, or taking an element outside of the bounds of an array.
It is possible that in coding your algorithm you actually instructed the computer to compute some other than what you intended it to. Such a `bug', or logical error, can be very hard to find, and bugs are the main reason why program development takes the time it does.
Even if your program works correctly most of the time, there may be cases in which it doesn't. To catch these, you need to test a program, trying to stretch the boundaries.
Finally there is always the possibility that your program does exactly what you wanted it to, but that your algorithm does not solve the problem you were tackling.
Once you have keyed in your program (using a text editor), the work is not finished. When you tell the computer to compile your program, the compiler may find syntax errors. You need to edit the source to correct these. When all syntax errors have been corrected you can run your program.
If your program is large enough that it is stored in more than one file, presumably containing different subprograms, your program needs to be linked before it can be run. The separate files can be compiled independently, however.
Your program then goes through the following stages: the source code is what you have written. This gets compiled into object code, and the object code (also called object modules if your program comes in several files) gets linked, eg to the system libraries, into an executable. Only this last incarnation of the program can be executed directly by the computer.
In the first couple of runs of your program, you need to test it with data for which you know the output, or at least for which you can determine whether the output makes sense. If it does not, you need to debug your code. Reread the source, print out intermediate results (and if necessary simulate your program on paper with a pocket calculator to check the intermediate results), and maybe use a debugger.
It is not enough that your program gives a correct result: the source should be readable, first of all to yourself in case you want to make changes later. Secondly, often other people will need to look at your code (eg, if it is a homework assignment!) and they need to understand it too. For this purpose:
PARAMETER
statements to define `magical' numbers.