Most users of g77
can be divided into two camps:
g77
.
g77
to compile existing, "legacy" code.
Users writing new code generally understand most of the necessary
aspects of Fortran to write "mainstream" code, but often need
help deciding how to handle problems, such as the construction
of libraries containing BLOCK DATA
.
Users dealing with "legacy" code sometimes don't have much
experience with Fortran, but believe that the code they're compiling
already works when compiled by other compilers (and might
not understand why, as is sometimes the case, it doesn't work
when compiled by g77
).
The following information is designed to help users do a better job coping with existing, "legacy" Fortran code, and with writing new code as well.
Without f2c
, g77
would have taken much longer to
do and probably not been as good for quite a while.
Sometimes people who notice how much g77
depends on, and
documents encouragement to use, f2c
ask why g77
was created if f2c
already existed.
This section gives some basic answers to these questions, though it is not intended to be comprehensive.
g77
offers several extensions to FORTRAN 77 language that f2c
doesn't:
CYCLE
and EXIT
SELECT CASE
KIND=
and LEN=
notation
FORMAT
statements
(such as `FORMAT(I<J>)',
where `J' is a PARAMETER
named constant)
MvBits
intrinsic
libU77
(Unix-compatibility) library,
with routines known to compiler as intrinsics
(so they work even when compiler options are used
to change the interfaces used by Fortran routines)
g77
also implements iterative DO
loops
so that they work even in the presence of certain "extreme" inputs,
unlike f2c
.
See section Loops.
However, f2c
offers a few that g77
doesn't, such as:
PARAMETER
statements
AUTOMATIC
statement
It is expected that g77
will offer some or all of these missing
features at some time in the future.
g77
offers better diagnosis of problems in FORMAT
statements.
f2c
doesn't, for example, emit any diagnostic for
`FORMAT(XZFAJG10324)',
leaving that to be diagnosed, at run time, by
the libf2c
run-time library.
g77
offers compiler options that f2c
doesn't,
most of which are designed to more easily accommodate
legacy code:
However, f2c
offers a few that g77
doesn't,
like an option to have REAL
default to REAL*8
.
It is expected that g77
will offer all of the
missing options pertinent to being a Fortran compiler
at some time in the future.
Saving the steps of writing and then rereading C code is a big reason
why g77
should be able to compile code much faster than using
f2c
in conjunction with the equivalent invocation of gcc
.
However, due to g77
's youth, lots of self-checking is still being
performed.
As a result, this improvement is as yet unrealized
(though the potential seems to be there for quite a big speedup
in the future).
It is possible that, as of version 0.5.18, g77
is noticeably faster compiling many Fortran source files than using
f2c
in conjunction with gcc
.
g77
has the potential to better optimize code than f2c
,
even when gcc
is used to compile the output of f2c
,
because f2c
must necessarily
translate Fortran into a somewhat lower-level language (C) that cannot
preserve all the information that is potentially useful for optimization,
while g77
can gather, preserve, and transmit that information directly
to the GBE.
For example, g77
implements ASSIGN
and assigned
GOTO
using direct assignment of pointers to labels and direct
jumps to labels, whereas f2c
maps the assigned labels to
integer values and then uses a C switch
statement to encode
the assigned GOTO
statements.
However, as is typical, theory and reality don't quite match, at least
not in all cases, so it is still the case that f2c
plus gcc
can generate code that is faster than g77
.
Version 0.5.18 of g77
offered default
settings and options, via patches to the gcc
back end, that allow for better program speed, though
some of these improvements also affected the performance
of programs translated by f2c
and then compiled
by g77
's version of gcc
.
Version 0.5.20 of g77
offers further performance
improvements, at least one of which (alias analysis) is
not generally applicable to f2c
(though f2c
could presumably be changed to also take advantage of
this new capability of the gcc
back end, assuming
this is made available in an upcoming release of gcc
).
Because g77
compiles directly to assembler code like gcc
,
instead of translating to an intermediate language (C) as does f2c
,
support for debugging can be better for g77
than f2c
.
However, although g77
might be somewhat more "native" in terms of
debugging support than f2c
plus gcc
, there still are a lot
of things "not quite right".
Many of the important ones should be resolved in the near future.
For example, g77
doesn't have to worry about reserved names
like f2c
does.
Given `FOR = WHILE', f2c
must necessarily
translate this to something other than
`for = while;', because C reserves those words.
However, g77
does still uses things like an extra level of indirection
for ENTRY
-laden procedures--in this case, because the back end doesn't
yet support multiple entry points.
Another example is that, given
COMMON A, B EQUIVALENCE (B, C)
the g77
user should be able to access the variables directly, by name,
without having to traverse C-like structures and unions, while f2c
is unlikely to ever offer this ability (due to limitations in the
C language).
However, due to apparent bugs in the back end, g77
currently doesn't
take advantage of this facility at all--it doesn't emit any debugging
information for COMMON
and EQUIVALENCE
areas,
other than information
on the array of char
it creates (and, in the case
of local EQUIVALENCE
, names) for each such area.
Yet another example is arrays.
g77
represents them to the debugger
using the same "dimensionality" as in the source code, while f2c
must necessarily convert them all to one-dimensional arrays to fit
into the confines of the C language.
However, the level of support
offered by debuggers for interactive Fortran-style access to arrays
as compiled by g77
can vary widely.
In some cases, it can actually
be an advantage that f2c
converts everything to widely supported
C semantics.
In fairness, g77
could do many of the things f2c
does
to get things working at least as well as f2c
---for now,
the developers prefer making g77
work the
way they think it is supposed to, and finding help improving the
other products (the back end of gcc
; gdb
; and so on)
to get things working properly.
To avoid the extensive hassle that would be needed to avoid this,
f2c
uses C character constants to encode character and Hollerith
constants.
That means a constant like `'HELLO'' is translated to
`"hello"' in C, which further means that an extra null byte is
present at the end of the constant.
This null byte is superfluous.
g77
does not generate such null bytes.
This represents significant
savings of resources, such as on systems where `/dev/null' or
`/dev/zero' represent bottlenecks in the systems' performance,
because g77
simply asks for fewer zeros from the operating
system than f2c
.
(Avoiding spurious use of zero bytes, each byte typically have
eight zero bits, also reduces the liabilities in case
Microsoft's rumored patent on the digits 0 and 1 is upheld.)
To ensure that block data program units are linked, especially a concern
when they are put into libraries, give each one a name (as in
`BLOCK DATA FOO') and make sure there is an `EXTERNAL FOO'
statement in every program unit that uses any common block
initialized by the corresponding BLOCK DATA
.
g77
currently compiles a BLOCK DATA
as if it were a
SUBROUTINE
,
that is, it generates an actual procedure having the appropriate name.
The procedure does nothing but return immediately if it happens to be
called.
For `EXTERNAL FOO', where `FOO' is not otherwise referenced in the
same program unit, g77
assumes there exists a `BLOCK DATA FOO'
in the program and ensures that by generating a
reference to it so the linker will make sure it is present.
(Specifically, g77
outputs in the data section a static pointer to the
external name `FOO'.)
The implementation g77
currently uses to make this work is
one of the few things not compatible with f2c
as currently
shipped.
f2c
currently does nothing with `EXTERNAL FOO' except
issue a warning that `FOO' is not otherwise referenced,
and, for `BLOCK DATA FOO',
f2c
doesn't generate a dummy procedure with the name `FOO'.
The upshot is that you shouldn't mix f2c
and g77
in
this particular case.
If you use f2c
to compile `BLOCK DATA FOO',
then any g77
-compiled program unit that says `EXTERNAL FOO'
will result in an unresolved reference when linked.
If you do the
opposite, then `FOO' might not be linked in under various
circumstances (such as when `FOO' is in a library, or you're
using a "clever" linker--so clever, it produces a broken program
with little or no warning by omitting initializations of global data
because they are contained in unreferenced procedures).
The changes you make to your code to make g77
handle this situation,
however, appear to be a widely portable way to handle it.
That is, many systems permit it (as they should, since the
FORTRAN 77 standard permits `EXTERNAL FOO' when `FOO'
is a block data program unit), and of the ones
that might not link `BLOCK DATA FOO' under some circumstances, most of
them appear to do so once `EXTERNAL FOO' is present in the appropriate
program units.
Here is the recommended approach to modifying a program containing a program unit such as the following:
BLOCK DATA FOO COMMON /VARS/ X, Y, Z DATA X, Y, Z / 3., 4., 5. / END
If the above program unit might be placed in a library module, then
ensure that every program unit in every program that references that
particular COMMON
area uses the EXTERNAL
statement
to force the area to be initialized.
For example, change a program unit that starts with
INTEGER FUNCTION CURX() COMMON /VARS/ X, Y, Z CURX = X END
so that it uses the EXTERNAL
statement, as in:
INTEGER FUNCTION CURX() COMMON /VARS/ X, Y, Z EXTERNAL FOO CURX = X END
That way, `CURX' is compiled by g77
(and many other
compilers) so that the linker knows it must include `FOO',
the BLOCK DATA
program unit that sets the initial values
for the variables in `VAR', in the executable program.
The meaning of a DO
loop in Fortran is precisely specified
in the Fortran standard...and is quite different from what
many programmers might expect.
In particular, Fortran iterative DO
loops are implemented as if
the number of trips through the loop is calculated before
the loop is entered.
The number of trips for a loop is calculated from the start, end, and increment values specified in a statement such as:
DO iter = start, end, increment
The trip count is evaluated using a fairly simple formula based on the three values following the `=' in the statement, and it is that trip count that is effectively decremented during each iteration of the loop. If, at the beginning of an iteration of the loop, the trip count is zero or negative, the loop terminates. The per-loop-iteration modifications to iter are not related to determining whether to terminate the loop.
There are two important things to remember about the trip count:
INTEGER(KIND=1)
.
These two items mean that there are loops that cannot
be written in straightforward fashion using the Fortran DO
.
For example, on a system with the canonical 32-bit two's-complement
implementation of INTEGER(KIND=1)
, the following loop will not work:
DO I = -2000000000, 2000000000
Although the start and end values are well within
the range of INTEGER(KIND=1)
, the trip count is not.
The expected trip count is 40000000001, which is outside
the range of INTEGER(KIND=1)
on many systems.
Instead, the above loop should be constructed this way:
I = -2000000000 DO IF (I .GT. 2000000000) EXIT ... I = I + 1 END DO
The simple DO
construct and the EXIT
statement
(used to leave the innermost loop)
are F90 features that g77
supports.
Some Fortran compilers have buggy implementations of DO
,
in that they don't follow the standard.
They implement DO
as a straightforward translation
to what, in C, would be a for
statement.
Instead of creating a temporary variable to hold the trip count
as calculated at run time, these compilers
use the iteration variable iter to control
whether the loop continues at each iteration.
The bug in such an implementation shows up when the trip count is within the range of the type of iter, but the magnitude of `ABS(end) + ABS(incr)' exceeds that range. For example:
DO I = 2147483600, 2147483647
A loop started by the above statement will work as implemented
by g77
, but the use, by some compilers, of a
more C-like implementation akin to
for (i = 2147483600; i <= 2147483647; ++i)
produces a loop that does not terminate, because `i' can never be greater than 2147483647, since incrementing it beyond that value overflows `i', setting it to -2147483648. This is a large, negative number that still is less than 2147483647.
Another example of unexpected behavior of DO
involves
using a nonintegral iteration variable iter, that is,
a REAL
variable.
Consider the following program:
DATA BEGIN, END, STEP /.1, .31, .007/ DO 10 R = BEGIN, END, STEP IF (R .GT. END) PRINT *, R, ' .GT. ', END, '!!' PRINT *,R 10 CONTINUE PRINT *,'LAST = ',R IF (R .LE. END) PRINT *, R, ' .LE. ', END, '!!' END
A C-like view of DO
would hold that the two "exclamatory"
PRINT
statements are never executed.
However, this is the output of running the above program
as compiled by g77
on a GNU/Linux ix86 system:
.100000001 .107000001 .114 .120999999 ... .289000005 .296000004 .303000003 LAST = .310000002 .310000002 .LE. .310000002!!
Note that one of the two checks in the program turned up
an apparent violation of the programmer's expectation--yet,
the loop is correctly implemented by g77
, in that
it has 30 iterations.
This trip count of 30 is correct when evaluated using
the floating-point representations for the begin,
end, and incr values (.1, .31, .007) on GNU/Linux
ix86 are used.
On other systems, an apparently more accurate trip count
of 31 might result, but, nevertheless, g77
is
faithfully following the Fortran standard, and the result
is not what the author of the sample program above
apparently expected.
(Such other systems might, for different values in the DATA
statement, violate the other programmer's expectation,
for example.)
Due to this combination of imprecise representation
of floating-point values and the often-misunderstood
interpretation of DO
by standard-conforming
compilers such as g77
, use of DO
loops
with REAL
iteration
variables is not recommended.
Such use can be caught by specifying `-Wsurprising'.
See section Options to Request or Suppress Warnings, for more information on this
option.
Getting Fortran programs to work in the first place can be quite a challenge--even when the programs already work on other systems, or when using other compilers.
g77
offers some facilities that might be useful for
tracking down bugs in such programs.
A fruitful source of bugs in Fortran source code is use, or mis-use, of Fortran's implicit-typing feature, whereby the type of a variable, array, or function is determined by the first character of its name.
Simple cases of this include statements like `LOGX=9.227',
without a statement such as `REAL LOGX'.
In this case, `LOGX' is implicitly given INTEGER(KIND=1)
type, with the result of the assignment being that it is given
the value `9'.
More involved cases include a function that is defined starting
with a statement like `DOUBLE PRECISION FUNCTION IPS(...)'.
Any caller of this function that does not also declare `IPS'
as type DOUBLE PRECISION
(or, in GNU Fortran, REAL(KIND=2)
)
is likely to assume it returns
INTEGER
, or some other type, leading to invalid results
or even program crashes.
The `-Wimplicit' option might catch failures to properly specify the types of variables, arrays, and functions in the code.
However, in code that makes heavy use of Fortran's
implicit-typing facility, this option might produce so
many warnings about cases that are working, it would be
hard to find the one or two that represent bugs.
This is why so many experienced Fortran programmers strongly
recommend widespread use of the IMPLICIT NONE
statement,
despite it not being standard FORTRAN 77, to completely turn
off implicit typing.
(g77
supports IMPLICIT NONE
, as do almost all
FORTRAN 77 compilers.)
Note that `-Wimplicit' catches only implicit typing of names. It does not catch implicit typing of expressions such as `X**(2/3)'. Such expressions can be buggy as well--in fact, `X**(2/3)' is equivalent to `X**0', due to the way Fortran expressions are given types and then evaluated. (In this particular case, the programmer probably wanted `X**(2./3.)'.)
Many Fortran programs were developed on systems that provided automatic initialization of all, or some, variables and arrays to zero. As a result, many of these programs depend, sometimes inadvertently, on this behavior, though to do so violates the Fortran standards.
You can ask g77
for this behavior by specifying the
`-finit-local-zero' option when compiling Fortran code.
(You might want to specify `-fno-automatic' as well,
to avoid code-size inflation for non-optimized compilations.)
Note that a program that works better when compiled with the
`-finit-local-zero' option
is almost certainly depending on a particular system's,
or compiler's, tendency to initialize some variables to zero.
It might be worthwhile finding such cases and fixing them,
using techniques such as compiling with the `-O -Wuninitialized'
options using g77
.
Many Fortran programs were developed on systems that
saved the values of all, or some, variables and arrays
across procedure calls.
As a result, many of these programs depend, sometimes
inadvertently, on being able to assign a value to a
variable, perform a RETURN
to a calling procedure,
and, upon subsequent invocation, reference the previously
assigned variable to obtain the value.
They expect this despite not using the SAVE
statement
to specify that the value in a variable is expected to survive
procedure returns and calls.
Depending on variables and arrays to retain values across
procedure calls without using SAVE
to require it violates
the Fortran standards.
You can ask g77
to assume SAVE
is specified for all
relevant (local) variables and arrays by using the
`-fno-automatic' option.
Note that a program that works better when compiled with the
`-fno-automatic' option
is almost certainly depending on not having to use
the SAVE
statement as required by the Fortran standard.
It might be worthwhile finding such cases and fixing them,
using techniques such as compiling with the `-O -Wuninitialized'
options using g77
.
The `-Wunused' option can find bugs involving implicit typing, sometimes more easily than using `-Wimplicit' in code that makes heavy use of implicit typing. An unused variable or array might indicate that the spelling for its declaration is different from that of its intended uses.
Other than cases involving typos, unused variables rarely indicate actual bugs in a program. However, investigating such cases thoroughly has, on occasion, led to the discovery of code that had not been completely written--where the programmer wrote declarations as needed for the whole algorithm, wrote some or even most of the code for that algorithm, then got distracted and forgot that the job was not complete.
As with unused variables, It is possible that unused arguments to a procedure might indicate a bug. Compile with `-W -Wunused' option to catch cases of unused arguments.
Note that `-W' also enables warnings regarding overflow of floating-point constants under certain circumstances.
The `-Wsurprising' option can help find bugs involving
expression evaluation or in
the way DO
loops with non-integral iteration variables
are handled.
Cases found by this option might indicate a difference of
interpretation between the author of the code involved, and
a standard-conforming compiler such as g77
.
Such a difference might produce actual bugs.
In any case, changing the code to explicitly do what the
programmer might have expected it to do, so g77
and
other compilers are more likely to follow the programmer's
expectations, might be worthwhile, especially if such changes
make the program work better.
The `-falias-check', `-fargument-alias',
`-fargument-noalias',
and `-fno-argument-noalias-global' options,
introduced in version 0.5.20 and
g77
's version 2.7.2.2.f.2 of gcc
,
were withdrawn as of g77
version 0.5.23
due to their not being supported by gcc
version 2.8.
These options, which control the assumptions regarding aliasing
(overlapping) of writes and reads to main memory (core) made
by the gcc
back end,
might well be added back (in some form) in a future version
of gcc
.
However, these options are supported by egcs
.
The information below still is useful, but applies to
only those versions of g77
that support the
alias analysis implied by support for these options.
These options are effective only when compiling with `-O' (specifying any level other than `-O0') or with `-falias-check'.
The default for Fortran code is `-fargument-noalias-global'.
(The default for C code and code written in other C-based languages
is `-fargument-alias'.
These defaults apply regardless of whether you use g77
or
gcc
to compile your code.)
Note that, on some systems, compiling with `-fforce-addr' in effect can produce more optimal code when the default aliasing options are in effect (and when optimization is enabled).
If your program is not working when compiled with optimization,
it is possible it is violating the Fortran standards (77 and 90)
by relying on the ability to "safely" modify variables and
arrays that are aliased, via procedure calls, to other variables
and arrays, without using EQUIVALENCE
to explicitly
set up this kind of aliasing.
(The FORTRAN 77 standard's prohibition of this sort of
overlap, generally referred to therein as "storage
assocation", appears in Sections 15.9.3.6.
This prohibition allows implementations, such as g77
,
to, for example, implement the passing of procedures and
even values in COMMON
via copy operations into local,
perhaps more efficiently accessed temporaries at entry to a
procedure, and, where appropriate, via copy operations back
out to their original locations in memory at exit from that
procedure, without having to take into consideration the
order in which the local copies are updated by the code,
among other things.)
To test this hypothesis, try compiling your program with
the `-fargument-alias' option, which causes the
compiler to revert to assumptions essentially the same as
made by versions of g77
prior to 0.5.20.
If the program works using this option, that strongly suggests
that the bug is in your program.
Finding and fixing the bug(s) should result in a program that
is more standard-conforming and that can be compiled by g77
in a way that results in a faster executable.
(You might want to try compiling with `-fargument-noalias',
a kind of half-way point, to see if the problem is limited to
aliasing between dummy arguments and COMMON
variables--this
option assumes that such aliasing is not done, while still allowing
aliasing among dummy arguments.)
An example of aliasing that is invalid according to the standards is shown in the following program, which might not produce the expected results when executed:
I = 1 CALL FOO(I, I) PRINT *, I END SUBROUTINE FOO(J, K) J = J + K K = J * K PRINT *, J, K END
The above program attempts to use the temporary aliasing of the `J' and `K' arguments in `FOO' to effect a pathological behavior--the simultaneous changing of the values of both `J' and `K' when either one of them is written.
The programmer likely expects the program to print these values:
2 4 4
However, since the program is not standard-conforming, an implementation's behavior when running it is undefined, because subroutine `FOO' modifies at least one of the arguments, and they are aliased with each other. (Even if one of the assignment statements was deleted, the program would still violate these rules. This kind of on-the-fly aliasing is permitted by the standard only when none of the aliased items are defined, or written, while the aliasing is in effect.)
As a practical example, an optimizing compiler might schedule the `J =' part of the second line of `FOO' after the reading of `J' and `K' for the `J * K' expression, resulting in the following output:
2 2 2
Essentially, compilers are promised (by the standard and, therefore,
by programmers who write code they claim to be standard-conforming)
that if they cannot detect aliasing via static analysis of a single
program unit's EQUIVALENCE
and COMMON
statements, no
such aliasing exists.
In such cases, compilers are free to assume that an assignment to
one variable will not change the value of another variable, allowing
it to avoid generating code to re-read the value of the other
variable, to re-schedule reads and writes, and so on, to produce
a faster executable.
The same promise holds true for arrays (as seen by the called
procedure)---an element of one dummy array cannot be aliased
with, or overlap, any element of another dummy array or be
in a COMMON
area known to the procedure.
(These restrictions apply only when the procedure defines, or writes to, one of the aliased variables or arrays.)
Unfortunately, there is no way to find all possible cases of violations of the prohibitions against aliasing in Fortran code. Static analysis is certainly imperfect, as is run-time analysis, since neither can catch all violations. (Static analysis can catch all likely violations, and some that might never actually happen, while run-time analysis can catch only those violations that actually happen during a particular run. Neither approach can cope with programs mixing Fortran code with routines written in other languages, however.)
Currently, g77
provides neither static nor run-time facilities
to detect any cases of this problem, although other products might.
Run-time facilities are more likely to be offered by future
versions of g77
, though patches improving g77
so that
it provides either form of detection are welcome.
For several versions prior to 0.5.20, g77
configured its
version of the libf2c
run-time library so that one of
its configuration macros, ALWAYS_FLUSH
, was defined.
This was done as a result of a belief that many programs expected
output to be flushed to the operating system (under UNIX, via
the fflush()
library call) with the result that errors,
such as disk full, would be immediately flagged via the
relevant ERR=
and IOSTAT=
mechanism.
Because of the adverse effects this approach had on the performance
of many programs, g77
no longer configures libf2c
(now named libg2c
in its g77
incarnation)
to always flush output.
If your program depends on this behavior, either insert the
appropriate `CALL FLUSH' statements, or modify the sources
to the libg2c
, rebuild and reinstall g77
, and
relink your programs with the modified library.
(Ideally, libg2c
would offer the choice at run-time, so
that a compile-time option to g77
or f2c
could
result in generating the appropriate calls to flushing or
non-flushing library routines.)
See section Always Flush Output, for information on how to modify
the g77
source tree so that a version of libg2c
can be built and installed with the ALWAYS_FLUSH
macro defined.
If your program crashes at run time with a message including
the text `illegal unit number', that probably is
a message from the run-time library, libg2c
.
The message means that your program has attempted to use a
file unit number that is out of the range accepted by
libg2c
.
Normally, this range is 0 through 99, and the high end
of the range is controlled by a libg2c
source-file
macro named MXUNIT
.
If you can easily change your program to use unit numbers in the range 0 through 99, you should do so.
Otherwise, see section Larger File Unit Numbers, for information on how
to change MXUNIT
in libg2c
so you can build and
install a new version of libg2c
that supports the larger
unit numbers you need.
Note: While libg2c
places a limit on the range
of Fortran file-unit numbers, the underlying library and operating
system might impose different kinds of limits.
For example, some systems limit the number of files simultaneously
open by a running program.
Information on how to increase these limits should be found
in your system's documentation.
If your program depends on exact IEEE 754 floating-point handling it may help on some systems--specifically x86 or m68k hardware--to use the `-ffloat-store' option or to reset the precision flag on the floating-point unit. See section Options That Control Optimization.
However, it might be better simply to put the FPU into double precision
mode and not take the performance hit of `-ffloat-store'. On x86
and m68k GNU systems you can do this with a technique similar to that
for turning on floating-point exceptions
(see section Floating-point Exception Handling).
The control word could be set to double precision by
replacing the __setfpucw
call with one like this:
__setfpucw ((_FPU_DEFAULT & ~_FPU_EXTENDED) | _FPU_DOUBLE);
(It is not clear whether this has any effect on the operation of the GNU maths library, but we have no evidence of it causing trouble.)
Some targets (such as the Alpha) may need special options for full IEEE conformance. See section `Hardware Models and Configurations' in Using and Porting GNU CC.
Code containing inconsistent calling sequences in the same file is
normally rejected--see section GLOBALS
.
(Use, say, ftnchek
to ensure
consistency across source files.
See section Generating Skeletons and Prototypes with f2c
.)
Mysterious errors, which may appear to be code generation problems, can
appear specifically on the x86 architecture with some such
inconsistencies. On x86 hardware, floating-point return values of
functions are placed on the floating-point unit's register stack, not
the normal stack. Thus calling a REAL
or DOUBLE PRECISION
FUNCTION
as some other sort of procedure, or vice versa,
scrambles the floating-point stack. This may break unrelated code
executed later. Similarly if, say, external C routines are written
incorrectly.
These options should be used only as a quick-and-dirty way to determine how well your program will run under different compilation models without having to change the source. Some are more problematic than others, depending on how portable and maintainable you want the program to be (and, of course, whether you are allowed to change it at all is crucial).
You should not continue to use these command-line options to compile a given program, but rather should make changes to the source code:
-finit-local-zero
DATA
, so that
`-finit-local-zero' is not needed.
Consider using `-Wuninitialized' (which requires `-O') to
find likely candidates, but
do not specify `-finit-local-zero' or `-fno-automatic',
or this technique won't work.
-fno-automatic
SAVE
statements.)
Many other compilers do this automatically, which means lots of
Fortran code developed with those compilers depends on it.
The effect of this is that all non-automatic variables and arrays
are made static, that is, not placed on the stack or in heap storage.
This might cause a buggy program to appear to work better.
If so, rather than relying on this command-line option (and hoping all
compilers provide the equivalent one), add SAVE
statements to some or all program unit sources, as appropriate.
Consider using `-Wuninitialized' (which requires `-O')
to find likely candidates, but
do not specify `-finit-local-zero' or `-fno-automatic',
or this technique won't work.
The default is `-fautomatic', which tells g77
to try
and put variables and arrays on the stack (or in fast registers)
where possible and reasonable.
This tends to make programs faster.
Note: Automatic variables and arrays are not affected
by this option.
These are variables and arrays that are necessarily automatic,
either due to explicit statements, or due to the way they are
declared.
Examples include local variables and arrays not given the
SAVE
attribute in procedures declared RECURSIVE
,
and local arrays declared with non-constant bounds (automatic
arrays).
Currently, g77
supports only automatic arrays, not
RECURSIVE
procedures or other means of explicitly
specifying that variables or arrays are automatic.
-fgroup-intrinsics-hide
EXTERNAL
for any external procedure
that might be the name of an intrinsic.
It is easy to find these using `-fgroup-intrinsics-disable'.
Aside from the usual gcc
options, such as `-O',
`-ffast-math', and so on, consider trying some of the
following approaches to speed up your program (once you get
it working).
On some systems, such as those with Pentium Pro CPUs, programs
that make heavy use of REAL(KIND=2)
(DOUBLE PRECISION
)
might run much slower
than possible due to the compiler not aligning these 64-bit
values to 64-bit boundaries in memory.
(The effect also is present, though
to a lesser extent, on the 586 (Pentium) architecture.)
The Intel x86 architecture generally ensures that these programs will work on all its implementations, but particular implementations (such as Pentium Pro) perform better with more strict alignment. (Such behavior isn't unique to the Intel x86 architecture.) Other architectures might demand 64-bit alignment of 64-bit data.
There are a variety of approaches to use to address this problem:
COMMON
and EQUIVALENCE
areas such
that the variables and arrays with the widest alignment
guidelines come first.
For example, on most systems, this would mean placing
COMPLEX(KIND=2)
, REAL(KIND=2)
, and
INTEGER(KIND=2)
entities first, followed by REAL(KIND=1)
,
INTEGER(KIND=1)
, and LOGICAL(KIND=1)
entities, then
INTEGER(KIND=6)
entities, and finally CHARACTER
and INTEGER(KIND=3)
entities.
The reason to use such placement is it makes it more likely
that your data will be aligned properly, without requiring
you to do detailed analysis of each aggregate (COMMON
and EQUIVALENCE
) area.
Specifically, on systems where the above guidelines are
appropriate, placing CHARACTER
entities before
REAL(KIND=2)
entities can work just as well,
but only if the number of bytes occupied by the CHARACTER
entities is divisible by the recommended alignment for
REAL(KIND=2)
.
By ordering the placement of entities in aggregate
areas according to the simple guidelines above, you
avoid having to carefully count the number of bytes
occupied by each entity to determine whether the
actual alignment of each subsequent entity meets the
alignment guidelines for the type of that entity.
If you don't ensure correct alignment of COMMON
elements, the
compiler may be forced by some systems to violate the Fortran semantics by
adding padding to get DOUBLE PRECISION
data properly aligned.
If the unfortunate practice is employed of overlaying different types of
data in the COMMON
block, the different variants
of this block may become misaligned with respect to each other.
Even if your platform doesn't require strict alignment,
COMMON
should be laid out as above for portability.
(Unfortunately the FORTRAN 77 standard didn't anticipate this
possible requirement, which is compiler-independent on a given platform.)
gcc
configuration subsystem).
The warning about this in the gcc
manual isn't
generally relevant to Fortran,
but using it will force COMMON
to be padded if necessary to align
DOUBLE PRECISION
data.
When DOUBLE PRECISION
data is forcibly aligned
in COMMON
by g77
due to specifying `-malign-double',
g77
issues a warning about the need to
insert padding.
In this case, each and every program unit that uses
the same COMMON
area
must specify the same layout of variables and their types
for that area
and be compiled with `-malign-double' as well.
g77
will issue warnings in each case,
but as long as every program unit using that area
is compiled with the same warnings,
the resulting object files should work when linked together
unless the program makes additional assumptions about
COMMON
area layouts that are outside the scope
of the FORTRAN 77 standard,
or uses EQUIVALENCE
or different layouts
in ways that assume no padding is ever inserted by the compiler.
main()
.
The recent one from GNU (glibc2
) will do this on x86 systems,
but we don't know of any other x86 setups where it will be right.
Read your system's documentation to determine if
it is appropriate to upgrade to a more recent version
to obtain the optimal alignment.
Progress is being made on making this work
"out of the box" on future versions of g77
,
gcc
, and some of the relevant operating systems
(such as GNU/Linux).
A package that tests the degree to which a Fortran compiler
(such as g77
)
aligns 64-bit floating-point variables and arrays
is available at @uref{ftp://alpha.gnu.org/gnu/g77/align/}.
If you're using `-fno-automatic' already, you probably should change your code to allow compilation with `-fautomatic' (the default), to allow the program to run faster.
Similarly, you should be able to use `-fno-init-local-zero' (the default) instead of `-finit-local-zero'. This is because it is rare that every variable affected by these options in a given program actually needs to be so affected.
For example, `-fno-automatic', which effectively SAVE
s
every local non-automatic variable and array, affects even things like
DO
iteration
variables, which rarely need to be SAVE
d, and this often reduces
run-time performances.
Similarly, `-fno-init-local-zero' forces such
variables to be initialized to zero--when SAVE
d (such as when
`-fno-automatic'), this by itself generally affects only
startup time for a program, but when not SAVE
d,
it can slow down the procedure every time it is called.
See section Overly Convenient Command-line Options, for information on the `-fno-automatic' and `-finit-local-zero' options and how to convert their use into selective changes in your own code.
If you aren't linking with any code compiled using
f2c
, try using the `-fno-f2c' option when
compiling all the code in your program.
(Note that libf2c
is not an example of code
that is compiled using f2c
---it is compiled by a C
compiler, typically gcc
.)
Using an appropriate `-m' option to generate specific code for your CPU may be worthwhile, though it may mean the executable won't run on other versions of the CPU that don't support the same instruction set. See section `Hardware Models and Configurations' in Using and Porting GNU CC. For instance on an x86 system the compiler might have been built--as shown by `g77 -v'---for the target `i386-pc-linux-gnu', i.e. an `i386' CPU. In that case to generate code best optimized for a Pentium you could use the option `-march=pentium'.
For recent CPUs that don't have explicit support in the released version
of gcc
, it might still be possible to get improvements
with certain `-m' options.
`-fomit-frame-pointer' can help performance on x86 systems and others. It will, however, inhibit debugging on the systems on which it is not turned on anyway by `-O'.