modifying constants in Fortran and elsewhere

Discussion:

(too old to reply)

gah4

2023-07-11 02:42:01 UTC

A potential bug since the earliest days of Fortran is passing a
constant to a subroutine, and then changing the value of the dummy
argument.

In at least some Fortran system, this modifies the value of a constant
used other places in a program.

As this was known when PL/I was designed, it is defined such that
modifiable constants are passed to called procedures. C avoids it by
not allowing the & operator on constants. (Though K&R allows
modification of string constants.)

Somehow, in all the years, that feature was never added to Fortran.

It is easy to write programs and test for it, but I wonder if there
are any stories for real program that had this bug, and even better,
stories about the difficulty of finding it, or problems caused by it.

Thomas Koenig

2023-07-15 10:57:48 UTC

Permalink

Could come in handy if the value of PI should change during the
excecution of the program :-)

This is a consequence of the standard /360 calling convention.
Both arguments and local variables were put in close proximity to
the code, if posssible within the range of a base register. It was
all read-write, and the compiler optimized duplicate constants.
(The explanation above is only for non-reentrant code, which was
the usual case for FORTRAN, but they could be made to use reentrant
code using a compiler option).

Post by gah4
As this was known when PL/I was designed, it is defined such that
modifiable constants are passed to called procedures. C avoids it by
not allowing the & operator on constants. (Though K&R allows
modification of string constants.)

You can still try by passing a pointer to a const variable, but
chances are you will get a segfault when you try to modify it.

Post by gah4
Somehow, in all the years, that feature was never added to Fortran.

Fortran has the VALUE attribute for dummy variables now, which
generates a local copy of the variable. Compilers differ on how
they implement it; passing VALUE arguments as actual value, like
C usually does, or passing a pointer and making a local copy are
both valid choices.

Post by gah4
It is easy to write programs and test for it, but I wonder if there
are any stories for real program that had this bug, and even better,
stories about the difficulty of finding it, or problems caused by it.

I actually got bitten by that while using a mainframe for scientific
work as a student. It's been a few decades, so I don't recall too
many details. It was difficult to find, but I was paid by the hour,
so I didn't mind too much :-)
[The constant stomping issue far predates S/360. As soon as Fortran II
added subroutines on the 704, there were constant arguments you could
change by mistake. The problem is that it took quite a while for people
to sort out the differences among call by reference, call by value,
and call by copy in/out. Fortran on the 70x and S/360 user reference
for array arguments, copy in/out for scalars. Algol 60 tried to define
its argument passing in an elegant way, and accidentally invented call
by name when they meant call by reference. -John]

Hans-Peter Diettrich

2023-07-16 09:56:27 UTC

Permalink

Post by Thomas Koenig

Post by gah4
A potential bug since the earliest days of Fortran is passing a
constant to a subroutine, and then changing the value of the dummy
argument.

I remember such a bug in the 80s on the pdp-11/34 of our institute.

Post by Thomas Koenig
Algol 60 tried to define
its argument passing in an elegant way, and accidentally invented call
by name when they meant call by reference. -John]

By Name example from the German wikipedia
<https://de.wikipedia.org/wiki/Namensparameter>
A[1] := 10
A[2] := 20
i := 1
x := Funkt (A[i], i)

FUNCTION Funkt (a, i) : Real
i := i+1
RETURN a
END

x wird der Wert 20 zugewiesen. Im Werteparameter-Fall wäre x dagegen 10,
ebenso bei Referenzparametern.
<<
Returns 20 as ByName while 10 were returned for ByRef.

DoDi
[Then there's the infamous Knuth Man or Boy program, where even he
couldn't figure out what the answer was supposed to be.
https://en.wikipedia.org/wiki/Man_or_boy_test
-John]

Thomas Koenig

2023-07-16 13:08:07 UTC

Permalink

Post by Thomas Koenig
[The constant stomping issue far predates S/360. As soon as Fortran II
added subroutines on the 704, there were constant arguments you could
change by mistake. The problem is that it took quite a while for people
to sort out the differences among call by reference, call by value,
and call by copy in/out. Fortran on the 70x and S/360 user reference
for array arguments, copy in/out for scalars.

I just have the IBM System/360 Operating System FORTRAN IV (H)
Programmer's Guide, Fourth Edition, open (isn't Bitsavers great?).

It states, on page 108

Argument List

The argument list contains addresses of
variables, arrays, and subprogram names
used as arguments. Each entry in the argu-
ment list is four bytes and is aligned on a
full-word boundary. The last three bytes
of each entry contain the 24-bit address of
an argument. The first byte of each entry
contains zeros, unless it is the last entry
in the argument list. If this is the last
entry, the sign bit in the entry is set on.

So, apparently no copy in/out on that particular compiler, at least.
It also shows the (ab)use that was possible of the uppermost byte,
because clearly 24 bits are enough for everybody and for all
time, right? :-)
[See the 360 Fortran IV Language manual, where on pages 90-95 it explains
copy in/out and how to avoid it by putting slashes around the dummy
argument names.

https://bitsavers.org/pdf/ibm/360/fortran/C28-6515-7_System_360_FORTRAN_IV_Language_1966.pdf

I was there, I actually used this stuff. Re abuse of the upper byte, as the size of
OS/360 exploded way past what they expected, programmers were under pressure to make
every bit and byte count, hence overloading the high byte. -John]

gah4

2023-07-17 02:09:36 UTC

Permalink

On Sunday, July 16, 2023 at 9:43:20 AM UTC-7, Thomas Koenig wrote:

(snip)

Post by Thomas Koenig
I just have the IBM System/360 Operating System FORTRAN IV (H)
Programmer's Guide, Fourth Edition, open (isn't Bitsavers great?).
It states, on page 108
Argument List
The argument list contains addresses of
variables, arrays, and subprogram names
used as arguments. Each entry in the argu-
ment list is four bytes and is aligned on a
full-word boundary. The last three bytes
of each entry contain the 24-bit address of
an argument. The first byte of each entry
contains zeros, unless it is the last entry
in the argument list. If this is the last
entry, the sign bit in the entry is set on.
So, apparently no copy in/out on that particular compiler, at least.
It also shows the (ab)use that was possible of the uppermost byte,
because clearly 24 bits are enough for everybody and for all
time, right? :-)

Soon after I started learning OS/360 Fortran, I found the LIST compiler option.

That gives a listing, sort-of, of the generated assembly code.
(Not good enough to actually assemble, but useful to figure out
what it actually does.)

First, OS/360 Fortran, like just about all the others at the time, does
all static allocation. The MAP option gives a listing of all the variables
used, and their addresses.

Each subroutine has a prologue, which copies all scalar variables to
the address indicated in the map, and an epilogue which copies them
back.

Variables are stored just after the executable code, and so addressable
with the same base register, as long as it fits in 4K bytes.
(Or two or three base registers for 8K or 12K bytes.) Once copied,
they are easily directly addressed.

Otherwise, to use one, the address is loaded into a register, and then
used, requiring usually two instructions. And two instructions to
copy each in, and two more to copy back out. So any variable referenced
more than four times, is faster with the copy.

And yes, you can put the dummy argument between slashed, in which case
it won't do that.

There is one reason given by IBM. A DIRECT ACCESS statement, used with
a direct access file, keeps a variable with the record number of the next
record, in case you want to read sequentially. If you need to reference that
from a called subroutine, you need the slashes.

There are a few other cases involving variables in COMMON, such that a
variable could change at an unusual time.

In any case, the Fortran standard, since Fortran 66 allows for this.

I didn't know until the above note from or moderator that went back
to the IBM 704, where Fortran originated. I don't know the addressing
modes of the 704, and how or why that helped.

The only other compiler that I know in that much detail is Fortran-10
for the PDP-10. The PDP-10 has an indirect bit on addresses, such that
the processor will indirectly address with one instruction, though likely
the time of two instructions.

It is only more recently, with non-contiguous assumed shape arrays,
that this came back again. Fixed size and assumed size arrays are
assumed by the called routine to be contiguous. If a routine with
an assumed shape argument, calls another with assumed size,
it is usual to make a copy and pass that. Then copy back again
on return. In that case, it is the caller that does the copy, not the
callee, as for IBM.
[The 70x series had 15 bit word addressing with index registers. Even
though they mostly did sign/magnitude arithmetic, indexing worked by
doing a two's complement subtract of the index register from the
address in the instruction. (I've seen lots of guesses why they did
that, but never found an actual source.) So Fortran arrays were stored
in reverse order starting in high memory. 70x Fortran did not have
the /X/ argument specifier and I cannot find anything in the manuals
about argument aliasing, although the calling sequence example in the
FAP assember program shows how to copy in arguments.

The Fortran manual does say:

A constant may not appear as an argument in the
call to a SUBROUTINE or FUNCTION subprogram if
the corresponding dummy variable in the definition
of the subprogram appeared either on the let side of an
arithmetic statement or in an input list.

-John]

gah4

2023-07-17 17:51:56 UTC

Permalink

(snip, our moderator wrote)

Post by gah4
[The 70x series had 15 bit word addressing with index registers. Even
though they mostly did sign/magnitude arithmetic, indexing worked by
doing a two's complement subtract of the index register from the
address in the instruction. (I've seen lots of guesses why they did
that, but never found an actual source.) So Fortran arrays were stored
in reverse order starting in high memory. 70x Fortran did not have
the /X/ argument specifier and I cannot find anything in the manuals
about argument aliasing, although the calling sequence example in the
FAP assember program shows how to copy in arguments.

I have known for a long time that the 704 (and I believe later 7.. machines)
indexes arrays backwards, and allocates from the end of memory.

Allocating COMMON from the end of memory is convenient, in that is the
one place (other than the beginning) where you know will always by the same.

Indexing back means that it will still work, even with different length COMMON.

One Fortran feature that I only recently found out about, from 1130 ECAP,
is CALL LINK.

A program can CALL LINK(other program name), and replace itself with
another executable program. Not a subroutine call, but the whole program
is replaced in memory. But COMMON blocks are kept!

(That might be before subroutine overlay that I know well.)

By the time of OS/360, that seems to have disappeared, but a fancy
subroutine overlay system then appeared.

I any case, allocating from the end of memory, and indexing backward,
is convenient for aligning data in different whole executable programs.

Maybe they were just lucky.
[The 70x IBSYS had what they called chain jobs, same idea that the
code was overwritten but the common data stayed. The 360 linker had an
elaborate system of tree structured overlays that let you divide the
code into segments loaded automatically when the program called a
routine in the segment, again leaving whatever was in the root
segment, typically including common blocks. You can read all about it
in Chapter 8 of "Linkers and Loaders", available from better
booksellers everywhere. -John]

gah4

2023-07-17 02:17:22 UTC

Permalink

(snip)

I was there, I actually used this stuff. Re abuse of the upper byte, as the size of
OS/360 exploded way past what they expected, programmers were under pressure to make
every bit and byte count, hence overloading the high byte. -John]

Funny how history repeats itself. (As the saying goes.)

In this case, it was only the one bit, and with 31 bit addressing, it could
still be used. Though the compile doesn't actually check for the end of
the argument list.

Much of OS/360, at all levels, uses the high byte of addresses.
Most important, in the DCB used for much I/O. With the transition to 31 bit
and 64 bit addressing, much of the old control blocks are still there.
The DCB is in user address space, and so not easily replaced.

Much of that is still true in OS/390 and z/OS, the 31 bit and 64 bit OS.

And then years later, Apple creates the Macintosh, and (original) MacOS.
And with 24 bit addressing on the 68000, uses the high byte of addresses.

Then with the 68020, there were programs known to be 32 bit clean,
and those that were not.

I suspect that all the hard learned lessons from the mainframe days were
relearned in the microcomputer days.

David Brown

2023-07-17 11:09:43 UTC

Permalink

Post by gah4
A potential bug since the earliest days of Fortran is passing a
constant to a subroutine, and then changing the value of the dummy
argument.
In at least some Fortran system, this modifies the value of a constant
used other places in a program.
As this was known when PL/I was designed, it is defined such that
modifiable constants are passed to called procedures. C avoids it by
not allowing the & operator on constants. (Though K&R allows
modification of string constants.)
Somehow, in all the years, that feature was never added to Fortran.
It is easy to write programs and test for it, but I wonder if there
are any stories for real program that had this bug, and even better,
stories about the difficulty of finding it, or problems caused by it.

I don't think any language can beat Forth here:

1 .
=> 1 ok
1 1 + .
=> 2 ok

: 1 2 ;
=> ok
1 .
=> 2 ok
1 1 + .
=> 4 ok

If you want to redefine the meaning of a number, you just define it like
any other identifier.