Fortran_and_cplusplus_part2

2018/07/16

Categories: Programming Tags: C++ FORTRAN

Mixing Fortran and C++ …​continued

Introduction

As mentioned in my earlier article, mixing Fortran and C code is not an uncommon task. That earlier article outlined the basics by which Fortran functions and data structures (`common` blocks) could be accessed from C. In this follow-up, I will highlight a potential problem that can occur when sharing information in this way. This pitfall shows the importance of questioning your assumptions about how your programs behave.

Getting The Program

The code to accompany this article is available for download in the form of a gzipped tarball. It is very similar to the code used in the first article, and has in fact been simplified somewhat.

Compilation

When you have downloaded and unpacked the code, you can simply type make and two executables will be produced FortCTester and broken_FortCTester. In fact, both of these executables are compiled from the same sourcefiles cppMain.cpp and fortran_routines.F, but by using a preprocessor directive we compile the code twice calling in different headers. Thus, the differences between the programs can be seen by studying these headers. For FortCTester the relevant files are fortran_variables.h and cpp_variables.h, while for broken_FortCTester the relevant files are broken_fortran_variables.h and broken_cpp_variables.h.

Description of Problem

Observant readers may look at the names I selected for these executables and suspect that there might be something a bit off with one of them. Such is indeed the case.

Looking first at the working version, we run it as

./FortCTester

The output we get from this is as follows:

 start in C++

Setting variables in Fortran, via C++, and printing from C++
12345
5.67
 now in FORTRAN

 variables ---------------------
  int value 12345
  real*8 value  5.67
 Setting variables from Fortran
 back in C++

Try to print info from C++ side of the house
variables ---------------------
int value 12345
real*8/double value 5.67

This is essentially the same as we saw from the program in part 1.

Turning to the more suspiciously named executable, we run it and get quite different output (which may be surprising given that the procedural parts of the code are identical to those in FortCTester). The results are as follows:

 start in C++

Setting variables in Fortran, via C++, and printing from C++
12345
5.67
 now in FORTRAN

 variables ---------------------
  int value 2061584302
  real*8 value  5.31233305E-315
 Setting variables from Fortran
 back in C++

Try to print info from C++ side of the house
variables ---------------------
int value 12345
real*8/double value -6.09924e-320

Clearly something is wrong here. We write data into the structure from C without any difficulty, and immediately read it back correctly. However, when we move over to Fortran and try to read the same data, the results look like what one might expect to get when reading from uninitialised memory. In Fortran, we then have a go at writing into the data structure. Though it is not done here, the Fortran code can immediately and correctly read back the values it has written into the common block members. However, when we return to C we again have problems correctly accessing the data in memory. The integer appears to read back correctly, but the Double-Precision float is not at all what one would expect to see. What’s going on?

Examining the Problem

When I first encountered this problem (via a colleague who had noticed some unusual behaviour accessing Fortran common blocks) I found it very puzzling. A first hint to the source of the problem is the compiler warning you get during the build process:

g77  -g -ggdb -D USE_BROKEN -o broken_fortran_routines.o -c
fortran_routines.F
fortran_routines.F: In subroutine `fortran_routine':
broken_fortran_variables.h:4: warning:
         common /vari/
                 ^
Initial padding for common block `vari' is 4 bytes at (^) --
consider reordering members, largest-type-size first

When I looked into this message it indicated that it was essentially a performance-related issue. Depending on the alignment of values in memory, they may be easier (fast) or harder (slow) for the system to access FIXME—​link. However, nothing I read indicated that this could cause the behaviour displayed.

However, given the linking of the compiler-warning to the broken code, it appeared pretty likely that memory and alignment might figure in the explanation of the strange behaviour. To look into this a bit more, I decided to use the GNU Debugger — gdb to examine where the data was actually getting written.

Firing up gdb is as simple as

gdb ./broken_FortCTester

Before typing run, you need to insert a breakpoint somewhere. Just before control hands over to Fortran seemed like as good a place as any:

(gdb) break 19
Breakpoint 1 at 0x8048b81: file cppMain.cpp, line 19.
(gdb) run
Starting program:
/home/mconry/work/fortran_cpp_test/P2_test/FortranAndCPP_Problems/broken_FortCTester
 start in C++

Setting variables in Fortran, via C++, and printing from C++
12345
5.67

Breakpoint 1, main () at cppMain.cpp:19
19              fortran_routine__(); // This calls
fortran_routine() from fortran_routines.f

So far, so good. Now, let’s examine what’s in memory, and also where in memory it is:

(gdb) print vari_
$3 = {value_int = 12345, value_double = 5.6699999999999999}
(gdb) print &vari_
$4 = (variables *) 0x8049300
(gdb) print &vari_.value_int
$5 = (int *) 0x8049300
(gdb) print &vari_.value_double
$6 = (double *) 0x8049304

So we see that the integer and double are stored consecutively in memory. The double is stored 4 bytes after the integer, which is what you might expect given that a standard integer is 4 bytes long in both C++ and Fortran (at least for the compiler and architecture used here). Next we go into Fortran and have a look at what Fortran thinks is going on:

(gdb) step
fortran_routine__ () at fortran_routines.F:13
13            print*, 'now in FORTRAN'
Current language:  auto; currently fortran
(gdb) step
 now in FORTRAN
14            print*,''
(gdb) print value_double__
$8 = 5.3123330517840809e-315
(gdb) print &value_double__
$9 = (PTR TO -> ( real*8 )) 0x8049308
(gdb) print &value_int
No symbol "value_int" in current context.
(gdb) print &value_int__
$10 = (PTR TO -> ( integer )) 0x8049304
(gdb)

Clearly, Fortran doesn’t think the variables are what our C code does. This is no more than we knew already from running our broken code. However, what we can see from the above transcript is _why_ the Fortran code gets confused. As we can see, the Fortran code thinks the integer starts 4 bytes later than C. Thus, when we attempt to read the value we in fact start reading part of the data that C++ wrote for the double. Then when we read the double, we get the second half of the 4 byte double, plus 4 bytes of random memory that we haven’t written to at all.

When the Fortran code writes data, it does so under the same mistaken assumptions. This is why the integer is still readable when we return to C, the Fortran code has written all of it's data (all 3 bytes of it!) starting at the end of our C-Integer value.

Some experiments

It is interesting to play around with this code and see how we can break/fix it. You can try these changes out by editing the files experiment01_fortran_variables.h and experiment01_cpp_variables.h.

We have seen already that putting the 8 byte field (the double) at the start of the memory structures works. As it turns out you can achieve a similar effect by putting two 4 byte terms at the start. In Fortran we thus have:

      real*8        value_double
      integer       value_int
      integer       value_int2

      common /vari/
     +  value_int,
     +  value_int2,
     +  value_double
      save /vari/

And in C++

struct variables
{
    int     value_int;
    int     value_int2;
    double  value_double;
};

Using this code gives identical results to the first working code.

Another way to fix the code is to declare the double as a float, so that all the fields are 4 bytes long. In this case, the entries get arranged contiguously and start from the same address in C++ and Fortran.

If we put a double at the start of the structure, followed by an integer, and then another double, we get a slightly different (but still undesirable) behaviour. The definitions used are in Fortran

      real*8        value_double
      real*8        value_double2
      integer       value_int

      common /vari/
     +  value_double2,
     +  value_int,
     +  value_double
      save /vari/

While in C++ the structure is defined as follows:

struct variables
{
    double  value_double2;
    int     value_int;
    double  value_double;
};

In this case, Fortran is able to correctly read the integer, but the double gets incorrectly accessed. Going through it in gdb shows the following from within the C++ code:

(gdb) print &vari_.value_double2
$1 = (double *) 0x8049300
(gdb) print &vari_.value_int
$2 = (int *) 0x8049308
(gdb) print &vari_.value_double
$3 = (double *) 0x804930c

Notice that value_double2 occupies 8 bytes, and the integer comes straight after. value_double is stored right after the integer. Once we get into the Fortran, if we look at the memory addresses the Fortran code thinks our values are stored at

$4 = (PTR TO -> ( real*8 )) 0x8049300
(gdb) print &value_int__
$5 = (PTR TO -> ( integer )) 0x8049308
(gdb) print &value_double__
$6 = (PTR TO -> ( real*8 )) 0x8049310

The first two are just the same as before, but note that the last entry, value_double, is stored leaving a gap after the end of the integer. With this information, our observed errors are quite predictable. C++ and Fortran agree on where the integer is, so it can be read/written correctly. However, there is disagreement regarding the second float in the structure, leading to an error.