Sunday 15 April 2012

Notes on C default argument promotions and function prototypes

I spent a while researching this subject, so here are a few notes.

C has a thing called the "default argument promotions". If using modern standard C, the only time that these will matter are if you are passing parameters to a variadic function (e.g. printf). The parameters in the variadic section may have their type converted before they are passed to the function. For example, float parameters are converted to doubles, and char's are converted to int's. If you actually needed to pass, for example, a char instead of an int, the function would have to convert it back.

This is a wart and exists because of early design decisions whose consequences we got stuck with for the sake of backwards compatibility. This post is for historical interest and to see why the rules are the way they are.

Here are some examples. We refer to the following functions:


int a (c1)
char c1;
{
printf("%%c:%c\n%%x:%04x\n", c1, c1);
}

int b (c1)
int c1;
{
printf("%%c:%c\n%%x:%04x\n", c1, c1);
}

int a1 (char c1)
{
printf("%%c:%c\n%%x:%04x\n", c1, c1);
}

int b1 (int c1)
{
printf("%%c:%c\n%%x:%04x\n", c1, c1);
}




We have some functions defined here using the pre-ANSI style, and some with the ANSI style, so we shall see what is possible.

We try different ways of calling these functions.


int
main (void)
{
int (*fp)();
int (*fp1)(char);

char c = 's';


We define 'fp' as a function pointer without a prototype. Therefore, all parameters passed via the function pointer are subject to the default argument promotion rules.

The following are correct usage:



fp = a;
fp ('st'); /* correct */


This prints:

%c:t
%x:0074


(I've decided to mix things up with some multi-character character constants, which are result in undefined behaviour according to the standard. I am assuming that they can be used to define the individual bytes of an integer constant.) This is correct because even though 'a' takes a char as a parameter and we are passing it an int (character literals are of type int in C), 'a' is defined using the pre-ANSI style, so it should be called as if it were receiving an int, and it will be automatically converted to a char within the function. (I wanted to see if I could pass both characters ('s' and 't') on to printf, but in this case it only seems to pass on the 't'.)


fp = b;
fp('st'); /* correct */


This prints:


%c:t
%x:7374


Now for something slightly more questionable:


fp = b1;
fp('st'); /* correct - mixing styles works */


This prints:


%c:t
%x:7374


This is questionable because we are calling a function defined using the ANSI style with a pointer declared using the old style - but the function is expecting an int, and gets an int - so everything is fine.


fp1 = a1;
fp1(c); /* correct */


This prints:

%c:s
%x:0073


just as you would expect it to.

Now for some incorrect examples:


/*fp = a1;
fp ('st'); /* incorrect */


It's impossible to call 'a1' correctly through fp, because it is expecting a char as a parameter, but 'fp' cannot pass chars as parameters. 'st' starts as type int, and is passed as type int to a1.


/*fp = a1;
fp (c); /* incorrect */


Even though c is of type char, the default argument promotions convert it to an int.


/* fp1 = a;
fp1(c); /* incorrect but works */


Here we are doing the opposite: calling a old-style function with a new-style pointer. c is passed as a char, but fp1 is expecting an int. It does seem to work on my computer but it is still wrong.

Here are some more functions which we will try calling:


int receive_float (f)
float f;
{
printf ("%f\n", f);
}

int receive_double (f)
double f;
{
printf ("%f\n", f);
}

int new_receive_float (float f)
{
printf ("%f\n", f);
}

int new_receive_double (double f)
{
printf ("%f\n", f);
}


Here are some attempts at calling these functions:


receive_float (1.0f); /*pass as double*/
receive_double (2.0); /*pass as double - 1.0 is of type double*/


Both of these are fine. 1.0f starts as a float, is converted to a double by the argument promotions, and within the function 'receive_float', it is converted back to a float. (Then within 'receive_float', it is converted again to a double to be passed to 'printf'.) The second line works exactly as we would expect it to.

Here is another example:

fp = new_receive_double;
fp(3.0);
fp(4.0f); /* use argument promotions */


'fp' is an old-style function pointer, so all float parameters are converted to doubles, as they should be.

Here is an incorrect example:

/* fp = new_receive_float; /* calling fp correctly is impossible /
fp(3.0); /* incorrect /
fp(4.0f); /* incorrect */

'new_receive_float' expects a float parameter, but fp can only pass it doubles.

Finally, some examples with variadic functions:



#if 0

varargs.h doesn't work anymore so I can't test calling it

int old_variadic (va_alist)
va_dcl
{
int x, y;

va_list p;
va_start(p);

a = va_arg(p, int);
b = va_arg(p, int);

va_end(p);

printf("Args received: %d, %d\n", a, b);
}

#endif

int new_variadic (char *s, ...)
{
va_list p;
int a,b;

va_start (p, s);

a = va_arg(p, int);
b = va_arg(p, int);

va_end(p);

printf("Args received: %d, %d\n", a, b);
}



You will see I have commented out the first function. I wanted to test calling old-style variadic functions in various ways, but I can't compile any old-style functions on my computer any more.


new_variadic (0, 33, 44);


This prints:

Args received: 33, 44


as you'd expect.


int (*fpv)(char *, ...);

fpv = new_variadic; /* correct */
fpv (0, 33, 44);


works just the same.

Now we get on to the interesting bit:


fp = new_variadic; /* incorrect but works */
fp (0, 33, 44);


is technically incorrect, but works anyway.

Why is it incorrect? Because variadic functions are supposed to be prototyped, according to the C standard. The reason for this is that compilers are allowed to use different calling conventions for variadic functions.

Wait a minute here - if all variadic functions have to be defined and called using the new-style, then why is there a rule that variadic parameters undergo default argument promotion, when default argument promotion is a left-over of the old way of doing things?

The answer, I believe, is binary compatibility. We may have a variadic library function defined using the old method.

If we have 'printf', or some other variadic function in a compiled library, which written and compiled using the old system, then it is impossible to write a prototype properly for the function.

int printf(const char *, ...);
is incorrect, because this implies the library function 'printf' can be called using a different calling convention to non-variadic functions. However, before the introduction of function prototypes, such a difference couldn't and didn't exist, because the compiler had no way of telling what type of function it was when it was outputting the object code which was to call the function.

Hence, the correct way to declare the function is
int printf();
, with no ANSI-style prototype, which forces the parameters to undergo default promotion. But if if the same calling convention is used for all functions, then
int printf(const char *, ...);
will work as well, as long as the variadic parameters undergo default promotion, which they do. For the standard library at least, you expect to be able to prototype the functions properly, and this allows this to happen. This isn't a very good justification though, because you can just recompile the library, or not use variadic prototypes. But there is a better justification.

As well as old library binaries being compatible with new programs, you also want new versions of libraries to be compatible with old program binaries, because in some cases you won't have the source code for the latter. If an old binary calls a variadic function in your library, the default argument promotions will be applied to the arguments. Then if you write a new version of the function, using stdarg.h, it still has to be called by the calling code in the old binary. So the code in the new version must provide the same interface as the old version, accepting the same types of arguments. This is the interface to any new programs as well, which declared the function using a variadic prototype.

Suppose the old program had code which was compiled from the following:

short int a;
int b;

printf("%hd", a); /* interpret parameter as a short int (but it is passed as an int) */
printf("%d", b); /* interpret parameter as an int */


We could use the same code in a new program, and the 'printf' would have to provide the same interface to both programs, so the new program has to pass 'a' as an int as well.

I couldn't find much information online about this subject. Here are a few links which are slightly relevant:

http://lkml.indiana.edu/hypermail/linux/kernel/0606.2/0330.html

http://www.archivum.info/comp.lang.c/2009-07/01250/Re-about-argument-pushing-order.html

http://mail-index.netbsd.org/tech-kern/1995/07/31/0002.html">http://mail-index.netbsd.org/tech-kern/1995/07/31/0002.html

No comments:

Post a Comment