1,188 views
3 votes
3 votes
#include <stdio.h>
int main()
{
     int *p = (int *)20;
     int *q = (int *)30;
     printf("%d", q - p);
}

Explain which one of the following is correct?

  1. Compilation error.
  2. 10
  3. 2
  4. None of there.

2 Answers

3 votes
3 votes

It's undefined behavior as per C standard. 

According to the C Standard (6.5.6 Additive operators)

When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object; the result is the difference of the subscripts of the two array elements.

Also note that is is undefined behaviour to de-reference p and q unless they point to valid addresses holding int values

See this discussion on stackoverflow: 

https://stackoverflow.com/questions/27885823/difference-between-two-pointer-variables

edited by
1 votes
1 votes
#include <stdio.h>
int main()
{
     int *p = (int *)20;
     int *q = (int *)30;
     printf("%d", q - p);
}
The answer is the difference in the number of int(s) stored in between q and p

Look at the above question ( code ). Let's go with the above thought and find out why the answer is 2.

int occupies 2 bytes or 4 bytes ( THIS DEPENDS ON THE ARCHITECTURE, 32bit or 64bit ) ( Most compilers today are 32bit, so the int is 4-bytes )

$$(30 - 20) /4 = \lfloor 2.5 \rfloor = 2$$

Take the floor of it thus, answer is 2.

Personally I feel pointer differences don't make any sense unless you're talking about an array which basically is a consecutive block of elements and if you look at section 6.5.6 Additive operators of the open-std standard N1570, you'll find out that it's not 100% clear how the pointer difference actually behaves. 

Cheers.

UPDATE:  

As it's still not much clear let's look at the gdbdump output of the executable. [ Note, it might be different for you, I'm using Apple LLVM version 8.1.0 (clang-802.0.42)  Default target: x86_64-apple-darwin16.7.0)

This is the assembler source got from ~$ gcc -S -O2 test.c ( I have stripped the symbol table, unecessary parts )

_main:
LFB1:
    subq    $8, %rsp
LCFI0:
    movl    $2, %esi
    xorl    %eax, %eax
    leaq    LC0(%rip), %rdi
    call    _printf
    xorl    %eax, %eax
    addq    $8, %rsp
LCFI1:
    ret

And now we focus to the assembler dump for the above part. ( ~$ gdb test, ~$(gdb) disass /m main) [ It's badly formatted, sorry ] I have written comments for almost all the steps.

Dump of assembler code for function main:
   0x0000000100000f70 <+0>:    sub    $0x8,%rsp // move stack pointer 0x8/4 places down the stack.
   0x0000000100000f74 <+4>:    mov    $0x2,%esi // move 0x2 = 2 ??!! into eax. (although it follows the rule above, I don't think it'll do this always )...
   0x0000000100000f79 <+9>:    xor    %eax,%eax // ...also that why it puts 2 is also undefined.  
   0x0000000100000f7b <+11>:    lea    0x2c(%rip),%rdi        # 0x100000fae
   0x0000000100000f82 <+18>:    callq  0x100000f8e // call printf() and output value of eax ( which is 2 )
   0x0000000100000f87 <+23>:    xor    %eax,%eax
   0x0000000100000f89 <+25>:    add    $0x8,%rsp // move stack pointer back 0x8/4 places up.
   0x0000000100000f8d <+29>:    retq
End of assembler dump.

The whole of .__main is a stack. 

There is no relation between p and q and 2. It's actually futile to proceed this. Buggy input will give weird undefined output. ( arbitrarily bizarre way to interpret the code without violating the ANSI C standard. )

After a long readings, trial and error and experimentation I can totally say that it's UNDEFINED BEHAVIOUR. Pointer arithmetic just doens't make sense to me, you, compiler or machine. 

TL;DR;
It's not worth it. Move forward and understand better stuff. 

edited by

Related questions

0 votes
0 votes
1 answer
1