3 votes
3 votes
#include <stdio.h>
int main()
     int *p = (int *)20;
     int *q = (int *)30;
     printf("%d", q - p);

Explain which one of the following is correct?

  1. Compilation error.
  2. 10
  3. 2
  4. None of there.

2 Answers

3 votes
3 votes

It's undefined behavior as per C standard. 

According to the C Standard (6.5.6 Additive operators)

When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object; the result is the difference of the subscripts of the two array elements.

Also note that is is undefined behaviour to de-reference p and q unless they point to valid addresses holding int values

See this discussion on stackoverflow: 


edited by
1 votes
1 votes
#include <stdio.h>
int main()
     int *p = (int *)20;
     int *q = (int *)30;
     printf("%d", q - p);
The answer is the difference in the number of int(s) stored in between q and p

Look at the above question ( code ). Let's go with the above thought and find out why the answer is 2.

int occupies 2 bytes or 4 bytes ( THIS DEPENDS ON THE ARCHITECTURE, 32bit or 64bit ) ( Most compilers today are 32bit, so the int is 4-bytes )

$$(30 - 20) /4 = \lfloor 2.5 \rfloor = 2$$

Take the floor of it thus, answer is 2.

Personally I feel pointer differences don't make any sense unless you're talking about an array which basically is a consecutive block of elements and if you look at section 6.5.6 Additive operators of the open-std standard N1570, you'll find out that it's not 100% clear how the pointer difference actually behaves. 



As it's still not much clear let's look at the gdbdump output of the executable. [ Note, it might be different for you, I'm using Apple LLVM version 8.1.0 (clang-802.0.42)  Default target: x86_64-apple-darwin16.7.0)

This is the assembler source got from ~$ gcc -S -O2 test.c ( I have stripped the symbol table, unecessary parts )

    subq    $8, %rsp
    movl    $2, %esi
    xorl    %eax, %eax
    leaq    LC0(%rip), %rdi
    call    _printf
    xorl    %eax, %eax
    addq    $8, %rsp

And now we focus to the assembler dump for the above part. ( ~$ gdb test, ~$(gdb) disass /m main) [ It's badly formatted, sorry ] I have written comments for almost all the steps.

Dump of assembler code for function main:
   0x0000000100000f70 <+0>:    sub    $0x8,%rsp // move stack pointer 0x8/4 places down the stack.
   0x0000000100000f74 <+4>:    mov    $0x2,%esi // move 0x2 = 2 ??!! into eax. (although it follows the rule above, I don't think it'll do this always )...
   0x0000000100000f79 <+9>:    xor    %eax,%eax // ...also that why it puts 2 is also undefined.  
   0x0000000100000f7b <+11>:    lea    0x2c(%rip),%rdi        # 0x100000fae
   0x0000000100000f82 <+18>:    callq  0x100000f8e // call printf() and output value of eax ( which is 2 )
   0x0000000100000f87 <+23>:    xor    %eax,%eax
   0x0000000100000f89 <+25>:    add    $0x8,%rsp // move stack pointer back 0x8/4 places up.
   0x0000000100000f8d <+29>:    retq
End of assembler dump.

The whole of .__main is a stack. 

There is no relation between p and q and 2. It's actually futile to proceed this. Buggy input will give weird undefined output. ( arbitrarily bizarre way to interpret the code without violating the ANSI C standard. )

After a long readings, trial and error and experimentation I can totally say that it's UNDEFINED BEHAVIOUR. Pointer arithmetic just doens't make sense to me, you, compiler or machine. 

It's not worth it. Move forward and understand better stuff. 

edited by

Related questions

0 votes
0 votes
1 answer