User:Rincewind82/sandbox

Verbose assembly and code bloat edit

There has for a long time been accusations about C++ generating code bloat.[1][2] In order to measure this in a fair way we should use the same compiler to compare idiomatic code in C against modern C++ using the C++ standard library. Here we have the simple task of initializing a string array and to print it out. We'll get the opportunity to use some of the most fundamental constructs like strings, arrays and IO.

// Test.cpp
#ifdef TEST_C
#include <stdlib.h>
#include <stdio.h>

void StringArrayTest()
{
	const char *Strings[] = 
	{
		"One",
		"Two",
		"Three",
		"Four",
		"Five",
	};
	for(size_t i=0; i<(sizeof(Strings)/sizeof(Strings[0])); i++)
		puts(Strings[i]);
}
#endif

#ifdef TEST_CPP
#include <cstdlib>
#include <iostream>
#include <vector>
#include <string>

void StringArrayTest() 
{
	const std::vector<std::string> strings = 
	{
		"One",
		"Two",
		"Three",
		"Four",
		"Five",
	};
	for(auto &i : strings)
		std::cout << i << std::endl;
}
#endif

int main()
{
	StringArrayTest();
	return(EXIT_SUCCESS);
}
#!/bin/sh
g++ --version
g++ -fno-exceptions -fno-asynchronous-unwind-tables -fno-dwarf2-cfi-asm -masm=intel -std=c++11 -O3 -S -o test_c.s test.cpp -DTEST_C
g++ -fno-asynchronous-unwind-tables -fno-dwarf2-cfi-asm -masm=intel -std=c++11 -O3 -S -o test_cpp.s test.cpp -DTEST_CPP
g++ -masm=intel -s -o test_c test_c.s
g++ -masm=intel -s -o test_cpp test_cpp.s
ls -l test_c test_cpp

In this shell script we compile our test. We use g++ and let the preprocessor define if we want to compile C or C++. For C exceptions only generates some data structures that the linker removes later. By removing them directly with "-fno-exceptions" we make the C code cleaner. We also remove the DWARF debug information with "-fno-asynchronous-unwind-tables -fno-dwarf2-cfi-asm". This has no influence on the binary sizes but removes some additional noise from the assembly code.

g++ (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

....  6280 Mai 23 21:02 test_c
.... 10608 Mai 23 21:02 test_cpp

From the binary sizes here we already see some signs of code bloat. The C++ binary is much larger than the C binary. They are both solving the same simple task of writing out a string array. The only difference is that we are using C++ class templates and <iostream> to do it in the C++ case, and primitive datatypes in the C case. To be fair; some of this code bloat can later be removed by the linker when several source files use the same class templates further on. Our real interest is how the function "StringArrayTest" looks like. The code bloat we are finding there is there to stay.

    .file   "test.cpp"
    .intel_syntax noprefix
    .section    .rodata.str1.1,"aMS",@progbits,1
.LC0:
    .string "One"
.LC1:
    .string "Two"
.LC2:
    .string "Three"
.LC3:
    .string "Four"
.LC4:
    .string "Five"
    .text
    .p2align 4,,15
    .globl  _Z15StringArrayTestv
    .type   _Z15StringArrayTestv, @function
_Z15StringArrayTestv: 
    # Above we have the function
    push    rbp
    push    rbx
    sub rsp, 56
    lea rbp, [rsp+40]
    mov QWORD PTR [rsp], OFFSET FLAT:.LC0
    mov QWORD PTR [rsp+8], OFFSET FLAT:.LC1
    mov QWORD PTR [rsp+16], OFFSET FLAT:.LC2
    mov QWORD PTR [rsp+24], OFFSET FLAT:.LC3
    mov rbx, rsp
    mov QWORD PTR [rsp+32], OFFSET FLAT:.LC4
    # Above the stack frame and local variables
.L3:
    mov rdi, QWORD PTR [rbx]
    add rbx, 8
    call    puts
    cmp rbx, rbp
    jne .L3
    # Above is the print loop from L3
    add rsp, 56
    pop rbx
    pop rbp
    ret
    # Above the return. We are done!
    .size   _Z15StringArrayTestv, .-_Z15StringArrayTestv
    .section    .text.startup,"ax",@progbits
    .p2align 4,,15
    .globl  main
    .type   main, @function
main:
    sub rsp, 8
    call    _Z15StringArrayTestv
    xor eax, eax
    add rsp, 8
    ret
    .size   main, .-main
    .ident  "GCC: (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4"
    .section    .note.GNU-stack,"",@progbits

Here we have the C++ version. We see many slow call instructions to IO functions and various constructors/destructors. The compiler is unable to optimize them away, creating significant code bloat compared to the C version. When creating multiple functions printing out different strings, the code bloat is there every time. In contrast; even when using printf and "char strings[][STRING_MAX]" the C version keeps its compact assembly output. The problem doesn't seem to be with the C++ language itself. Using std::array instead of std::vector/std::string, and using <cstdio> instead of <iostream> creates the same compact assembly output as the C version. The problem of code bloat seems to be with the <iostream> and the STL with it's heap dependent template classes.

    .file   "test.cpp"
    .intel_syntax noprefix
    .section    .rodata.str1.1,"aMS",@progbits,1
.LC0:
    .string "One"
.LC1:
    .string "Two"
.LC2:
    .string "Three"
.LC3:
    .string "Four"
.LC4:
    .string "Five"
    .text
    .p2align 4,,15
    .globl  _Z15StringArrayTestv
    .type   _Z15StringArrayTestv, @function
_Z15StringArrayTestv:
.LFB1624:
    push    r15
.LCFI0:
    mov esi, OFFSET FLAT:.LC0
    push    r14
.LCFI1:
    push    r13
.LCFI2:
    push    r12
.LCFI3:
    push    rbp
.LCFI4:
    push    rbx
.LCFI5:
    sub rsp, 72
.LCFI6:
    lea rdx, [rsp+10]
    lea rdi, [rsp+16]
.LEHB0:
    call    _ZNSsC1EPKcRKSaIcE
    lea rdi, [rsp+24]
    lea rdx, [rsp+11]
    mov esi, OFFSET FLAT:.LC1
    call    _ZNSsC1EPKcRKSaIcE
    lea rdi, [rsp+32]
    lea rdx, [rsp+12]
    mov esi, OFFSET FLAT:.LC2
    call    _ZNSsC1EPKcRKSaIcE
    lea rdi, [rsp+40]
    lea rdx, [rsp+13]
    mov esi, OFFSET FLAT:.LC3
    call    _ZNSsC1EPKcRKSaIcE
    lea rdi, [rsp+48]
    lea rdx, [rsp+14]
    mov esi, OFFSET FLAT:.LC4
    call    _ZNSsC1EPKcRKSaIcE
.LEHE0:
    mov edi, 40
.LEHB1:
    call    _Znwm
.LEHE1:
    lea rbx, [rsp+16]
    mov r14, rax
    mov rbp, rax
    lea r12, [rbx+40]
    .p2align 4,,10
    .p2align 3
.L4:
    test    rbp, rbp
    je  .L5
    mov rsi, rbx
    mov rdi, rbp
.LEHB2:
    call    _ZNSsC1ERKSs
.LEHE2:
.L5:
    add rbx, 8
    add rbp, 8
    cmp rbx, r12
    jne .L4
    mov rax, QWORD PTR [rsp+48]
    mov r15d, OFFSET FLAT:_ZL28__gthrw___pthread_key_createPjPFvPvE
    test    r15, r15
    lea rdi, [rax-24]
    je  .L6
    cmp rdi, OFFSET FLAT:_ZNSs4_Rep20_S_empty_rep_storageE
    jne .L102
.L8:
    mov rax, QWORD PTR [rsp+40]
    lea rdi, [rax-24]
    cmp rdi, OFFSET FLAT:_ZNSs4_Rep20_S_empty_rep_storageE
    jne .L103
.L10:
    mov rax, QWORD PTR [rsp+32]
    lea rdi, [rax-24]
    cmp rdi, OFFSET FLAT:_ZNSs4_Rep20_S_empty_rep_storageE
    jne .L104
.L12:
    mov rax, QWORD PTR [rsp+24]
    lea rdi, [rax-24]
    cmp rdi, OFFSET FLAT:_ZNSs4_Rep20_S_empty_rep_storageE
    jne .L105
.L14:
    mov rax, QWORD PTR [rsp+16]
    lea rdi, [rax-24]
    cmp rdi, OFFSET FLAT:_ZNSs4_Rep20_S_empty_rep_storageE
    jne .L106
.L31:
    cmp r14, rbp
    mov r12, r14
    jne .L91
    jmp .L53
    .p2align 4,,10
    .p2align 3
.L109:
    movzx   eax, BYTE PTR [rbx+67]
.L46:
    movsx   esi, al
    mov rdi, r13
.LEHB3:
    call    _ZNSo3putEc
    mov rdi, rax
    call    _ZNSo5flushEv
    add r12, 8
    cmp rbp, r12
    je  .L107
.L91:
    mov rsi, QWORD PTR [r12]
    mov edi, OFFSET FLAT:_ZSt4cout
    mov rdx, QWORD PTR [rsi-24]
    call    _ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_l
    mov r13, rax
    mov rax, QWORD PTR [rax]
    mov rax, QWORD PTR [rax-24]
    mov rbx, QWORD PTR [r13+240+rax]
    test    rbx, rbx
    je  .L108
    cmp BYTE PTR [rbx+56], 0
    jne .L109
    mov rdi, rbx
    call    _ZNKSt5ctypeIcE13_M_widen_initEv
    mov rax, QWORD PTR [rbx]
    mov esi, 10
    mov rdi, rbx
    call    [QWORD PTR [rax+48]]
    jmp .L46
    .p2align 4,,10
    .p2align 3
.L107:
    test    r15, r15
    mov rbx, r14
    je  .L54
    .p2align 4,,10
    .p2align 3
.L58:
    mov rax, QWORD PTR [rbx]
    lea rdi, [rax-24]
    cmp rdi, OFFSET FLAT:_ZNSs4_Rep20_S_empty_rep_storageE
    jne .L110
.L56:
    add rbx, 8
    cmp rbp, rbx
    jne .L58
.L53:
    test    r14, r14
    je  .L1
    mov rdi, r14
    call    _ZdlPv
.L1:
    add rsp, 72
.LCFI7:
    pop rbx
.LCFI8:
    pop rbp
.LCFI9:
    pop r12
.LCFI10:
    pop r13
.LCFI11:
    pop r14
.LCFI12:
    pop r15
.LCFI13:
    ret
.L108:
.LCFI14:
    call    _ZSt16__throw_bad_castv
.LEHE3:
.L74:
    test    r15, r15
    mov r12, rax
    mov rbx, r14
    je  .L71
.L67:
    mov rax, QWORD PTR [rbx]
    lea rdi, [rax-24]
    cmp rdi, OFFSET FLAT:_ZNSs4_Rep20_S_empty_rep_storageE
    jne .L111
.L61:
    add rbx, 8
    cmp rbp, rbx
    jne .L67
.L70:
    test    r14, r14
    je  .L64
    mov rdi, r14
    call    _ZdlPv
.L64:
    mov rdi, r12
.LEHB4:
    call    _Unwind_Resume
.LEHE4:
    .p2align 4,,10
    .p2align 3
.L110:
    mov edx, -1
    lock xadd   DWORD PTR [rax-8], edx
    test    edx, edx
    jg  .L56
    lea rsi, [rsp+16]
    call    _ZNSs4_Rep10_M_destroyERKSaIcE
    jmp .L56
.L73:
    mov rdi, rax
    mov rbx, r14
    lea r12, [rsp+15]
    call    __cxa_begin_catch
    cmp r14, rbp
    je  .L37
.L90:
    mov rax, QWORD PTR [rbx]
    mov rsi, r12
    add rbx, 8
    lea rdi, [rax-24]
    call    _ZNSs4_Rep10_M_disposeERKSaIcE
    cmp rbp, rbx
    jne .L90
.L37:
.LEHB5:
    call    __cxa_rethrow
.LEHE5:
.L75:
    lea r12, [rsp+15]
    mov rbp, rax
.L34:
    lea rbx, [rsp+48]
    lea r13, [rsp+8]
.L39:
    mov rax, QWORD PTR [rbx]
    mov rsi, r12
    sub rbx, 8
    lea rdi, [rax-24]
    call    _ZNSs4_Rep10_M_disposeERKSaIcE
    cmp rbx, r13
    jne .L39
    mov rdi, rbp
.LEHB6:
    call    _Unwind_Resume
.LEHE6:
.L72:
    mov rbp, rax
    call    __cxa_end_catch
    test    r14, r14
    je  .L34
    mov rdi, r14
    call    _ZdlPv
    .p2align 4,,2
    jmp .L34
.L54:
    mov rax, QWORD PTR [rbx]
    lea rdi, [rax-24]
    cmp rdi, OFFSET FLAT:_ZNSs4_Rep20_S_empty_rep_storageE
    jne .L112
.L51:
    add rbx, 8
    cmp rbp, rbx
    jne .L54
    jmp .L53
.L106:
    mov edx, -1
    lock xadd   DWORD PTR [rax-8], edx
    test    edx, edx
    jg  .L31
.L100:
    lea rsi, [rsp+15]
    call    _ZNSs4_Rep10_M_destroyERKSaIcE
    jmp .L31
.L112:
    mov edx, DWORD PTR [rax-8]
    lea ecx, [rdx-1]
    test    edx, edx
    mov DWORD PTR [rax-8], ecx
    jg  .L51
    lea rsi, [rsp+16]
    call    _ZNSs4_Rep10_M_destroyERKSaIcE
    jmp .L51
.L104:
    mov edx, -1
    lock xadd   DWORD PTR [rax-8], edx
    test    edx, edx
    jg  .L12
    lea rsi, [rsp+15]
    call    _ZNSs4_Rep10_M_destroyERKSaIcE
    jmp .L12
.L105:
    mov edx, -1
    lock xadd   DWORD PTR [rax-8], edx
    test    edx, edx
    jg  .L14
    lea rsi, [rsp+15]
    call    _ZNSs4_Rep10_M_destroyERKSaIcE
    jmp .L14
.L6:
    cmp rdi, OFFSET FLAT:_ZNSs4_Rep20_S_empty_rep_storageE
    jne .L113
.L18:
    mov rax, QWORD PTR [rsp+40]
    lea rdi, [rax-24]
    cmp rdi, OFFSET FLAT:_ZNSs4_Rep20_S_empty_rep_storageE
    jne .L114
.L21:
    mov rax, QWORD PTR [rsp+32]
    lea rdi, [rax-24]
    cmp rdi, OFFSET FLAT:_ZNSs4_Rep20_S_empty_rep_storageE
    jne .L115
.L24:
    mov rax, QWORD PTR [rsp+24]
    lea rdi, [rax-24]
    cmp rdi, OFFSET FLAT:_ZNSs4_Rep20_S_empty_rep_storageE
    jne .L116
.L27:
    mov rax, QWORD PTR [rsp+16]
    lea rdi, [rax-24]
    cmp rdi, OFFSET FLAT:_ZNSs4_Rep20_S_empty_rep_storageE
    je  .L31
    mov edx, DWORD PTR [rax-8]
    lea ecx, [rdx-1]
    test    edx, edx
    mov DWORD PTR [rax-8], ecx
    jg  .L31
    jmp .L100
.L103:
    mov edx, -1
    lock xadd   DWORD PTR [rax-8], edx
    test    edx, edx
    jg  .L10
    lea rsi, [rsp+15]
    call    _ZNSs4_Rep10_M_destroyERKSaIcE
    jmp .L10
.L116:
    mov edx, DWORD PTR [rax-8]
    lea ecx, [rdx-1]
    test    edx, edx
    mov DWORD PTR [rax-8], ecx
    jg  .L27
    lea rsi, [rsp+15]
    call    _ZNSs4_Rep10_M_destroyERKSaIcE
    jmp .L27
.L115:
    mov edx, DWORD PTR [rax-8]
    lea ecx, [rdx-1]
    test    edx, edx
    mov DWORD PTR [rax-8], ecx
    jg  .L24
    lea rsi, [rsp+15]
    call    _ZNSs4_Rep10_M_destroyERKSaIcE
    jmp .L24
.L114:
    mov edx, DWORD PTR [rax-8]
    lea ecx, [rdx-1]
    test    edx, edx
    mov DWORD PTR [rax-8], ecx
    jg  .L21
    lea rsi, [rsp+15]
    call    _ZNSs4_Rep10_M_destroyERKSaIcE
    jmp .L21
.L113:
    mov edx, DWORD PTR [rax-8]
    lea ecx, [rdx-1]
    test    edx, edx
    mov DWORD PTR [rax-8], ecx
    jg  .L18
    lea rsi, [rsp+15]
    call    _ZNSs4_Rep10_M_destroyERKSaIcE
    jmp .L18
.L102:
    mov edx, -1
    lock xadd   DWORD PTR [rax-8], edx
    test    edx, edx
    jg  .L8
    lea rsi, [rsp+15]
    call    _ZNSs4_Rep10_M_destroyERKSaIcE
    jmp .L8
.L111:
    mov edx, -1
    lock xadd   DWORD PTR [rax-8], edx
    test    edx, edx
    jg  .L61
    lea rsi, [rsp+16]
    call    _ZNSs4_Rep10_M_destroyERKSaIcE
    jmp .L61
.L71:
    mov rax, QWORD PTR [rbx]
    lea rdi, [rax-24]
    cmp rdi, OFFSET FLAT:_ZNSs4_Rep20_S_empty_rep_storageE
    jne .L117
.L69:
    add rbx, 8
    cmp rbp, rbx
    jne .L71
    jmp .L70
.L117:
    mov edx, DWORD PTR [rax-8]
    lea ecx, [rdx-1]
    test    edx, edx
    mov DWORD PTR [rax-8], ecx
    jg  .L69
    lea rsi, [rsp+16]
    call    _ZNSs4_Rep10_M_destroyERKSaIcE
    jmp .L69
.LFE1624:
    .globl  __gxx_personality_v0
    .section    .gcc_except_table,"a",@progbits
    .align 4
.LLSDA1624:
    .byte   0xff
    .byte   0x3
    .uleb128 .LLSDATT1624-.LLSDATTD1624
.LLSDATTD1624:
    .byte   0x1
    .uleb128 .LLSDACSE1624-.LLSDACSB1624
.LLSDACSB1624:
    .uleb128 .LEHB0-.LFB1624
    .uleb128 .LEHE0-.LEHB0
    .uleb128 0
    .uleb128 0
    .uleb128 .LEHB1-.LFB1624
    .uleb128 .LEHE1-.LEHB1
    .uleb128 .L75-.LFB1624
    .uleb128 0
    .uleb128 .LEHB2-.LFB1624
    .uleb128 .LEHE2-.LEHB2
    .uleb128 .L73-.LFB1624
    .uleb128 0x1
    .uleb128 .LEHB3-.LFB1624
    .uleb128 .LEHE3-.LEHB3
    .uleb128 .L74-.LFB1624
    .uleb128 0
    .uleb128 .LEHB4-.LFB1624
    .uleb128 .LEHE4-.LEHB4
    .uleb128 0
    .uleb128 0
    .uleb128 .LEHB5-.LFB1624
    .uleb128 .LEHE5-.LEHB5
    .uleb128 .L72-.LFB1624
    .uleb128 0
    .uleb128 .LEHB6-.LFB1624
    .uleb128 .LEHE6-.LEHB6
    .uleb128 0
    .uleb128 0
.LLSDACSE1624:
    .byte   0x1
    .byte   0
    .align 4
    .long   0

.LLSDATT1624:
    .text
    .size   _Z15StringArrayTestv, .-_Z15StringArrayTestv
    .section    .text.startup,"ax",@progbits
    .p2align 4,,15
    .globl  main
    .type   main, @function
main:
.LFB1626:
    sub rsp, 8
.LCFI15:
    call    _Z15StringArrayTestv
    xor eax, eax
    add rsp, 8
.LCFI16:
    ret
.LFE1626:
    .size   main, .-main
    .p2align 4,,15
    .type   _GLOBAL__sub_I__Z15StringArrayTestv, @function
_GLOBAL__sub_I__Z15StringArrayTestv:
.LFB1853:
    sub rsp, 8
.LCFI17:
    mov edi, OFFSET FLAT:_ZStL8__ioinit
    call    _ZNSt8ios_base4InitC1Ev
    mov edx, OFFSET FLAT:__dso_handle
    mov esi, OFFSET FLAT:_ZStL8__ioinit
    mov edi, OFFSET FLAT:_ZNSt8ios_base4InitD1Ev
    add rsp, 8
.LCFI18:
    jmp __cxa_atexit
.LFE1853:
    .size   _GLOBAL__sub_I__Z15StringArrayTestv, .-_GLOBAL__sub_I__Z15StringArrayTestv
    .section    .init_array,"aw"
    .align 8
    .quad   _GLOBAL__sub_I__Z15StringArrayTestv
    .local  _ZStL8__ioinit
    .comm   _ZStL8__ioinit,1,1
    .weakref    _ZL28__gthrw___pthread_key_createPjPFvPvE,__pthread_key_create
    .section    .eh_frame,"a",@progbits
.Lframe1:
    .long   .LECIE1-.LSCIE1
.LSCIE1:
    .long   0
    .byte   0x3
    .string "zPLR"
    .uleb128 0x1
    .sleb128 -8
    .uleb128 0x10
    .uleb128 0x7
    .byte   0x3
    .long   __gxx_personality_v0
    .byte   0x3
    .byte   0x3
    .byte   0xc
    .uleb128 0x7
    .uleb128 0x8
    .byte   0x90
    .uleb128 0x1
    .align 8
.LECIE1:
.LSFDE1:
    .long   .LEFDE1-.LASFDE1
.LASFDE1:
    .long   .LASFDE1-.Lframe1
    .long   .LFB1624
    .long   .LFE1624-.LFB1624
    .uleb128 0x4
    .long   .LLSDA1624
    .byte   0x4
    .long   .LCFI0-.LFB1624
    .byte   0xe
    .uleb128 0x10
    .byte   0x8f
    .uleb128 0x2
    .byte   0x4
    .long   .LCFI1-.LCFI0
    .byte   0xe
    .uleb128 0x18
    .byte   0x8e
    .uleb128 0x3
    .byte   0x4
    .long   .LCFI2-.LCFI1
    .byte   0xe
    .uleb128 0x20
    .byte   0x8d
    .uleb128 0x4
    .byte   0x4
    .long   .LCFI3-.LCFI2
    .byte   0xe
    .uleb128 0x28
    .byte   0x8c
    .uleb128 0x5
    .byte   0x4
    .long   .LCFI4-.LCFI3
    .byte   0xe
    .uleb128 0x30
    .byte   0x86
    .uleb128 0x6
    .byte   0x4
    .long   .LCFI5-.LCFI4
    .byte   0xe
    .uleb128 0x38
    .byte   0x83
    .uleb128 0x7
    .byte   0x4
    .long   .LCFI6-.LCFI5
    .byte   0xe
    .uleb128 0x80
    .byte   0x4
    .long   .LCFI7-.LCFI6
    .byte   0xa
    .byte   0xe
    .uleb128 0x38
    .byte   0x4
    .long   .LCFI8-.LCFI7
    .byte   0xe
    .uleb128 0x30
    .byte   0x4
    .long   .LCFI9-.LCFI8
    .byte   0xe
    .uleb128 0x28
    .byte   0x4
    .long   .LCFI10-.LCFI9
    .byte   0xe
    .uleb128 0x20
    .byte   0x4
    .long   .LCFI11-.LCFI10
    .byte   0xe
    .uleb128 0x18
    .byte   0x4
    .long   .LCFI12-.LCFI11
    .byte   0xe
    .uleb128 0x10
    .byte   0x4
    .long   .LCFI13-.LCFI12
    .byte   0xe
    .uleb128 0x8
    .byte   0x4
    .long   .LCFI14-.LCFI13
    .byte   0xb
    .align 8
.LEFDE1:
.LSFDE3:
    .long   .LEFDE3-.LASFDE3
.LASFDE3:
    .long   .LASFDE3-.Lframe1
    .long   .LFB1626
    .long   .LFE1626-.LFB1626
    .uleb128 0x4
    .long   0
    .byte   0x4
    .long   .LCFI15-.LFB1626
    .byte   0xe
    .uleb128 0x10
    .byte   0x4
    .long   .LCFI16-.LCFI15
    .byte   0xe
    .uleb128 0x8
    .align 8
.LEFDE3:
    .hidden __dso_handle
    .ident  "GCC: (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4"
    .section    .note.GNU-stack,"",@progbits
  1. ^ "Stroustrup C++ spoof 'interview'". Stroustrup: Well, almost. The executable was so huge, it took five minutes to load, on an HP workstation, with 128MB of RAM. Then it ran like treacle. Actually, I thought this would be a major stumbling-block, and I'd get found out within a week, but nobody cared. Sun and HP were only too glad to sell enormously powerful boxes, with huge resources just to run trivial programs. You know, when we had our first C++ compiler, at AT&T, I compiled 'Hello World', and couldn't believe the size of the executable. 2.1MB Interviewer: What? Well, compilers have come a long way, since then. Stroustrup: They have? Try it on the latest version of g++ - you won't get much change out of half a megabyte. Also, there are several quite recent examples for you, from all over the world. British Telecom had a major disaster on their hands but, luckily, managed to scrap the whole thing and start again. They were luckier than Australian Telecom. Now I hear that Siemens is building a dinosaur, and getting more and more worried as the size of the hardware gets bigger, to accommodate the executables. Isn't multiple inheritance a joy?
  2. ^ "Why is the code generated for the "Hello world" program ten times larger for C++ than for C?". It isn't on my machine, and it shouldn't be on yours. I have even seen the C++ version of the "hello world" program smaller than the C version. In 2004, I tested using gcc -O2 on a Unix and the two versions (iostreams and stdio) yielded identical sizes. There is no language reason why the one version should be larger than the other. It is all an issue on how an implementor organizes the standard libraries (e.g. static linking vs. dynamic linking, locale support by default vs. locale support enabled through and option, etc.). If one version is significantly larger than the other, report the problem to the implementor of the larger.