LLVM Discussion Forums

Wrong LLVM code after applying transform passes

Hi all,

I’m trying to access the members of a host-side struct of 32 bytes from LLVM generated code. I inject the address of the struct as a constant pointer in the LLVM IR since I know it never changes at runtime. I’m currently using the below code, which works correctly and I’m able to access the members at the correct offsets.

%struct.tc_t = type { i32, i32, i32, %struct.tc_t* (%struct.cpu_ctx_t*)*, [3 x %struct.tc_t* (%struct.cpu_ctx_t*)*], i32 }

%7 = inttoptr i32 9789200 to %struct.tc_t*
%8 = getelementptr inbounds %struct.tc_t, %struct.tc_t* %7, i32 0, i32 0
%9 = load volatile i32, i32* %8
%10 = getelementptr inbounds %struct.tc_t, %struct.tc_t* %7, i32 0, i32 1
%11 = load volatile i32, i32* %10
%12 = getelementptr inbounds %struct.tc_t, %struct.tc_t* %7, i32 0, i32 2
%13 = load volatile i32, i32* %12
%14 = getelementptr inbounds %struct.tc_t, %struct.tc_t* %7, i32 0, i32 3
%15 = load volatile %struct.tc_t* (%struct.cpu_ctx_t*)*, %struct.tc_t* (%struct.cpu_ctx_t*)** %14
%16 = getelementptr inbounds %struct.tc_t, %struct.tc_t* %7, i32 0, i32 4, i32 0
%17 = load volatile %struct.tc_t* (%struct.cpu_ctx_t*)*, %struct.tc_t* (%struct.cpu_ctx_t*)** %16
%18 = getelementptr inbounds %struct.tc_t, %struct.tc_t* %7, i32 0, i32 4, i32 1
%19 = load volatile %struct.tc_t* (%struct.cpu_ctx_t*)*, %struct.tc_t* (%struct.cpu_ctx_t*)** %18
%20 = getelementptr inbounds %struct.tc_t, %struct.tc_t* %7, i32 0, i32 4, i32 2
%21 = load volatile %struct.tc_t* (%struct.cpu_ctx_t*)*, %struct.tc_t* (%struct.cpu_ctx_t*)** %20
%22 = getelementptr inbounds %struct.tc_t, %struct.tc_t* %7, i32 0, i32 5
%23 = load volatile i32, i32* %22

However, after I use the below transform passes, it becomes wrong.

legacy::FunctionPassManager pm = legacy::FunctionPassManager(cpu->mod);
 
pm.add(createPromoteMemoryToRegisterPass());
pm.add(createInstructionCombiningPass());
pm.add(createConstantPropagationPass());
pm.add(createDeadStoreEliminationPass());
pm.add(createDeadCodeEliminationPass());
pm.run(*cpu->bb->getParent());

The generated code now seems to assume that the function pointer members are 8 bytes large (as opposed of 4 bytes, their real size), which causes getelementptr to calculate wrong addresses, hence the problem. Is there a way to solve this, other than not doing the passes (which I don’t want)?

%4 = load volatile i32, i32* inttoptr (i64 9789200 to i32*), align 16
%5 = load volatile i32, i32* inttoptr (i64 9789204 to i32*), align 4
%6 = load volatile i32, i32* inttoptr (i64 9789208 to i32*), align 8
%7 = load volatile %struct.tc_t* (%struct.cpu_ctx_t*)*, %struct.tc_t* (%struct.cpu_ctx_t*)** inttoptr (i64 9789216 to %struct.tc_t* (%struct.cpu_ctx_t*)**), align 32
%8 = load volatile %struct.tc_t* (%struct.cpu_ctx_t*)*, %struct.tc_t* (%struct.cpu_ctx_t*)** inttoptr (i64 9789224 to %struct.tc_t* (%struct.cpu_ctx_t*)**), align 8
%9 = load volatile %struct.tc_t* (%struct.cpu_ctx_t*)*, %struct.tc_t* (%struct.cpu_ctx_t*)** inttoptr (i64 9789232 to %struct.tc_t* (%struct.cpu_ctx_t*)**), align 16
%10 = load volatile %struct.tc_t* (%struct.cpu_ctx_t*)*, %struct.tc_t* (%struct.cpu_ctx_t*)** inttoptr (i64 9789240 to %struct.tc_t* (%struct.cpu_ctx_t*)**), align 8
%11 = load volatile i32, i32* inttoptr (i64 9789248 to i32*), align 64

Additional information: the code above was generated with LLVM 8.0.1 and the data layout of the machine is “e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32”.

Summary

This text will be hidden

Can you provide a reproducer with an IR file and an opt invocation that shows the issue?

The program I’m using to trigger the problem is test386.asm, which is available here https://github.com/barotto/test386.asm. I generated the LLVM IR by adding this line of code to my program cpu->mod->dump();, where mod is the module. I performed 5 invocations of opt with the following options, increasing the number of passes used each time from left to right: opt -mem2reg -instcombine -constprop -dse -dce -o out.txt -S -verify-each mod_dump.txt. The pass mem2reg produced the same output as the input (out1.txt), and all the following produced the same output out2.txt.

mod_dump.txt -> https://pastebin.com/tLgzm5tN

out1.txt -> https://pastebin.com/PQnZ5VKJ
out2.txt -> https://pastebin.com/Z3aKfqvq