[Nasm-bugs] [Bug 3392783] New: obj output format may discard offsets in segment-relative constants
noreply-nasm at dev.nasm.us
noreply-nasm at dev.nasm.us
Fri Sep 17 08:01:51 PDT 2021
https://bugzilla.nasm.us/show_bug.cgi?id=3392783
Bug ID: 3392783
Summary: obj output format may discard offsets in
segment-relative constants
Product: NASM
Version: 2.16 (development)
Hardware: All
OS: All
Status: OPEN
Severity: normal
Priority: Medium
Component: Assembler
Assignee: nobody at nasm.us
Reporter: david at bamsoftware.com
CC: chang.seok.bae at intel.com, gorcunov at gmail.com,
hpa at zytor.com, nasm-bugs at nasm.us
Obtained from: Built from git using configure, From OS distribution
Created attachment 411829
--> https://bugzilla.nasm.us/attachment.cgi?id=411829&action=edit
Input file that shows constant offsets being discarded in the obj output format
In the obj output format, in certain contexts, a segment-relative constant
(`code+0xaaaa` in the example to follow) is emitted as a relocation pointing at
the constant 0x0000 (not 0xaaaa). In other contexts, such as the target of a
jmp instruction, the constant is emitted as a relocation entry pointing to the
constant 0xaaaa (which is what I expect).
The attached file t.asm shows what I mean. The comments show what each
instruction assembles to, with the ones that are unexpected to me marked with
"??".
bits 16
section code
jmp code+0xaaaa ; -> e9aaaa with reloc
mov ax, 0xaaaa ; -> b8aaaa
mov ax, code+0xaaaa ; -> b80000 with reloc ??
dw 0xaaaa ; -> aaaa
dw code+0xaaaa ; -> 0000 with reloc ??
I produced an executable with:
$ nasm -f obj -o t.obj t.asm
$ djlink -o t.exe t.obj
(http://www.delorie.com/djgpp/16bit/djlink/)
The rabin2 program from radare2 (https://book.rada.re/tools/rabin2/intro.html)
shows that the relocation table contains 3 entries, one for each `code+...`,
which is what I expect.
$ rabin2 -R t.exe
[Relocations]
vaddr paddr type name
―――――――――――――――――――――――――――――――――
0x00000001 0x00000201 SET_16
0x00000007 0x00000207 SET_16
0x0000000b 0x0000020b SET_16
3 relocations
If I use `$$+0xaaaa` instead of `code+0xaaaa`, the constant values in the
output file are 0xaaaa as I expect, but no relocation are emitted.
I have tested version 2.14-1 from Debian buster, and commit e2ed7b7e from the
Git repository, with the same behavior.
The following patch results in the output I expect with t.asm, though I imagine
this change is not generally correct, and that a better place to make a change
would be higher in the call stack.
diff --git a/output/legacy.c b/output/legacy.c
index d2785387..9d28faf7 100644
--- a/output/legacy.c
+++ b/output/legacy.c
@@ -90,5 +90,5 @@ void nasm_do_legacy_output(const struct out_data *data)
case OUT_SEGMENT:
type = OUT_ADDRESS;
- dptr = zero_buffer;
+ dptr = &data->toffset;
size = (data->flags & OUT_SIGNED) ? -data->size : data->size;
tsegment |= 1;
For comparison, if I write a similar program and use the elf output format, I
get 0xaaaaaaaa in the program text and relocations where needed.
bits 32
section .text
global _start
_start:
jmp 0xaaaaaaaa
jmp _start+0xaaaaaaaa
mov eax, 0xaaaaaaaa
mov eax, _start+0xaaaaaaaa
dd 0xaaaaaaaa
dd _start+0xaaaaaaaa
Made into an object file like this:
$ nasm -f elf -o u.o u.asm
Disassembly with relocations marked:
$ objdump -M intel -dr u.o
...
00000000 <_start>:
0: e9 a6 aa aa aa jmp aaaaaaab <_start+0xaaaaaaab>
1: R_386_PC32 *ABS*
5: e9 a0 aa aa aa jmp aaaaaaaa <_start+0xaaaaaaaa>
a: b8 aa aa aa aa mov eax,0xaaaaaaaa
f: b8 aa aa aa aa mov eax,0xaaaaaaaa
10: R_386_32 .text
14: aa stos BYTE PTR es:[edi],al
15: aa stos BYTE PTR es:[edi],al
16: aa stos BYTE PTR es:[edi],al
17: aa stos BYTE PTR es:[edi],al
18: aa stos BYTE PTR es:[edi],al
18: R_386_32 .text
19: aa stos BYTE PTR es:[edi],al
1a: aa stos BYTE PTR es:[edi],al
1b: aa stos BYTE PTR es:[edi],al
I came across this issue while writing an EXE-producing program and checking
its support for writing relocations in the header. I have a sample program with
a dw array of constant values, which are also relocation targets. The program
prints the contents of the array, as a quick visual check that the relocations
have been effected. I have been doing this with a custom EXE header writer that
includes a constant relocation table, but having the assembler output the
relocation offsets would be less fragile. I am not sure what I'm doing is the
best or the correct way to do what I want, but in any case the output with obj
is surprising and apparently inconsistent with other output formats.
--
You are receiving this mail because:
You are on the CC list for the bug.
You are watching all bug changes.
More information about the Nasm-bugs
mailing list