Does [ebp*2] reference DS or SS segment?

Question

IDM says the memory op uses SS segment if EBP is used as base register. As a result, [ebp + esi] and [esi + ebp] references SS and DS segments, respectively. See NASM's doc: 3.3 Effective Address.

In the above same section, NASM mentioned how to generate shorter machine code by replacing [eax*2] with [eax+eax].

However, NASM also generates [ebp + ebp] for [ebp*2] (i.e. no base register).

I suspect [ebp+ebp] references SS segment, and [ebp*2] references DS segment.

I asked NASM this question. They think [ebp*2] and [ebp+ebp] are the same, but it doesn't make sense to me. Obviously, [ebp+ebp] (ebp as base register) references SS segment. If they're the same, [ebp*2 must reference SS too. This means SS is referenced as long as ebp is the base or index register, which in turn means, both [ebp + esi] and [esi + ebp] reference SS segments, so they must be the same.

Does anyone know which segment [ebp*2] uses?

This NASM optimization ([ebp*2] -> [ebp+ebp]) assumes a flat memory model where ss and ds are equivalent, which is the case under all the major mainstream x86 OSes. It's an interesting corner case because a pure [idx*2] addressing mode without a register or 32-bit absolute base is also very unusual (except for LEA to copy-and-shift). Normally people use real pointers instead of faking word-addressable memory by scaling them by 2, or whatever you're doing. — Peter Cordes
I asked NASM this question. Do you mean you asked the NASM developers? Or that you assembled code with NASM and/or disassembled with ndisasm to see what the program itself "thought"? Because the info you got was wrong: [esi + ebp] uses ds. And if you're assuming that ss and ds are interchangeable, you'd optimize [ebp + esi] to [esi + ebp] to avoid needing a disp8 = 0. (EBP as a base register is only encodeable with a disp8 or disp32; the encoding that would mean EBP + no displacement actually means there's a disp32 with no base register. (But potentially an index). — Peter Cordes
@PeterCordes : He originally asked on the old (defunct) NASM forum that was on Sourceforge sourceforge.net/p/nasm/discussion/167169/thread/18e79c06 . He had a problem getting email activated on nasm.us — Michael Petch
@PeterCordes thanks for your comments, very good point about flat memory model x86 os uses. i was focusing on the correctness of the assembler. i'm writing a simple assembler, so this assumption doesn't apply to me, but very good point. — wildpie
Indeed, any assumption of a flat memory model should be optional. This just explains why it was overlooked, since NASM does it even for [symbol + ebp*2]. bin is a flat binary, with no implications about what you might do with the resulting machine code. e.g. use it as a .COM executable, a boot sector, or embed it into something else. (The default mode for bin is bits 16, i.e. 16-bit mode.) — Peter Cordes

Sep Roland Sep Roland · Accepted Answer · 2018-04-08T21:36:33

The Intel manual tells us below figure 3-11, which deals with Offset = Base + (Index * Scale) + Displacement:

The uses of general-purpose registers as base or index components are restricted in the following manner:

The ESP register cannot be used as an index register.

When the ESP or EBP register is used as the base, the SS segment is the default segment. In all other cases, the DS segment is the default segment.

This means that NASM is wrong when it changes [ebp*2] into [ebp+ebp] (in order to avoid the 32bit displacement).

[ebp*2] uses DS because ebp is not used as base
[ebp+ebp] uses SS because one of the ebp is used as base

It would then be best to specify that you don't want this behaviour from NASM.
Until the time NASM authors realize their mistake, you can disable this behaviour (where EBP is used as an index) by writing:

[NoSplit ebp*2]

Does [ebp*2] reference DS or SS segment?

2 Answers