Why does SEG not give an error message with this code fragment?

Question

The line below has no error generated by the assembler:

mov ax,seg TEXT:frewd

(see program fragment below)

I would expect the assembler to create a error message because frewd is not in the TEXT segment but is in the TEXT1 segment and no GROUP statement exists.

Am I missing something?

I have filled both segments with dummy data so that there are 2 different segments but still no error.

 .386
 TEXT segment para private
   example dw ?
   dummy byte 65531 dup(0)
   sample dw ?
 TEXT ends
 TEXT1 segment word private 'CODE'
   frew dw ?
   dummy1 byte 65531 dup(0)
   frewd dw ?
 TEXT1 ends
 Cseg segment
   mov ax, seg TEXT:frewd ;no error is generated here by the assembler
   mov es,ax
 Cseg ends
 end

It probably put TEXT and TEXT1 in the same segment? Or the assembler is not smart enough to do such checks. — Alexis Wilke
How can it put both segments in the same segment? I thought the max size per segment was 65535? Am I wrong about that? — StillLearning
Oh, it would be two different blocks both using the same segment register with two different values. I just think that the assembler is not smart enough to detect the problem. — Alexis Wilke
The problem that occurs is that you will not notice the wrong segment is being used and will be hard to detect. I tried it using .8086 instead of .386 and still got no error message. There should be a error message for this. — StillLearning
I'm so glad that x86-16 is obsolete so we don't have to use quirky old toolchains or segmentation anymore. — Peter Cordes

Ross Ridge Ross Ridge · Accepted Answer · 2019-08-11T18:47:02

There is no reason for MASM to give an error for this line. The SEG operator obtains the segment address (paragraph address in real mode) of the frame of the expression it's used on. A frame is the segment or group that the final offset of an address is determined relative to by the linker. By default the frame of an address is the same as the segment of address, unless the segment of the address belongs in a group. Note that this means addresses in MASM have three parts, a frame, a segment, and an offset.

Since TEXT:frewd is an address, that means it has a frame, so the SEG operator evaluates to the frame of that expression. As it turns out the frame of the expression TEXT:frewd isn't what it would appear to be, but if it were actually TEXT, there would be no reason why MASM couldn't evaluate the expression without error. The problem of frewd not being within 64k of the base of TEXT isn't something that would be known until link time.

MASM only allows the frame of an address to be different from the segment of an address if the frame is group. This means that when a segment name is used as a segment override of label, it doesn't actually change the frame of the address to be the named segment. Instead, the frame get changed to that of label's segment if the frame used to be a group, otherwise the frame remains the same as label's segment. However the segment override does change what segment register MASM will use for the instruction, based on previous ASSUME statements, if the label is used in a memory operand.

I've created an example to try to demonstrate how MASM behaves. Each MOV instruction that loads a value from memory is commented with a description of what segment register ("sreg") the assembler uses for the instruction, and the frame and segment that are used for the address of the memory operand. The frame and segment used appear in the relocations (or fixups) that the assembler puts in the object file it outputs.

DATA1   SEGMENT PARA
data1_label DW  0
DATA1   ENDS

DATA2   SEGMENT PARA
    DW  2
data2_label DW  4
DATA2   ENDS

DGROUP  GROUP   DATA1, DATA2

FARDATA SEGMENT PARA
    DW  6, 8
fardata_label DW 10
FARDATA ENDS

CODESEG SEGMENT PARA PUBLIC 'CODE'
start:
    mov ax, DGROUP
    mov ds, ax
    mov ax, FARDATA
    mov es, ax
    mov ax, DATA2
    mov ss, ax
    ASSUME  ds:DGROUP
    ASSUME  es:FARDATA
    ASSUME  ss:DATA2
    ASSUME  cs:CODESEG

    ; Since data1_label and data2_label belong to segments that belong to DGROUP,
    ; they're accessed relative to DGROUP by default and through the segment 
    ; register assumed to point to DGROUP. The correct code is generated
    ; without any overrides.

    mov ax, [data1_label]           ; sreg: DS, frame: DGROUP,  segment: DATA1
    mov ax, [data2_label]           ; sreg: DS, frame: DGROUP,  segment: DATA2

    ; No surprises here, fardata_label is accessed relative to the segment its
    ; defined in and using the segment register assumed for that segment.

    mov ax, [fardata_label]         ; sreg: ES, frame: FARDATA, segment: FARDATA

    ; Changing the last three instructions to use DATA2 as a segment override causes
    ; them all to use the SS segment register, the one assumed for SS.  It also
    ; overrides using DGROUP as the frame for data1_label and data2_label, but
    ; doesn't change the frame to DATA2 for either data1_label or fardata_label.
    ; Only the second instruction will work correctly.  The other two instructions
    ; use the wrong segment register access the label at the offset the linker will
    ; end up using.

    mov ax, [DATA2:data1_label]     ; sreg: SS, frame: DATA1,   segment: DATA1
    mov ax, [DATA2:data2_label]     ; sreg: SS, frame: DATA2,   segment: DATA2
    mov ax, [DATA2:fardata_label]   ; sreg: SS, frame: FARDATA, segment: FARDATA

    ; Overriding with CODESEG has the same as effect as overriding with DATA2,
    ; except the CS register is used instead. None of the instructions will
    ; work, since none of them will have offsets relative to CODESEG, the segment
    ; loaded into CS.

    mov ax, [CODESEG:data1_label]   ; sreg: CS, frame: DATA1,   segment: DATA1
    mov ax, [CODESEG:data2_label]   ; sreg: CS, frame: DATA2,   segment: DATA2
    mov ax, [CODESEG:fardata_label] ; sreg: CS, frame: FARDATA, segment: FARDATA

    ; Using DGROUP as an override on fardata_label will work so long as
    ; fardata_label doesn't end up getting placed before the start of the DGROUP,
    ; or someplace 64K beyond the start of DGROUP.  If it does end up outside
    ; DGROUP then the linker will give an error.  The assembler is unable to
    ; detect this.

    mov ax, [DGROUP:fardata_label]  ; sreg: DS, frame: DGROUP,  segment: FARDATA

    ; Using a single segment override on a number works as expected, but 
    ; using multiple overrides only the left-most override has an effect. 

    mov ax, [CODESEG:0]             ; sreg: CS, frame: CODESEG, segment: CODESEG
    mov ax, [DGROUP:0]              ; sreg: DS, frame: DGROUP,  segment: DGROUP
    mov ax, [DGROUP:CODESEG:0]      ; sreg: DS, frame: DGROUP,  segment: DGROUP
    mov ax, [CODESEG:DGROUP:0]      ; sreg: CS, frame: CODESEG, segment: CODESEG
    mov ax, [FARDATA:CODESEG:0]     ; sreg: ES, frame: FARDATA, segment: FARDATA
CODESEG ENDS

    END start

If you change MOV instructions so that they use SEG on the address (eg. mov ax, SEG data1_label or mov ax, SEG DATA2:data1_label) then SEG operator will evaluate to whats given as the "frame" in the comments.

The moral of the story is that you almost never want to use to segment names as segment overrides with MASM, as it almost certainly won't do what you want. There's also rarely any need to to use group names as segment overrides as the assembler will use the group by default for anything defined in a group. (Note this was different with MASM 5 or earlier where you had to use OFFSET DGROUP:label if the label was part of DGROUP in order to get the right result.)

The only really useful way to use segment overrides with MASM is when a segment register is used on the left hand side. In that case, it can be an alternative to using ASSUME or when there's no label involved in a memory operand, like with indexed addressing (eg. mov ax, es:[di]).

Why does SEG not give an error message with this code fragment?

1 Answers