Showing posts with label assembly. Show all posts
Showing posts with label assembly. Show all posts

Thursday, April 22, 2021

Boolean and Comparison Instructions

AND

  • Performs AND on each bit of source and destination operands
  • Result is stored on destination operand

AND reg,reg
AND reg,mem
AND reg,imm
AND mem,reg
AND mem,imm


  • Example:

mov al,0000001b  ; al = 0000001b
and al,0000000b  ; al = 0000000b


  • Can be used to mask bits (leave them unchanged)

and AL,11110110b    ; clear bits 0 and 3, leave others unchanged


  • Flags being modified:
    • Always clears overflow and carry flags
    • Modifies Sign, Zero, and Parity flags

OR

  • Similar to AND but performs OR instead

OR reg,reg
OR reg,mem
OR reg,imm
OR mem,reg
OR mem,imm


  • Can be used to modify a single bit leaving others unchanged

mov al,00000000b
or al,00000100b   ; set bit 2, leave others unchanged 


  • Here are the flags being modified (assuming AL is being modified)


XOR

  • Can be used as bit flippers (for symmetric encryption) - reversible property


  • Can be used whether a byte has even parity (parity flag set, even numbers of 1’s) or odd parity (parity flag clear, odd numbers of 1’s)

mov al,10110101b  ; 5 bits = odd parity
xor al,0          ; Parity flag clear (odd)
mov al,11001100b  ; 4 bits = even parity
xor al,0          ; Parity flag set (even)


  • Can be also used to check parity of a 16-bit integer - 16-bit parity

mov ax,64C1h ; 0110 0100 1100 0001
xor ah,al    ; Parity flag set (even) 


  • Doing XOR on exactly same integer results to 0 (XOR’ing to yourself). That means that it will set parity flag since the results is all 0’s (even number of 1’s).

mov al,22h   ; al = 22h
xor al,22h   ; al = 0


NOT

  • Inverts bits

NOT reg
NOT mem


  • Doesn’t affect any flags
  • Example:

mov al,11110000b
not al               ; AL = 00001111b


Test

  • Performs an implied AND (doesn’t modify the destination operand)
  • Example:

test al,00001001b     ; test bits 0 and 3


  • Can be used to check if a bit is set (can also work on multiple bits) set - Modifies zero flag



; Example for testing status of a device using but 5
mov al,status

; If bit 5 in status is set, this will clear zero flag.
; Otherwise it will set zero flag.
test al,00100000b
jnz DeviceOffline

; Can be used also on multiple bits 0, 1 and 4
test al,00010011b
jnz InputDataByte


  • Can be used to test if an integer is even or odd

mov eax,18
test eax,1   ; ZF = 1
mov eax,17
test eax,1   ; ZF = 0


CMP

  • Performs implied substraction between source and destination operands (both are not modified)

CMP destination,source


  • Modifies zero, carry and sign flags



  • Example:

; Let’s look at three code fragments showing how flags are affected by the CMP
; instruction. When AX equals 5 and is compared to 10, the Carry flag is set
; because subtracting 10 from 5 requires a borrow
mov ax,5
cmp ax,10 ; ZF = 0 and CF = 1

; Comparing 1000 to 1000 sets the Zero flag because subtracting the source from
; the destination produces zero:
mov ax,1000
mov cx,1000
cmp cx,ax ; ZF = 1 and CF = 0

; Comparing 105 to 0 clears both the Zero and Carry flags because subtracting 0
; from 105 generates a positive, nonzero value
mov si,105
cmp si,0 ; ZF = 0 and CF = 0


Manipulating individual CPU flags

We can manipulate (set or clear) cpu flags using the boolean and comparison instructions. Here are some examples.


  • Zero flag

test al,0   ; set Zero flag
and al,0    ; set Zero flag
or al,1     ; clear Zero flag


  • Sign flag - operates against the highest bit of the destination operand

or al,80h    ; set Sign flag
and al,7Fh   ; clear Sign flag


  • Carry flag

stc   ; set Carry flag
clc   ; clear Carry flag


  • Overflow flag

mov al,7Fh   ; AL = +127
inc al       ; AL = 80h (-128), OF=1
or eax,0     ; clear Overflow flag


How do we mimic conditional statements like in higher level languages?

Here is an example program that finds the large value between 2 integers.


; filename: LargerOfTwoIntegers.asm

mov edx,eax     ; assume EAX is larger
cmp eax,ebx     ; if EAX is >= EBX
jae L1          ; jump to L1
mov edx,ebx     ; else move EBX to EDX
L1:             ; EDX contains the larger integer
  ...

Sunday, April 11, 2021

Conditional Structures

Example 1 - Simple conditional

Pseudocode:


if( op1 == op2 )
{
  X = 1; 
  Y = 2;
} 


Assembly code:


  mov eax,op1
  cmp eax,op2
  jne L1
  mov X,1
  mov Y,2 
L1:
  ...


Example 2 - NTFS

Pseudocode:


    clusterSize = 8192;
    if terrabytes < 16
      clusterSize = 4096;


Assembly code:


  mov clusterSize,8192 
  cmp terrabytes, 16
  jae next
  mov clusterSize,4096 
next:
  ...


Example 3 - If Else

Pseudocode:


if op1 > op2
  call Routine1
else
  call Routine2
end if 


Assembly code:


  mov  eax,op1
  cmp  eax,op2
  jg   A1
  call Routine2
  jmp  A2
A1:
  call Routine1
A2:
  ...


Example 4 - nested If Else

Pseudocode:


if op1 == op2
  if X > Y
    call Routine1
  else
    call Routine2
  end if
else
  call Routine3
end if 


Assembly code:


  mov eax,op1 
  cmp eax,op2
  jne L2
  call Routine3
  mov eax,X
  cmp eax,Y
  jg L1
  call Routine2
  jmp L3 
L1:
  call Routine1
  jmp L3
L2:
  call Routine3 
L3:
  ...

Tuesday, March 16, 2021

INC and DEC Instructions

  • Adds or subtracts 1 from the target operand

inc ax
dec ax


  • If there the target operand is a smaller register, the other parts of a larger resgister will not be affected.

mov eax,1002FFFFh
; eax = 1002FFFFh

inc ax
; FFFFh + 1 will be 10000h but the higher 16 bits will not
; be affected so eax = 10020000h

mov eax,30020000h
; Same true for subtracting, the higher bits will not be
; affected. eax = 3002FFFFh

Saturday, March 13, 2021

Working with Strings

  • A string is just an array having characters as element. Each character is converted into Hex representation.

.data
myString BYTE "A", "B", "C", "D", "E"

.code
mov esi,offset myString



  • The declaration above can be shorten in this way to mimic a traditional “string” on high level programming languages.

.data
myString BYTE "ABCDE"

Friday, March 12, 2021

Align Instruction

  • This ensures data are located on even numbered addresses so that CPU can access it faster

ALIGN <bound>


  • Bound can be 1, 2, 4, 8 or 16 bytes
  • Here is an actual implementation. If we didn’t use align, wVal will be located in 0x003E4001.

.data
bVal BYTE 11h         ; offset = 0x003E4000
align 2
wVal WORD 2222h       ; offset = 0x003E4002
bVal2 BYTE 33h        ; offset = 0x003E4004
align 4
dVal DWORD 44444444h  ; offset = 0x003E4008



Thursday, March 11, 2021

Using Scale Factors

To extract an array element, we normally compute the number of bytes from beginning of data segment.


.data
myArray BYTE 10h,20h,30h,40h

.code
mov al,myArray     ; gets 10h
mov al,myArray+2   ; gets 30h


Extracting element from a DWORD array might be tough.


.data
myArray DWORD 10h,20h,30h,40h,50h,60h

.code
mov eax,myArray     ; gets 10h
mov eax,myArray+4   ; gets 20h
mov eax,myArray+12  ; gets 40h


With scale factors, this job will be easy. This code will work in any array sizes and you just need to specify the subscript (element you want to extract) via ESI.


.data
myArray DWORD 10h,20h,30h,40h,50h,60h

.code
mov esi,3                          ; points to element 3
mov eax,myArray[esi*type myArray]  ; gets 40h

Wednesday, March 10, 2021

Processor and Memory Architectures

Von Neumann





  • Uses a shared memory region
  • Has several security implications (e.g buffer overflow)
  • Uses one bus for data and instruction

Harvard Architecture



  • Uses separate memory regions for instructions (opcode) and data
  • Provides parallel access
  • Uses different bus for data and instruction
  • x86s and ARM uses this

Harvard vs Von Neumman




  • A method of memory management
  • Enables each application to operate in its own memory space
  • Allows multiple programs to access memory without interfering with one another’s data
  • Allows extending physical memory size by putting sections of memory to secondary storage (e.g disk) to free up memory space
    • Page - fixed size memory sections that are being allocated or moved to secondary storage
    • Page swapping - process of moving pages
    • Swap file - contains the pages
  • Access violation - attempts to access memory space not intended for you
  • A memory page can be:
    • readonly
    • executable
    • shared (for interprocess communication)
  • Example virtual-to-physical address translation in Windows NT


  • Page Fault - transfer of control to OS to do address translation (virtual to physical address translation and vice versa; not really an error)
    • Soft Fault
    • Hard Fault - accessing a page that has been swapped to secondary storage
  • Page status bits
  • Page Frames
    • Contiguous 4KB section of Windows NT physical memory

Memory Pools

Example below are for Windows NT


  • Non-paged pool - contains page frames that must remain resident in memory at all times (e.g for performance reaons)
  • Paged pool - can be swapped out to disk
  • PFN (Page Frame Number) database
    • Active
    • Standby
    • Modified
    • Free
    • Zeroed
    • Bad

Memory Management Unit (MMU)

  • Processor component that controls memory allocation, address translation and protection functions
  • Contains a cache - improves the speed of memory access by avoiding the need to traverse the page table directory and perform a page table lookup during each memory access

Each time the processor needs to access physical memory, which may occur multiple times during the execution of a single instruction, it first checks the TLB's associative memory to determine whether the translation information is resident in the TLB. If it is, the instruction immediately uses the information stored in the TLB to access physical memory. If the TLB does not contain the requested translation, a page fault occurs and the processor must traverse the page table directory and a page table to determine the translated address, assuming the referenced page is resident in memory.


  • Other MMU functions
    • Separation of virtual memory into kernel space and user space
    • Isolation of process memory
    • Page-level access restrictions
    • Detection of software problems
  • Null pointer exception - program error that tries to access invalid memory location (e.g $00000000)

Tuesday, March 9, 2021

PTR instruction

  • Allows moving data between source and destination operands even though they are of different sizes
  • A common example is to move a part of larger operand to a smaller operand.

.data
bigVar DWORD 12345678h

.code
; mov ax,bigVar
; This will not be allowed.

mov ax,WORD PTR bigVar
; This will move 5678h into ax (remember little endian format)

Monday, March 8, 2021

Manipulating x86-32 Flags

Here is a sample program to demonstrate how the respective flags are affected by each ADD, SUB, INC, DEC, and NEG instructions.


.386
.model flat,stdcall
.stack 4096
ExitProcess proto,dwExitCode:dword

.data
Rval SDWORD ?
Xval SDWORD 26
Yval SDWORD 30
Zval SDWORD 40

.code
main PROC
  ; INC and DEC
  mov ax,1000h
  inc ax ; 1001h
  dec ax ; 1000h
  ; Expression: Rval = -Xval + (Yval - Zval)
  mov eax,Xval
  neg eax                  ; -26
  mov ebx,Yval
  sub ebx,Zval             ; -10
  add eax,ebx
  mov Rval,eax             ; -36
  ; Zero flag example:
  mov cx,1
  sub cx,1                 ; ZF = 1
  mov ax,0FFFFh
  inc ax                   ; ZF = 1
  ; Sign flag example:
  mov cx,0
  sub cx,1                 ; SF = 1
  mov ax,7FFFh
  add ax,2                 ; SF = 1
  ; Carry flag example:
  mov al,0FFh
  add al,1                 ; CF = 1, AL = 00
  ; Overflow flag example:
  mov al,+127
  add al,1                 ; OF = 1
  mov al,-128
  sub al,1                 ; OF = 1
INVOKE ExitProcess,0
main ENDP
END main

Loops

  • Unconditional jump
  • Can only be performed within -128 to +127 of current location
  • Execute instruction decrement ECX by 1 check ECX (if 0, stop loop else continue loop)
  • ECX must be initialized to any positive value first
  • If ECX was initialized to 0, it will loop 4,294,967,295 times (initial value is 0 then 1 will be subtracted which will result to FFFFFFFF)
  • Example:

; This adds 1 to eax 5 times 
mov eax,0
mov ecx,5
L1:
  add eax,1
  loop L1

; eax = 5

Assembling ang Linking Process in 1 picture

 


Sunday, March 7, 2021

Tip in moving data to memory

In this code, we want to replace a WORD on in myVar by 3.


.data
myVar DWORD 12345678h

.code
mov ax,3
mov WORD PTR myVar+2,ax
mov eax,myVar


The question is: which WORD will be replaced?


If we look closely, myVar has 2 WORD parts.


1234h - 1st WORD
5678h - 2nd WORD


Remember that this will be store in little endian format in memory so the layout would be:


78 -- 1 byte storage
56 -- 1 byte storage
34 -- 1 byte storage
12 -- 1 byte storage


In our code above, we used PTR to replace the 2nd word of myVar. From the layout, the 2nd WORD is:


34 -- 1 byte storage
12 -- 1 byte storage


So if we run the code, the resulting value of EAX is:


00035678h

MOV Operator

  • Copies data from source operand to destination operand

MOV   destination, source


  • Cannot copy data from smaller to larger operand. For example if you are going to copy 16-bit data to a 32-bit register, set first the 32-bit register to 0 and use its lower 16-bit part.

.data
count WORD 1 ; 16-bit value

.code
mov ecx,0    ; 32-bit register
mov cx,count ; lower 16-bit part


  • But if you use immediate operands (no data labels) and you copy a smaller to a larger operand, it will work. The upper bits of destination will be cleared out (equal to 0).

.code
mov eax,0ffh
; eax = 0x000000ff


  • If the source operand is a data label, it copies data from that memory location to the destination operand. To copy only the memory address (and not the data), use offset operator.

mov eax,offset myVal ; copies memory address 0x00B8102B to eax
mov eax,myVal       ; copies data contained in 0x00B8102B to eax
mov eax,[myVal]     ; same as above


  • If the source operand is a register and contains a memory address, you must dereference it if you want to get the actual data.

mov esi,offset myVal ; copies memory address 0x00B8102B to esi
mov eax,esi          ; copies memory address 0x00B8102B to eax
mov ecx,[esi]        ; copies data located in memory address 0x00B8102B to ecx


MOVZX (MOV with zero-extended)

  • Allows directly moving smaller operand to larger operand by adding 0’s on remaining part (like a padding)


.data
byteVal BYTE 10001111b  ; byte is 8 bits 

.code
movzx ax, byteVal.  ; ax is 16-bit


  • This can only be used on the following type of operands



  • If higher bits of destination register has contents, those will be overwritten by 0.

MOVSX (MOV with sign-extended)


  • Similar to MOVZX
  • The destination operand MSB is set to the MSB of the source operand

.data
byetVal BYTE 10001111b

.code
movsx ax, byteVal


Offset Operator

  • Offset represents the distance of a data label from beginning of data segment
  • Let’s say we have this data definitions.

.data
bVal BYTE 10h
wVal WORD 2030h
dVal DWORD 40506070h
dVal2 BYTE ABH


  • The 4 variables above will be layed out in memory in this manner. bVal offset is 0 meaning it starts directly from data segment. On the other hand, wVal offset is 1 because it needed to allocate 1 byte of storage for the previous data (which is bVal; a BYTE) first before it can be positioned in memory. The from there it will be allocated 2 bytes since its a WORD. Same true for dVal2 but in this case it needs to allocate 4 bytes of space first because the previous data is a DWORD.



  • Offset also represents the memory location where the data resides. So if we would convert the above diagram into arbitrary memory addresses, it can look like this. That means that bVal is located at exactly address 00404000h while wVal is at address 00404001h.



  • Given the above statements, offset operator can be used to determine the location of data or the memory address.

.data
myVal BYTE 10h

.code
mov eax,offset myVal


Varying destination operand sizes

  • If we have this data definition,

myBytes  BYTE 10h,20h,30h,40h


  • If we get the offset of first element of array, put it in a byte-size register then it seems very straightforward.

mov esi,offset myBytes
mov al,[esi]            ; al = 10h


  • But if we change the destination operand into a word-size register, the behavior changes.

mov ax,[esi]        ; ax = 2010h


  • That happened because the mov operation tries to fill up the remaining 1-byte space on destination operand by getting the additional 1-byte data after the first element on the offset which in this case it is the second element 20h.