Solar Assembler Reference Manual

back to www.oby.ro
    > back to Sol_Asm

for Sol_Asm version 0.36.62.00

updated 09.05.2018

©   Copyright 2007,2018 Bogdan Valentin Ontanu. All rights reserved.

Contents

Chapter.1 Introduction

This document presents an overview of the syntax and usage of Solar Assembler. It makes the assumption that the reader is familiar with assemblers and ASM programming language.

During this document the following terms and abbreviations are used:

Abbreviation Description
Sol_Asm Solar Assembler
OS Operating system
Win32 or Win64 Windows 32 or 64 bits operating system
PE32 or PE64 Portable Executable Format - 32 or 64 bits
DLL Dynamic Link library
CDECL C default calling convention
STDCALL Win32 API default calling convention
OMF Object Module Format - OBJ format specification
COFF Common Object Format - OBJ format specification
ELF Executable and Linking Format - OBJ format specification
HLL High Level Language

Also in this document the accolades "{}" are used to enclose some text, name or value that you have to specify in syntax definition.

One exception to this rule is in STRUCTURE initialization section where "{}" are part of the syntax.

1.1.Design Goals

SOL_ASM is designed from the point of view of the creator that uses ASM as its main programming language. Hence Sol_Asm tries to ease the development of huge ASM only projects.

However Sol_Asm can also be used as a low level assembler without the help from HLL directives.

Sol_Asm main features are:

In daily usage this means that:

It also means that SOL_ASM does contain a decent amount of HLL features like:

All HLL statements are implemented internally in SOL_ASM and code is generated for them at compile time (not by user included macros). This means that all those features can be used to start development with minimal includes.

Of course Sol_Asm is written in assembly language and compiled by Sol_Asm itself. That is why it is named sol_asm2 ... because sol_asm is building sol_asm2 ;)

1.2 Targets

Short term targets

The short term targets until alpha stage have been:

All of the short term targets have been acquired.

Long term targets

The long term targets are:

Most (but not all) of the long term targets have been acquired also.

1.3 Fair warning

Solar Assembler is stable and functional but still in development. This means that it still contains bugs and has a few missing features. Caution is advised to the user.
However it should be ok for personal / big ASM projects.

Lately Sol_Asm has been used by me to develop:

Those are all big and complex projects and Sol_ASM has proved itself valid for them.

1.4 OS specific versions

This document assumes you are using the Win32 version of Sol_Asm. Other OS versions and details are not fully presented here.

However SOL_ASM OS specific versions are almost the same and share 99% of code with Win32 versions.

The only differences are:

Windows

Linux

MacOSX

Chapter.2 Running Solar assembler

2.1 Invocation

You can execute SOL_ASM from the command line like this:

Syntax:

	sol_asm2 {input_file} {output_file} {-options}

or

	sol_asm2 {-options} {input_file} {output_file} 

Example:

	sol_asm2  -pe32  my_game.asm  my_game.exe 

2.2 Options

All command line options must be specified with the "-" prefix character. On Windows you can also use the "/" prefix character.

Help options:

Option Action Obs.
-h, -help This will print all help text and then exit  
-h0, -h1, -h2, -h3, -h4 This will print a limited part of help text and then exit
  • h0: help options
  • h1: format options
  • h2: PE sub-systems
  • h3: Other options
  • h4: Info options

Output options:

Option Action Obs.
-pe32 This will generate Win32 Portable Executable  
-pe64 This will generate Win64 PE executable  
-console This will set console sub-system (for Win32)  
-dll This will set DLL characteristics of PE (for Win32) make a DLL
-binary This will generate a plain binary useful for OS development or handcrafted formats
-omf32 This will generate an OBJ file in OMF format. This OBJ can be linked with ALINK linker.
-coff32 This will generate an 32 bit OBJ file in MS-COFF format. This OBJ can be linked with MS link, Polink, GOlink and other linkers. and used in projects that link multiple modules together as OBJ
-coff64 This will generate COFF 64 OBJ format  
-elf32 This will generate an 32 bit OBJ file in ELF format. can be linked with LD or GCC on Unix like systems
-elf64 This will generate an 64bit OBJ file in ELF format. can be linked with LD or GCC on Unix like systems
-mac32 This will generate an OBJ file in MACHO format. is still experimental or not finished

For MacOSx it is still recommended that you use the ELF32/64 format and then convert from ELF to MachO obj format before linking (eg by using objconv by Agner Fog)

Other options:

Option Action Obs.
-q Be Quiet: only error messages are shown (for makefiles) for makefiles
-d equ_name This will define equ_name symbol at command line. The value of the symbol is 1 (one) and can be later tested in source code.
-size This will optimize for the output for size.

Using this option will usually result in more passes being done.

-dbg This will generate debug info. Works for PE32, ELF and COFF OBJ, other debug formats and levels will follow.
-list This will generate an listing file named: output_filename_list.lst One extra pass will be done for listing
-list_pass This will generate a series of listing files named: _list_1.lst list_2.lst ... one file for each pass. Compile speed will be slower with this option
-bench This will show a compiler/parser speed benchmark

Info options:

Option Action Files name suffix
-info This will generate OllyDbg specific info.
This can be loaded into OllyDbg with the Labelmaster plugin.
_info.lst
-info_proc This will generate a list of all PROC's and their arguments. _proc.lst
-info_stru This will generate a list of all STRUC's and their members info. _stru.lst
-info_equ This will generate a list of all EQU's items and their values. _equ.lst
-info_enum This will generate a list of all ENUM items and their values as EQU definitions. _enum.inc
-info_tkn This will generate a list of known opcodes and directives. _tkn.lst
-info_files This will generate a list include files and folders. _files.lst
-info_reloc This will generate a list of relocations _reloc.lst
-info_sect This will generate a list of sections for each pass _sections_N.lst
-info_all This will generate all above info.  

Notes
The info_files option

If you have a main ASM file that includes all other files then Sol_ASM will parse this tree and generate a list of all the files included in your project. The sub folders of your project's main ASM file will become "groups".

The generated list of files is in RadASM INI format and you can copy paste it in a dummy project in order to transfer your existing non-RadASM project into an RadASM project.

2.3 OllyDbg specific debug info

Sol_Asm can generate a file named: output_filename_info.txt that will contain a list of your application LABELS, PROC's and their addresses. This file can be loaded in OllyDbg by the LabelMaster plug-in and can help symbolic debugging a lot. You will be able to see familiar code labels, PROC names, variable names and call stack in OllyDbg.

You can obtain Labelmaster plugin here

The same thing can be obtained by using the -dbg command line option that will generate debug info inside the OBJ or PE32 files. However this simple ascii format has some advantages (multiple address with the same name) and can be used with ease for your own custom debugging utils.

Chapter 3. Program Setup

A series of initial statements are required for making a valid program. this is usually called "red tape" to wrap the package.

Here is a sample of the most simple Sol_Asm program:

; minimal test file
section "code" class_code

	nop
	ret	

Only one declaration is absolutely required by SOL_ASM: the section declaration.

3.1 Sections

SOL_ASM divides a program into multiple sections.

You define a section like this:

Syntax:

	section {"section name"} {section_type}

Example:

	section "code" 	class_code
	section "data"  class_data
	section "idata" class_imports

At least one section must be defined before any code generation.

Section type Description Attributes
class_code for code CODE, EXECUTE, READ
class_data for initialized data INITIALIZED, READ, WRITE
class_bss for not initialized data READ, WRITE, RAW size = 0
class_imports for imports INITIALIZED, READ, WRITE
class_relocs relocations INITIALIZED, READ, WRITE
class_exports exports INITIALIZED, READ
class_rsrc for resources work in progress

After you have defined your sections, in the program body you can switch in between sections with ".section_name" like this:

.code
	; enter your code here
.data
	; enter some data definitions here
.code
	; return / continue to code section
Notes:

3.1.2 Section Name Alias

When defining a section you can provide an alias name like this:

Syntax

	SECTION	{section_name_for program} ALIAS {section_name_for_OS}

This is useful for linkers that have default section naming conventions or like to unite sections based on section name.

For example:

	SECTION "code"	CLASS_CODE ALIAS ".text"

This will allow to use the familiar ".code" and ".data" section selectors in your program and still output a section name according to your linker's preferences.

Alternatively you can name your section".text" and select it with "..text"

3.2 Imports

If your program is a PE32 or PE64 format then you can specify the imported DLL's and the functions imported from each DLL

3.2.1 Imports Definition

You define imports like this:

Syntax:

	FROM	 	{dll_name}
	IMPORT		{function_name} {[param_count]} {calling convention} ALIAS {alias_name} 

Example:

	from	 	kernel32.dll
	import		ExitProcess
	import		GetStdHandle [1] STDCALL ALIAS _GetStdHandle@4

	from		user32.dll
	import		MessageBox ALIAS MessageBoxA

The above example will import:

Each "import" statement belongs to the previous "from" statement.

Notes:

Alternative names for import keywords are:

3.2.2 Imports Alias

When importing an API name you can provide an alias name like this:

Syntax

	IMPORT	{function_name_for program} ALIAS {function_name_for_OS}

For example:

	import	MessageBox alias MessageBoxA

This will allow your program to refer to ASCII or UNICODE versions of API using a single API name across your code.

3.2.3 Calling convention for imports

By default Sol_Asm considers all imported functions to be STDCALL for binary and 32 bits format and WIN64 for 64 bits format.

You can establish a different calling convention of an imported function like this:

Example:

	import	Str_Printf CDECL/win64/stdcall/lin64

Additionally you can add the "varg" statement to mark a variable arguments import function. This is usefull for lin64 calling convention.

3.2.4 Argument count for Imports

Sol_Asm does not need procedures prototypes because it will extract this information from your PROC definition even if the definition is present after procedure usage. However imported API's are not defined in your sources and hence by default Sol_Asm will not check the argument count for imported functions.

You can define the argument count for imports like this

Syntax

	IMPORT	{function_name} [{argument_count}]

Example:

	import	MessageBox [4] alias MessageBoxA

In this case Sol_Asm will check an INVOKE statement to have 4 parameters and the IMPORT statement acts as a mini prototype.

3.3 EXTERN

With EXTERN you can define symbols external to your module (defined in other modules). This is usefully when you gnerate OBJ files and link them together with an external linker.

Syntax

	EXTERN	{function_name} [{argument_count}] ALIAS {function_alias}

Example:

	extern AddAtom [1] alias _AddAtomA@4
Notes:

EXTERN is similar to IMPORT but it does not need a FROM_DLL statement

EXTERN symbols will be solved by the linker at link time. Depending on your linker configuration they can be linked in statically or dynamically.

3.4 EXPORT

Using EXPORT you can export a procedure or a label from your program. It works for PE32, DLL and COFF,ELF output formats.

For OBJ output formats it has the effect of making your symbol PUBLIC.

3.4.1 Define Exports

You define exported functions like this:

Syntax

	EXPORT {proc_or_label_name} ALIAS {export_name_for_output}
Notes:

3.5 Entry point

By default the entry point is at the start of the first section defined.

However you can specify another location like this: Syntax:

	.ENTRY {symbol_name}

Example:

	.entry App_Init
...
App_Init:
		call	Main
		invoke	ExitProcess
		ret		
...

PROC MAIN
	...
	ret 
ENDP		
	

This will define "App_Init" label as the entry point of your program.

Notes:

3.6 Base address, ORG and DISP

Those keywords allow you to setup the address where your code will or should be placed in memory at run time.

3.6.1 Base address

The default base address is setup like this:

You can setup a base address like this:

Syntax:

	BASE	{absolute_or_virtual_address}

Example:

	BASE	02000_0000h

This will make your executable base address start at 512M. The base address is the address of the first section. Each section is aligned at 4K and PE files also have an additional 4K header before the first section starts.

BASE has the same effect like an ORG and an DISP with the same value.

3.6.2 ORG

If you want a piece of code to be located at an absolute address (for OS development) or if you want to "jump around" inside code positions you can use the ORG directive:

Syntax:

	ORG	{absolute_address}

The ORG directive moves both the current address counter and output pointer in output file to the specified address.

Code and data will be generated at the new address and offset in output file.

3.6.3 DISP

DISP directive will move the output pointer backward a certain amount. the reason for this is to avoid having zeroes at start of output file after an ORG directive.

Syntax:

	DISP	{negative_move_size}

For example:

	org	0B000h
	disp	0B000h

Are the first lines in Solar_OS System32 module.

This means that code is made to run at absolute address 0xB000 but the output pointer remains at:

	output_position = + B000h (because of org) - B000h (because of disp) = 0

And this way the generated binary will not contain 0xB000 zeroes or garbage at start of file.

3.7 Encoding Modes

Sol_Asm can encode 16bits, 32bits and 64bits ASM.

You switch between 32/64 bits encoding with:

	.USE16		- encode 16 bits
	.USE32		- encode 32 bits (default)
	.USE64		- encode 64 bits 
Note:

3.8 Include files

Your main asm file can include other asm files and so on.

Syntax:

	include		{include_path_and_file_name}

Or alternatively you can include binary files.

Syntax

	incbin		{include_path_and_file_name}

You can also fine tune the binary include:

Syntax

	incfrom		{start_pos}, {size}, {include_path_and_file_name}

In this case all 3 parameters must be present.

Include size can be "?" if you want to include until the end of the file.

Examples:

	incfrom		512,1027,help2.txt	; skip 512 bytes, include 1027 bytes
	incfrom		128,?,help2.txt		; skip 128 bytes, include rest of file
Notes:

Chapter 4. Language elements

4.1 Numbers

SOL_ASM accepts numbers in the following formats:

Notes:

Examples:

	111_000_101b		- binary
	0FFFF_C0_00h		- hexadecimal
	1_000_000		- decimal

	10.2345			- floating point

Numbers do not have to start with "0" or a digit but that is good practice.

Limitations:

4.2 Expressions

Expressions are statements like:

	(5+4)*7
	((PCI_DEVICES_MAX*PCI_ITEM_SIZE)/(4096+4096))+1
	((ETH_RX_APPS_MAX*4)/4096)+1
	<1 SHL 5>

Expressions can contain Numbers, Operators, Braces and Symbols

Operators are:

Operator Description Priority
"*" multiplication 1
"/" division 1
+ addition 2
- subtraction 2
x SHL n shift x left n times 1
x SHR n shift x right n times 1
x ROL n rotate x left n times 1
x ROR n rotate x right n times 1
x XOR y binary XOR 1
x AND y binary AND 1
x OR y binary OR 1
NOT x binary NOT 2
- unary minus 1
RND N obtain a random number in range [0...N] 1

Variables recognized in expressions

Variable Description Priority
$ current address 2
$adr current address 2
$$ current section base addr 2
$$$ format base addr 2
$ofs current offset in section 2
$rva symbol RVA of symbol 2
$pass current pass nr 2
$style token token style (token, string modrm) 2
$type token token type (register, label, etc) 2
$size token token size (8,16,32,64 bits) 2
$value token token value (reg code, label addr) 2

Limitations:

For example:

	< 1 SHL 5 >

The above expression does contain spaces and was therefore enclosed in < and >

4.3 ModRM Expressions

Are expressions used by CPU in complex effective address calculations. Those kind of expressions are handled differently from normal expressions by Sol_Asm.

The generic layout is:

Syntax

	[{base_reg} + {scale}*{index_reg} + {displacement}]

Example

	mov	eax,[esi + 4*ecx + 1234h]
	mov	eax,[esi + INFO_CTX.name_len]

In the first example above:

	base_reg 	= esi
	scale		= 4
	index_reg	= ecx
	displacement	= 1234h

As per CPU specifications scale can be missing or: 2,4,8 only

Limitations:
Notes

4.4 Small strings

This is special kind of string that can be used as an instruction operand.

For example:

	mov	eax,"abcd"
	cmp	al,"-"

You can use the SWAP modifier to reverse the string

For example:

	cmp	eax," rox"		; compare with "xor " in reverse because of endian issues
	cmp	eax, swap "xor "	; same as above but much easier to read
Limitations:

4.5 User defined symbols

They are used as names for labels, procedures, etc in the program.

For Example:

	System32_Start:

User defines symbols are case sensitive and can contain underscores "_" digits and special characters but can not contain: CR, LF, space "<>" and comma.

They do not have to start with a letter... (but that is good practice).

The max symbol size is 128 bytes.

4.5 Comments

4.5.1 Line comments

This kind of comments start with ";" character and extend until the end of line.

For example:

	; this is a single comment on a line

	mov	eax,1		; this comment is at end of line

4.5.2 Block comments

Block comments are made with: "/*" and "*/"

For example:

; comment out this debug code
/*
	ODS_str	<13,10,"+++ Equ_Create">
	ODS_token
	
	; notice here how block comments can be nested
	/*
	ODS_fmt	<13,10,09,"equ_create: value=%x">, eax
	*/

*/
Note:

4.6 Very Long Lines

You can continue a long line on the next line with the "\" symbol.

For Example:

invoke	CreateWindowExA,0,class_name,wnd_title,\
			WS_OVERLAPPED+WS_CAPTION+WS_SYSMENU+\
			WS_THICKFRAME+WS_MINIMIZEBOX+WS_MAXIMIZEBOX,\
			32,64,320,240,0,0,[module_handle],0
Note:

4.7 Keywords

Keywords are:

For Example:

	MOV, XOR, ADD, SUB, JMP, CALL		- are opcode mnemonics 
	EAX, ECX, ST0, MM0, RAX, XMM1		- are register names
	PROC, STRUC, .entry, ORG, INVOKE	- are SOL_ASM directives

Keywords are case insensitive.

4.8 Special symbols

SOL_ASM rarely treats a symbol in a special way.All symbols are born equal :P However there are exceptions:

Special Character Description Notes
SPACE, TAB or "," are used as separators for tokens can not be part of user tokens
CR, LF line end and separators for tokens can not be part of user tokens
":" it defines a code label when used as suffix after a user symbol can be part of user tokens
"$" means current address counter can be part of user tokens
"?" means "do not care" / "non initialized" in data define statements can be part of user tokens
"." hints a section selection when followed by a section name
means a structure member name separator in {structure}.{member}
can be part of user tokens
" " (double quotes) encloses a string can be part of user tokens
' ' (single quotes) encloses a string can be part of user tokens
< > means LITERAL, multiple tokens enclosed by < > and separated by spaces or comma will be considered as one ;) has use restrictions
"[" "]" encloses a Mod_RM address expression has use restrictions
"{" "}" used to enclose structure initializations statements
also used instead of < >
can be part of user tokens but has use restrictions

The following symbols have a special meaning only inside a MACRO body:

Special Character Description Notes
"@" means MLOCAL when used as a prefix inside a MACRO can be part of tokens
"&" triggers MARG check and expansion even when inside a token can be part of tokens

As you can see Sol_Asm is relatively tolerant toward the use of special symbols inside user defined tokens.

4.8.1 Build Time

The special symbol "$time" means current build time in OS_TIME format and it creates a data definition.

STRUC OS_TIME
	year		dw	?
	month		dw	?
	day_of_week	dw	?
	day_of_month	dw	?
	
	hour		dw	?
	minute		dw	?
	second		dw	?
	mili_sec	dw	?
ENDS

When include in source this data definition will be updated by SOL_ASM at each compile time.

For example:

build_time:
	;-------------------------------------
	; compile time symbol, 
	; value is filled in by assembler
	;-------------------------------------
	$time				
	db	0	

Chapter 5. Data definitions

5.1 Define initialized data

You can define initialized data like this:

Syntax:

	{label_name}	db	{data_item}	; define byte 	8 bits
	{label_name}	dw	{data_item}	; define word	16 bits
	{label_name}	dd	{data_item}	; define dword	32 bits

	{label_name}	dq	{data_item}	; define qword	64 bits
	{label_name}	dt	{data_item}	; define tword	80 bits
	{label_name}	do	{data_item}	; define oword	128 bits

Notes:

Additionally "db" does accept ASCII strings.

For example:

	my_dwords	dd	1,2,7,0FACE_BABEh,1356789,11

	my_string	db	"This is a message",0 

You can use the "?" special character to define non initialized data.

For example:

	my_var_db	db	?
	my_var_dw	dw	?
	my_var_dd	dd	?
Limitations:

5.2 Define unicode strings

You can define an unicode string like this:

Syntax:

	{label_name}	du		"type your utf-8 string here",0	

The parser will read and interpret utf-8 encoded code points from the quoted string and translate them to 16 bits words.

5.3 Define Floating point data

Syntax:

	{label_name}	real4		{data_item}	; define REAL4 number	- 32bits
	{label_name}	real8		{data_item}	; define REAL8 number	- 64bits
	{label_name}	real10		{data_item}	; define REAL10 number	- 80bits	

For example:

	test1		real4		10.2345
	test2		real4		0.7785
	test3		real4		1_277_789.534
	test4		real4		999_123_456_789.37

	test1b		real8		10.2345
	test2b		real8		0.77854321773
	test3b		real8		1_277_789.534
	test4b		real8		999_123_456_789.37

	test1c		real10		10.2345
	test2c		real10		0.77854321773
	test3c		real10		1_277_789.534
	test4c		real10		999_123_456_789.37	

SOL_ASM performs real number conversions into the highest floating point precision available (80bits) and stores the result in requested format. Because of this "test4" above can not retain all defined digits but "test4c" can do it.

5.4 Reserve non initialized data

You can reserve data with the following keywords:

	{label_name}	rb	{count} 	; reserve byte(s)  =   8 x  bits
	{label_name}	rw	{count} 	; reserve word(s)  =  16 x  bits	
	{label_name}	rd	{count} 	; reserve dword(s) =  32 x  bits
	{label_name}	rq	{count} 	; reserve qword(s) =  64 x  bits
	{label_name}	rt	{count}		; reserve tbytes   =  80 x  bits
	{label_name}	ro	{count}		; reserve owords   = 128 x  bits

You can reserve structures like this:

	rs	{structure_name},{count} 	- reserve {count} structures	

For example:

	rb	1024		; reserve 1024 bytes	
	rw	17		; reserve   17 words
	rd	23		; reserve   23 dwords
	rs	WNDCLASS,77	; reserve   77 WNDCLASS structures	

5.5 Fill data buffers

You can fill initialized data buffers with the following keywords:

 {label_name}	fb	{count},{fill_value} 	; fill bytes	
 {label_name}	fw	{count},{fill_value} 	; fill words		
 {label_name}	fd	{count},{fill_value} 	; fill dwords	
 {label_name}	fq	{count},{fill_value} 	; fill qwords	

And for structures:

 {label_name}	fs	{struc},{count},{fill_value}	; fill structures 

5.6 Structure data definitions

Structure definitions are automatically promoted as data types and you can define a structure like this

Syntax:

	{label_name} {structure_name}	{data_item}

For example:

	my_class	WNDCLASS	?	
	my_ps		PAINTSTRUCT	?

Defines one WNDCLASS structure at label "my_class" with initial value unknown, and one PAINTSTRUCT structures at label "my_ps".

5.7. Structure Member data initializations

Considering the structure:

	STRUC POINT_3D
		x	dd	?
		y	dd	?
		z	dd	?
	ENDS

You can initialize structure members like this:

	my_pt_1		POINT_3D	{ 1 2 3 }				

	my_pt_2		POINT_3D	{  y = 2  z = 7  x = 1 }

	my_pt_3		POINT_3D {  
					x = 2  
					y = 7  
					z = 1 
				}

You can initialize sub structure members by name like this:

	STRUC POINT_2D
		x	dd	?
		y	dd	?
	ENDS
	
	STRUC CIRCLE_2D
		color		dd	?
		center		rs	POINT_2D,1
		radius		dd	?
	ENDS	
	
	my_circle_var	CIRCLE_2D {
					color = 00_7F_FF_3Fh
					center.x = 100
					center.y = 200
					radius = 77
				}

You can nest {} like this

	
	my_circle_var	CIRCLE_2D {
					color = 00_7F_FF_3Fh
					{ 100 200 }
					radius = 77
				}
	; or by name
	my_circle_var	CIRCLE_2D {
					color = 00_7F_FF_3Fh
					{ x = 100 y = 200 }
					radius = 77
				}				

You can also use {} for members that are made of multiple items but are not typed as structures (RB, RW, RD, RQ, RO) like this:

	
struc GUID
	dd1	dd	?
	dw1	dw	?
	dw2	dw	?
	bytes	rb	8
ends

my_guid	GUID { aaaa_bbbbh,cccch,ddddh { 1 2 3 4 5 6 7 8 } }
	

Chapter 6. General Code Syntax

SOL_ASM follows Intel style ASM syntax as opposed to AT && T syntax. The syntax reflects my personal preferences resulted from doing extensive applications in ASM. The following sub chapters will present the most notable syntax issues...

6.1 Default ASM instruction syntax

Each ASM source line haves this default layout:

Syntax:

	{label} {instruction} {parameter1} , {parameter2} {; comments}

For example:


read_pixel:	mov	eax,[esi]	; 32 bits ARGB format

All elements can be missing but if a {parameter} is present then {instruction} must also be present.

Directives do not have to follow this syntax.

6.2 Offset keyword and use of []

There is no "offset" keyword. The name of a variable or label automatically means "offset of" As a consequence you must always use brackets for obtaining "contents of" a variable.

For example:

	.data
		my_var		dd	37h
	.code
		mov	esi,my_var
		mov	edx,[my_var]
		mov	edx,[esi]		
	...

In the above example the first MOV will fill ESI register with the offset of my_var (ie with 0x402000 for example)

The second MOV will fill EDX with the content of my_var (ie with 37h for example). The 3rd move will do the same but by using esi as a pointer to my_var.

Notice the similarity of the second MOV with: MOV EDX,[ESI]

6.3 Size overrides

When needed or when wanted the user can override the operand size of encodings.

Available overrides are:

	byte		- force   8 bits
	word		- force  16 bits
	dword		- force  32 bits
	qword		- force  64 bits
	tbyte		- force  80 bits
	oword		- force 128 bits

	small		- force low word of symbol 

6.4 Structure members

Let us assume we have defined the following structure:

	STRUC INFO_CTX
		info_name		rb	128
		info_dword		dd	?
		info_word		dw	?
		info_byte		db	?	
	ENDS

And then we reserve a vector of 1024 such structures:

	my_info		rs	INFO_CTX,1024

Then the following rules apply for accessing structure members:

	mov	esi,my_info
	mov	eax,[esi + INFO_CTX.info_dword]		
	mov	[esi + INFO_CTX.info_word],2		; will move WORD 2
	mov	[esi + INFO_CTX.info_byte],1		; will move BYTE 1

	; go to next item in vector
	add	esi, size INFO_CTX

Observe how the structure member size will hint instructions for operand size when possible. This greatly reduces the need for "dword / word / byte" modifiers.

For example:

	movzx	eax,[esi + INFO_CTX.info_byte]
	movzx	eax,[esi + INFO_CTX.info_word]

is equivalent to:

	movzx	eax,byte [esi + INFO_CTX.info_byte]
	movzx	eax,word [esi + INFO_CTX.info_word]

But you do not have to use "byte" and "word" hints because of the structure that provides this information.

However in this example:

	mov	byte [esi],4

SOL_ASM will require the "byte" user size override / hint because there is no structure member hint available

6.5 Multiple instructions on the same line

You can write multiple assembly instructions on the same line. Sol_Asm will know when one instruction ends and the next one starts.

For Example

	push ebx  	push esi	push edi

	; init
	mov eax,1 	mov ecx,17	mov ebx,3
loop:
	xor ecx,ebx  	sub ebx,edx
	dec ecx 	jnz loop
 
	pop edi 	pop esi 	pop ebx
Notes:

6.7 Empty Spaces

In this development stage Sol_ASM can be very annoying about white spaces requirements. This behaviour is in part because the parser always considers spaces as token separators no mater what. This helps parsing speed and eases debugging but it also makes some problems.

It is my intention to remove those limitations in later versions but for now you will have to know and respect them

6.7.1. Expressions and spaces

The expression parser doe shandle white spaces but the high level tokenizer does break expressions on spaces and because of this you must avoid spaces in expressions or if you need spaces then enclose the whole expression in < and > or { and }

;---------------
; this is OK
;---------------
mov	eax, (7*4)+(5*PACKET_SIZE)	; this is an expression with no spaces inside
mov	ecx, WND_CHILD+WND_MINIMIZE	; this is an expression	with no spaces inside
mov	ecx, size MY_STRU		; this is not an expression
mov	al, byte [esi]			; this is not an expression

;--------------------------------------------------------------
; this is NOT OK because expressions can not contain spaces
;--------------------------------------------------------------
mov	eax,(7*4) + (5 * PACKET_SIZE)
mov	eax,1 SHL 18					; this expression needs spaces
mov	ecx,WND_RESIZE OR WND_CHILD OR WND_MINIMIZE

;-----------------------------------------------
; this is made  OK by the use of < and >
;-----------------------------------------------
mov	eax, < (7*4) + (5 * PACKET_SIZE) >
mov	eax, < 1 SHL 18 >
mov	ecx, < WND_RESIZE OR WND_CHILD OR WND_MINIMIZE >
	
Notes for expressions

6.7.2 .IF and Spaces

Runtime conditionals like .IF or .While or .Repeat do need spaces arround: paranthesis, conditions and logical operators.

;---------------
; this is OK 
;---------------
.if ( eax == 1 .and. ebx == 5 ) .or. ( [status] == 1 .and. [errors] == 0 )
	...	
.endif


;------------------
; this is NOT OK
;------------------
.if (eax==1 .and. ebx==5).or.([status]==1.and.[errors]==0)
	...	
.endif

;-----------------------------------------------------------------
; here use {} because < and > are conditional operators also
;-----------------------------------------------------------------
.if ( eax < { 7FFFFh SHR 5 } ) .and. ( edx > { 1 SHL 7} )
	...	
.endif
Notes for .IF

Chapter 7. Directives

7.1 EQUATES

Syntax:

	{symbol_name}	EQU	{value or expression}

Examples:

	equ1			equ	40
	ETH_RX_APPS_MAX		EQU	1024
	ETH_MEM_BLOCKS		EQU	((ETH_RX_APPS_MAX*4)/4096)+1	
	equ_28			EQU	< 1 SHL 28 >

Equates can not be redefined or double defined. However you can use the assignment operator for this:

Syntax:

	{symbol_name}	=	{value or expression}

Examples:

	x = y + 1
	y = 7
Note

For example the folowing code will force Sol_ASM to make 8 passes until y = 7 and no longer changes it's value

#if $pass == 1
	y = 0
#endif

#if y < 7
	y = y+1
#endif

#echo " y=%x",y

7.2 LABELS

Labels are defined in two modes:

Syntax:

{label_name}:
{label_name}	{data definition keyword}	{data_items}

For example:

	mov	ecx,nr_of_items
	mov	esi,items_ptr
my_loop:
	; perform some actions here
	add	[esi+ITEM.quantity],1
	
	; next item
	add	esi,size ITEM
	dec	ecx
	jnz	my_loop

In the above code sequence "my_loop" is a code label and serves as a target for the JNZ instruction.

or

.data
	my_account_balance	dd	1234_5678h
.code	
	mov	ecx,nr_of_invoices
	mov	esi,invoices_ptr
my_loop:
	; perform some actions here
	mov	eax,[esi+INVOICE.total]
	sub	[my_account_balance],eax
	
	; next invoice
	add	esi,size INVOIVE
	dec	ecx
	jnz	my_loop	

In the above code sequence "my_account_ballance" is a data label and serves as a parameter for the SUB instruction.

7.2.1 Labels scope

Labels defined outside of a procedure are global in name scope. Global labels can not be double defined.

Labels defined inside PROC ... ENDP construct are local in namespace to the procedure. Hence there can be multiple labels with the exact same name as long as they reside in different procedures.

7.3 STRUCTURES

Structures are defined like this:

Syntax:

STRUC {structure_name}
	{member_name1}	{data_definition_keyword}	{data_item}
	...
	{member_name2}	{data_reserve_keyword}		{count}
	...
ENDS

For example:

	
	STRUC ETH_PACKET
		packet_ptr		dd	?
		packet_id		dd	?
		packet_mac_src		rb	16
		packet_mac_dest		rb	16
	ENDS


	STRUC ETH_DRV
		drv_id			dd	?
		drv_name		rb	128
	
		status			dd	?
		
		packets_buff		rs	ETH_PACKET,1024
	ENDS

As you can see structures can contain other structures. Once a structure is defined it can be used in subsequent data definitions.

Access to it's members can be done like this

	.data
		my_eth	ETH_DRV 	?
	.code
	
	; via pointer
	mov 	esi, my_eth
	mov	eax,[esi + ETH_DRV.status]
	
	; or by direct access
	mov	[my_eth.status], 1
	mov	eax,[my_eth.status]

You can define a LOCAL variable in a PROC as having STRUC type and access it like this:

	PROC my_proc stdcall
		ARG arg1, arg2
		LOCAL my_eth :ETH_DRV, wc :WNDCLASSEX
		
		; note the space between my_eth and :ETH_DRV it is required now
		mov	[my_eth.status],2
		mov	[wc.cbSize], size WNDCLASSEX
		
		; or via pointer
		lea	esi,my_eth
		mov	[rsi+ETH_DRV.status],4
		ret
	ENDP

Structure size can be obtained like this:

	add	esi, SIZE ETH_DRV

Also you can obtain the offset of a member inside a structure like this:

	mov	eax, ETH_DRV.eth_status

Note:

Hence this code is also valid:

	mov eax, ETH_DRV

And it will move the size of ETH_DRV structure into eax.

For clarity reasons the use of SIZE is recommended whenever possible.

You can access structure members like this:

Example:

.data
	my_driver 	rs	ETH_DRV,16
.code
	mov	esi,my_driver
	mov	eax,[esi + ETH_DRV.packets_buff.packet_id]
	...

7.3.1 UNIONS

You can define unnamed UNIONS inside a structure.

Syntax:

	UNION
		{member_name1}	{data_definition_keyword}	{data_item}
		...
		{member_name2}	{data_reserve_keyword}		{count}
	ENDU

For Example:

	struc pixel_format
		flags1   dd   ?

		union
			r_mask   dd   ?
			y_mask   dd   ?
			union
				rx_mask      dd   ?
				ry_mask      dd   ?
			endu      
		endu

		flags2   dd   ?

		union
			g_mask   dd   ?
			u_mask   dd   ?      
		endu
	ends 	

And you can access any UNION member just like any other structure member.

7.4 PROCEDURES

Procedures are defined like this:

Syntax:

PROC {proc_name} {proc_call_convention_type}
	USES	{uses_list}
	ARG	{arg_list}
	LOCAL	{local_list}

	; some code

proc_label:	
	...

	ret
ENDP

For example:


	PROC Test_01 stdcall
		USES	esi,edi
		ARG	wnd_handle, wnd_action
		LOCAL	count, my_var1, my_var2


		mov	esi,[wnd_handle]
		mov	ecx,100

	loop_here:
		mov	eax,[esi]
		test	eax,eax
		jz	finish

		add	[count],eax

		dec	ecx
		jnz	loop_here

	finish:
		mov	eax,[count]
		ret
	ENDP

Known calling conventions

SOL_ASM will automatically generate PROLOGUE and EPILOGUE code and will generate code for handling of USES, ARG and LOCAL variables as needed.

Known calling conventions are:

Additionally you can use the "varg" statement to mark a procedure that uses variable arguments count.

Notes:

For PROC's defined as NOFRAME Sol_Asm will not emit prologue and epilogue code but will emit PUSH/POP code for USES statements if present. In this case you should write the prologue and epilogue code yourself.

Default arguments and locals sizes:

This can be overwritten if they have a structure type like this:

	PROC Test_02 stdcall
		USES	esi,edi
		ARG	wnd_handle, wnd_action
		LOCAL	my_var2 :MCTX  l_point :POINT_3D
		...
		ret
	ENDP

7.4.1 PROC Local buffers

You can define a local procedure buffer like this:

	PROC Test_02 stdcall
		USES	esi,edi
		ARG	wnd_handle, wnd_action
		LOCAL	my_var1,  my_buff [32],  my_var2 :MY_CTX [32]
		...
		ret
	ENDP

This will define a buffer of 32 dwords starting at "my_buff" and a 32 * SIZE MY_CTX buffer / vector at my_var2

Notes:

For example: in PROC Test_02 above incrementing address from "my_buff" will hit "my_var1" and not "my_var2"

	PROC Wnd_Proc1 win64
		ARG	hwnd, wmsg, wparam, lparam
		LOCAL	tmp_hdc

		;-------------------------
		; spill is usually needed
		;-------------------------
		mov     [hwnd],rcx
		mov     [wmsg],rdx
		mov     [wparam],r8
		mov     [lparam],r9
		
		...
		
		ret
	ENDP
Notes:

7.5 INVOKE

Procedures or imported functions can be used with INVOKE syntax:

Syntax:

	INVOKE {function_name}, {param1},{param2}, ... {param_N}
	
	; or with dynamic function in register
	mov	rbx,[my_function]
	INVOKE	{abi_name},rbx,{param1},{param2}, ..., {param_N}
Known ABI names:
Notes:

For example:

	invoke	Str_Printf,ods_fname,ods_fname_fmt,[pass_nr]
	invoke	OS_File_Create,ods_fname
	mov	[ods_fhandle],eax

	mov	eax,[My_Dynamic_Proc]	
	invoke	stdcall,eax,ecx,edx
	
	; use ADDR to get the address of a local variable in a PROC
	PROC my_proc stdcall
		ARG 	arg1, arg2
		LOCAL	wc :WNDCLASSEX
		
		mov	[wc.cbSize],size WNDCLASSEX
		...
		invoke	RegisterClassA, ADDR wc
		...
		
		ret
	ENDP	

Depending on each procedure definition or import function hints Sol_ASM will handle calling conventions details.

CINVOKE

CINVOKE is a variation for invoke that will assume CDECL convention and will not perform parameter count checking.

7.6 MACROS

SOL_ASM contains a MACRO processor that supports nested and recursive macros with VARARG and checked arguments.

7.6.1 Define MACRO

A MACRO is defined like this:

Syntax:

	MACRO {macro_name}
		MARG	{ marg_list [:REQ] [:VARARG] }

		; some code

	@macro_label:
		...

	ENDM

For example:

;---------------------------------------
; define and output a simple string
; note: @ means local symbol for macros
;---------------------------------------
MACRO ODS_str
	MARG	mpar1
	#ifdef SHOW_DEBUG

		jmp	@over1
			@mstring1	db	mpar1,0
		@over1:

		pushad
		invoke	Str_Len,@mstring1
		invoke	OS_File_Write_Dbg,[ods_fhandle],@mstring1,eax
		popad

	#endif
ENDM

And can then be used like this:

ODS_str	<13,10,"-------- Listing Sections -------">
Notes:

Inside a MACRO the "@" prefix means that the symbol is local to this MACRO and will get a different name each time the MACRO is expanded.

Limitations:

7.6.2 VARARG MACRO's

A macro can have a variable number of arguments.

For example:

;---------------------------------------
; define and output a formatted string
; note: @ means local symbol for macros
;---------------------------------------
MACRO ODS_fmt
	MARG	mfmt, arg_list :VARARG

	jmp	@over1	
		@mstring1	db	mfmt,0
	@over1:

	pushad
	invoke	Str_Printf, sz_buff1, @mstring1, arg_list
	invoke	OS_File_Write_Dbg, [ods_fhandle], sz_buff1, eax
	popad

ENDM

And can then be used like this:

ODS_fmt	<13,10,"Section:%u RVA=%x VSIZE=%x Name=%s">,ecx,[esi+PE_SECT.rva],[esi+PE_SECT.vsize],esi

7.6.3 MACROS with :REQ

The ":REQ" MARG type can be used to force MACRO parameter number check up to a specific argument position.

For example:

MACRO MTEST
	MARG	a1,a2,a3,a4 :req , a5

	mov	eax,a1
	mov	ebx,a2
	mov	ecx,a3
	mov	edx,a4

ENDM

On macro invocation this will check for 4 macro arguments. And because of this "a5" can be missing but "a4" can not.

7.6.4 Nested MACROS

You can define a macro inside another macro... and so on.

For example:

MACRO M2 
	MARG arg1,arg2

	mov	eax,arg1

	MACRO M3 
		MARG arg3,arg4
		mov	eax,arg3
		push	arg4
	ENDM
	
	push	eax
	push	arg2

ENDM

On first invocation of M2 only it's body will be generated and M3 will be defined but not expanded.

7.6.5 Using "&" in MACROS

In MACRO body the "&" character will trigger a MARG check and expansion even if found in the middle of another token or string.

For example

MACRO M4
	MARG arg1 arg2

in_label_&arg1:
	mov	eax,<&arg1>
	db	" In strings: &arg2",0

ENDM

7.6.7 Recursive Macros

A macro can invoke itself recursively.

For example:

MACRO MPUSH
	MARG	p1,p2,p3,p4

	#ifnb <&p1>
		push	p1
		MPUSH	p2,p3,p4
	#endif
ENDM 
Notes:

7.6.7 Using EXITM in macros

EXITM can be used to return a token from a MACRO expansion.

For example

MACRO RV
	MARG func, params

	invoke func,params

	; return something from macro
	exitm eax
ENDM

; later on in code
	...
	mov	ecx,RV GetModuleHandle
	invoke	ExitProcess, < RV GetModuleHandleA >
	push	RV GetModuleHandleA
	...

Note:

7.6.8 The REPT Macro

You can use REPT to repeat a series of instructions.

For example

	x = 7

	REPT 12
		shl	eax,x
		add	ecx,3
		x = x+1
	ENDM

7.6.9 The FOR Macro

You can use FOR to repeat a series of instructions for each item in a list.

Syntax:

	FOR {item} IN: {items list} {REV} DO
		{ for macro body }
	ENDM

Sol_Asm will expand the {for macro body} for each element in {items list} and will replace any occurrence of {item} in the {macro body} with current {items list} element.

The "REV" keyword is optional and if present then the {items list} will be parsed in reversed order.

FOR can be used to iterate the variable parameters of a MACRO.

For example

MACRO my_invoke
	MARG func :req, params :vararg

	FOR item IN: params REV  DO
		push   item
	ENDM

	call	func
ENDM 

The above sample will define your own INVOKE like macro and you can later on use it like this:

	my_invoke	My_Func,eax,0,1,"123",[ecx]

7.7 Conditional Assembly

You can conditionally eliminate a block of source code at compile time by using the following directives:

Directive Description
#ifdef {symbol} if symbol is defined
#ifndef {symbol} if symbol is not defined
#ifb {token} if token is blank
#ifnb {token} if token is not blank
#if_used {symbol} if symbol is used in code
#if_not_used {symbol} if symbol is not used in code
#if {condition} if condition is true

Syntax:


	#ifdef {symbol_name}

		; code block for true
		....
	#else
		; code block for false

	#endif

For example:


	;------------------------------------------------
	; this checks the command line /binary option
	;------------------------------------------------
	#ifdef /binary
		org	0B000h
		disp	0B000h
	#endif
	
	#if $ >= 512
		#echo "boot sector address overflow: %x", $
	#endif

Observe how command line options get auto promoted as EQU symbol and can be tested by #ifdef

#ifdef can be nested on multiple levels so the following example is valid also.

	#IFDEF INTEL
		mov	eax,1
		#IFDEF WIN32
			mov	esi,32h
			#ifdef	LUCKY
				mov	ecx,33h
			#else
				mov	ecx,11h
			#endif
		#ELSE
			mov	esi,16h
		#ENDIF
		mov	edi,88h
	#ELSE
		mov	eax,2

		#IFDEF WIN32
			mov	esi,32h
			#ifdef	LUCKY
				mov	ecx,33h
			#else
				mov	ecx,11h
			#endif
		#ELSE
			mov	esi,16h
		#ENDIF	
		mov	edi,77h
	#ENDIF

7.8 Runtime High Level .IF and friends

You can use runtime high level .IF .ELSEIF .ELSE .ENDIF constructs in SOL_ASM.

Sol_ASM will generate the needed compare, jump code and labels internally. This internal code generation is preformed much faster than a MACRO can do.

Syntax:


	.IF {operand1} {condition_a} {operand2}

		; code block for {condition_a} true
		....

	.ELSEIF  {operand3} {condition_b} {operand4}

		; code block for {condition_b} true
		...

	.ELSE
		; code block for all above conditions false
		...

	.ENDIF 

For example:

	.if [parse_mode] == 1
		.if [parse_status] == 1
			mov	ecx,1
		.elseif [parse_status] == 2
			mov	ecx,2
		.elseif [parse_status] <= 7
			mov	ecx,7
		.else
			mov	ecx,-1
		.endif
	.elseif [parse_mode] == 2
		mov	edx,2
	.elseif eax == swap "xor "
		mov	edx,7
	.else
		mov	edx,-1
	.endif	

Known condition operators are:

Operator Description Flag checked
"==" equal ZF = 1
"!=" not equal ZF = 0
"<" unsigned smaller CF = 1
">" unsigned greater (NBE)
"<=" smaller or equal (BE)
">=" greater or equal (NC)
"zero?" Z flag (Z)
"zero?" not Z (NZ)
"carry?" Carry (C)
"!carry?" not Carry (NC)
"sign?" S flag SF
"!sign?" not signed SF
Overflow? OF = 1 OF
!Overflow? OF = 0 OF
parity? P = 1 PF
!parity? P = 0 PF

Note:
Limitations:

7.8.1 Using multiple conditions

You can use multiple conditions in .IF like this:

Example

	.if ( eax == 1 .or. ecx == 2 ) .and. esi != 7
		...
	.elseif dl == "a" .or. dl == "b" .or. dl == "s"
		... 
	.endif
Note:

7.8.2 Using signed conditions

By default all comparations in a .IF are unsigned.
You can use signed conditions in .IF by prefixing the condition with the signed keyword like this:

Example

.if signed edx > = [edi + HTML_CTX.wnd_dy]
	; flag done		
	mov	eax,1
	ret
.endif

7.8.3 .REPEAT .UNTIL

You can use high level REPEAT ... UNTIL constructs. SolAsm will generate the needed code.

Syntax

	.REPEAT 
		{repeat body}
	.UNTIL {condition}	
Notes

Example

	mov	ecx,17
	.repeat
		mov	edx,0
		.repeat 
			inc	edx
		.until edx > 7

		dec	ecx
	.until ecx == 0

7.8.4 .WHILE .ENDW

You can use high level WHILE ... ENDW constructs. SolAsm will generate the needed code.

Syntax

	.WHILE {condition} 
		{while body}
	.ENDW	
Notes

Example

	mov	ecx,17
	.while ecx > 1
	
		mov	edx,0
		.while edx < 7 
			inc	edx
		.endw
		
		dec	ecx
	.endw

7.9 ENUMS

ENUM is a kind of auto generated EQU sequence. Sol_Asm will auto increment the values and will check for limits.

You can define ENUMS like this:

Syntax

	ENUM {enum_name},{start_value},{max_value}
		{enum name items}
	ENDE	

Example

ENUM Modes,77h,ffh
	MODE_1
	MODE_2
	MODE_3
	MODE_21
ENDE

Sol_ASM will generate: MODE_1 EQU 77h , MODE_2 EQU 78h ... and so on for each ENUM item in sequence and will check for limits.

Note:

7.10 DEFINE text equates

DEFINE creates symbolic constants for text or strings. It behaves like a kind of EQU for strings and tokens.

This allows you to:

Syntax

	DEFINE {symbolic_name},{text}

An alternative name for DEFINE is TEQU

Example

	define	text1	"planet earth"
	define	text2	< swap "ecx" >
	define	text3	ebx
	define	text4	[esi+4]
	define	text5	STRCUT has_ebx_inside,5,3

	define	and	xor

	...

.data
	my_stting	db	text1	; in fact "planet earth"
.code

	mov	eax,text2	; in fact mov eax,swap "ecx"
	mov	eax,text3	; in fact mov eax,ebx
	mov	eax,text4	; in fact mov eax,[esi+4]

	mov	eax,text5	; in fact mov eax,ebx		

	and	eax,eax		; in fact XOR eax,eax			
Notes

Textequ Types

Defined text equates have some subtle types attached:

7.11 STRING Functions

String functions allow you to operate on strings in text equates.

The folowing functions are available

Function Description Notes
STRCUT Extract a sub string from a string
STRADD Add two strings
STRLEN Obtain Length of string the result is a numeric token

7.11.1 STRCUT

STRCUT will extract a sub string from a source string

Syntax

	STRCUT {source},{start_pos},{length}

Example

 define	ebx1	STRCUT has_ebx_inside,5,3	; ebx		type token 
 define	ebx2	STRCUT "has_ebx_inside",6,3	; "ebx"		type string
 define	ebx3	STRCUT [ebx+ecx],1,3		; [ebx]		type ModRM

The result of STRCUT has the same type as the source

7.11.1 STRADD

STRADD will add two strings together.

Syntax

	STRADD {string1},{string2}

Example

 define	txt1	STRADD "planet"," earth"	; "planet earth" 	string 
 define	txt2	STRADD in,voke			; invoke		token
 define	txt3	STRADD [ebx],[+ecx]		; [ebx+ecx]		ModRM

The result of STRADD has the same type as string2

7.11.3 STRLEN

STRLEN will return the length of a string.

Syntax

	STRLEN {string1}

Example

 len1	equ	STRLEN "planet"	
 len2	equ	STRLEN invoke
 len3	equ	STRLEN [ebx+ecx]

 len4	equ	STRLEN STRADD "planet"," earth"	
 define	txt1	STRCUT "has_ebx_inside",6, STRLEN "ebx"
Notes

7.12 #ECHO

The #ECHO directive allows you to emit formated message text at compile time. This can be used to debug macros or inform user of compile stages.

Syntax

	#ECHO {format string},{arg1},{arg2},... 

Example

MY_EQU equ 1234
define my_str " this is a string message"

.code
 ...

 #echo "\n code end=%x section base=%x, my_equ=%u string=%s",$,$$$,MY_EQU,my_str 

Notes

As a format specificator you can use one of the folowing:

Format Description
%x Hexadecimal number
%u unsigned decimal number
%d signed decimal number
%s an ASCII null terminated string
\n new line (CR+LF)
\t TAB
%% the "%" ASCII char itself
\\ the "\" ASCII char itself

7.13 OPTION

The OPTION directive is used to setup compiler optional behaviour.

Syntax

	OPTION {option_type}, [ {option_value} ] 

The folowing options are available

Option Description
list_on activates listing output
list_off deactivates listing output
proc_align { value } setups alignment for PROC (default is 16 bytes)

7.14 #LOAD

This directive allows you to read a value from compiled code or data at compile time.

Syntax

	#LOAD {equ_name}, [byte/word/dword/qword] {address}  

For Example:

	my_db	db	1
	
	#load	x,byte my_db
	#echo " x=%x",x
Notes

7.15 #STORE

This directives allows you to write a value to compiled code or data at compile time.

Syntax

	#STORE {address}, [byte/word/dword/qword] {value}  

For Example:

	my_db	db	1
	
	#store	my_db, byte 55h
Notes

Chapter 8. Resource compiler

Sol_Asm does contain a mini resource compiler.

It can parse some RC scripts elements and can generate an "in memory" templates for them.

In resource scripts Sol_ASM does support C style hexadecimal constants.

8.1 Resource ID's

You can define a resource ID like this:

Syntax:

	#define		{ID value}

For Example:

	#define IDD_DLG1 1000
	#define IDC_BTN1 1001
	#define IDC_EDT1 1002
	#define IDC_BTN2 1003
Note:

8.2 Dialogs

You can define a DIALOG like this:

Syntax:

	{dialog_id} 	DIALOGEX {dlg_x},{dlg_y},{dlg_dx},{dlg_dy}
	CAPTION		{caption string}
	STYLE		{style value}
	BEGIN
		{ control definitions }
	END

You can define a CONTROL like this:

Syntax:

	CONTROL {caption},{id},{"class"},{flags},{x},{y},{dx},{dy},{flags_ex}

For Example:

#define IDD_DLG1 1000

#define IDC_BTN1 1001
#define IDC_EDT1 1002
#define IDC_BTN2 1003
#define IDC_STC1 1004

IDD_DLG1 	DIALOGEX 	57,7,258,158
CAPTION 	"Sol_Asm Dialog 01"
STYLE		0x10CF0000

BEGIN
 CONTROL "Save",	IDC_BTN1,"Button",	0x50010000,	134,114,50,13,	0x00000000
 CONTROL "Exit",	IDC_BTN2,"Button",	0x50010000,	196,112,42,15,	0x00000000
 CONTROL "Name",	IDC_STC1,"Static",	0x50000000,	12,24,22,8,	0x00000000
 CONTROL "Text Edit",	IDC_EDT1,"Edit",	0x50010000,	50,22,134,11,	0x00000200
END

8.3 MENUS

You can define a MENU like this:

Syntax:

	{menu_id} 	MENUEX 
	BEGIN
		POPUP {"text"},{id}

		BEGIN
			MENUITEM {"text"},{id}
		END
	END

For Example:

SEPARATOR EQU 0

#define IDR_MENU 	10000
#define IDM_File 	10001
#define IDM_File_Open 	10004
#define IDM_File_New 	10005
#define IDM_File_Exit 	10009
#define IDM_Edit	10002
#define IDM_Edit_Cut	10006
#define IDM_Edit_Copy	10007
#define IDM_Edit_Paste	10008

IDR_MENU MENUEX
BEGIN
	POPUP "File",IDM_File

	BEGIN
		MENUITEM "Open",IDM_File_Open
		MENUITEM "New",IDM_File_New
		MENUITEM SEPARATOR
		MENUITEM "Exit",IDM_File_Exit
	END

	POPUP "Edit",IDM_Edit
	BEGIN
		MENUITEM "Cut",IDM_Edit_Cut
		MENUITEM "Copy",IDM_Edit_Copy
		MENUITEM "Paste",IDM_Edit_Paste
	END	
END

8.4 Emit Compiled Resources

You can emit a compiled resource as a data item like this:

Syntax:

	EMIT_RSRC {resource_id}

For Example:

align 32

my_dialog:
	EMIT_RSRC IDD_DLG1

align 32

my_menu:

	EMIT_RSRC IDR_MENU	

and in your code you can write:

	...
   	invoke	DialogBoxIndirectParamA,[hInstance],my_dialog,0,Dlg_Proc,0

	...
	invoke	LoadMenuIndirectA,my_menu

Chapter 9. Listing

Sol_Asm can produce a listing file when the "-list" command line option is used.

Listing columns format:

{include_level} {macro_level} {flag} {address} {program text} {opcodes}

Include Level column

Shows the depth of include file nesting.

Macro level column

Shows the depth of macro expansion nesting

Flag column

It is an internal flag to Sol_Asm and changes often for debugging. Currently it shows if there is a need for a new pass to solve a symbol.

Address column

Shows the address for current line being assembled. For OBJ formats it shows the offset in section since the final address will be setup by the linker.

Program text column

Shows the program source text.

This includes:

Opcode Column

It shows the CPU opcodes or data generated by Sol_Asm for each source line as a series of hexadecimal bytes.

Opcode column is aligned to column 128 if possible and expands up to column 224.

If more opcodes are needed then a new row is generated. If more than 4 rows are needed then an ellipsis "..." is shown and further opcodes are not shown anymore.

Limitations:

Listing Example:

1 0 0 00401047	
1 0 0 00401047		;--------------------------
1 0 0 00401047		; make up a build date 
1 0 0 00401047		;--------------------------
1 0 0 00401047		mov	esi,build_time				BE 3C A4 42 00 
1 0 0 0040104C		
1 0 0 0040104C		xor	eax,eax					33 C0 
1 0 0 0040104E		xor	edx,edx					33 D2 
1 0 0 00401050		xor	ecx,ecx					33 C9 
1 0 0 00401052		
1 0 0 00401052		mov	ax,[esi + OS_TIME.year]			66 8B 46 00 
1 0 0 00401056		mov	cx,[esi + OS_TIME.month]		66 8B 4E 02 
1 0 0 0040105A		mov	dx,[esi + OS_TIME.day_of_month]		66 8B 56 06 
1 0 0 0040105E	
1 0 0 0040105E		invoke	Str_Printf,sz_tmp1,sz_fmt_bld1,eax,ecx,edx
1 0 0 0040105E	push edx  						52 
1 0 0 0040105F	push ecx  						51 
1 0 0 00401060	push eax  						50 
1 0 0 00401061	push sz_fmt_bld1  					68 69 A0 42 00 
1 0 0 00401066	push sz_tmp1  						68 00 16 43 00 
1 0 0 0040106B	call Str_Printf  					E8 B0 07 00 00 
1 0 0 00401070	add esp, 00000014h  					83 C4 14 

Appendix.1 Other issues

A1.1 Namespaces

Sol_Asm does use separated NAMESPACES for:

Because of this you can have a PROC with the same name as a STRUC but not two PROC's or two STRUC's with the same name.

However for now this is not under the control of the programmer and hence it is advised to avoid such coding practice because you can not control the order in witch SOl_ASM searches the separated namespaces.

It is my intention to provide a mechanism for controlling and defining namespaces to the user.

A1.2 System requirements

CPU

Sol ASM does require a 386 CPU as a minimum and does benefit form new advanced CPU's.

Memory

SOL_ASM pre allocates approximatively 24Mega bytes at startup.

Each section gets 1M at define time and that is eventually reallocated when needed.

Additional memory is allocate when needed for files, imports, macro's etc

OS

Sol asm was tested on WinXP, Solar OS and WinXP64 but it should also work on Win95, win98, win2k, win2003 and Vista

Starting from version 14.02 Sol_Asm also runs on Linux and on UNIX like OSes that can link Sol_Asm OBJ against a limited set of LIBC functions.

A version for Mac OS X is also available in ELF OBJ format. You can use Agner Fog's OBJCONV program to convert it to MACH-O and link to LIBC to obtain the executable on your Mac.

A1.3 Speed testing

Speed testing was performed on two big projects: Sol_Asm itself and Solar_OS.

Synthetic testing was performed on files with 10.000 or 100k PROC's

For Example:

Solar Assembler version 0.10.01
Copyright (C) 2004-2008 Bogdan Valentin Ontanu, All rights reserved.
Build on 2008_2_23  at 7:14:23

Assembling file: sol_asm2.asm
Assembler  pass: 1
Assembler  pass: 2
Assembler  pass: 3
Assembler  pass: 4
Assembler lines: 67866
Output    bytes: 192512
Assembler  time: 406 ms
---------------------------

4 pass x 67.866 lines = 271.464 lines in 406 ms --> 668.630 lines per second

For Example:

Solar Assembler version 0.10.01
Copyright (C) 2004-2008 Bogdan Valentin Ontanu, All rights reserved.
Build on 2008_2_23  at 7:14:23

Assembling file: system_32.asm
Assembler  pass: 1
Assembler  pass: 2
Assembler  pass: 3
Assembler lines: 111403
Output    bytes: 534016
Assembler  time: 578 ms
---------------------------

3 pass x 111.403 lines = 334.209 lines in 578 ms --> 578.216 lines per second

This are real projects with many PROC's, STRUC's, MACRO's and code.

Testing was performed on an laptop with an Intel Core 2 Duo CPU at 2Ghz and with 1G of RAM in WinXP 32.

Appendix.2 Known keywords

Warning:

Registers:

	

8 bit registers
-------------------------------              
"al"    "r8l"       "spl"
"cl"    "r9l"       "bpl"
"dl"    "r10l"      "sil"
"bl"    "r11l"      "dil"
"ah"    "r12l"
"ch"    "r13l"
"dh"    "r14l"
"bh"    "r15l"
              
16 bits registers
-------------------------------
"ax"    "r8w"      "es" 
"cx"    "r9w"      "cs" 
"dx"    "r10w"     "ss" 
"bx"    "r11w"     "ds" 
"sp"    "r12w"     "fs" 
"bp"    "r13w"     "gs" 
"si"    "r14w" 
"di"    "r15w" 
       
32 bits registers
-------------------------------
"eax"     "r8d"  
"ecx"     "r9d"  
"edx"     "r10d" 
"ebx"     "r11d" 
"esp"     "r12d" 
"ebp"     "r13d" 
"esi"     "r14d" 
"edi"     "r15d" 
       
64 bits registers
-------------------------------
"rax"     "r0"      "r8" 
"rcx"     "r1"      "r9" 
"rdx"     "r2"      "r10"
"rbx"     "r3"      "r11"
"rsp"     "r4"      "r12"
"rbp"     "r5"      "r13"
"rsi"     "r6"      "r14"
"rdi"     "r7"      "r15"
       

MMX registers       
--------------------------------
"mm0"  
"mm1"  
"mm2"  
"mm3"  
"mm4"  
"mm5"  
"mm6"  
"mm7"  
       
FPU registers       
--------------------------------
"st0"  
"st1"  
"st2"  
"st3"  
"st4"  
"st5"  
"st6"  
"st7"  
       
XMM registers       
--------------------------------
"xmm0" 
"xmm1" 
"xmm2" 
"xmm3" 
"xmm4" 
"xmm5" 
"xmm6" 
"xmm7" 
       
"xmm8" 
"xmm9" 
"xmm10"
"xmm11"
"xmm12"
"xmm13"
"xmm14"
"xmm15"

Instructions and directives


0	mov					 
1	lea					 
2	movzx					 
3	movsx					
4	bswap					 
5	xchg					 
6	xor					 
7	cmp					 
8	add					 
9	sub					 
10	or					 
11	and					 
12	sbb					 
13	adc					 
14	shl					 
15	shr					 
16	sar					 
17	rol					 
18	ror					 
19	rcl					 
20	rcr					 
21	sal					 
22	shld					 
23	shrd					 
24	test					 
25	not					 
26	neg					 
27	inc					 
28	dec					 
29	div					 
30	idiv					 
31	mul					 
32	imul					 
33	call					 
34	jmp					 
35	loop					 
36	ret					 
37	retn					 
38	int					 
39	int3					 
40	into					 
41	iret					 
42	iretd					 
43	hlt					 
44	leave					 
45	push					 
46	pushad					 
47	pusha					 
48	pushfd					 
49	pushf					 
50	pop					 
51	popad					 
52	popa					 
53	popfd					 
54	popf					 
55	jo					 
56	jno					 
57	jc					 
58	jnc					 
59	jb					 
60	jnb					 
61	jnae					 
62	jae					 
63	jz					 
64	jnz					 
65	je					 
66	jne					 
67	jbe					 
68	jnbe					 
69	jna					 
70	ja					 
71	js					 
72	jns					 
73	jpe					 
74	jpo					 
75	jl					 
76	jnl					 
77	jnge					 
78	jge					 
79	jle					 
80	jnle					 
81	jng					 
82	jg					 
83	rep					 
84	movsb					 
85	movsd					 
86	movsw					 
87	stosb					 
88	stosd					 
89	stosw					 
90	lodsb					 
91	lodsd					 
92	lodsw					 
93	scasb					 
94	scasd					 
95	nop					 
96	clc					 
97	stc					 
98	daa					 
99	das					 
100	cbw					 
101	cdq					 
102	cld					 
103	cmc					 
104	aaa					 
105	aas					 
106	lahf					 
107	lock					 
108	cpuid					 
109	rdtsc					 
110	aad					 
111	aam					 
112	out					 
113	in					 
114	finit					 
115	fninit					 
116	fld					 
117	fild					 
118	fst					 
119	fstp					 
120	fistp					 
121	fadd					 
122	faddp					 
123	fiadd					 
124	fsub					 
125	fisub					 
126	fdiv					 
127	fdivrp					 
128	fmul					 
129	fmulp					 
130	fimul					 
131	fxch					 
132	fucompp					 
133	fclex					 
134	fnclex					 
135	fnop					 
136	fchs					 
137	fabs					 
138	ftst					 
139	fxam					 
140	fld1					 
141	fldl2t					 
142	fldl2e					 
143	fldpi					 
144	fldlg2					 
145	fldln2					 
146	fldz					 
147	f2xm1					 
148	fyl2x					 
149	fptan					 
150	fpatan					 
151	fxtract					 
152	fprem1					 
153	fdecstp					 
154	fincstp					 
155	fprem					 
156	fyl2xp1					 
157	fsqrt					 
158	fsincos					 
159	frndint					 
160	fscale					 
161	fsin					 
162	fcos					 
163	emms					 
164	sidt					 
165	lidt					 
166	lgdt					 
167	sgdt					 
168	cli					 
169	sti					 
170	wbinvd					 
171	xlat					 
172	db					 
173	dw					 
174	dd					 
175	dq					 
176	dt					 
177	do					 
178	real4					 
179	real8					 
180	real10					 
181	rb					 
182	rw					 
183	rd					 
184	rq					 
185	rt					 
186	ro					 
187	rs					 
188	equ					 
189	align					 
190	proc					
191	uses					 
192	arg					 
193	local					 
194	endp					 
195	.if					 
196	.elseif					 
197	.else					 
198	.endif					 
199	#ifdef					 
200	#ifndef					 
201	#else					 
202	#endif					
203	#ifnb					 
204	#ifb					 
205	#if_used				 
206	#if_not_used				
207	macro					 
208	endm					 
209	exitm					 
210	rept					 
211	invoke					 
212	cinvoke					 
213	cdecl					 
214	stdcall					 
215	include					 
216	incbin					 
217	incfrom					 
218	import_dll				 
219	from_dll				 
220	import_lib				 
221	from_lib				 
222	import_func				 
223	import					 
224	extern					 
225	export					 
226	alias					 
227	struc					 
228	struct					 
229	ends					 
230	enum					 
231	ende					 
232	.entry					 
233	org					 
234	disp					 
235	.use16					 
236	.use32					 
237	.use64					 
238	section					 
239	class_code				 
240	class_data				 
241	class_imports				 
242	class_relocs				 
243	class_bss				 
244	class_exports				 
245	class_rsrc				 
246	#define					 
247	begin					 
248	end					 
249	dialogex				 
250	caption					 
251	style					 
252	control					 
253	menuex					 
254	popup					 
255	menuitem				 
256	emit_rsrc				 
257	.echo					 
258	$time					 

Appendix.3 Sample programs

A win32 sample application

;------------------------------------------------------
; Sol_Asm assembler sample
; Copyright (c) 2004-2008, Bogdan Valentin Ontanu
; All rights reserved.
;------------------------------------------------------

;----------------------------
; define imports
;----------------------------
from_dll 	kernel32.dll
	import	ExitProcess
	import	GetStdHandle

from_dll	user32.dll
	import	MessageBox alias MessageBoxA

;-------------------
; define sections
;-------------------
section "code" 		class_code
section "data"  	class_data
section "idata" 	class_imports


.data
	sz_message	db	"First Win32 PE application",0
	sz_title	db	"Sol_ASM",0

.code
	;------------------------
	; define entry point
	;------------------------
	.entry Start

Start:
	;-----------------------------
	; the classical message box
	;-----------------------------
	invoke	MessageBox, 0, sz_message, sz_title, 3

	;--------------------------
	; done here, exit nicely
	;--------------------------
	invoke	ExitProcess,0
	ret

Assuming the file in named: test_win32.asm and Sol_Asm is in path you can build this sample with the following command:

	sol_asm2  test_win32.asm test_win32.exe -pe32

The resulted executable should display a message box when run.