PureBasic Survival Guide V - Advanced PureBasic
 PureBasic Survival Guide a tutorial for using purebasic for windows 5.20b2 Part V - Advanced PureBasic v6.13 23.06.2013

5.1 Expression evaluation

Type conversion

(It looks like 4.50rc2 finally fixed the broken type conversion rules! Yippie!)

PureBasic does things different. Sometimes very different. With the arrival of v4.00 I tried one more time to understand it, and this is what came out...

1. expressions are evaluated from left to right
2. if a part of the expression contains an operator with a higher priority, that part is evaluated before continuing with the next part
3. the type used depends on the variables, functions and keywords used
4. some keywords / operators will change the way an expression is evaluated
5. types 'change upwards' in this sequence: long > quad > float > double
6. types never change 'downwards'
7. the types bytes and words are always converted to longs
8. PureBasic uses bankers' rounding when converting floating point to integer
9. dividing an integer by another integer does not change the type, ie. is rounded down

The rules above apply to all expressions, here are some samples...

Code:

a.l = <exp>
The destination variable a.l makes <exp> being evaluated as a long UNTIL sometthing in the expression would force a change in type.

Code:

a.l = 10*22+PeekQ(...)
First, a.l means a long, 10*22 would be processed as a long, PeekQ() would turn the type to a quad, finally the result is converted back to long to fit in a.l.

Code:

a.l = 10*22+Cos(...)
Again, a.l means the type is set to long, 10*22 would be processed as a long, Cos() would turn the type to a float, finally the result is converted back to long to fit in a.l.

Bankers' rounding

Frankly, I'm not sure anymore if PureBasic uses banker's rounding or not. My brain simply fails... just keep in mand that real world floating point numbers cannot be exactly represented by their binary counterpart inside the CPU (or the part inside the CPU called FPU that does the actual calculations) so perhaps it doesn't matter that much...

But, as of 4.51, there's still an inconsistency... It looks like the FPU returns bankers rounding whilst the compiler applies 'round half up' on numeric (non-variable) expressions upon compile time....

zero.f = 0
onehalf.f = 1.5
three.f = 3
four.f = 4
;
i.l = 1.5 + 3.0
Debug i                     ; compiler rounds up, result is 5
;
j.l = 1.5 + 4.0
Debug j                     ; compiler rounds up, result is 6
;
k.l = onehalf + three
Debug k                     ; fpu rounds down (bankers rounding), result  is 4
;
l.l = onehalf + four
Debug l                     ; fpu rounds up (bankers rounding), result is 6

Triggering type changes

It's important to realize here that for example a function like Cos() changes the used type to a floating point. Using a floating point number such as 3.0 would do the same thing, but dividing one integer by another integer does not! That's why you get different answers in the following example. Code:

z.f = 0
a.l = 2/3 + 2/3 + z.f
b.l = 2/3.0 + 2/3 + z.f
c.l = z.f + 2/3 + 2/3
d.l = z.f + 2/3 + 2/3 + 2/3
;
Debug z    ; 0
Debug a    ; 0
Debug b    ; 1
Debug c    ; 1
Debug d    ; 2
Let's have a good look at the above and start with a.l...
a.l = 2/3 + 2/3 + z.f
Dividing an integer by another integer does not change the type, so the expression stays in integer mode and thus is rounded down. 2/3 in integer is rounded down, ie. 0.6 becomes 0, which results in:
a.l = 0 + 0 + 0.0 = 0
Now the last addition of adding a float change the whole type to a float, but alas... it's too late, we've done al evaluations and zero will be the outcome.

Let's try the next line:

b.l = 2/3.0 + 2/3 + z.f
Here 2 is divived by 3.0. The 3.0 turns the whole expression into a floating point evaluation, so instead of rounding down we keep the value. As we're now in floating point mode, we'll evaluate all further expressions at least as floats, so any subsequent 2/3's will not be rounded down but kept is they are, which leads us to:
b.l = 0.66 + 0.66 + 0.0 = 1.22 = 1
When the result of the expression 1.22 is stored in the integer variable b.l it's going to use bankers rounding, 1.22 will be rounded down thus the result will be 1.

The next line:

c.l = z.f + 2/3 + 2/3
When parsed from left to right the first part of the expression encountered is a float, thus the whole expression type is changed upwards to a float. Subsequent parts stay in floating point mode, so the result would be:
c.l = 0.0 + 0.66 + 0.66 = 1.22 = 1
The next line:
c.l = z.f + 2/3 + 2/3
The final line shows the impact of bankers rounding again:
d.l = 0.0 + 0.66 + 0.66 + 0.66 = 1.88 = 2

Functions and function parameters

Code:

a.f = 10*22+Cos(z.f)
Here the a.f turns all into floats, so 10*22+Cos(...) would be processed as a float, finally the result is stored as a float. Note that not only the Cos() itself, but also the parameters of Cos() can force a typechange!

Code:

a.f = Cos(z.f)+10*22
b.f = Cos(z.d)+10*22
In the above the expression behind a.f is evaluated as a float, whilst the one behind b.f is evaluated as a double, when the .d is encountered, the whole type is again changed upwards from regular floating point .f to double precision floating point .d.

Type forcing

Code:

a.f = 10*22
Again, a.f means a float so 10*22 would be processed as a float. This also means you can do *some* typeforcing, as following:

Code:

z.d = 0
a.l = z.d + 12000*Cos(x.f)
See? a.l means long, so we start in 'long' mode, we then run into z.d which means we continue in double mode, so the following 12000*Cos(x) is evaluated in double mode, before being turned back into a long. This also allows you to do some optimising, if you have an expression which contains different types, sorting them can speed up things:

Code:

a.l = 12000 * 22 + 10 * t.l + 10 * 22 * Cos(x)
b.l = Cos(x) * 10 * 22 + 12000 * 22 + 10 * t.l
The expression behind a.l is faster than the one behind b.l. On an Amd64-3000 the difference was roughly 6% with the debugger switched on.

Code:

a.f = 3
b.l = 1 + 2 / a.f
Notice the mathematic priority! It's NOT going to do something like ( 1 + 2 ) / 3 but it does 1 + ( 2 / 3 ). However, it will not start with typing a float as this still works from left to right. The next sample clearly proves that. Notice the different variable outputs...

Code:

; survival guide 5_1_100 type conversion
; pb 4.40b3
;
a.f = 3
b.l = 2/3 + 2/a + 2/a
c.l = 2/a + 2/a + 2/a
d.l = 2/a + 2/3 + 2/3
e.l = 2/3 + 2/3 + 2/a
;
Debug a
Debug b
Debug c
Debug d
Debug e
In the code above 2/3 would not change the type into a float, 2/a would. Here's how the calculation would actually go:
; survival guide 5_1_110 type conversion
; pb 4.50rc2
;
a.f = 3
b.l = 2/3 + 2/a + 2/a     ; b = 0 + 0.66 + 0.66 = 1.33 = 1
c.l = 2/a + 2/a + 2/a     ; c = 0.66 + 0.66 + 0.66 = 2
d.l = 2/a + 2/3 + 2/3     ; d = 0.66 + 0.66 + 0.66 = 2
e.l = 2/3 + 2/3 + 2/a     ; e = 0 + 0 + 0.66 = 0.66 = 1
;
Debug a                   ; 3.0
Debug b                   ; 1
Debug c                   ; 2
Debug d                   ; 2
Debug e                   ; 1
Here is another example, using a float variable to force the type to float.
a.f = 0
b.l = 2/3 + 2/3 + 2/3 + 2/3
c.l = a.f + 2/3 + 2/3 + 2/3 + 2/3
And one more (all these samples were the result of trying to figure out how things work(ed) or were (are?) broken...
; survival guide 5_1_130 type conversion
; pb 4.40b3
;
a.f = 3.0
;
f.l = 2/3 + 2/3 + 2/3
g.l = 2/3 + 2/3 + 2/3.0
h.l = 2/3 + 2/3 + 2/a
;
Debug f
Debug g
Debug h
;
i.l = 2/3 + 2/3.0 + 2/3
j.l = 2/3 + 2/a + 2/3
;
Debug i
Debug j
;
k.l = 2/3.0 + 2/3 + 2/3
l.l = 2/a + 2/3 + 2/3
;
Debug k
Debug l

Evaluations in expressions

PureBasic is NOT C(++) and does NOT allow things like:

a = a + 5*(b=c) + 6*(c=d)
Previously, the compiler would not throw an error. As of 5.11 it does. If you're desperate for this coding style use Bool()...
a = a + 5*(Bool(b=c)) + 6*(Bool(c=d))
... or go the long way (which is easier to read and probably executes just as fast, if not faster):
If b = c
a = a+5
EndIf
If c = d
a = a+6
EndIf
Note that expressions are evaluated left to right, but Procedure() parameters are passed on from right to left! See here for an example.

5.2 Unicode

If you're only using build-in string functions and do not touch the strings in memory, don't care how things are written to files and don't have to deal with other applications that do or do not use UniCode, well, then this section is not for you. Go away :-)

The rest of us, stick around. This is important.

These will make sense after a while :-) Especially the first and the last one are great! Now first two quite important hints...
• If you want to enter Unicode characters in the editor, you have to switch this option on! File / Preferences / Editor / Use UTF8.
• If you want your programs to use Unicode, you have to switch this option on! Compiler / Compiler Options / Create Unicode Executable.
Done that? Okay, I'm stepping into the danger zone here... I'm not the expert on Unicode or similar things, and I may go entirely wrong (sorry). So feel free to correct me if I made any mistakes. Pay close attention to the samples in this section, and the way you should run them (with Unicode on or off). Good luck!

Multi byte

... is the all-encompassing name for any and all systems using more than one byte to encode a character. Where good old ASCII used 8 bits / 1 byte for a maximum of 256 characters, you can now encode more than 256 characters. Note that some multi-byte encoding schemes are called... multi-byte! Keeps things wonderfully clear, doesn't it? :-)

Some well known (yeah, right) encoding schemes and character sets:

• UTF8, UTF16, UTF32
• UCS2, UCS4
• DBCS
• DWCS
• BIG5
• ASCII
• UNICODE

Unicode

I'm going to keep things simple, and not entirely correct, as (from our simple point of view) it doesn't matter much. Unicode is a collection of characters. Period. That's all there is to it. It encompasses many different characters from many different languages. There are a few issues with Unicode (see http://www.jbrowse.com/text for nitty gritty details. However, for most of us, Unicode is good enough.

Notice that you may not have loaded all 'characters' on your Windows box. So, Unicode works, but it doesn't display properly. In those cases you may have to add sets of characters, especially for arabic and asian languages this may be the case.

Unicode is the character set. However, we can ENCODE it in different ways. That's where the real fun starts.

Wide character

Windows does Unicode. Well it doesn't. No it does. Oh hell who knows :-) This is what I think it is, but it may be wrong... The term 'wide character' actually stands for '2 bytes'. That's obviously not the same as 'Unicode'...

Windows does the following 'wide character' encodings to represent Unicode characters: (Remember: Unicode is the character set, not the encoding mechanism.)

• Windows 9x / ME: DBCS (ain't got a clue)
• Windows NT / 2K: UCS2 (DWCS?)
• Windows XP: UCS2 (though some do not agree on this, and they may be right :-))
Confused yet? No? Darn, I have to try harder :-)

When you're dealing with Windows (XP), the terms wide character, UCS2, DWCS, Unicode and UTF16 are often interchanged and pretty much mean the same (to average Joe The Windows Programmer). Yet they actually all mean something different...

• ASCII - 1 byte for each character, cannot be used to represent Unicode text
• DBCS - 2 bytes, ain't got a clue what it does, but it isn't supported by PureBasic
• DWCS - 2 bytes, probably the same as UCS2, can do a subset of Unicode (as it isn't full UTF16)
• UCS2 - 2 bytes, can do a subset of Unicode (as it isn't full UTF16)
• UTF16 - 2 or 4 bytes for each character, can do full Unicode
• UTF8 - 1 to 6 bytes, can do full Unicode
XP / Vista may support UTF16, or it may not. The jury is still out on that one. I personally have a strong suspicion it's not (full) UTF16. I've yet to run into some reference that would indicate 'Unicode' programs under Windows have variable character length in memory.

NT and WIndows 2K are questionable. To err on the safe side I would not use Unicode on these boxes.

And as for the old 16 bit software: stick to non-Unicode when writing programs for Windows 9x / Me. Unicode on those platforms is definitely unsupported.

PureBasic's (Window's) Unicode

To turn your program into Unicode, all you have to do is use the /UNICODE flag when calling the compiler (or just tick the box under Compiler Options), and all regular functions that use strings suddenly move to Unicode... now that's easy :-) However, UniCode has some impact on the strings in memory and in files. That may deserve some attention...

PureBasic Unicode strings in memory

• regular strings are encoded in (a variant on) UTF16 (XP) or UCS2 (NT / 2K)
• each character takes 2 bytes
• strings can contain zeroes

PureBasic Unicode strings in files.

• regular strings are encoded in UTF8
• each character takes 1 to 6 bytes
• ASCII characters 0..127 are represented by one byte
• all other characters take multiple bytes
• UTF8 characters can not contain zeroes

Testing (non) UniCode

Ready? Let's start with an example... We'll find the char type .c very useful when it comes to Unicode. Here's a bit of code that shows string behaviour. Run it once in Unicode mode, and once in Ascii mode. The option to toggle between the two you'll find under Compiler / Compiler Options / Create Unicode Executable. Run the code below with Unicode switched on, and Unicode switched off.

; survival guide 5_2_200 char
; pb 4.40b3
;
;
Structure CHAR                     ; not yet declared in v4.00, so here i'll do it
c.c
EndStructure
;
a.s = "test"
l = Len(a)                         ; always reports 4, the number of characters in the string
*p_c.CHAR = @a
;
; here's one way to figure out the size of a char
;
c_size = StringByteLength(" ")     ; returns size of char, could be either 1 or 2
;
; or use the compiler constant #pb_compiler_unicode
;
If #PB_Compiler_Unicode
c_size = 2
Else
c_size = 1
EndIf
;
b.s = ""
For n = 1 To l
b = b+Chr(*p_c\c)               ; build up a regular string using contents in memory
*p_c = *p_c+c_size               ; move the pointer in bytes per character
Next n
;
;
*p_b.BYTE = @a                     ; start again at the beginning
h.s = ""
For n = 1 To (l+1)*c_size          ; from start to end including terminating zero
h = h+"$"+Hex(*p_b\b)+" " *p_b = *p_b+1 ; move the pointer in bytes regardless of mode Next n AddGadgetItem(1,-1,"in hex: "+h) ; Repeat Until WaitWindowEvent() = #PB_Event_CloseWindow As you can see, the above will work in Unicode mode as well as in regular mode. Instead of using .BYTE structures or PeekA() we will have to switch to .CHAR or PeekC(), as well as increase counters with either 1 or 2 depending on Unicode mode... as each 'character' can be either 1 (non-Unicode) or 2 (Unicode) bytes long. There's also a little compiler constant that helps us here: #PB_Compiler_Unicode is 1 if the program is compiled in Unicode mode, and 0 if it is compiled in non-Unicode or ASCII mode. More on: Memory usage Using Unicode affects all string functions, and the way strings are stored in memory. In non-Unicode mode: a.s = "test" ; will take 5 bytes:$74 $65$73 $74$00
The same in 'Unicode' (actually UCS2):
a.s = "test"                       ; will take 10 bytes: $74$00 $65$00 $73$00 $74$00 $00$00
The same applies to fixed length strings as well. If you use arrays of bytes (see here) in structures make sure you reserve enough bytes.

A file in UTF8 looks pretty much like a regular Ascii file, unless there are special characters in there which do not exist in the regular Ascii character set. In these cases a character in UTF8 can take up more than a single byte (theoretically up to 6 bytes).

A Unicode string in memory looks pretty much like a regular Ascii string, except it is 'interleaved' with zeroes. 'Okay' in Unicode would take up 10 bytes $4F 00 6B 00 61 00 79 00 00 00. The same string 'Okay' in Ascii would take up 5 bytes$ 4F 6B 61 79 00. Remember strings are zero terminated!

PureBasic has a dedicated command to deduct how much space a string takes in memory:

a.s = "test"
Debug StringByteLength(a)
Debug StringByteLength(a,#PB_Ascii)
Debug StringByteLength(a,#PB_UTF8)
Debug StringByteLength(a,#PB_Unicode)
Note that it doesn't calculate the space for terminating zeroes, in Ascii mode there is one terminating zero, in Unicode mode there are two.

Flags for PeekS(), WriteString() etc.

You can specify the string format with specific parameters.

In non-Unicode mode:

a.s = "test"                         ; takes up 5 bytes, 4 for 'test' and 1 for a zero
WriteString(1,a)                     ; takes up 4 bytes, 4 for 'test'
WriteString(1,a,#PB_Ascii)           ; takes up 4 bytes
WriteString(1,a,#PB_UTF8)            ; takes up 4 bytes (no special characters)
WriteString(1,a,#PB_Unicode)         ; takes up 16 bytes
PokeS(@a,"x",1)                      ; writes 2 bytes, 1 for 'x' and 1 for a zero
PokeS(@a,"x",1,#PB_Ascii)            ; writes 2 bytes, 1 for 'x' and 1 for a zero
PokeS(@a,"x",1,#PB_UTF8)             ; writes 2 bytes, 1 for 'x' and 1 for a zero (no special characters)
PokeS(@a,"x",1,#PB_Unicode)          ; writes 4 bytes, 2 for 'x' and 2 for 2 zeroes
In Unicode mode:
a.s = "test"                         ; takes up 10 bytes, 8 for 'test' and 2 for two zeroes
WriteString(1,a)                     ; takes up 4 bytes, 4 for 'test'
WriteString(1,a,#PB_Ascii)           ; takes up 4 bytes
WriteString(1,a,#PB_UTF8)            ; takes up 4 bytes (no special characters)
WriteString(1,a,#PB_Unicode)         ; takes up 16 bytes
PokeS(@a,"x",1)                      ; writes 4 bytes, 2 for 'x' and 2 for 2 zeroes
PokeS(@a,"x",1,#PB_Ascii)            ; writes 2 bytes, 1 for 'x' and 1 for a zero
PokeS(@a,"x",1,#PB_UTF8)             ; writes 2 bytes, 1 for 'x' and 1 for a zero (no special characters)
PokeS(@a,"x",1,#PB_Unicode)          ; writes 4 bytes, 2 for 'x' and 2 for 2 zeroes
This applies to WriteString(), ReadString(), PokeS()and PeekS(). Check the helpfile for more details.

5.3 WinAPI and DLL's

There's again terminology here that may be confusing. Let's clear that first...

WinAPI

PureBasic allows calling WIndows routines (the so called WinAPI, or 'windows application programming interface) directly. A number of these WinAPI calls are recognized automatically.

#SM_CMONITORS = 80
Debug GetSystemMetrics_(#SM_CMONITORS)
WinAPI calls can be recognized by the underscore following the call, as shown above. Move the cursor over the GetSystemMetrics_() part and press F1. If you have installed WIN32.HLP or the Windows Platform SDK then you will see that the function is actually called GetSystemMetrics without the underscore.

Using WinAPI, you can do everything that is possible with Windows. The regular PureBasic gadget commands 'hide' the sometimes unfriendly calls to Windows from our eyes. I may be exploring a little WinAPI left and right... but for the moment stay with me, okay? Oh... you already left... :-)

The nice thing about PureBasic is that it automatically supports a large number of WinAPI calls, simply by using the API name followed by an underscore. In those cases, we do not have to 'open' a 'DLL'... PureBasic did that already... If a WinAPI function is not recognized we can always open and use them ourselves, if we know where to look and what to do.

PureBasic handles Unicode vs. non-unicode for 'recognized' WinAPI functions. Try this in Unicodeon and off:

MessageBox_(0,"hello world","winapi",0)
The interesting thing is: there is no WinAPI function called MessageBox... there's one called MessageBoxA and one called MessageBoxW... But that's a different story... or is it?

No, it isn't. In Unicode mode PureBasic turns your MessageBox_() call into a MessageBoxW() call, whilst in non-Unicode mode the MessageBoxA() function is called. But only for all functions that PureBasic is natively aware of, ie. all standard Windows functions. If you use external (third party) DLL's this may not work and you may have to figure out yourself which function you need to call.

DLL

Windows comes with a number of DLL's, and many applications bring their own as well. DLL's are files containing a number of functions that you can call from within your own program. The big advantage is obvious: sometimes you won't have to make up your own code as may be the right DLL with the right function available to you. Using DLL's has one big advantage: DLL's look the same to every language, to every program.

Listing all functions within a DLL

It may not be entirely useful, but sometimes it helps to know what's in a DLL... because either documentation is incorrect, or you're one of those that likes to use undocumented features...

With the arrival of v4.00 it is now possible to use Unicode. As some enterprising developers may already know, many WinApi functions do exist in two tastes: Ascii and Wide. This may not even be documented (as I found out with the WINTAB32.DLL, where all documentation never mentions the existence of A and W variations).

Here's an example: (run with Unicode off)

; survival guide 5_3_100 dll functions
; pb 4.40b3

; works on 4.31 and 4.40b3, doesn't work on 4.40b1

user32_nr.i = 1
user32_h.i = OpenLibrary(user32_nr,"USER32.DLL")

If user32_h
;
ExamineLibraryFunctions(user32_nr)
While NextLibraryFunction()>0
Debug LibraryFunctionName()
Wend
EndIf

CloseLibrary(user_nr)
Run the code above and you will see a large number of functions with an A and a W variant.

There are different ways to use a function, for example:

More variants can be found in the PureBasic documentation.

CallFunction

Here's an example of CallFunction, which we could use for, for example, the WinSock functions. First open the dll...

ws_winsock_nr = 1
ws_winsock_h = OpenLibrary(ws_winsock_nr,"WSOCK32.DLL")
Once opened we find and call the function by its name...
ws_retval = CallFunction(ws_winsock_nr, "WSAStartup")
When we are done, we are supposed to close the library...
CloseLibrary(ws_winsock_nr)

CallFunctionFast

The CallFunction method above has as a disadvantage: the 'lookup' of the function name at runtime will take some time. To speed things up a little, we could also store the address of the function:

*ws_wsastartup = GetFunction(ws_winsock_pbh,"WSAStartup")
GetFunction() checks if the function exists, and if so returned an address. Use that address as a parameter for CallFunctionFast() and pass the appropriate parameters.
ws_retval = CallFunctionFast(*ws_wsastartup,$101,@ws_wsadata) Let's try that again: open a DLL, find a function, call that function. ; survival guide 5_3_400 callfunctionfast ; pb 4.40b3 non-unicode ; OpenWindow(1,100,100,100,100,"window") ; user32_nr = 1 user32_h = OpenLibrary(user32_nr,"USER32.DLL") p_messageboxa = GetFunction(user32_nr,"MessageBoxA") CallFunctionFast(p_messageboxa,0,@"hello world 1",@"test",0) If you run the code above in Unicode mode, it won't work properly which is logical as we're calling the Asci version of the function. More about that later, as we are working our way towards... Prototypes! Note: as of 4.40b1 you can only pass 'integers' using CallFunctionFast(). If you want to pass strings directly as parameters, you need to add the '@' character to pass the memory location of that string. CallFunctionFast(p_messageboxa,0,@"hello world 1",@"test",0) ; works on 4.40b1 and later CallFunctionFast(p_messageboxa,0,"hello world 1","test",0) ; works only on versions before 4.40b1 Prototypes and pseudotypes Pseudotypes and prototypes... No. They're not the sleezebags hanging around your local grocery store, nor some of your aspiring wannabe-manager colleagues, or the legally blonde over there (those are more stereotypes :-))... But then, what are they? And why would we use them? Let's do things the wrong way around, we start with how and then why... Prototypes The easiest way to think of prototypes is as 'wrappers' that turn WinAPI or DLL functions into regular procedures. In other words, instead of using a CallFunction or CallFunctionFast we call the function as we would call a procedure. And whilst doing that, we can do some funny things... First the basics with the WinAPI function MessageBoxA(). In a normal direct call, we would open the DLL, look for it, then call it. Try the next snippet with Unicode disabled. ; survival guide 5_3_500 prototypes ; pb 4.40b3 non-unicode ; OpenWindow(1,100,100,100,100,"window") ; OpenLibrary(0, "user32.dll") *mb2 = GetFunction(0,"MessageBoxA") ; CallFunctionFast(*mb2,0,@"so much work for hello world 2",@"callfunctionfast",0) We could also 'wrap it', sort of, so we could use it just like the build-in underscore variant: ; survival guide 5_3_510 prototypes ; pb 4.40b3 non-unicode ; OpenWindow(1,100,100,100,100,"window") ; OpenLibrary(0, "user32.dll") Prototype mb4(w_parent_h,message.s,title.s,flags) Global mb4.mb4 = GetFunction(0,"MessageBoxA") ; mb4(0,"so much work for hello world 3","prototype string",0) So, after defining it as a 'prototype' and creating a variable with that type, we can call mb2() as if it was a PureBasic command, one of our own procedures, or a build-in WinAPI call (using the underscore). Duh? Duh. Okay... that didn't make much sense, or did it? Think of it as another way to write a procedure. The following is a rough equivalent (but not the same as we haven't done pseudotypes yet)... ; survival guide 5_3_520 prototypes ; pb 4.40b3 non-unicode ; OpenWindow(1,100,100,100,100,"window") ; OpenLibrary(0, "user32.dll") Global *mb6 = GetFunction(0,"MessageBoxA") ; Procedure mb6(w_parent_h,message.s,title.s,flags) CallFunctionFast(*mb6,w_parent_h,@message,@title,flags) EndProcedure ; mb6(0,"so much work for hello world 4","procedure",0) Again, it's only a rough visualization, because now it's time for... Pseudotypes Pseudotypes turn prototypes into the real contenders for the sexiest code on earth. Well. Not exactly. Sort of. If at all. I suppose. (It's been a long day, can't you tell?) A pseudotype tells the compiler to convert the given parameter to another parameter, but it only works in combination with prototypes. The following code would only work with Ascii parameters. If we would run it in Unicode mode, all sorts of unexpected things could happen: ; survival guide 5_3_600 psuedotypes ; pb 4.40b3 non-unicode ; OpenWindow(1,100,100,100,100,"window") ; OpenLibrary(0, "user32.dll") Prototype mb4(w_parent_h,message.s,title.s,flags) Global mb4.mb4 = GetFunction(0,"MessageBoxA") ; mb4(0,"so much work for hello world 5","prototype string",0) Obviously you'd say that we could use a Unicode approach when running in Unicode mode, and a regular Ascii approach when not in Unicode mode. Hmmm. Now what about DLL's that only provide one flavour, either Unicode or non-Unicode? Well, that's easy. We let PureBasic do the work for us. ; survival guide 5_3_610 psuedotypes ; pb 4.40b3 non-unicode ; OpenWindow(1,100,100,100,100,"window") ; OpenLibrary(0, "user32.dll") Prototype mb5(w_parent_h,message.p-ascii,title.p-ascii,flags) Global mb5.mb5 = GetFunction(0,"MessageBoxA") ; mb5(0,"so much work for hello world 6","prototype pseudotype",0) In the above, we tell PureBasic that the parameters are strings, and they have to be converted to Ascii, no matter what. So the above will run in Unicode as well as non-Unicode mode! There are three types of pseudotypes: • p-ascii - to force the parameter to 8 bit Ascii • p-unicode - to force the parameter to 16 bit Unicode (DWCS) • p-bstr - to force the parameter to BSTR format (Visual Basic) To wrap this up, here's a sample with different approaches: ; survival guide 5_3_700 prototypes and pseudotypes ; pb 4.40b3 non-unicode ; OpenWindow(1,100,100,100,100,"window") ; OpenLibrary(0, "user32.dll") ; ; example 1: build-in purebasic winapi... doesn't work for non-windows ; functions for which you have to the dll yourself... ; MessageBox_(0,"so much work for hello world 1","built-in underscore",0) ; ; example 2: opening it as if it was any other dll function: ; *mb2 = GetFunction(0,"MessageBoxA") ; CallFunctionFast(*mb2,0,@"so much work for hello world 2",@"callfunctionfast",0) ; ; example 3: we define a function called mb1() that accepts four ; parameters, we are actually wrapping the MessageBoxA function ; ; note: purebasic enforces parameter types a little stronger now, so we ; have to pass the proper variable type ; Prototype mb3(w_parent_h,*message,*title,flags) Global mb3.mb3 = GetFunction(0,"MessageBoxA") ; mb3(0,@"so much work for hello world 3",@"prototype pointer",0) ; ; example 4: purebasic automatically translates strings into pointers ; (to these strings) when doing api calls, so although the function ; MessageBoxA needs pointers to strings we could define the function ; like this with strings as parameters ; Prototype mb4(w_parent_h,message.s,title.s,flags) Global mb4.mb4 = GetFunction(0,"MessageBoxA") ; mb4(0,"so much work for hello world 4","prototype string",0) ; ; example 5: third example, using pseudotypes, a pseudotype 'changes' ; the type of parameter to another type, so you sort of pass the wrong ; one, and tadaaa, it's suddenly the right one ; Prototype mb5(w_parent_h,message.p-ascii,title.p-ascii,flags) Global mb5.mb5 = GetFunction(0,"MessageBoxA") ; mb5(0,"so much work for hello world 5","prototype pseudotype",0) ; ; example 6: a throwback to the old days, you don't want to be seen ; doing this as it is sooooo not-coooooool... (but hey, it works :-)) ; Global *mb6 = GetFunction(0,"MessageBoxA") ; Procedure mb6(w_parent_h,message.s,title.s,flags) CallFunctionFast(*mb6,w_parent_h,@message,@title,flags) EndProcedure ; mb6(0,"so much work for hello world 6","procedure",0) Why Prototypes? Although at first rather complex, prototypes and pseudotypes turn out to be rather simple. But why go through all the effort for just an alternative way to call functions? Well... Prototypes allow you a few things that CallFunctionFast etc. cannot: • it is easier to use non-Unicode functions in Unicode programs and vice verse • you can check on the number of parameters passed to the function • you can return and handle other types of variables, such as floats, doubles, or quads 5.4 Structures and pointers Structures and pointers allow 'fancy' stuff, and for serious WinAPI programming they are essential. Usage can be quite complex, so don't worry if you don't get it right the first time. In fact, if you don't need them, you might not even bother with them... Yet once you master them, they become an essential component towards clean programming. Structures A structure is a variable type that allows us to combine many different variables and types and treat them as if they were one single thing. (Go on, read that line again.) As a structure is fixed, it provides a stable and implicitly documented way to store, retrieve and exchange information, especially in combination with a pointer. On a 32 bit platform: ; survival guide 5_4_100 structures ; pb 4.40b3 win32 ; Structure sample a.l ; +0 4 bytes b.w ; +4 2 bytes c.b ; +6 1 byte d.f ; +7 4 bytes e.s ; +11 4 bytes (8 bytes in win64) f.b[10] ; +15 10 bytes EndStructure ; x.sample ; x\a = 1 x\b = 1 x\c = 1 x\d = 1.1 x\e = "test" PokeS(@x\e,"test") ; Debug SizeOf(sample) ; size of the structure Debug SizeOf(x) ; size of the typed variable x.sample Debug OffsetOf(sample\d) ; relative position of the field \d ; Debug OffsetOf(x\d) ; OffsetOf() does NOT work on the variable itself, use the structure name First we defined the structure and all the fields it contains. Then we created a variable and 'typed' it 'sample'. Now we can fill in any of the parts of the structure. On a 64 bits platform e.s would take 8 bytes, see strings in strucures. The first four bytes contain the long for the x\a field, then 2 bytes for a word for x\b, 1 for a byte for x\c, 4 for a float for x\d, etcetera.. I've listed above how much space each variable type takes, and how far from the beginning of the structure it is stored. The length of a structure can be retrieved using SizeOf() with either the structure name or the variable name. A structure can contain bytes, words, longs, strings, floats, or even other structures or pointers. (see here for more details on pointers in structures). SizeOf() and OffsetOf() The command SizeOf() returns the size of a structure or variable (type). If a structure is 25 bytes long, then a variable typed with that same structure is also 25 bytes long. Makes sense, doesn't it? :-) See the little sample code above. You can use SizeOf() on both the 'x.sample' or 'sample' itself. The command OffsetOf() tells you how many bytes from the start of the structure a certain field is located. In the example above, the field 'd' should be located 7 bytes after the start of the structure, so OffsetOf(sampel\d) returns 7. OffsetOf() does NOT work on the typed variable, only on the structure itself. (Which is why I commented out the last line in the example above.) ; survival guide 5_4_200 structures ; pb 4.40b3 win32 non-unicode ; Structure xx ; a structure *p.LONG ; +0 4 bytes for a pointer to a long *q.STRING ; +4 4 bytes for a pointer to a string l.l ; +8 4 bytes for a long z.s ; +12 4 bytes for a zero terminated string f.s{10} ; +16 10 bytes for a fixed length string c.c[20] ; +26 20 bytes for an array of chars EndStructure ; Debug SizeOf(xx) ; the whole structure is 46 bytes long ; Debug OffsetOf(xx\p) Debug OffsetOf(xx\q) Debug OffsetOf(xx\l) Debug OffsetOf(xx\z) Debug OffsetOf(xx\f) Debug OffsetOf(xx\c) Running exactly the same code under a 64 bits version of Windows with Unicode on would result in the same code with a different structure size: ; survival guide 5_4_210 structures ; pb 4.40b3 win64 unicode ; Structure xx ; a structure *p.LONG ; +0 8 bytes for a pointer to a long *q.STRING ; +8 8 bytes for a pointer to a string l.l ; +16 4 bytes for a long z.s ; +20 8 bytes for a zero terminated string f.s{10} ; +28 20 bytes for a fixed length string c.c[20] ; +48 40 bytes for an array of chars EndStructure ; Debug SizeOf(xx) ; the whole structure is 88 bytes long ; Debug OffsetOf(xx\p) Debug OffsetOf(xx\q) Debug OffsetOf(xx\l) Debug OffsetOf(xx\z) Debug OffsetOf(xx\f) Debug OffsetOf(xx\c) Arrays in structures You can include arrays in structures, but they will have a fixed length, and it is not possible to change their size after the structure has been build using ReDim(). Arrays in structs use a different type of brackets than regular arrays... ; win32 ; e.b[10] ; takes 10 bytes f.s[10] ; takes 40 bytes (10 pointers to strings, see below) g.s{10}[10] ; takes 100 bytes OR 200 bytes (depending on unicode mode, see here) h.q[10,10] ; takes 10x10x4 = 400 bytes (an array of 10 by 10 quads) Strings in structures Strings need a little more attention when used in structures. All regular variable types are stored in memory as 'one block' of data. Let's take the variable 'x' from the sample above. It's located in memory at: address = @x You will notice that there are only 4 bytes (under 32 bits) or 8 bytes (under 64 bits) reserved for a string, regardless of the string length. That is because the string itself is not stored inside the structure, but a pointer towards it. And, you guessed it, in 32 bit Windows such a pointer takes 4 bytes (32 bits) whilst in 64 bits Windows it's (insert appropriate drumroll) 8 bytes (64 bits). Most languages do it differently, they store a string directly in memory as a 'byte array'. (They simply have no clue what strings are, the poor bastards...) This is also possible in PureBasic. We simply reserve some space (as much as we need, don't forget the space for any terminating zeroes)... .. e.b[10] .. ... reserves 10 bytes in the struct. We could write some data in there using PokeS(). We can read it using PeekS(). The space above could be used to store a 10 character string in non-Unicode mode, or a 5 character string in Unicode mode. Remember, when using PokeS() it writes the terminating zero as well, thus leaving us effectively 9 characters to store in non-Unicode mode using PokeS(), or 4 characters in Unicode mode! We could use a fixed length string or an array of chars as wel, and thus avoid the Unicode character size issue. How much this they would occuppy would obviously then depend on the Unicode mode. If you're not doing Unicode, one character will be one byte. .. x.b[10] ; 10 bytes whatever the mode, 5 characters in non-unicode, 10 in unicode y.c[10] ; 10 bytes in non unicode, 20 bytes in unicode, always 10 characters z.s{10} ; 10 bytes in non unicode, 20 bytes in unicode, always 10 characters .. Two more examples of the differences between fixed length strings and regular strings: ; survival guide 5_4_300 zero terminated strings ; pb 4.40b3 ; Structure xx i.i ; +0 4 bytes (32 bits) or 8 bytes (64 bits) just a dummy field a.s ; +4 4 bytes (32 bits) or +8 8 bytes (64 bits) pointer to the actual string EndStructure ; x.xx x\i = 1234 x\a = "ABCD" ; store a regular string ; ; where is the structure? ; Debug SizeOf(xx) ; oh horror :-) the size depends on unicode mode... ; ; win32 non-unicode = 4 + 4 = 8 bytes ; win32 unicode = 4 + 4 = 8 bytes ; win64 non-unicode = 8 + 8 = 16 bytes ; win64 unicode = 8 + 8 = 16 bytes ; Debug @x ; the adres of the variable or structure (in this case @x = x) ; ; regular string, now where is the actual data for x\a.s ? ; Debug x\a ; second field of the structure Debug @x\a ; this returns the address where the string is actually stored Debug PeekS(@x\a) ; the actual string ; ; stored inside the structure is actually a pointer to the actual string ; *p.INTEGER = PeekI(@x+OffsetOf(xx\a)) ; here at @x+0 you find actual pointer to the string Debug *p ; should be the same as @x\a Debug PeekS(*p) ; survival guide 5_4_410 fixed length strings ; pb 4.40b3 ; Structure xx i.i ; +0 4 bytes just a dummy field so there is something in the structure b.s{10} ; +4 10 bytes (non-unicode) or 20 bytes (unicode) contains the actual string EndStructure ; x.xx x\i = 1234 x\b = "DEFG" ; store a fixed length string ; ; where is the structure? ; Debug SizeOf(xx) ; oh horror :-) the size depends on platform and unicode mode... ; ; win32 non-unicode = 4 + 10 = 14 bytes ; win32 unicode = 4 + 20 = 24 bytes ; win64 non-unicode = 8 + 10 = 18 bytes ; win64 unicode = 8 + 20 = 28 bytes ; Debug @x ; the adres of the variable or structure (in this case @x = x) ; ; regular string, now where is the actual data for x\a.s ? ; Debug x\b ; second field of the structure Debug @x\b ; this returns the address where the string is actually stored Debug PeekS(@x\b) ; the actual string ; ; stored inside the structure is the actual data ; *p.INTEGER = @x+OffsetOf(xx\b) ; not the pointer but the actual data Debug *p ; should be the same as @x\b or @x+4 Debug PeekS(@x\b) With / EndWith With the arrival of PB v4.00 our lifes have become a little easier. If we need to fill a large number of fields of a struct, we don't have to specify each time the variable name, it's enough to specify the field. Below you will find two variables with the structure player, and you can see the different approach to filling them. Structure player x.l y.l lives.l bullets.l EndStructure ; player1.player ; create var player1 with struct player player2.player ; same for player 2 ; With player1 \x = 10 \y = 10 \lives = 3 \bullets = 100 EndWith ; player2\x = 10 player2\y = 10 player2\lives = 3 player2\bullets = 100 Structures are great little beasts as they help us organize our data. The can also help us exchange data with other programs or the OS. But we need something more, a variable type (well, sort of) that indirectly points to a real variable, and if we change it, we are actually change the variable it changes to. DIt that make much sense? Nope. Not. Not yet :-) Just one more (very important) thing... IMPORTANT: WITH / ENDWITH CANNOT BE NESTED. This is done by design. Don't even ask for it. Pointers In PureBasic a pointer is a special flavour of an integer. Prior to PB4.30, pointers were always longs, these days they are .i integers, ie. their size depends on the platform and code. A pointer is used as a variable that points to a certain location in memory. Mmm... a$ = "test"
a.l = @a$; not wise as this cannot be guaranteed to work in a 64 bit environment a.i = @a$      ; now the pointer size is automatically adjusted
a.q = @a$; we *could* use quads, but why not use the adaptive .i type? *a = @a$       ; ah, the real thing, a serious pointer, using the default type
Just like the '$' symbol, in PureBasic the '*' symbol is part of the variable name, so the three variables above, a$, a and *a are not the same thing. We could use a long to store the address of the string a$, or we could use a pointer. So, what's the difference? For you experts, first another little example: b.l = 256 *c.LONG *c = @b *c\l = 3 Debug b Ah, the specialists will have noticed that *c is a pointer of type LONG. LONG is a rather simple structure, with only one field. (There are a few of these and they can make life easier.) It's one of the default structures within PureBasic so you don't have to declare it, but if you would it would look something like this: Structure LONG l.l EndStructure A pointer points to a spot in memory where stomething is stored. When we define the pointer, we tell PureBasic what type of variable we expect on that spot in memory. In fact, *c.LONG is not just a pointer, it's a structured pointer. (We're revisiting the subject later. Don't worry.) So, by specifying a field of a structured pointer, we're changing the contents of the memory where the pointer is pointing to. By leaving out the filed, or specifying a type, we tell the pointer where to point to. Let's take that little snippet again and explain what is going on. b.l = 256 ; new long variable b is created and filled with 256 *c.LONG ; a pointer of type LONG is created *c = @b ; the pointer now points to the spot in memory where b.l is stored *c\l = 3 ; we change the contents of the spot in memory where the field 'l' is located Debug b Yes. It's not easy. But it can come in very handy. Fortunately, PureBasic is a little inconsistent, otherwise things would be too eay... :-) Before we go on, let's think this over, and realize some very important facts... • on a 32 bits platform a pointer is 32 bits long, on a 64 bits platform a pointer will be 64 bits long (duh!) • on a 64 bit platform a long isn't long enough to contain a pointer (poor joke, but still a long is only 32 bits, not 64) • in PureBasic pointers behave like regular integer variables in all situations, except in combination with structs / types Duh. Okay, let's start again, and see if we grab it this time. First a normal single pointer. Let's say there is a string in memory, and we want every letter 'A' turn into a 'B'. Try this with Unicode off... ; win32 non-unicode ; a.s = "Test ABAB" Debug a p.i = @a Repeat b.b = PeekB(p) If b = 'A' PokeB(p,'B') EndIf p = p+1 Until b = 0 Debug a ; win32 non-unicode ; a.s = "Test ABAB" Debug a *p = @a Repeat b.b = PeekB(*p) If b = 'A' PokeB(*p,'B') EndIf *p = *p+1 Until b = 0 Debug a The top block uses a regular integer, the bottom a pointer. As you can see, the pointer is treated and used like a normal integer. The only advantage in this case is that it's easier to recognize it'is something that points to some spot in memory. Of course, this isn't the best way to use a pointer. Don't worry, it's coming up! But, as I'm inconsistent as hell, here'a sneak preview... ; win32 non-unicode ; a.s = "Test ABAB" Debug a *p.BYTE = @a Repeat If *p\b = 'A' *p\b = 'B' EndIf *p+1 Until *p\b = 0 Debug a Structured pointers (pointers to structures) Ah, finally something fancy! For every normal variable you declare, some space is reserved in memory. When you define a pointer and type it as a struct, it does NOT reserve space in memory, instead it keeps pointing to whatever it was pointing, but we can use the structure fields to access the data in memory... Sounds complex? It's rather easy... Here's an example ; survival guide 5_4_440 fixed length strings ; pb 4.40b3 win32 non-unicode ; Structure sample a.l ; +0 4 bytes b.w ; +4 2 bytes c.b ; +6 1 byte d.f ; +7 4 bytes e.s ; +11 4 bytes (on 32 bits Windows) f.b[10] ; +15 10 bytes EndStructure ; x.sample ; x\a = 1 x\b = 2 x\c = 3 x\d = 4.571253 x\e = "test" ; *p.sample = @x ; this is what it's all about ; Debug x\b ; 1. read the value of x\b... Debug PeekW(@x\b) ; 2. or... read the value of x\b... Debug PeekW(@x+4) ; 3. or... read the value of x\b... Debug PeekW(*p+4) ; 4. or... read the value of x\b... Debug *p\b ; 5. or... read the value of x\b :-) Look at the code. First we define a structure. Then we create a variable x with that structure. At that moment a block of 25 bytes will be set aside for all parts of x.sample. Following that, we store some information in those fields. Then we declare a pointer *p, we let it point to the place in memory where the fields of x reside using @x. We also tell the compiler we expect to find a structure of type 'sample' on that spot in memory. Now let's see how we can read the value from x\b... 1. We can use the value of x\b directly: Debug x\b ; 1. read the value of x\b... 2. @x\b returns the spot where that field is stored in memory, so we can read that spot using: Debug PeekW(@x\b) ; 2. or... read the value of x\b... 3. We know the field \b is located at @x+4 (see the code above). So we can read that point in memory using: Debug PeekW(@x+4) ; 3. or... read the value of x\b... 4. The pointer's value is actually the location of x, we know where the field \b is located, so we can read the right spot: Debug PeekW(*p+4) ; 4. or... read the value of x\b... 5. But... why whould we keep track of the exact spot where x\b is located, if we would change the struct we might have to make changes all throughout our code, so let the compiler handle it: Debug *p\b ; 5. or... read the value of x\b :-) Another way to think of a structured pointer is like this: the pointer points to a specific place in memory. Each field of the structure is more or less 'mapped' onto that part of memory. Moving the pointer makes each field map to a different part in memory. By accessing the field, we are accessing that place in memory. Have another look at the little sneak preview I gave a little back... ; non-unicode ; a.s = "Test ABAB" Debug a *p.BYTE = @a Repeat If *p\b = 'A' *p\b = 'B' EndIf *p+1 Until *p\b = 0 Debug a This structured pointer has a single field 'b'. By moving the pointer around, we move the place where that field maps to around. And then we can do nasty things to the memory it is mapped to... Pointers are a very strong concept and powerfull tool. Make sure you understand them before claiming to be an experienced programmer :-) Standard structures PureBasic provides many pre-defined structures (hit [Alt] + [S] in jaPBe). A few special ones are of extra interest: ; non-unicode ; a.s = "ABCD" ; *x.BYTE = @a *y.WORD = @a *z.LONG = @a ; Debug *x\b Debug PeekB(@a) ; Debug *y\w Debug PeekW(@a) ; Debug *z\l Debug PeekL(@a) These come in handy as replacements for peek / poke, and when dealing with procedure parameters 'by reference'. *a.BYTE ; use *a\b *b.WORD ; use *b\w *c.LONG ; use *c\l *d.INTEGER ; use *d\i *e.FLOAT ; use *e\f *f.DOUBLE ; use *f\d ; *g.STRING ; no, sorry, there's no *g\s *h.QUAD ; use *h\q Remember: A pointer points to a spot in memory where stomething is stored. When we define the pointer, we tell PureBasic what type of variable we expect on that spot in memory. In other words, the length of a pointer itself is always the same! 32 bits (4 byts) on win32, and 64 bits (8 bytes) on win64, regardless what it points to. Pointers inside structures Pointers are somewhat inconsistent in structures, as you have to use the asterix inside the structure when defining them, but you can't use the asterix outside the structure as part of the field name... Huh? Yep. Really. Structure xx *p.LONG ; see? a pointer is defined within the structure EndStructure ; l.l = 1234 ; create a long v.xx ; create a variable of type 'xx' v\p = @l ; set the 'pointer' field 'p' to the adress of 'l' Debug v\p\l ; ... and display its contents A pointer that is a structure field, looses its preceeding asterix, but still acts as a pointer. Pointers to strings Again: a pointer points to a spot in memory where stomething is stored. When we define the pointer, we tell PureBasic what type of variable we expect on that spot in memory. You can define a pointer to a string using *g.STRING But you CAN NOT manipulate that string in any normal way. Remember: PureBasic manages all string related memory, and with zero terminating strings the place where the are and the space they occupy may vary. If we work directly on the string (using a regular variable such as 'a.s' or a structure string field) then PureBasic still knows what's going on. If we access a string directly in memory than we could run into problems... What if PureBasic decides to mess around with memory and clear out unused memory because some strings became smaller? Ouch! (This kind of background activity is called 'garbage collection' by the way, and is fully automatic in PureBasic.) This is why there is no way to do this: a.s = "ABCD" *g.STRING = @a *g\s = "DEFG" ; error, this won't be accepted! A few basic rules when accessing (zero terminated) strings directly in memory: • never change the length of the string • do not write (terminating) zeroes into them • find the address and immediately do something with it • careful with string functions that may (or will!) cause garbage collection Byref vs. Byval In many languages parameters can be passed on to a procedure, either 'by value' or 'by reference'. By value means, the parameter itself is not passed but it's value is. When inside the procedure, the original variable passed is not changed. Here's an example: Procedure bv(x.l) ; every parameter in PureBasic is passed by value (ByVal) x = 2 Debug x EndProcedure ; z.l = 1 Debug z bv(z) Debug z The above will show 1, 2 and 1. The value of z was passed to the procedure, and put in a new variable x. Changing x inside the procedure doesn't affect z. In some languages you can pass parameters by reference. That simply means that changes inside the procedure would affect the variable outside. In PureBasic this is not possible, so the following code WILL NOT WORK: Procedure br(ByRef x.l) ; this will not work! x = 2 Debug x EndProcedure ; z.l = 1 Debug z bv(z) Debug z Basics that would support the ByRef keyword, would show 1, 2 and 2. A procedure can normally only return one value using ProcedureReturn. But what if you want to return more than one value, or if you want to change the variables you used as parameters? The answer is: use a pointer. Procedure br(*x.LONG) ; let's simulate a ByRef *x\l = 2 Debug *x\l EndProcedure ; z.l = 1 Debug z br(@z) Debug z Instead of passing the value of variable z, we pass the address where it is located. Inside the procedure we don't use the address, but what is located on that address. (See here for the structure being used.) See also the part on local vs. global variables. As of PB v4.00 arrays and linked lists can be global or local in PureBasic. Global Dim a.l(10,10) Global NewList b.l() Arrays and linked lists are passed by reference, not by value! Please note: in older version s of PureBasic arrays and linked lists were global. That's no longer the case. Keep in mind that you can pass them to a procedure, but they are passed by reference, not by value. See also here for how to use them as parameters in procedure calls and the use of the Array and List keyword in the procedure defintion. IDE Structure Viewer The PB IDE has a build-in structure viewer, which allows you to quickly find all fields for a predefined structure. Take for example the .RECT structure. Start the PB IDE and hit [Alt] + [S] which will start the structure viewer. Type 'rect' followed by [Enter] in the bottom text box, and it will show you all fields of the .RECT structure. There are many predefined Windows structures PureBasic is already aware of, which saves a lot of declaring. If we want to use the .RECT structure we can simply use it like this: r.RECT ... and then fill in the fields. Of course, we're lazy and we will let PureBasic list all fields for us. Move the cursor to an empty line and start the structure viewer again using [Alt] + [S]. Find the fields for the 'rect' structure, then click on 'insert'. The structure viewer will ask you for a name, enter 'q' and the following will be inserted into the IDE: q.rect q\left = q\top = q\right = q\bottom = It's then an easy job to fill in the fields. Unfortunately this doesn't work for the structures we've build ourselves. External tools such as CodeCaddy may help a little here. 5.5 Linked lists Linked lists are, well, linked lists :-) Think about them as a train, each wagon is connected front and end to other wagons in the same train. You can move forward or backward through the string of wagons, until you reach the front or the end of the train. You can insert new wagons, and delete existing ones. Keep in mind that it is faster to 'walk' through a linked list using FirstElement() NextElement() and PreviousElement() than it is to select subsequent elements with SelectElement(). Well, that was pretty much not very helpfull :-) Let's try it again. (And don't forget to check out the PureBasic helpfile / function...) Let's start over again. A linked list is a train, a chain of wagons. Let's say each wagon, also known as an 'element', is an .i integer. (You could just as well store full complex structures in each wagon aka. field.) Each wagon has a number, so by specifying the wagon number we can find our data back. (A sort of 'pointer' or 'index', similar to the ones used in arrays.) Of course, at the start the train is empty. There are no wagons. Well, there's pretty much no train :-) NewList x.i() Okay. There's our train. Now let's add some wagons. NewList x.i() ; AddElement(x()) AddElement(x()) ; Debug ListSize(x()) Yep. That's two wagons. The AddElement() instruction adds an empty field either at the start of the list (if it's an empty list) or just after of the current selected wagon / field. And then makes the new one the active one. (Okay, I'll drop the wagon analogy now, it's stupid and you got it by now, I hope :-)) Here's another example, and we're going to store some data in the list then retrieve it. ; survival guide 5_5_200 linked lists ; pb 4.40b3 ; Structure load animal.s number.l EndStructure ; NewList wagon.load() ; AddElement(wagon()) ; a new and empty list, so we'll create the first element in this list wagon()\number = 6 wagon()\animal = "dog" ; the first element of the list contains 6 dogs :-) ; AddElement(wagon()) ; ah, another wagon, immediately after the first one wagon()\number = 1 wagon()\animal = "cat" ; and this one will contain a cat ; n = ListSize(wagon()) ; For nn = 0 To n-1 SelectElement(wagon(),nn) Debug "wagon: "+Str(nn) Debug "number: "+Str(wagon()\number) Debug "animal: "+wagon()\animal Debug "" Next nn Hmmm. Too easy. Let's use the same train, but we'll add a wagon with elephants in the middle, and then we go fancy and go through the list in a different manner, just for kicks. We're one hell of a cool train driver / circus director... ; survival guide 5_5_210 linked lists ; pb 4.40b3 ; Structure load animal.s number.l EndStructure ; NewList wagon.load() ; AddElement(wagon()) ; a new and empty list list wagon()\number = 6 wagon()\animal = "dog" ; the very first element of the list ; AddElement(wagon()) ; ah, another wagon wagon()\number = 1 wagon()\animal = "cat" ; and this one will contain a cat ; n = ListSize(wagon()) ; SelectElement(wagon(),1) ; we'll move to wagon 1 InsertElement(wagon()) ; and insert a wagon directly in front of it wagon()\number=2 wagon()\animal = "elephant" ; and we'll put two elephants in it ; ResetList(wagon()) ; move before the front While NextElement(wagon()) ; move to the next element Debug "wagon: "+Str(ListIndex(wagon())) ; where are we? Debug "number: "+Str(wagon()\number) ; how many animals> Debug "animal: "+wagon()\animal ; what kind of animals? Debug "" Wend Sometimes arrays are better. Sometimes linked lists are better. It's a matter of application and preference. Please note that linked lists in procedure calls are passed by reference, and check out the use of the List keyword in the procedure definition. (And yeah, I plead guilty, I kept using that train as an example :-)) ForEach / Next ForEach / Next allows you to quickly affect all elements in a list. See below for a comparison with a regular For / Next and SelectElement() ; survival guide 5_5_220 foreach next ; pb 4.40b3 ; Structure specs name.s x.l y.l z.l color.l EndStructure ; NewList star.specs() ; ; the universe contained 100 white stars ; For n = 1 To 100 AddElement( star() ) ; With star() \x = 10 \y = 10 \z = 10 \color = RGB(255,255,255) EndWith ; Next n ; ; then, one day, they all changed into red giants... ; ResetList(star()) ; why walk through all elements... While NextElement(star()) star()\color = RGB(255,0,0) Wend ; ; and finally turned to blue ; ForEach star() ; if you can do it all at once? star()\color = RGB(0,0,255) Next I was hoping I could avoid linked lists... but I couldn't :-) And the new With / EndWith instructions of PB v4.00 make some very nice things possible... Sometimes, just sometimes, linked lists are soooo nice. Just keep in mind that they are a little slower than an array. Please note: arrays and linked lists are passed by reference, not by value. See also here for how to use them as parameters in procedure calls. 5.6 Maps (hash tables) Maps or hash tables are automatically sorted, one dimensional tables of uniqe elements. Each element in the table has a unique name, and can be quickly found back by its name. With arrays or linked lists you have to do the lookup yourself, and in arrays and linked lists you can store duplicates. Maps can contain regular variables or structures. Here's a simple map: ; 4.40b1 ; NewMap flight.s() flight("195") = "arrived" flight("127") = "boarding" flight("195") = "unloading" flight("127") = "gate closed" Debug MapSize(flight()) As you can see, map elemenets are automatically added and updated. The element with name "195" didn't exist, so it's added and set to "arrived". A bit later we overwrite that with "unloading". Hash tables are not just simple lists of arrays. A hash table (internally) doesn't even use strings at all! What it does is it creates a 'hash' (a unique number) based on the contents of the key, then it looks in the table if that hash already existed. For all practical purposes that doesn't make any difference in using those tables though :-) Maps can also contain structures, as seen here: ; survival guide 5_6_300 maps aka hash tables ; 4.40b1 ; Structure info name.s age.l EndStructure ; NewMap employee.info() employee("Hans")\name = "Niks" employee("Hans")\age = 20 employee("Willem")\name = "van Zanten" employee("Willem")\age = 53 employee("Hans")\name = "Jansen" employee("Hans")\age = 30 ; Debug "find one element" FindMapElement(employee(),"Hans") With employee() Debug \name Debug \age EndWith ; Debug "list all elements" ResetMap(employee()) While NextMapElement(employee()) With employee() Debug \name Debug \age EndWith Wend It's important to realize maps are unique, so although at first glance we seem to add three employees to the list, we're actually only adding two... The second "Hans" is replacing the first... In those situations it may be better to use a linked list or array. Note that when passing maps as procedure parameters you need to preceed them with the keyword Map in the procedure defintion. When / why / where use maps? • alphanumeric elements (the key can be anything) • auto sorting • auto updating • no need for look ups through the table • keys are unique When / why / where not to use maps? • if the key element may exist more than once (use list or array) • if the resulting map needs to have more than one dimension • if its necessary to sort the map in more than one way • if you do not want to convert a 'numeric' key to an alphanumeric one (using Str() or something similar) Hash tables aka maps might even be easier to use than good ol' arrays'n'lists... 5.7 Threads Windows not only allows you to run multiple programs at the same time ('processes') but it also allows a single program to 'spawn of' parts of itself, parts that do something in the background then get deleted after they're done. For example one part of your program could be updating the screen whilst another part is busy analyzing information. These are called 'threads'. Obviously, trying to access the same thing twice at the same time can be a dangerous hobby (two people grabbing the same, last can of beer just after all shops have closed)... PureBasic has a compiler option 'threadsave' that will help fixing up the regular PureBasic keywords / commands, but it's still the user that has to carefully think it over how, when and where to use threads. (Hey, what's new. Doesn't fix the beer problem either :-)) PureBasic has no 'Critical Section' command, but you can achieve the same effect using a mutex. Use the threadsafe option when using threads. Bugs are hard to find and fix without it. You just gotta' love the Internet, and the chance it gives to people to voice a different opinion. I was working on database stuff using the build-in commands, and followed a link to the SQLite page which in turn led me here: http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf. So, threads are evil, huh? Let's use them! :-) Creating and managing threads This little example will have two threads. First there's the regular program, which sends messages to the debugger window, then there's a created thread which does exactly the same thing. Watch the output. Procedure thread1_procedure(parameter.l) For n = 1 To 200 Debug "thread1 "+Str(n) Next n EndProcedure ; thread1_nr.l = CreateThread(@thread1_procedure(),0) For n = 1 To 100 Debug "main "+Str(n) Next n A thread will continue running until it is either done (and hits the 'EndProcedure' statement), until its parent is done, or until it's killed. You can pass a single parameter to the thread, but you cannot return one. CreateThread() returns a number which can be used to identify the different threads. This result is used for manipulating existing threads: • IsThread() - check if a thread exists • KillThread() - kills a thread • PauseThread() - pauses a thread • ResumeThread() - resumes a paused thread • WaitThread() - waits until the specified thread is done, or a specified time has passed • ThreadPriority() - retrieve or change the priority of a thread Synchronizing threads To avoid problems when two threads try to access the same data you can use a Mutex object. Once a mutex is created, each thread can try and 'lock' it. If some other thread tries to lock the mutex, that thread will have to wait until the mutex is no longer 'locked'. In the next example two threads are created, and both send out some information to the debugger. Without the LockMutex() both threads will have the same chance of sending characters, meaning you could get any combination of 'A' and 'B's. ; survival guide 5_7_100 threads ; pb 4.40b3 ; Global mutex1 mutex1 = CreateMutex() ; Procedure thread1_procedure(x.l) Repeat ; LockMutex(mutex1) Debug Chr(x) Debug Chr(x) Debug Chr(x) ; UnlockMutex(mutex1) ForEver EndProcedure ; CreateThread(@thread1_procedure(),'A') CreateThread(@thread1_procedure(),'B') ; For n = 1 To 10 Delay(100) Next n If you run the above code again, this time uncommenting the LockMutex() and UnlockMutex() lines, you will see that each thread finishes it's block of three characters before letting the other thread do its job. Why? Whilst the first thread keeps the mutex locked, the second thread stays on hold on the LockMutex() line. Then the moment the mutex becomes available due to the UnlockMutex in the first thread, the second thread immediately grabs and locks the mutex, thus blocking the first thread. And then it's all the other way around... LockMutex() halts the current thread, whilst TryLockMutex() does not. Both need UnlockMutex() to unlock the mutex and make it available again. Variable scope in threads Threads follow the same rules as normal code when it comes to variable scope, so if a thread calls a procudure, all variables local to that procedure stay local. If two threads both call that procedure at the same time, that procedure will run twice, each with its own set of local variables. Threads can access global variables as well, but so can other threads, and so can your main code. If you want a thread to have its own set of global variables (so global throughout all procedures but restricted to your thread) you need the keyword Threaded. Threaded works like Global but restricted to its own thread. ; survival guide 5_7_200 threaded ; pb 4.40b3 ; Threaded s.s = "start value" ; Procedure change(*parameter) s = "thread "+Str(*parameter) EndProcedure ; Procedure thread(*parameter) Debug "*** thread "+Str(*parameter) Debug s change(*parameter) Debug s EndProcedure ; Debug "*** main" Debug s s = "main" Debug s ; CreateThread(@thread(), 1) CreateThread(@thread(), 2) ; MessageRequester("","Continue",#PB_MessageRequester_Ok) ; Debug "*** main" Debug s As you can see the variable s.s is global to each thread, as it's changed from within the procedure change() yet does not affect s.s in the main or any other thread. Windows mutex vs. PureBasic mutex Note that the CreateMutex_() API and purebasic 4's native CreateMutex() command are different beasts (though they can be used to achieve the same result): the CreateMutex() command deals with the management of (Windows) critical section objects rather than (Windows) mutex objects. In other words: whilst the PureBasic mutexes are restricted to a single process (multiple threads belonging to a single program), you could use WinAPI for interprocess communications (multiple programs) but at a price... the WinAPI mutexes are much slower! ; survival guide 5_7_300 mutex ; pb 4.40b3 ; ; - creation and removal of mutexes ; - CreateMutex_() ; - CloseHandle_() ; ; note: purebasic 4.xx CreateMutex() command is not a wrapper for the Windows API CreateMutex_() ! ; #MUTEX_ALL_ACCESS =$1F0001
#WAIT_TIMEOUT     = $102 #WAIT_ABANDONED =$80
#WAIT_FAILED      = $FFFFFFFF #WAIT_OBJECT_0 =$0
;
; *** opening and closing a mutex
;
; mutexes are unique single objects that can be 'owned' by applications, they are exclusive and can never be owned
; by two different owners at the same time, thus 'mutually exclusive' or 'mutex'
;
; a return value of 0 means something when wrong with the creation of the mutex
;
mutex_name.s = "test"
;
mutex_h = CreateMutex_(0,0,@mutex_name)
If mutex_h = 0
lasterror = GetLastError_()
Debug "not created"
EndIf
;
; a mutex will be destroyed if nobody owns it, and 1. the creator exits, or 2. a closehandle closes it
; as usual the best way is to properly close a mutex if you no longer use it :-)
;
; one way to check for the existence of a mutex is opening it, in which case it returns a handle to the mutex
; if the mutex didn't exist (yet) or was destroyed via CloseHandle_() this will return 0
;
mutex2_h = OpenMutex_(#MUTEX_ALL_ACCESS,0,@mutex_name)
If mutex2_h = 0
Debug "could not open"
Else
CloseHandle_(mutex2_h)
EndIf
;
; trying to create a mutex that already existed will result in creating a copy of the specified mutex
; GetLastError_() provides information on the creation of the mutex, and if it already existed
;
; (this mechanism can be used to create 'single instance' software, where a new instance will exit as it already
; detects another version of itself in memory, by trying to create such a mutex and exiting when it already
; existed)
;
mutex3_h = CreateMutex_(0,0,@mutex_name)
lasterror = GetLastError_()
CloseHandle_(mutex3_h)
EndIf
;
; although windows seems to clean up abandoned mutexes, it's better to clean them up properly
;
CloseHandle_(mutex_h)
;
;
;
; *** mutex ownership
;
; only one instance (program) can 'own' a mutex, ownership can be established upon creation
; (the second parameter when set to '#True' makes the creator an owner)
; other threads / instances would open the existing mutex using OpenMutex_()
;
;   mutex4_h = OpenMutex_(#MUTEX_ALL_ACCESS,0,@mutex_name)
;
mutex4_h = CreateMutex_(0,#True,@mutex_name)
;
; this program (the current instance, the code you're reading :-)) can try to get ownership of the mutex
; by calling WaitForSingleObject_() with a proper handle and a timeout parameter
;
; WaitForSingleObject_() will wait the specified nr. of milliseconds before continuing
;
result = WaitForSingleObject_(mutex4_h,50)
Select result
Case #WAIT_FAILED
Debug "failed"
Case #WAIT_ABANDONED
Debug "other process abandoned mutex"
Case #WAIT_TIMEOUT
Debug "other process still owns mutex"
Case #WAIT_OBJECT_0
Debug "succes"
EndSelect
;
; 'abandoned' is a special case, it means the mutex was found but the previous owner is gone now, and you
; have now become the owner of the mutex
;
; if you no longer need to own a mutex, releasing is easy
;
ReleaseMutex_(mutex4_h)
;
; using WaitForSingleObject_() and ReleaseObject_() makes it possible to synchronize two applications with each
; other
;
; finally, when we're all done and about to exit the program we close the mutex
;
CloseHandle_(mutex4_h)
;
; a final note (thanks netmaestro!): it's worth noting at this point that despite their similar names,
; the CreateMutex_() API and purebasic 4's native CreateMutex() command are very differen: the CreateMutex()
; command deals with the management of critical section objects rather than (windows) mutex objects

5.8 Compiler directives

The If / Else / Endif and Select / Case / Default / EndSelect commands are processed during the execution of your program. They allow 'flow control' during execution of your code.

There is a set of similar instructions that tell the compiler to compile certain parts of your code, depending on the condition. Here's an example. Furst run this from within the IDE (with the debugger on), then compile it to an executable and run that .exe file.

; survival guide 5_8_100 compiler directives
; pb 4.40b3
;
; run this with and without debugger
;
w_main_width = 200
w_main_height = 200
;
;
CompilerIf #PB_Compiler_Debugger = 1
CompilerElse
CompilerEndIf
;
If #PB_Compiler_Debugger = 1
Else
EndIf
;
Repeat
event = WaitWindowEvent()
Until event =#PB_Event_CloseWindow
Yes. You're not seeing double, but you may wonder what the difference is between CompilerIf and the regular If. Well, that's easy. If the compiler finds an 'agreeable condition' (try that one your wife / girlfriend / partner m/v :-)) it will include that section. Otherwise it would not.

Let's see what the compiler makes of the next little sample:

CompilerIf #PB_Compiler_Debugger = 1
a = 1
CompilerElse
a = 5 ; ***
CompilerEndIf
;
If #PB_Compiler_Debugger = 1
a = 1
Else
a = 5
EndIf
First, what would the above turn into with the debugger on?
a = 1
;
If 1 = 1
a = 1
Else
a = 5
EndIf
Wow. What happened to the line marked with '***'? It's gone! It's not even included in the final code, but the lines belonging to that regular 'If' are. Let's try that again, now with the debugger off...
a = 5 ; ***
;
If 0 = 1
a = 1
Else
a = 5
EndIf
So, the compiler directive CompilerIf allow us to in- or exclude parts of the code, depending on the value of a constant. Remember that! Only constants (stuff starting with #).

You could, for example, write a procedure that has OS dependent, or CPU dependent, or even compiler version dependent parts. This will make it easier to develop and maintain different versions of your program. PureBasic offers you a whole set of constants. Place the cursor on CompilerIf and hit [F1]. At the bottom of that page you will find a list of constants.

Check out the help file for the following (somewhat) related instructions...

Not a compiler directive as such, but very useful in combination with them is the PureBasic constant #PB_Editor_CreateExecutable under Compiler / COmpiler Options / Constants.

5.9 Command line options

Portable

The help file lists a number of command line options for the IDE. Good to know (for those running with restricted user rights on corporate machines, running from a USB stick etc.) is that PureBasic is a 'portable' program. Simply starting up the IDE with the additional parameter /PORTABLE will keep everything out of the registry, and will keep all configuration files inside the PureBasic directory.

PUREBASIC.EXE /PORTABLE

Command line compiler

I'm not entirely sure it's actually listed in the index of the PureBasic help file (it probably is, I just need better glasses :-)). Open the PureBasic help file [F1] and search for 'command line compiler'. There's a whole section dedicated to the parameters that you can pass on to the compiler. Some of them are quite interesting, but most of us will be statisfied by the regular options accessible from within the IDE under Compiler / Compiler Options.

(Better glasses indeed, it's listed under General Topics in the help file... sigh.)

5.10 Macros

Marcros allow fancy and complex constructions, but they should be used with care: it is very easy to build overly complex code using (too) many (too) complex macros, making code hard to read and maintain. Always ask yourself: can I read this code in two years time? IMHO it may often be a better choice to use a Procedure() if you do not really need a macro.

Then again, macros allow things impossible otherwise, and they may speed up your code, so it's worth to spend some time on them.

Without parameters

Here's an example macro:

Macro example1
a = 1
EndMacro
;
a = 2
example1
Debug a
The code within a macro definition is inserted at the place of the macro call. When you compile the above, PureBasic turns it into:
;
a = 2
a = 1
Debug a
Here's another example:
Macro example2
"test"
EndMacro
;
Debug example2
Which is turned into:
;
Debug "test"
Macros do not apply to literal strings, as the following example shows:
Macro example2
TEST
EndMacro
;
Debug "example2"
... which results in:
;
Debug "example2"

With parameters

You can pass parameters to a macro, as the following example shows:

var1 = var1+var2
EndMacro
;
a = 1
b = 2
Debug a
Whatever you pass as var1 and var2 will replace those occurencies in the macro code, and subsequently the whole macro will be inserted in your code. The above would result in this:
;
a = 1
b = 2
a = a+b
Debug a
Note that you do not specify a type for macro parameters,. Everyting you pass on is passed on literally, as the following example illustrates:
var1 = var1+var2
EndMacro
;
a.i = 1
b.l = 2
Debug a
The above turns into:
;
a.i = 1
b.l = 2
a.i = a.i+b.l
Debug a
As you can see, each occurance of 'var1' is replaced with 'a.i' and so on.

Smart substitution and concatenation

The following example shows two more interesting aspects of macros: concatenation and smart substitution. Inside the macro defintion you can 'glue' different parameters together using the '#' symbol. The compiler will not replace parts within a literal string, and it will not replace the 'b' from 'bug' in line 4... Well, perhaps smart is not the right word, 'literal' would be better... 'bug' is NOT the same thing as 'b', and if there's one thing computers are extremely good at is being literal (it's a major cause of faults :-)... but hey, even most girlfriends can be a little too literal ;-)).

Macro x(a,b,c)
Debug "abc"
Debug "a#b#c"
a#bug 5
a#b c
Debug c
De#b c
b#c#a = 1
EndMacro
;
bug5de = 0
x(de,bug,5)
Debug bug5de
The parameter 'a' is replaced with 'de', 'b' is replaced with 'bug' and 'c' with '5'. The # symbol glues parts left and right together... Thus, the code above will be turned into this:
;
bug5de = 0
Debug "abc"
Debug "a#b#c"
debug 5
debug 5
debug 5
debug 5
bug5de = 1
Debug bug5de

Error checking

The syntax of a macro is NOT checked during definition. The following example will cause the compiler to complain about line 7, as this:

1/a
pfffrt
EndMacro
;
a = 0
... is turned into this:
;
a = 1/a
pfffrt
Fortunately, a popup window will show the macro and the line inside that might have caused the error... however this is not failsafe, as the following example shows an error in line 9 though the real problem lies in line 2 inside the macro defintion.
1/a
EndMacro
;
a = 2
;
a = 0

Macros are great but use with caution.

5.11 Miscellaneous

Some of these subjects may have come up elsewhere, but you might have missed them. Or I might have missed them ;-) but I considered them important enough to bring up (again) here...

PB IDE Compiler options

Inside the PureBasic IDE under Compiler / Compiler Options you will find some options that may be either confusing, extremely powerful, or (probably) both at the same time :-)

The first tab Compiler Options allows you to enable or set a few important options:

PB IDE Constants

Under Compiler / Compiler Options / Constants you can tell the IDE to pass a few constants to the compiler. These constants are automatically adjusted and allow you to, for example, embed version numbering based on saves or builds. You can also detect if your code is running as a stand alone program, or started during development using the build-in debugger using the #PB_Editor_CreateExecutable constant.

One tab further you can include version information which will show up in Windows, if you do so it's probably best to be consistent and make it match your build numbers etc. :-)

Library Subsystem

I will have to re-work on this section, as there have been changes, but I'm not entirely clear what exactly. Will follow once I figured it out.

Is this all? Perhaps. Perhaps not. I just hope this writeup has been somewhat helpful for getting you on track. I certainly would have had use for this when I switched over from Gfa! Anyway, have fun with PureBasic and share your code with the rest of the PureBasic community... See ya'...

But wait! There's more! Have you checked out the next few pages? There's a little on 2D graphics, and whatever else I can cook up... Go on...