5.1 Expression evaluation
(It looks like 4.50rc2 finally fixed the broken type conversion rules! Yippie!)
PureBasic does things different. Sometimes very different. With the arrival of v4.00 I tried one more time to understand it, and this is what came out...
a.l = <exp>The destination variable a.l makes <exp> being evaluated as a long UNTIL sometthing in the expression would force a change in type.
a.l = 10*22+PeekQ(...)First, a.l means a long, 10*22 would be processed as a long, PeekQ() would turn the type to a quad, finally the result is converted back to long to fit in a.l.
a.l = 10*22+Cos(...)Again, a.l means the type is set to long, 10*22 would be processed as a long, Cos() would turn the type to a float, finally the result is converted back to long to fit in a.l.
Frankly, I'm not sure anymore if PureBasic uses banker's rounding or not. My brain simply fails... just keep in mand that real world floating point numbers cannot be exactly represented by their binary counterpart inside the CPU (or the part inside the CPU called FPU that does the actual calculations) so perhaps it doesn't matter that much...
But, as of 4.51, there's still an inconsistency... It looks like the FPU returns bankers rounding whilst the compiler applies 'round half up' on numeric (non-variable) expressions upon compile time....
zero.f = 0
It's important to realize here that for example a function like Cos() changes the used type to a floating point. Using a floating point number such as 3.0 would do the same thing, but dividing one integer by another integer does not! That's why you get different answers in the following example. Code:
z.f = 0Let's have a good look at the above and start with a.l...
a.l = 2/3 + 2/3 + z.fDividing an integer by another integer does not change the type, so the expression stays in integer mode and thus is rounded down. 2/3 in integer is rounded down, ie. 0.6 becomes 0, which results in:
a.l = 0 + 0 + 0.0 = 0Now the last addition of adding a float change the whole type to a float, but alas... it's too late, we've done al evaluations and zero will be the outcome.
Let's try the next line:
b.l = 2/3.0 + 2/3 + z.fHere 2 is divived by 3.0. The 3.0 turns the whole expression into a floating point evaluation, so instead of rounding down we keep the value. As we're now in floating point mode, we'll evaluate all further expressions at least as floats, so any subsequent 2/3's will not be rounded down but kept is they are, which leads us to:
b.l = 0.66 + 0.66 + 0.0 = 1.22 = 1When the result of the expression 1.22 is stored in the integer variable b.l it's going to use bankers rounding, 1.22 will be rounded down thus the result will be 1.
The next line:
c.l = z.f + 2/3 + 2/3When parsed from left to right the first part of the expression encountered is a float, thus the whole expression type is changed upwards to a float. Subsequent parts stay in floating point mode, so the result would be:
c.l = 0.0 + 0.66 + 0.66 = 1.22 = 1The next line:
c.l = z.f + 2/3 + 2/3The final line shows the impact of bankers rounding again:
d.l = 0.0 + 0.66 + 0.66 + 0.66 = 1.88 = 2
a.f = 10*22+Cos(z.f)Here the a.f turns all into floats, so 10*22+Cos(...) would be processed as a float, finally the result is stored as a float. Note that not only the Cos() itself, but also the parameters of Cos() can force a typechange!
a.f = Cos(z.f)+10*22In the above the expression behind a.f is evaluated as a float, whilst the one behind b.f is evaluated as a double, when the .d is encountered, the whole type is again changed upwards from regular floating point .f to double precision floating point .d.
a.f = 10*22Again, a.f means a float so 10*22 would be processed as a float. This also means you can do *some* typeforcing, as following:
z.d = 0See? a.l means long, so we start in 'long' mode, we then run into z.d which means we continue in double mode, so the following 12000*Cos(x) is evaluated in double mode, before being turned back into a long. This also allows you to do some optimising, if you have an expression which contains different types, sorting them can speed up things:
a.l = 12000 * 22 + 10 * t.l + 10 * 22 * Cos(x)The expression behind a.l is faster than the one behind b.l. On an Amd64-3000 the difference was roughly 6% with the debugger switched on.
a.f = 3Notice the mathematic priority! It's NOT going to do something like ( 1 + 2 ) / 3 but it does 1 + ( 2 / 3 ). However, it will not start with typing a float as this still works from left to right. The next sample clearly proves that. Notice the different variable outputs...
; survival guide 5_1_100 type conversionIn the code above 2/3 would not change the type into a float, 2/a would. Here's how the calculation would actually go:
; survival guide 5_1_110 type conversionHere is another example, using a float variable to force the type to float.
a.f = 0And one more (all these samples were the result of trying to figure out how things work(ed) or were (are?) broken...
; survival guide 5_1_130 type conversion
Evaluations in expressions
PureBasic is NOT C(++) and does NOT allow things like:
a = a + 5*(b=c) + 6*(c=d)Previously, the compiler would not throw an error. As of 5.11 it does. If you're desperate for this coding style use Bool()...
a = a + 5*(Bool(b=c)) + 6*(Bool(c=d))... or go the long way (which is easier to read and probably executes just as fast, if not faster):
If b = cNote that expressions are evaluated left to right, but Procedure() parameters are passed on from right to left! See here for an example.
If you're only using build-in string functions and do not touch the strings in memory, don't care how things are written to files and don't have to deal with other applications that do or do not use UniCode, well, then this section is not for you. Go away :-)
The rest of us, stick around. This is important.
Let's start with a few informative links...
... is the all-encompassing name for any and all systems using more than one byte to encode a character. Where good old ASCII used 8 bits / 1 byte for a maximum of 256 characters, you can now encode more than 256 characters. Note that some multi-byte encoding schemes are called... multi-byte! Keeps things wonderfully clear, doesn't it? :-)
Some well known (yeah, right) encoding schemes and character sets:
I'm going to keep things simple, and not entirely correct, as (from our simple point of view) it doesn't matter much. Unicode is a collection of characters. Period. That's all there is to it. It encompasses many different characters from many different languages. There are a few issues with Unicode (see http://www.jbrowse.com/text for nitty gritty details. However, for most of us, Unicode is good enough.
Notice that you may not have loaded all 'characters' on your Windows box. So, Unicode works, but it doesn't display properly. In those cases you may have to add sets of characters, especially for arabic and asian languages this may be the case.
Unicode is the character set. However, we can ENCODE it in different ways. That's where the real fun starts.
Windows does Unicode. Well it doesn't. No it does. Oh hell who knows :-) This is what I think it is, but it may be wrong... The term 'wide character' actually stands for '2 bytes'. That's obviously not the same as 'Unicode'...
Windows does the following 'wide character' encodings to represent Unicode characters: (Remember: Unicode is the character set, not the encoding mechanism.)
When you're dealing with Windows (XP), the terms wide character, UCS2, DWCS, Unicode and UTF16 are often interchanged and pretty much mean the same (to average Joe The Windows Programmer). Yet they actually all mean something different...
NT and WIndows 2K are questionable. To err on the safe side I would not use Unicode on these boxes.
as for the old 16 bit software: stick to non-Unicode when writing programs
for Windows 9x / Me. Unicode on those platforms is definitely unsupported.
PureBasic's (Window's) Unicode
turn your program into Unicode, all you have to do is use the /UNICODE
flag when calling the compiler (or just tick the box under Compiler Options),
and all regular functions that use strings suddenly move to Unicode...
now that's easy :-) However, UniCode has some impact on the strings in
memory and in files. That may deserve some attention...
PureBasic Unicode strings in memory
Testing (non) UniCode
Ready? Let's start with an example... We'll find the char type .c very useful when it comes to Unicode. Here's a bit of code that shows string behaviour. Run it once in Unicode mode, and once in Ascii mode. The option to toggle between the two you'll find under Compiler / Compiler Options / Create Unicode Executable. Run the code below with Unicode switched on, and Unicode switched off.
; survival guide 5_2_200 charAs you can see, the above will work in Unicode mode as well as in regular mode. Instead of using .BYTE structures or PeekA() we will have to switch to .CHAR or PeekC(), as well as increase counters with either 1 or 2 depending on Unicode mode... as each 'character' can be either 1 (non-Unicode) or 2 (Unicode) bytes long.
There's also a little compiler constant that helps us here: #PB_Compiler_Unicode is 1 if the program is compiled in Unicode mode, and 0 if it is compiled in non-Unicode or ASCII mode.
Using Unicode affects all string functions, and the way strings are stored in memory. In non-Unicode mode:
a.s = "test" ; will take 5 bytes: $74 $65 $73 $74 $00The same in 'Unicode' (actually UCS2):
a.s = "test" ; will take 10 bytes: $74 $00 $65 $00 $73 $00 $74 $00 $00 $00The same applies to fixed length strings as well. If you use arrays of bytes (see here) in structures make sure you reserve enough bytes.
A file in UTF8 looks pretty much like a regular Ascii file, unless there are special characters in there which do not exist in the regular Ascii character set. In these cases a character in UTF8 can take up more than a single byte (theoretically up to 6 bytes).
A Unicode string in memory looks pretty much like a regular Ascii string, except it is 'interleaved' with zeroes. 'Okay' in Unicode would take up 10 bytes $ 4F 00 6B 00 61 00 79 00 00 00. The same string 'Okay' in Ascii would take up 5 bytes $ 4F 6B 61 79 00. Remember strings are zero terminated!
PureBasic has a dedicated command to deduct how much space a string takes in memory:
a.s = "test"Note that it doesn't calculate the space for terminating zeroes, in Ascii mode there is one terminating zero, in Unicode mode there are two.
Flags for PeekS(), WriteString() etc.
You can specify the string format with specific parameters.
In non-Unicode mode:
a.s = "test" ; takes up 5 bytes, 4 for 'test' and 1 for a zeroIn Unicode mode:
a.s = "test" ; takes up 10 bytes, 8 for 'test' and 2 for two zeroesThis applies to WriteString(), ReadString(), PokeS()and PeekS(). Check the helpfile for more details.
There's again terminology here that may be confusing. Let's clear that first...
PureBasic allows calling WIndows routines (the so called WinAPI, or 'windows application programming interface) directly. A number of these WinAPI calls are recognized automatically.
#SM_CMONITORS = 80WinAPI calls can be recognized by the underscore following the call, as shown above. Move the cursor over the GetSystemMetrics_() part and press F1. If you have installed WIN32.HLP or the Windows Platform SDK then you will see that the function is actually called GetSystemMetrics without the underscore.
Using WinAPI, you can do everything that is possible with Windows. The regular PureBasic gadget commands 'hide' the sometimes unfriendly calls to Windows from our eyes. I may be exploring a little WinAPI left and right... but for the moment stay with me, okay? Oh... you already left... :-)
The nice thing about PureBasic is that it automatically supports a large number of WinAPI calls, simply by using the API name followed by an underscore. In those cases, we do not have to 'open' a 'DLL'... PureBasic did that already... If a WinAPI function is not recognized we can always open and use them ourselves, if we know where to look and what to do.
PureBasic handles Unicode vs. non-unicode for 'recognized' WinAPI functions. Try this in Unicodeon and off:
MessageBox_(0,"hello world","winapi",0)The interesting thing is: there is no WinAPI function called MessageBox... there's one called MessageBoxA and one called MessageBoxW... But that's a different story... or is it?
it isn't. In Unicode mode PureBasic
turns your MessageBox_() call into a MessageBoxW() call, whilst in non-Unicode
mode the MessageBoxA() function is called. But only for all functions that
PureBasic is natively aware of, ie. all standard Windows functions. If
you use external (third party) DLL's this may not work and you may have
to figure out yourself which function you need to call.
Windows comes with a number of DLL's, and many applications bring their own as well. DLL's are files containing a number of functions that you can call from within your own program. The big advantage is obvious: sometimes you won't have to make up your own code as may be the right DLL with the right function available to you. Using DLL's has one big advantage: DLL's look the same to every language, to every program.
It may not be entirely useful, but sometimes it helps to know what's in a DLL... because either documentation is incorrect, or you're one of those that likes to use undocumented features...
With the arrival of v4.00 it is now possible to use Unicode. As some enterprising developers may already know, many WinApi functions do exist in two tastes: Ascii and Wide. This may not even be documented (as I found out with the WINTAB32.DLL, where all documentation never mentions the existence of A and W variations).
Here's an example: (run with Unicode off)
; survival guide 5_3_100 dll functionsRun the code above and you will see a large number of functions with an A and a W variant.
There are different ways to use a function, for example:
Here's an example of CallFunction, which we could use for, for example, the WinSock functions. First open the dll...
ws_winsock_nr = 1Once opened we find and call the function by its name...
ws_retval = CallFunction(ws_winsock_nr, "WSAStartup")When we are done, we are supposed to close the library...
The CallFunction method above has as a disadvantage: the 'lookup' of the function name at runtime will take some time. To speed things up a little, we could also store the address of the function:
*ws_wsastartup = GetFunction(ws_winsock_pbh,"WSAStartup")GetFunction() checks if the function exists, and if so returned an address. Use that address as a parameter for CallFunctionFast() and pass the appropriate parameters.
ws_retval = CallFunctionFast(*ws_wsastartup,$101,@ws_wsadata)Let's try that again: open a DLL, find a function, call that function.
; survival guide 5_3_400 callfunctionfastIf you run the code above in Unicode mode, it won't work properly which is logical as we're calling the Asci version of the function. More about that later, as we are working our way towards... Prototypes!
Note: as of 4.40b1 you can only pass 'integers' using CallFunctionFast(). If you want to pass strings directly as parameters, you need to add the '@' character to pass the memory location of that string.
CallFunctionFast(p_messageboxa,0,@"hello world 1",@"test",0) ; works on 4.40b1 and later
Prototypes and pseudotypes
Pseudotypes and prototypes... No. They're not the sleezebags hanging around your local grocery store, nor some of your aspiring wannabe-manager colleagues, or the legally blonde over there (those are more stereotypes :-))...
But then, what are they? And why would we use them? Let's do things the wrong way around, we start with how and then why...
The easiest way to think of prototypes is as 'wrappers' that turn WinAPI or DLL functions into regular procedures. In other words, instead of using a CallFunction or CallFunctionFast we call the function as we would call a procedure. And whilst doing that, we can do some funny things...
First the basics with the WinAPI function MessageBoxA(). In a normal direct call, we would open the DLL, look for it, then call it. Try the next snippet with Unicode disabled.
; survival guide 5_3_500 prototypesWe could also 'wrap it', sort of, so we could use it just like the build-in underscore variant:
; survival guide 5_3_510 prototypesSo, after defining it as a 'prototype' and creating a variable with that type, we can call mb2() as if it was a PureBasic command, one of our own procedures, or a build-in WinAPI call (using the underscore).
Okay... that didn't make much sense, or did it? Think of it as another way to write a procedure. The following is a rough equivalent (but not the same as we haven't done pseudotypes yet)...
; survival guide 5_3_520 prototypesAgain, it's only a rough visualization, because now it's time for...
Pseudotypes turn prototypes into the real contenders for the sexiest code on earth. Well. Not exactly. Sort of. If at all. I suppose. (It's been a long day, can't you tell?)
A pseudotype tells the compiler to convert the given parameter to another parameter, but it only works in combination with prototypes. The following code would only work with Ascii parameters. If we would run it in Unicode mode, all sorts of unexpected things could happen:
; survival guide 5_3_600 psuedotypesObviously you'd say that we could use a Unicode approach when running in Unicode mode, and a regular Ascii approach when not in Unicode mode. Hmmm. Now what about DLL's that only provide one flavour, either Unicode or non-Unicode?
Well, that's easy. We let PureBasic do the work for us.
; survival guide 5_3_610 psuedotypesIn the above, we tell PureBasic that the parameters are strings, and they have to be converted to Ascii, no matter what. So the above will run in Unicode as well as non-Unicode mode!
There are three types of pseudotypes:
; survival guide 5_3_700 prototypes and pseudotypes
Although at first rather complex, prototypes and pseudotypes turn out to be rather simple. But why go through all the effort for just an alternative way to call functions? Well... Prototypes allow you a few things that CallFunctionFast etc. cannot:
5.4 Structures and pointers
and pointers allow 'fancy' stuff, and for serious WinAPI programming they
are essential. Usage can be quite complex, so don't worry if you don't
get it right the first time. In fact, if you don't need them, you might
not even bother with them... Yet once you master them, they become an essential
component towards clean programming.
A structure is a variable type that allows us to combine many different variables and types and treat them as if they were one single thing. (Go on, read that line again.)
As a structure is fixed, it provides a stable and implicitly documented way to store, retrieve and exchange information, especially in combination with a pointer. On a 32 bit platform:
; survival guide 5_4_100 structuresFirst we defined the structure and all the fields it contains. Then we created a variable and 'typed' it 'sample'. Now we can fill in any of the parts of the structure.
On a 64 bits platform e.s would take 8 bytes, see strings in strucures.
The first four bytes contain the long for the x\a field, then 2 bytes for a word for x\b, 1 for a byte for x\c, 4 for a float for x\d, etcetera.. I've listed above how much space each variable type takes, and how far from the beginning of the structure it is stored. The length of a structure can be retrieved using SizeOf() with either the structure name or the variable name. A structure can contain bytes, words, longs, strings, floats, or even other structures or pointers. (see here for more details on pointers in structures).
The command SizeOf() returns the size of a structure or variable (type). If a structure is 25 bytes long, then a variable typed with that same structure is also 25 bytes long. Makes sense, doesn't it? :-) See the little sample code above. You can use SizeOf() on both the 'x.sample' or 'sample' itself.
The command OffsetOf() tells you how many bytes from the start of the structure a certain field is located. In the example above, the field 'd' should be located 7 bytes after the start of the structure, so OffsetOf(sampel\d) returns 7. OffsetOf() does NOT work on the typed variable, only on the structure itself. (Which is why I commented out the last line in the example above.)
; survival guide 5_4_200 structuresRunning exactly the same code under a 64 bits version of Windows with Unicode on would result in the same code with a different structure size:
; survival guide 5_4_210 structures
Arrays in structures
You can include arrays in structures, but they will have a fixed length, and it is not possible to change their size after the structure has been build using ReDim(). Arrays in structs use a different type of brackets than regular arrays...
Strings need a little more attention when used in structures. All regular variable types are stored in memory as 'one block' of data. Let's take the variable 'x' from the sample above. It's located in memory at:
address = @x
You will notice that there are only 4 bytes (under 32 bits) or 8 bytes (under 64 bits) reserved for a string, regardless of the string length. That is because the string itself is not stored inside the structure, but a pointer towards it. And, you guessed it, in 32 bit Windows such a pointer takes 4 bytes (32 bits) whilst in 64 bits Windows it's (insert appropriate drumroll) 8 bytes (64 bits).
Most languages do it differently, they store a string directly in memory as a 'byte array'. (They simply have no clue what strings are, the poor bastards...) This is also possible in PureBasic. We simply reserve some space (as much as we need, don't forget the space for any terminating zeroes)...
The space above could be used to store a 10 character string in non-Unicode mode, or a 5 character string in Unicode mode. Remember, when using PokeS() it writes the terminating zero as well, thus leaving us effectively 9 characters to store in non-Unicode mode using PokeS(), or 4 characters in Unicode mode!
We could use a fixed length string or an array of chars as wel, and thus avoid the Unicode character size issue. How much this they would occuppy would obviously then depend on the Unicode mode. If you're not doing Unicode, one character will be one byte.
Two more examples of the differences between fixed length strings and regular strings:
; survival guide 5_4_300 zero terminated strings
With / EndWith
With the arrival of PB v4.00 our lifes have become a little easier. If we need to fill a large number of fields of a struct, we don't have to specify each time the variable name, it's enough to specify the field. Below you will find two variables with the structure player, and you can see the different approach to filling them.
Structure playerStructures are great little beasts as they help us organize our data. The can also help us exchange data with other programs or the OS. But we need something more, a variable type (well, sort of) that indirectly points to a real variable, and if we change it, we are actually change the variable it changes to.
DIt that make much sense? Nope. Not. Not yet :-) Just one more (very important) thing...
IMPORTANT: WITH / ENDWITH CANNOT BE NESTED.
This is done by design. Don't even ask for it.
In PureBasic a pointer is a special flavour of an integer. Prior to PB4.30, pointers were always longs, these days they are .i integers, ie. their size depends on the platform and code.
A pointer is used as a variable that points to a certain location in memory. Mmm...
a$ = "test"Just like the '$' symbol, in PureBasic the '*' symbol is part of the variable name, so the three variables above, a$, a and *a are not the same thing. We could use a long to store the address of the string a$, or we could use a pointer. So, what's the difference?
For you experts, first another little example:
b.l = 256Ah, the specialists will have noticed that *c is a pointer of type LONG. LONG is a rather simple structure, with only one field. (There are a few of these and they can make life easier.) It's one of the default structures within PureBasic so you don't have to declare it, but if you would it would look something like this:
Structure LONGA pointer points to a spot in memory where stomething is stored. When we define the pointer, we tell PureBasic what type of variable we expect on that spot in memory.
In fact, *c.LONG is not just a pointer, it's a structured pointer. (We're revisiting the subject later. Don't worry.) So, by specifying a field of a structured pointer, we're changing the contents of the memory where the pointer is pointing to. By leaving out the filed, or specifying a type, we tell the pointer where to point to. Let's take that little snippet again and explain what is going on.
b.l = 256 ; new long variable b is created and filled with 256Yes. It's not easy. But it can come in very handy. Fortunately, PureBasic is a little inconsistent, otherwise things would be too eay... :-) Before we go on, let's think this over, and realize some very important facts...
; win32 non-unicodeThe top block uses a regular integer, the bottom a pointer. As you can see, the pointer is treated and used like a normal integer. The only advantage in this case is that it's easier to recognize it'is something that points to some spot in memory. Of course, this isn't the best way to use a pointer. Don't worry, it's coming up!
But, as I'm inconsistent as hell, here'a sneak preview...
; win32 non-unicode
Structured pointers (pointers to structures)
Ah, finally something fancy!
For every normal variable you declare, some space is reserved in memory. When you define a pointer and type it as a struct, it does NOT reserve space in memory, instead it keeps pointing to whatever it was pointing, but we can use the structure fields to access the data in memory... Sounds complex? It's rather easy...
Here's an example
; survival guide 5_4_440 fixed length stringsLook at the code. First we define a structure. Then we create a variable x with that structure. At that moment a block of 25 bytes will be set aside for all parts of x.sample. Following that, we store some information in those fields.
Then we declare a pointer *p, we let it point to the place in memory where the fields of x reside using @x. We also tell the compiler we expect to find a structure of type 'sample' on that spot in memory.
Now let's see how we can read the value from x\b...
1. We can use the value of x\b directly:
Debug x\b ; 1. read the value of x\b...
2. @x\b returns the spot where that field is stored in memory, so we can read that spot using:
Debug PeekW(@x\b) ; 2. or... read the value of x\b...
3. We know the field \b is located at @x+4 (see the code above). So we can read that point in memory using:
Debug PeekW(@x+4) ; 3. or... read the value of x\b...
4. The pointer's value is actually the location of x, we know where the field \b is located, so we can read the right spot:
Debug PeekW(*p+4) ; 4. or... read the value of x\b...
5. But... why whould we keep track of the exact spot where x\b is located, if we would change the struct we might have to make changes all throughout our code, so let the compiler handle it:
; 5. or... read the value of x\b :-)
Another way to think of a structured pointer is like this: the pointer points to a specific place in memory. Each field of the structure is more or less 'mapped' onto that part of memory. Moving the pointer makes each field map to a different part in memory. By accessing the field, we are accessing that place in memory. Have another look at the little sneak preview I gave a little back...
; non-unicodeThis structured pointer has a single field 'b'. By moving the pointer around, we move the place where that field maps to around. And then we can do nasty things to the memory it is mapped to...
Pointers are a very strong concept and powerfull tool. Make sure you understand them before claiming to be an experienced programmer :-)
PureBasic provides many pre-defined structures (hit [Alt] + [S] in jaPBe). A few special ones are of extra interest:
; non-unicodeThese come in handy as replacements for peek / poke, and when dealing with procedure parameters 'by reference'.
*a.BYTE ; use *a\bRemember:
A pointer points to a spot in memory where stomething is stored. When we define the pointer, we tell PureBasic what type of variable we expect on that spot in memory.
In other words, the length of a pointer itself is always the same! 32 bits (4 byts) on win32, and 64 bits (8 bytes) on win64, regardless what it points to.
Pointers are somewhat inconsistent in structures, as you have to use the asterix inside the structure when defining them, but you can't use the asterix outside the structure as part of the field name... Huh? Yep. Really.
Structure xxA pointer that is a structure field, looses its preceeding asterix, but still acts as a pointer.
Again: a pointer points to a spot in memory where stomething is stored. When we define the pointer, we tell PureBasic what type of variable we expect on that spot in memory. You can define a pointer to a string using
*g.STRINGBut you CAN NOT manipulate that string in any normal way. Remember: PureBasic manages all string related memory, and with zero terminating strings the place where the are and the space they occupy may vary. If we work directly on the string (using a regular variable such as 'a.s' or a structure string field) then PureBasic still knows what's going on. If we access a string directly in memory than we could run into problems... What if PureBasic decides to mess around with memory and clear out unused memory because some strings became smaller? Ouch! (This kind of background activity is called 'garbage collection' by the way, and is fully automatic in PureBasic.)
This is why there is no way to do this:
a.s = "ABCD"A few basic rules when accessing (zero terminated) strings directly in memory:
Byref vs. Byval
In many languages parameters can be passed on to a procedure, either 'by value' or 'by reference'.
By value means, the parameter itself is not passed but it's value is. When inside the procedure, the original variable passed is not changed. Here's an example:
Procedure bv(x.l) ; every parameter in PureBasic is passed by value (ByVal)The above will show 1, 2 and 1. The value of z was passed to the procedure, and put in a new variable x. Changing x inside the procedure doesn't affect z.
In some languages you can pass parameters by reference. That simply means that changes inside the procedure would affect the variable outside. In PureBasic this is not possible, so the following code WILL NOT WORK:
Procedure br(ByRef x.l) ; this will not work!Basics that would support the ByRef keyword, would show 1, 2 and 2.
A procedure can normally only return one value using ProcedureReturn. But what if you want to return more than one value, or if you want to change the variables you used as parameters? The answer is: use a pointer.
Procedure br(*x.LONG) ; let's simulate a ByRefInstead of passing the value of variable z, we pass the address where it is located. Inside the procedure we don't use the address, but what is located on that address. (See here for the structure being used.)
Global Dim a.l(10,10)Arrays and linked lists are passed by reference, not by value!
note: in older version s of PureBasic arrays and linked lists were global.
That's no longer the case. Keep in mind that you can pass them to a procedure,
but they are passed by reference, not by
value. See also here for how to use them as parameters
in procedure calls and the use of the Array
and List keyword in the procedure defintion.
IDE Structure Viewer
The PB IDE has a build-in structure viewer, which allows you to quickly find all fields for a predefined structure. Take for example the .RECT structure. Start the PB IDE and hit [Alt] + [S] which will start the structure viewer. Type 'rect' followed by [Enter] in the bottom text box, and it will show you all fields of the .RECT structure.
There are many predefined Windows structures PureBasic is already aware of, which saves a lot of declaring. If we want to use the .RECT structure we can simply use it like this:
r.RECT... and then fill in the fields. Of course, we're lazy and we will let PureBasic list all fields for us. Move the cursor to an empty line and start the structure viewer again using [Alt] + [S]. Find the fields for the 'rect' structure, then click on 'insert'. The structure viewer will ask you for a name, enter 'q' and the following will be inserted into the IDE:
q.rectIt's then an easy job to fill in the fields.
Unfortunately this doesn't work for the structures we've build ourselves. External tools such as CodeCaddy may help a little here.
Linked lists are, well, linked lists :-) Think about them as a train, each wagon is connected front and end to other wagons in the same train. You can move forward or backward through the string of wagons, until you reach the front or the end of the train. You can insert new wagons, and delete existing ones.
Keep in mind that it is faster to 'walk' through a linked list using FirstElement() NextElement() and PreviousElement() than it is to select subsequent elements with SelectElement().
Well, that was pretty much not very helpfull :-) Let's try it again. (And don't forget to check out the PureBasic helpfile / function...)
Let's start over again. A linked list is a train, a chain of wagons. Let's say each wagon, also known as an 'element', is an .i integer. (You could just as well store full complex structures in each wagon aka. field.) Each wagon has a number, so by specifying the wagon number we can find our data back. (A sort of 'pointer' or 'index', similar to the ones used in arrays.)
Of course, at the start the train is empty. There are no wagons. Well, there's pretty much no train :-)
NewList x.i()Okay. There's our train. Now let's add some wagons.
NewList x.i()Yep. That's two wagons. The AddElement() instruction adds an empty field either at the start of the list (if it's an empty list) or just after of the current selected wagon / field. And then makes the new one the active one. (Okay, I'll drop the wagon analogy now, it's stupid and you got it by now, I hope :-))
Here's another example, and we're going to store some data in the list then retrieve it.
; survival guide 5_5_200 linked listsHmmm. Too easy. Let's use the same train, but we'll add a wagon with elephants in the middle, and then we go fancy and go through the list in a different manner, just for kicks. We're one hell of a cool train driver / circus director...
; survival guide 5_5_210 linked listsSometimes arrays are better. Sometimes linked lists are better. It's a matter of application and preference.
Please note that linked lists in procedure calls are passed by reference, and check out the use of the List keyword in the procedure definition.
(And yeah, I plead guilty, I kept using that train as an example :-))
ForEach / Next allows you to quickly affect all elements in a list. See below for a comparison with a regular For / Next and SelectElement()
; survival guide 5_5_220 foreach nextI was hoping I could avoid linked lists... but I couldn't :-) And the new With / EndWith instructions of PB v4.00 make some very nice things possible... Sometimes, just sometimes, linked lists are soooo nice. Just keep in mind that they are a little slower than an array.
Please note: arrays and linked lists are passed by reference, not by value. See also here for how to use them as parameters in procedure calls.
Maps or hash tables are automatically sorted, one dimensional tables of uniqe elements. Each element in the table has a unique name, and can be quickly found back by its name. With arrays or linked lists you have to do the lookup yourself, and in arrays and linked lists you can store duplicates. Maps can contain regular variables or structures. Here's a simple map:
; 4.40b1As you can see, map elemenets are automatically added and updated. The element with name "195" didn't exist, so it's added and set to "arrived". A bit later we overwrite that with "unloading".
Hash tables are not just simple lists of arrays. A hash table (internally) doesn't even use strings at all! What it does is it creates a 'hash' (a unique number) based on the contents of the key, then it looks in the table if that hash already existed. For all practical purposes that doesn't make any difference in using those tables though :-)
Maps can also contain structures, as seen here:
; survival guide 5_6_300 maps aka hash tablesIt's important to realize maps are unique, so although at first glance we seem to add three employees to the list, we're actually only adding two... The second "Hans" is replacing the first... In those situations it may be better to use a linked list or array.
Note that when passing maps as procedure parameters you need to preceed them with the keyword Map in the procedure defintion.
When / why / where use maps?
Windows not only allows you to run multiple programs at the same time ('processes') but it also allows a single program to 'spawn of' parts of itself, parts that do something in the background then get deleted after they're done. For example one part of your program could be updating the screen whilst another part is busy analyzing information. These are called 'threads'.
Obviously, trying to access the same thing twice at the same time can be a dangerous hobby (two people grabbing the same, last can of beer just after all shops have closed)... PureBasic has a compiler option 'threadsave' that will help fixing up the regular PureBasic keywords / commands, but it's still the user that has to carefully think it over how, when and where to use threads. (Hey, what's new. Doesn't fix the beer problem either :-))
PureBasic has no 'Critical Section' command, but you can achieve the same effect using a mutex.
Use the threadsafe option when using threads. Bugs are hard to find and fix without it.
just gotta' love the Internet, and the chance it gives to people to voice
a different opinion. I was working on database
stuff using the build-in commands, and followed a link to the SQLite page
which in turn led me here:
So, threads are evil, huh? Let's use them! :-)
Creating and managing threads
This little example will have two threads. First there's the regular program, which sends messages to the debugger window, then there's a created thread which does exactly the same thing. Watch the output.
A thread will continue running until it is either done (and hits the 'EndProcedure' statement), until its parent is done, or until it's killed.
You can pass a single parameter to the thread, but you cannot return one. CreateThread() returns a number which can be used to identify the different threads. This result is used for manipulating existing threads:
To avoid problems when two threads try to access the same data you can use a Mutex object. Once a mutex is created, each thread can try and 'lock' it. If some other thread tries to lock the mutex, that thread will have to wait until the mutex is no longer 'locked'.
In the next example two threads are created, and both send out some information to the debugger. Without the LockMutex() both threads will have the same chance of sending characters, meaning you could get any combination of 'A' and 'B's.
; survival guide 5_7_100 threadsIf you run the above code again, this time uncommenting the LockMutex() and UnlockMutex() lines, you will see that each thread finishes it's block of three characters before letting the other thread do its job. Why? Whilst the first thread keeps the mutex locked, the second thread stays on hold on the LockMutex() line. Then the moment the mutex becomes available due to the UnlockMutex in the first thread, the second thread immediately grabs and locks the mutex, thus blocking the first thread. And then it's all the other way around...
LockMutex() halts the current thread, whilst TryLockMutex() does not. Both need UnlockMutex() to unlock the mutex and make it available again.
Threads follow the same rules as normal code when it comes to variable scope, so if a thread calls a procudure, all variables local to that procedure stay local. If two threads both call that procedure at the same time, that procedure will run twice, each with its own set of local variables.
Threads can access global variables as well, but so can other threads, and so can your main code. If you want a thread to have its own set of global variables (so global throughout all procedures but restricted to your thread) you need the keyword Threaded. Threaded works like Global but restricted to its own thread.
; survival guide 5_7_200 threadedAs you can see the variable s.s is global to each thread, as it's changed from within the procedure change() yet does not affect s.s in the main or any other thread.
Note that the CreateMutex_() API and purebasic 4's native CreateMutex() command are different beasts (though they can be used to achieve the same result): the CreateMutex() command deals with the management of (Windows) critical section objects rather than (Windows) mutex objects.
In other words: whilst the PureBasic mutexes are restricted to a single process (multiple threads belonging to a single program), you could use WinAPI for interprocess communications (multiple programs) but at a price... the WinAPI mutexes are much slower!
; survival guide 5_7_300 mutex
5.8 Compiler directives
There is a set of similar instructions that tell the compiler to compile certain parts of your code, depending on the condition. Here's an example. Furst run this from within the IDE (with the debugger on), then compile it to an executable and run that .exe file.
; survival guide 5_8_100 compiler directivesYes. You're not seeing double, but you may wonder what the difference is between CompilerIf and the regular If. Well, that's easy. If the compiler finds an 'agreeable condition' (try that one your wife / girlfriend / partner m/v :-)) it will include that section. Otherwise it would not.
Let's see what the compiler makes of the next little sample:
CompilerIf #PB_Compiler_Debugger = 1First, what would the above turn into with the debugger on?
a = 1Wow. What happened to the line marked with '***'? It's gone! It's not even included in the final code, but the lines belonging to that regular 'If' are. Let's try that again, now with the debugger off...
a = 5 ; ***So, the compiler directive CompilerIf allow us to in- or exclude parts of the code, depending on the value of a constant. Remember that! Only constants (stuff starting with #).
You could, for example, write a procedure that has OS dependent, or CPU dependent, or even compiler version dependent parts. This will make it easier to develop and maintain different versions of your program. PureBasic offers you a whole set of constants. Place the cursor on CompilerIf and hit [F1]. At the bottom of that page you will find a list of constants.
Check out the help file for the following (somewhat) related instructions...#PB_Editor_CreateExecutable under Compiler / COmpiler Options / Constants.
The help file lists a number of command line options for the IDE. Good to know (for those running with restricted user rights on corporate machines, running from a USB stick etc.) is that PureBasic is a 'portable' program. Simply starting up the IDE with the additional parameter /PORTABLE will keep everything out of the registry, and will keep all configuration files inside the PureBasic directory.
I'm not entirely sure it's actually listed in the index of the PureBasic help file (it probably is, I just need better glasses :-)). Open the PureBasic help file [F1] and search for 'command line compiler'. There's a whole section dedicated to the parameters that you can pass on to the compiler. Some of them are quite interesting, but most of us will be statisfied by the regular options accessible from within the IDE under Compiler / Compiler Options.
(Better glasses indeed, it's listed under General Topics in the help file... sigh.)
Marcros allow fancy and complex constructions, but they should be used with care: it is very easy to build overly complex code using (too) many (too) complex macros, making code hard to read and maintain. Always ask yourself: can I read this code in two years time? IMHO it may often be a better choice to use a Procedure() if you do not really need a macro.
again, macros allow things impossible otherwise, and they may speed up
your code, so it's worth to spend some time on them.
Here's an example macro:
Macro example1The code within a macro definition is inserted at the place of the macro call. When you compile the above, PureBasic turns it into:
;Here's another example:
Macro example2Which is turned into:
;Macros do not apply to literal strings, as the following example shows:
Macro example2... which results in:
You can pass parameters to a macro, as the following example shows:
Macro addvar1tovar2(var1,var2)Whatever you pass as var1 and var2 will replace those occurencies in the macro code, and subsequently the whole macro will be inserted in your code. The above would result in this:
;Note that you do not specify a type for macro parameters,. Everyting you pass on is passed on literally, as the following example illustrates:
Macro addvar1tovar2(var1,var2)The above turns into:
;As you can see, each occurance of 'var1' is replaced with 'a.i' and so on.
Smart substitution and concatenation
The following example shows two more interesting aspects of macros: concatenation and smart substitution. Inside the macro defintion you can 'glue' different parameters together using the '#' symbol. The compiler will not replace parts within a literal string, and it will not replace the 'b' from 'bug' in line 4... Well, perhaps smart is not the right word, 'literal' would be better... 'bug' is NOT the same thing as 'b', and if there's one thing computers are extremely good at is being literal (it's a major cause of faults :-)... but hey, even most girlfriends can be a little too literal ;-)).
Macro x(a,b,c)The parameter 'a' is replaced with 'de', 'b' is replaced with 'bug' and 'c' with '5'. The # symbol glues parts left and right together... Thus, the code above will be turned into this:
The syntax of a macro is NOT checked during definition. The following example will cause the compiler to complain about line 7, as this:
Macro bad(a)... is turned into this:
;Fortunately, a popup window will show the macro and the line inside that might have caused the error... however this is not failsafe, as the following example shows an error in line 9 though the real problem lies in line 2 inside the macro defintion.
Some of these subjects may have come up elsewhere, but you might have missed them. Or I might have missed them ;-) but I considered them important enough to bring up (again) here...
Inside the PureBasic IDE under Compiler / Compiler Options you will find some options that may be either confusing, extremely powerful, or (probably) both at the same time :-)
The first tab Compiler Options allows you to enable or set a few important options:
Under Compiler / Compiler Options / Constants you can tell the IDE to pass a few constants to the compiler. These constants are automatically adjusted and allow you to, for example, embed version numbering based on saves or builds. You can also detect if your code is running as a stand alone program, or started during development using the build-in debugger using the #PB_Editor_CreateExecutable constant.
One tab further you can include version information which will show up in Windows, if you do so it's probably best to be consistent and make it match your build numbers etc. :-)
will have to re-work on this section, as there have been changes, but I'm
not entirely clear what exactly. Will follow once I figured it out.
Is this all? Perhaps. Perhaps not. I just hope this writeup has been somewhat helpful for getting you on track. I certainly would have had use for this when I switched over from Gfa! Anyway, have fun with PureBasic and share your code with the rest of the PureBasic community... See ya'...