Working with bitwise operators

Ramses12: Moderator; Posts : 20
Reputation : 1
Join date : 2010-04-10

Post n°1

Working with bitwise operators

Post by Ramses12 Sun May 23, 2010 3:04 am

Working with bitwise operators
001100010010011000110000

Compatibility: GM8 LITE
Tutorial Type: Walkthrough
Downloadable Content: http://solidfiles.com/d/oCIp/get
Skill Level: Intermediate/average
Objective: Bitwise operators and other different aspects concerning the binary system
Category: Mathematics, data management

Note that this is a mirror to a tutorial originally posted by me at http://gmc.yoyogames.com/index.php?showtopic=473795

Introduction
I've spent these days writing a tutorial, which was initially only a simple example showing how to modify and read a bit from an integer in Game Maker. Eventually this turned out to be a continuously growing weed, until it became what it is now: a big bush with portals to lots of interesting things like codes used in computer science or logarithmic functions.
This tutorial is mainly a collection of facts related to the binary in GM, including information about virtual data units of measure, number bases, the hex and ASCII codes, a complete description of bitwise operators in GM (usually a forsaken spot of GML), as well as a small collection of bit manipulation GML scripts.
Please tell me if you have something interesting that could be added to the tutorial.
As a note, remember that I am not a native english speaker, thus you may encounter spelling mistakes. So, let's get started.

Contents

Boolean variables, bits and their multiplies
Number bases; base 2
The Hex and ASCII codes
Bitwise operators in Game Maker
&, bitwise AND
|, bitwise OR
^, bitwise EXCLUSIVE OR
<<, bitwise SHIFT LEFT
>>, bitwise SHIFT RIGHT
~, bitwise NOT
Dealing with binary numbers; bitwise scripts
- bit_size
- bit_get
- bit_set
- bit_toggle
- bit_count
Working with bits; compiling booleans
Final words

Boolean variables, bits and their multiplies
According to Wikipedia, in computer science, the Boolean or logical data type is a primitive data type having one of two values: true or false, intended to represent the truth values of logic and Boolean algebra.
A data type is a way in which data can be stored. For example there are integers, strings and doubles. In some programming languaces like C++ there also exists another data type which is called bool. A bool (stands for boolean) is something that can take only two values: true and false (or 1 and 0). Because the boolean data type is a very easy to understand concept and is a good solution for a lot of things, GM users use booleans a lot, even if variables which hold booleans are actually integers. A lot of GM built-in functions accept booleans as arguments and return boolean values (true or false). So, shortly a boolean variable is a variables that is supposed to take the value or 1 or 0.
I shall quote Wikipedia again, saying that a bit is the basic unit of information in computing and telecommunications; it is the amount of information that can be stored by a digital device or other physical system that can normally exist in only two distinct states.
For a more user-friendly explination, I'd say that a bit is the smallest unit used by a computer; any sort of data in a computer is made of several 1s and 0s. These 1s and 0s are bits; thus a bit can take either the value of 1, or 0 (true/false, or sometimes high/low). The term of bit comes from binary digit.
To understand what bits truly are, you should understand that bits are used in two ways: in processing units like CPUs (CPU stands for Central Processing Unit) they are used to "think"; meaning that they process data which is stored - also in bits - in a memory, and data retrieved from [url="http://en.wikipedia.org/wiki/Input_device"]input devices[/url] like keyboards, mice, webcams etc. Usually a set bit (in bit manipulation a "set bit" is a bit with the digit value of 1 and a "clear bit" is a bit with the digit value of 0) is a positive electric pulse or voltage relative to a ground permanent voltage. If the ground voltage is met, that is a clear bit (false). The other type in which bits are met is the memory. A memory is a device used to store data, like a CD (Compact Disk) or a DVD (Digital Versatile Disk or Digital Video Disk, both variants are correct), or even more common, a HDD. In a HDD (abbreviation for Hard Disk Drive), all data is stored in a bunch of plates (circular rigid disks) made of several infinitesimal magnetic separate regions which act as data bits, and can be set or cleared by simply adding a magnetic value to them.
A group of 8 such bits (like for example 01001001) is called an octet, sometimes referred as "byte". You probably know that the multiples of bytes are kilobytes, megabytes, gigabytes, etc. But in computer terms, these are not the most correct units of measure. You see, for a computer (and so for a programmer), a "round number" is not something ending in a bunch of 0s, as it is for the rest of the world. As an example, the number 1024 is more round than 1000, and this is because 1024 is a power of 2 (1024=2¹⁰). Any computer is based on 2 and its powers. So, the truth is that there are two different scales. One is made for humans, and is used in daily conversations, and one is more unknown. So, the scale used for user-friendship is this:

[indent] 1 Byte = 8 bits;
1 Kilobyte (KB) = 1,000 Bytes.
1 Megabyte (MB) = 1,000 KB = 1,000,000 Bytes.
1 Gigabyte (GB) = 1,000 MB = 1,000,000 KB = 1,000,000,000 Bytes.
1 Terabyte (TB) = 1,000 GB = 1,000,000 MB = 1,000,000,000 KB = 1,000,000,000,000 Bytes.
1 Petabyte (PB) = 1,000 TB = 1,000,000 GB = 1,000,000,000 MB = 1,000,000,000,000 KB = 1,000,000,000,000,000 Bytes. [/indent] But the real scale used in computer technology is this: [indent] 1 Byte = 8 bits;
1 Kibibyte (KiB) = 1,024 Bytes.
1 Mebibyte (MiB) = 1,024 KiB = 1,048,576 Bytes.
1 Gibibyte (GiB) = 1,024 MiB = 1,048,576 KiB = 1,073,741,824 Bytes.
1 Tebibyte (TiB) = 1,024 GiB = 1,048,576 MiB = 1,073,741,824 KiB = 1,099,511,627,776 Bytes.
1 Pebibyte (PiB) = 1,024 TiB = 1,048,573 GiB = 1,073,741,824 MiB = 1,099,511,627,776 KiB = 1,125,899,906,842,624 Bytes [/indent]It is a common thing that sometimes there are made mistakes like saying that 1 Kilobyte is actually equal to 1024 bytes, but that is not true.
After all, this whole thing is based on 2 values: 1 and 0, and lots of combinations between them.

Number bases; base 2
To understand the concept of binary numeral system, you should be at least a bit familiar with the idea of a number base, or radix. The term of radix comes from latin, and means "root", "source" or "foundation". Usually the base elements are symbols; there can be used n symbols in an n base. Normally, we use base 10, thus there are 10 symbols: 0,1,2,3,4,5,6,7,8 and 9. These symbols have a specific order, which defines their succession. A different base might have a smaller amount of symbols or a larger one. Let's take base 8 as an example: There are 8 symbols - 0, 1, 2, 3, 4, 5, 6 and 7. When you reach 7, you get back to 0 and add 1 to the next slot. Thus the first ntegers in base 8 are 0, 1, 2, 3, 4, 5, 6, 7, 10, 11, 12, 13, 14, 15, 16, 17, 20 etc. Note that a base doesn't really have to use numbers. We can for instance create a number base with only these 3 symbols: x,y and z, in this order. Thus, the first numbers would look like this: x, y, z, xx, xy, xz, yx, yy, yz, zx, zy, zz, xxx, xxy, etc.
As a more advanced note, bases are not necessarely working only with symbols. For example, the clock's HMS system works in base 60; but this is a bit more abstract.
Returning to base 2, the binary system used by computers, it happens as usually: there are only 2 symbols, usually represented as 0 and 1. To write a number in base 2, you should keep track of the powers of 2. This series is crucial in being able to read binary numbers. Here are the first elements of the series: 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192.
When writing a number in the binary system, you only have to know which of these numbers, added up, give the particular number. Take 45 for example: you can write it as a sum of powers of 2;
45=32+8+4+1
To find the binary code for 45:

Write down 6 zeros, which will correspond to the powers of 2 (including 0).
Find out which powers of two were sumed up. In 45's case, those are 5, 3, 2 and 0.
Change the 0s corresponding to those powers of 2 to 1s.

The binary code for 45 is 101101.
For practice's sake, let's take another number: 99. To find out how can the number be written as a sum of powers of 2, simply take the largest power of 2 which is smaller than the number - 64 - and then subtract it from the number. Repeat the process until you get 0.
99-64=35;
35-32=3;
3-2=1;
1-1=0;
The numbers are 64, 32, 2 and 1 which means 6, 5, 1 and 0. The binary code for 99 is 1100011. Try this yourself with some other numbers.
To decode a binary number, use the same method: add up powers of two which correspond to 1s. As an example, 11001=2⁴+2³+2⁰=16+8+1=25.

The Hex and ASCII codes
You might have heard of something called "the hex code". It usually appears as a series consisting of digits and letters grouped two by two: "54 68 61 74 27 73 20 68 65 78 2E". Indeed, that's hex. The term of hex comes from "hexadecimal". The code is actually base 16, written with digits and letters from A to F. The digit symbols from 0 to 9 represent their decimal value. Letters from A to F represend values from 10 to 15. Reading integers in the code is pretty easy actually:
A3D8=10*16³+3*16²+13*16¹+8*16⁰ which is 10*4096+3*256+13*16+8*1=40960+768+208+8=41944.
Hex code is commonly expressed in pairs of [url="http://en.wikipedia.org/wiki/Nibble"]nibbles[/url]. In this form, two hex digits represent a byte.
You might know about the hex code used in web colors. Those are 3 bytes, expressed in 3 pairs of hex digits. This is how it works: The color hex code is split into 3 groups of 2 digits. First group corresponds to the Red value

, the second corresponds to the Green value (G) and the third to the Blue value (B) - all of these ranging from 0 to 255. Since a byte can have 256 possible values, the web color palette contains 256*256*256 colors = 16,777,216 colors. This palette is usually called "truecolor", since it uses 3 bytes per pixel. Note that you can use hex code in Game Maker, if you write a $ symbol before the value. Concerning colors, you should be aware that the R and B values are inverted relative to a web color's hex code. So the web color #F236A7 (some sort of purple) is in GM $A736F2.
[url="http://en.wikipedia.org/wiki/Memory_address"]Memory locations[/url] are usually expressed in hex. Any kind of errors like "Access violation at address 00406880 in module 'Game_Maker.exe'. Read of address 20657371." points to a particular unique location in the memory. That is how programmers can know how to fix it.
Hex is also used in URLs, whenever there is a character that cannot be written (which is neither letter nor digit). In this case, the % special character is used, after which comes a hex byte corresponding to the wanted character. As an example, space is written as %20, which in hex is 32 - the character code for space.
Now that you know what's the hex code all about, we should advance to the next topic, the ASCII code. As hex corresponds to base 16, ASCII corresponds to base 128. "ASCII" is an abreviation for American Standard Code for Information Interchange. This code contains 128 characters, printable and non-printable, including all digits and lowercase/uppercase letters, as well as many other symbols. Each character has a unique code ranging from 0 to 255. Basically from 0 to 31 there are control characters, and from 32 to 127 there are printing characters. Since a complete description of ASCII code and its uses is not the purpose of this tutorial, see [url="http://en.wikipedia.org/wiki/ASCII"]this Wikipedia page[/url] for additional information. Usually the ASCII is used in its extended version, with 8 bits per character and 256 symbols, which allows it to have one character corresponding to one byte.
In Game Maker, when creating a font, it is asked for the so-called "character range". That actually referes to values in ASCII that should be drawn by GM. Use "all" if you want to be able to draw all characters.

Bitwise operators in Game Maker
Game Maker provides a series of operators allowing the programmer to deal with numbers directly in their binary form, as they are actually read by computers. Since numbers do not longer have to be converted to their decimal form, binary operations should work faster than decimal ones. However it seems like GM doesn't doesn't work directly in numbers' binary form, so bitwise operations are fake (this is not certain though). GM's bitwise operators are & (and), | (or), ^ (xor), << (shift left), >> (shift right) and ~ (not). These operators are fully explained in next chapters.

&, bitwise AND
When two booleans are combined using and, the result is true only if both booleans are true. There are 4 possibilities for a 1 bit operation:

Code:: 0&0=0; 0&1=0; 1&0=0; 1&1=1;

Since you can apply a bitwise and to any two integers, each bit of the first integer is combined with the corresponding bit in the second integer:

Code:: 1100& 1010 ---- 1000

Both integers should obviously passed in their decimal form, as the the result is returned as decimal.
If you use a float value, it shall be rouded towards the nearest integer.

|, bitwise OR
When two booleans are combined using or, the result is true if any of the two booleans is true. There are 4 possibilities for a 1 bit operation:

Code:: 0|0=0; 0|1=1; 1|0=1; 1|1=1;

Since you can apply a bitwise or to any two integers, each bit of the first integer is combined with the corresponding bit in the second integer:

Code:: 1100| 1010 ---- 1110

Both integers should obviously passed in their decimal form, as the the result is returned as decimal.
If you use a float value, it shall be rouded towards the nearest integer.

^, bitwise EXCLUSIVE OR
When two booleans are combined using exclusive or, the result is true if their values are different, and false if their value is the same. There are 4 possibilities for a 1 bit operation:

Code:: 0^0=0; 0^1=1; 1^0=1; 1^1=0;

Since you can apply a bitwise exclusive or to any two integers, each bit of the first integer is combined with the corresponding bit in the second integer:

Code:: 1100^ 1010 ---- 0110

Both integers should obviously passed in their decimal form, as the the result is returned as decimal.
If you use a float value, it shall be rouded towards the nearest integer.

<<, bitwise SHIFT LEFT
When an integer is binary shifted left at an amount of n, its bits are shifted left n times. This is similar to adding a 0 at the end of the number n times. As an example,

Code:: 10011<<0=10011; 10011<<1=100110; 10011<<2=1001100; 10011<<3=10011000;

Arithmetically, n<< x=n*2^x.

>>, bitwise SHIFT RIGHT
When an integer is binary shifted right at an amount of n, its bits are shifted right n times. This is similar to removing a bit from the end of the number n times. As an example,

Code:: 10011>>0=10011; 10011>>1=1001; 10011>>2=100; 10011>>3=10;

Arithmetically, n>>x=n div 2^x, where x div y is the value of x/y rounded in the direction of zero to the nearest integer. Note that n>>-x=0.
If you try to apply a right shifting to a negative integer, the game might throw a Floating point error, and eventually crash. The procedure works however for left shifting.

~, bitwise NOT
This operation, also called "complement", negates each bit in a given integer. Note that this is the only unary operator in GM which applies to a single value. Complementing is very simple: 1 becomes 0 and 0 becomes 1. The only thing that is tricky in unary negation is that the not is applied to the leading zeros as well. In this case, an infinite number of leading zeros from a positive integer changes to an infinite number of leading 1s, which makes it a negative integer. As an example, ~...0000010011=...1111101100.
Arithmetically, ~n=-n-1.

Dealing with binary numbers; bitwise scripts
Even if GM's built in operators cover almost everything concerning binary operations, these can be extended into bit manipulation algorythms. As an example, retrieving a particular bit at a given position in the given number, or finding the number of bits in a binary integer. Even if these things could be done by converting the decimal integer to a binary string ([url="http://www.gmlscripts.com/script/dec_to_bin"]gmlscripts.com[/url] is a good place to search for binary/hex encoding), and then manipulating the number directly as a binary string, there are ways of doing it easier. Here is a small collection of bit manipulation basic scripts I have made. These scripts use the standard "bit twiddling" algorythms, and neither of them converts the integer to a string before manipulating the binary value.
You can find a .gml file with the scripts in [url="http://solidfiles.com/d/xB1E"]here[/url].

bit_size
This script returns the number of bits used by the given integer. Since always a binary positive number begins with 1, the amount of used bits is determined by the largest power of 2 which is part of the number. To retrieve the power to wich a given base must be raised in order to return a particular number, you have to use a logarithmic function, in this case log2(x). Because the function returns an integer when the argument is the exact power of 2, I couldn't use ceil(x), so I have used floor(x+1), where x is of course the result of the logaritmic function. For a detailed explination concerning logarithms, simply check [url="http://en.wikipedia.org/wiki/Logarithm"]Wikipedia[/url].

Code:: /* Script: bit_size(n) Arguments; n, integer Returned type: integer Returned value: number of bits in n */ var n; n=argument0; if(n==0)return 0; return floor(log2(abs(n))+1);

bit_get
This script returns the bit located at a given position in a given integer. Note that n&1 is used to retrieve the first bit in an integer. You can use if(n&1) to check whether n is even or odd, since every even number encoded binary ends in 0 and any odd number in base 2 ends in 1. This way is probably shorter than using if(n mod 2).

Code:: /* Script: bit_get(n,pos) Arguments; n, integer pos, integer Returned type: boolean Returned value: bit at position pos in n */ var n,pos; n=argument0; pos=argument1; if(n<0)return !(abs(n)>>pos&1); return n>>pos&1;

bit_set
This script returns a copy of the given integer, whose bit located at the given position is replaced by a given value (if the value is larger than 0 the bit is considered 1, otherwise 0). The script uses two different algorythms in the two cases, even if they are in a way similar.

Code:: /* Script: bit_set(n,pos,bit) Arguments; n, integer pos, integer bit, boolean Returned type: integer Returned value: copy of integer n with bit at position pos changed to bit */ var n,pos,bit; n=argument0; pos=argument1; bit=argument2; if(bit)return n|(1<<pos);> return n& (~power(2,target));

bit_toggle
This script returns a copy of the given integer, whose bit located at the given position is negated (if it is 1, it becomes 0 and if it is 0 it becomes 1). While setting a particular bit to true requires using an or bitwise operator and setting it to false requires an and, toggling it uses xor.

Code:: /* Script: bit_toggle(n,pos) Arguments; n, integer pos, integer Returned type: integer Returned value: copy of integer n with negated bit at position pos */ var n,pos; n=argument0; pos=argument1; bit=argument2; return n^(1<<pos);>

bit_count
This script returns the amount of bits with the value of 1 if the number is positive and the amount if bits holding 0 if the number is negative. Note that in a positive number the amount of 0 bits is infinite because of leading zeros, and so it happens to negative numbers and 1s.

Code:: /* Script: bit_count(n) Arguments; n, integer Returned type: integer Returned value: number of 1 bits if n is positive and number of 0 bits otherwise */ var n,r; n=abs(argument0); r=0; repeat(bit_size(n)){ r+=n&1; n=n>>1; }; return r;

Working with bits; compiling booleans
Working with singular bits in Game Maker can come in hand in many situations. As an example, the xor bitwise operator can be used in ecrypting a string with a password (see [url="http://gmc.yoyogames.com/index.php?showtopic=464424"]this[/url]). Bitwise operators can also be used in all sorts of string conversions, color management, arythmetical operations, and yet lots of other things, even if GM does not work with the "deep layers of the computer's mind".
You can notice that sometimes there are used a lot of singular bools, which unfortunately are used as separate variables, variables designed to be integers and not bools, which takes more space than it should. However there are ways of optimising this, by assigning each bit in an integer or a string to one bool variable. In this case you have a "bool collection" which can be accessed through a read-bit/write-bit system. Using an ASCII string for this would probably be more optimised than using an integer, since at 8 bits per character, a byte would take the same space as an integer whose maximum value is 2⁸. At a much larger area - let's say 128 bits - the process would probably crash for integers, since GM can't process that large numbers. Instead an 128-bit string would be pretty easy to process, while it has only 16 characters. For strings, the binary data retrieving process is actually the same as for integers, since you can always use ord(str) to retrieve the ASCII code of a particular character and chr(val) to retrieve a character from the ASCII value. Since these function apply to only one character, you only have to deal with 8 bits at once. This technique can also be used for basic encryptions.

Final words
Writing this tutorial was actually fun for me. The motivation for writing this was probably the fact that I could find no other tutorial concerning bitwise operators and binary manipulation algorythms here at GMC. If you feel like something should be added to it, or something is not clear enough presented, please feel free to tell me; also please notice me if there are any mistakes in there; this thing is pretty big and I don't have enough time to check every line in there...
The scripts that are presented here are giving a basic hint on the proccess twiddling. Post if you need a similar script or anything; I'm willing to add as many things as needed in order to make this as complete as possible.
As for now, 01110100011010000110000101101110 01101011011100110010000001100110 01101111011100100010000001110010 01100101011000010110010001101001 01101110011001110010000001100001 01101110011001000010000001101000 01100001011101100110010100100000 01100001001000000110111001101001 01100011011001010010000001100100 011000010111100100100001

-Ramses

Master Game Community

Working with bitwise operators

Working with bitwise operators