Representation of Numbers

Contents

Introduction

When working with any kind of digital system (electronics or computers) in which numbers are being represented, it is important to understand the different ways in which these numbers are represented. Almost without exception, numbers are represented by two voltage levels which can represent a one or a zero (an interesting exception to this rule are newer memory devices that use one four (or more) possible voltage levels, thereby increasing the amount of information that can be stored by a single memory cell). The number system based on ones and zeroes is called the binary system (because there are only two possible digits). Before discussing the binary system, a review of the decimal (ten possible digits) system is in order, because many of the concepts of the binary system will be easier to understand when introduced alongside their decimal counterpart.

You are familiar with the decimal system. For instance, to represent the positive integer one hundred and twenty-five as a decimal number, we can write (with the postivie sign implied). The subscript 10 denotes the number as a base 10 (decimal) number.

12510 = 1·100 + 2·10 + 5·1 = 1·102 + 2·101 + 5·100

The rightmost digit is multiplied by 100, the next digit to the left is multiplied by 101, and so on. Each digit to the left has a multiplier that is 10 times the previous digit. Some observations:

Representing fractions is a simple extension of this idea. To wit,

25.43 10 = 2·10 + 5·1 + 4·0.1 + 3·0.01 = 2·101 + 5·100 + 4·10-1+ 3·10-2

The only pertinent observations here are:

After reading this dcoument you might want to learn something about binary arithmetic.

Binary Representation of positive integers

Binary representations of positive can be understood in the same way as their decimal counterparts. For example

8610 = 1·64 + 0·32 + 1·16 + 0·8 + 1·4 + 1·2 + 0·1
   or
8610 = 1· 26 + 0· 25 + 1· 24 + 0· 23 + 1· 22 + 1· 21 + 0· 20
   or
8610 = 1010110 2

The subscript 2 denotes a binary (i.e., base 2) number. Each digit in a binary number is called a bit. The number 1010110 is represented by 7 bits. Any number can be broken down this way, by finding all of the powers of 2 that add up to the number in question (in this case 26, 24, 22 and 21). You can see this is exactly analagous to the decimal deconstruction of the number 125 that was done earlier. Likewise we can make a similar set of observations:

Exercises (positive integers):

Convert 125 from decimal to binary 
Convert 96 from decimal to binary 
Convert 10011 from binary to decimal 
In 'C', an unsigned integer is usually 16 bits. What is the largest number that can be represented by an unsigned integer? 
Convert 37 to binary, shift it left by one and convert back to decimal.  What is the result

Hexadecimal, Octal, Bits, Bytes and Words.

It is often convenient to handle groups of bits, rather than individually. The most common grouping is 8 bits, which forms a byte. A single byte can represent 256 (28) numbers. Memory capacity is usually referred to in bytes. Two bytes is usually called a word, or short word (though word-length depends on the application). A two-byte word is also the size that is usually used to represent integers in programming languages. A long word is usually twice as long as a word. A less common unit is the nibble which is 4 bits, or half of a byte.

It is cumbersome for humans to deal with writing, reading and remembering individual bits, because it takes many of them to represent even fairly small numbers. A number of different ways have been developed to make the handling of binary data easier for us. The most common is hexadecimal. In hexadecimal notation, 4 bits (a nibble) are represented by a single digit. There is obviously a problem with this since 4 bits gives 16 possible combinations, and there are only 10 unique decimal digits, 0 to 9. This is solved by using the first 6 letters (A..F) of the alphabet as numbers. The table shows the relationship between decimal, hexadecimal and binary.

Decimal Hexadecimal Binary
0 0 0000
1 1 0001
2 2 0010
3 3 0011
4 4 0100
5 5 0101
6 6 0110
7 7 0111
8 8 1000
9 9 1001
10 A 1010
11 B 1011
12 C 1100
13 D 1101
14 E 1110
15 F 1111

There are some significant advantages to using hexadecimal when dealing with electronic representations of numbers (if people had 16 fingers, we wouldn't be saddled with the awkward decimal system). Using hexadecimal makes it very easy to convert back and forth from binary because each hexadecimal digit corresponds to exactly 4 bits (log 2(16) = 4) and each byte is two hexadecimal digit. In contrast, a decimal digit corresponds to log2(10) = 3.322 bits and a byte is 2.408 decimal digits. Clearly hexadecimal is better suited to the task of representing binary numbers than is decimal.

As an example, the number CA3 16 = 1100 1010 00112 (11002 = C16 , 10102 = A16, 00112 = 3 16). It is convenient to write the binary number with spaces after every fourth bit to make it easier to read. Converting back and forth to decimal is more difficult, but can be done in the same way as before.

323510 = C16·256 + A16·16 + 316·1 = C16 ·162 + A16 ·161 + 316 ·160
   or
323510 = 12·256 + 10·16 + 3·1 = 12·162 +10·161 +3·160

Octal notation is yet another compact method for writing binary numbers. There are 8 octal characters, 0...7. Obviously this can be represented by exactly 3 bits. Two octal digits can represent numbers up to 64, and three octal digits up to 512. A byte requires 2.667 octal digits. Octal used to be quiete common, it was the primary way of doing low level I/O on some old DEC computers. It is much less common today but is still used occasionally (e.g., to set read, write and execute permissions on Unix systems)

Summary of binary types:

Exercises (binary types):

Convert 2000 from decimal to hexadecimal 
Convert 3C from hexadecimal to decimal 
Convert 1010 0111 1011 from binary to hexadecimal 
Convert 7D0 from hexadecimal to binary 
If you shift a hexadecimal number to the left by one digit, how many times larger is the resulting number? 

Signed Binary Integers

It was noted previously that we will not be using a minus sign (-) to represent negative numbers. We would like to represent our binary numbers with only two symbols, 0 and 1. There are a few ways to represent negative binary numbers. The simplest of these methods is called ones complement, where the sign of a binary number is changed by simply toggling each bit (0's become 1's and vice-versa). This has some difficulties, among them the fact that zero can be represented in two different ways (for an eight bit number these would be 0000 0000 and 1111 1111)., we will use a method called two's complement notation which avoids the pitfalls of one's complement, but which is a bit more complicated.

To represent an n bit signed binary number the leftmost bit, has a special significance. The difference between a signed and an unsigned number is given in the table below for an 8 bit number.

The value of bits in signed and unsigned binary numbers
  Bit 7 Bit 6 Bit 5 Bit 4 Bit 3 Bit 2 Bit 1 Bit 0
Unsigned 27 = 128 26 = 64 25 = 32 24= 16 23= 8 22 = 4 21 = 2 20 = 1
Signed -(27) = -128 26 = 64 25 = 32 24 = 16 23 = 8 22 = 4 21= 2 20 = 1


Let's look at how this changes the value of some binary numbers

Binary Unsigned Signed
0010 0011 35 35
1010 0011 163 -93
1111 1111 255 -1
1000 0000 128 -128

If Bit 7 is not set (as in the first example) the representation of signed and unsigned numbers is the same. However, when Bit 7 is set, the number is always negative. For this reason Bit 7 is sometimes called the sign bit. Signed numbers are added in the same way as unsigned numbers, the only difference is in the way they are interpreted. This is important for designers of arithmetic circuitry because it means that numbers can be added by the same circuitry regardless of whether or not they are signed.

To form a two's complement number that is negative you simply take the corresponding positive number, invert all the bits, and add 1. The example below illustrated this by forming the number negative 35 as a two's complement integer:

3510 = 0010 00112
invert -> 1101 11002
add 1 -> 1101 11012


So 1101 1101 is our two's complement representation of -35. We can check this by adding up the contributions from the individual bits

1101 11012 = -128 + 64 + 0 + 16 + 8 + 4 + 0 + 1 = -35.

The same procedure (invert and add 1) is used to convert the negative number to its positive equivalent. If we want to know what what number is represented by 1111 1101, we apply the procedure again

? = 1111 11012
invert -> 0000 00102
add 1 -> 0000 00112


Since 0000 0011 represents the number 3, we know that 1111 1101 represents the number -3.

Note that a number can be extended from 4 bits to 8 bits by simply repeating the leftmost bit 4 times. Consider the following examples

Decimal 4 bit 8 bit
3 0011 0000 0011
-3 1101 1111 1101
7 0111 0000 0111
-5 1011 1111 1011

Let's carefully consider the last case which uses the number -5. As a 4 bit number this is represented as

1011 = -8 + 2 + 1 = -5

The 8 bit number is

1111 1011 = -128 + 64 + 32 + 16 + 8 + 2 + 1 = -5.

It is clear that in the second case the sum of the contributions from the leftmost 5 bits (-128 + 64 + 32 + 16 + 8 = -8) is the same as the contribution from the leftmost bit in the 4 bit representation (-8)

This process is refered to as sign-extension, and can be applied whenever a number is to be represented by a larger number of bits. Likewise you can remove all but one of the leftmost bits, as long as they are all the same, so the 8 bit number 000001112=710 can be replaced by 01112. Also 111110112=-510 can be replaced by 10112).

Most processors even have two separate instructions for shifting numbers to the right (which, you will recall, is equivalent to dividing the number in half). The first instruction is something like LSR (Logical Shift Right) which simply shifts the bits to the right and usually fills a zero in as the lefmost bit. The second instruction is something like ASR (Arithmetic Shift Right), which shifts all of the bits to the right, while keeping the leftmost bit unchanged. With ASR 1010 (-6) becomes 1101 (-3). Of course, there is only one instruction for a left shift (since LSL is equivalent to ASL).

Positive binary fractions

The representation of unsigned binary fractions proceeds in exactly the same way as decimal fractions. For example

0.62510 = 1·0.5 + 0·0.25 + 1·0.125 = 1· 2-1 + 0· 2-2 + 1· 2-3 = 0.1012


Each place to the right of the decimal point represents a negative power of 2, just as for decimals they represent a negative power of 10. Likewise, if there are m bits to the right of a decimal, the precision of the number is 2-m (versus 10-m for decimal). Though it is possible to represent numbers greater than one by having digits to the left of the decimal place we will restrict ourselves to numbers less than one. These are commonly used by Digital Signal Processors.

The largest number that can be represented by such a representation is 1-2-m , the smallest number is 2-m. For a fraction with 15 bits of resolution this gives a range of approximately 0.99997 to 3.05E-5.

Note that this representationis easily extended to represent all positive numbers by having the digits to the left of the decimal point represent the integer part, and the digits to the right representing the fractional part. Thus

6.62510 = 110.1012

Exercises (positive binary fractions):

Convert 0.100 1001 from binary to decimal 
Convert 0.111 1111 from binary to decimal
Convert 0.75 from decimal to an 8 bit Q7 binary fraction 
Convert 0.65625 from decimal to an 8 bit Q7 binary fraction 
Approximate 0.9 as an 8 bit Q7 binary fraction

Signed binary fractions

Signed binary fractions are formed much like signed integers. We will work with a single digit to the left of the decimal point, and this will represent the number -1 (= -(20)). The rest of the representation of the fraction remains unchanged. Therefore this leftmost bit represents a sign bit just as with two's complement integers. If this bit is set, the number is negative, otherwise the number is positive. The largest positive number that can be represented is still 1-2-m but the largest negative number is -1. The resolution is still 1-2-m.

There is a terminology for naming the resolution of signed fractions. If there are m bits to the right of the decimal point, the number is said to be in Qm format. For a 16 bit number (15 bits to the right of the decimal point) this results in Q15 notation.

Exercises (signed binary fractions):

Convert 1.100 1001 from binary to decimal
Convert 1.111 1111 from binary to decimal
Convert -0.75 from decimal to a binary fraction 
Convert -0.65625 from decimal to a binary fraction 
Approximate -0.9 as a binary fraction (use 8 bits) 

Signed binary fractions are easily extended to include all numbers by representing the number to the left of the decimal point as a 2's complement integer, and the number to the right of the decimal point as a positive fraction. Thus

-6.62510 = (-7+0.375)10 = 1001.0112


Note, that as with two's complement integers, the leftmost digit can be repeated any number of times without affecting the value of the number.

A Quicker Method for Converting Binary Fractions.

Another way to convert Qm numbers to decimal is to represent the binary number as a signed integer, and to divide by 2m; this is equivalent to shifting the decimal point m places to the right. To convert a decimal number to Qm, multiply the number by 2m and take the rightmost m digits. Note, this simply truncates the number; it is more elegant, and accurate, but slightly more complicated, to round the number.

Examples (all Q7 numbers):

Convert 0.100 1001 to decimal. Take the binary number 0100 1001 (=7310), and divide by 27=128. The answer is 73/128=0.5703125, which agrees with the result of the previous exercise (Positive Binary Fractions).
Convert 1.100 1001  to decimal. Take the two's complements binary number 1100 1001 (=-5510), and divide by 128. The answer is -0.4296875, which agrees with the result of the previous exercise (Signed Binary Fractions).
Convert 0.9 to Q7 format Multiply 0.9 by 128 to get  115.2. This is represented in binary as 111 0011, so the Q7 representation is 0.111 0011. This agrees with the result of the previous exercise (Positive Binary Fractions).
Convert -0.9 to Q7 format Multiply -0.9 by 128 to get  -115.2. The Q7 representation is 1.000 1101. This agrees with the result of the previous exercise (Signed Binary Fractions).

Floating point numbers

This section is not complete

IEEE single precision floating point

This section is not complete

IEEE double precision floating point

This section is not complete


Comments or Questions?

Erik Cheever
Professor Emeritus
Engineering Department
Swarthmore College