This notebook contains an excerpt from the Python Programming and Numerical Methods - A Guide for Engineers and Scientists, the content is also available at Berkeley Python Numerical Methods.
The copyright of the book belongs to Elsevier. We also have this interactive book online for a better learning experience. The code is released under the MIT license. If you find this content useful, please consider supporting the work on Elsevier or Amazon!
< CHAPTER 9. Representation of Numbers | Contents | 9.2 Floating Point Numbers>
Base-N and Binary¶
The decimal system is a way of representing numbers that you are familiar with from elementary school. In the decimal system, a number is represented by a list of digits from 0 to 9, where each digit represents the coefficient for a power of 10.
EXAMPLE: Show the decimal expansion for 147.3.
\(147.3 = 1 \cdot 10^2 + 4 \cdot 10^1 + 7 \cdot 10^0 + 3 \cdot 10^{-1}\).
Since each digit is associated with a power of 10, the decimal system is also known as base10 because it is based on 10 digits (0 to 9). However, there is nothing special about base10 numbers except perhaps that you are more accustomed to using them. For example, in base3 we have the digits 0, 1, and 2 and the number \(121(base\ 3) = 1 \cdot 3^2 + 2 \cdot 3^1 + 1 \cdot 3^0 = 9 + 6 + 1 = 16(base\ 10)\)
For the purposes of this chapter, it is useful to denote which representation a number is in. Therefore in this chapter, every number will be followed by its representation in parentheses (e.g., 11 (base10) means 11 in base10) unless the context is clear.
A very important representation of numbers for computers is base2 or binary numbers. In binary, the only available digits are 0 and 1, and each digit is the coefficient of a power of 2. Digits in a binary number are also known as a bit. Note that binary numbers are still numbers, and so addition and multiplication are defined on them exactly as you learned in grade school.
TRY IT! Convert the number 11 (base10) into binary. \(11 (base\ 10) = 8 + 2 + 1 = 1 \cdot 2^3 + 0 \cdot 2^2 +1 \cdot 2^1 +1 \cdot 2^0 = 1011 (base\ 2)\)
TRY IT! Convert 37 (base10) and 17 (base10) to binary. Add and multiply the resulting numbers in binary. Verify that the result is correct in base10.
Convert to binary:
\(37\ (base\ 10) = 32 + 4 + 1 = 1 \cdot 2^5 + 0 \cdot 2^4 + 0 \cdot 2^3 + 1 \cdot 2^2 + 0 \cdot 2^1 + 1 \cdot 2^0 = 100101\ (base\ 2)\)
\(17\ (base\ 10) = 16 + 1 = 1 \cdot 2^4 + 0 \cdot 2^3 + 0 \cdot 2^2 + 0 \cdot 2^1 + 1 \cdot 2^0 = 10001\ (base\ 2)\)
Get results of addition and multiplication in decimal:
\(37 + 17 = 54\)
\(37\times17 = 629\)
Do addition in binary:
Do multiplication in binary:
Binary numbers are useful for computers because arithmetic operations on the digits 0 and 1 can be represented using AND, OR, and NOT, which computers can do extremely fast.
Unlike humans that can abstract numbers to arbitrarily large values, computers have a fixed number of bits that they are capable of storing at one time. For example, a 32-bit computer can represent and process 32-digit binary numbers and no more. If all 32-bits are used to represent positive integer binary numbers, then this means that there are \(\sum_{n=0}^{31} 2^{n} = 4,294,967,296\) numbers the computer can represent. This is not very many numbers at all and would be completely insufficient to do any useful arithmetic on. For example, you could not compute the perfectly reasonable sum \(0.5 + 1.25\) using this representation because all the bits are dedicated to only integers.
< CHAPTER 9. Representation of Numbers | Contents | 9.2 Floating Point Numbers>