Computing Limits

From ComputingForScientists

Jump to: navigation, search

Contents

  1. Computing Limits
    1. Objectives
    2. Motivation
    3. A Simple Spreadsheet
    4. Memory Limits
      1. Variable Name and Value Limits
      2. Precision Limits
      3. A More Realistic Spreadsheet
      4. IEEE 754 Floating Point Example
    5. Time Limits
    6. Time and Memory Limits Example
  2. Problems
    1. A Simpler Spreadsheet
    2. Variable Name Limits
    3. Memory Limits
    4. IEEE 754 Floating Point
  3. Activities
    1. Spreadsheet Limits
    2. Time Limits
    3. Pythagorean Theorem
    4. Fermat's Last Theorem
  4. Resources

1. Computing Limits

1.1. Objectives

To explain limitations of what can be computed due to:

  1. memory limitations
  2. precision limitations
  3. computing speed limitations

1.2. Motivation

When designing a computational experiment, computing limitations must be considered to ensure the experiment will yield meaningful results. In this section, we introduce a simple spreadsheet encoding rule to demonstrate several computing limitations.

1.3. A Simple Spreadsheet

In this section, we introduce a simple spreadsheet encoding rule in order to demonstrate several computing limitations.

Suppose a spreadsheet internally represents each decimal number in a cell as a 64-bit pattern. When the file is opened, the operating system copies all of the bit patterns in the spreadsheet from slow memory (usually a disk) into fast memory (RAM). The file must also contain information so the spreadsheet can determine which cell is associated with each pattern. If the spreadsheet had only one value, say 100,000,000, it would need to store the cell name (e.g., A1) along with the the location of the bit pattern associated with this name.

If we encoded the variable name information in 7-bit ACSII, the file would need to contain the following information (the first row is the human readable version and the second row is what is written in the file):

    A        1        100000000
 1000001  110001 101111101011110000100000000

In reality, most programs use a standard size for each element above to simplify decoding. If the encoding rule was that the variable name information was encoded in 8-bit ASCII and the cell value as a 64-bit binary representation of the cell value, the file would actually contain extra zeros (the first row is the human readable version and the second row is what is written in the file):

  A         1                                  100000000
01000001  00110001 0000000000000000000000000000000000000101111101011110000100000000

When the file is read, the spreadsheet reads the first 16 bits to determine the cell name (e.g., A1) in which to place the number found by converting the next 64 bits to an integer. If the end of the file is not found after the 80th bit, it reads the next 16 bits to determine the cell in which to place the next number found in the next 64 bits.

The above is not actually how information is encoded in a spreadsheet file. In addition, the spreadsheet file has other information (for example, what color each cell is). However, this simple spreadsheet encoding is sufficient to explain how information could be encoded and will be used in the following discussion of computing limits.

1.4. Memory Limits

In the simple spreadsheet example, the limitation on the number of cells that could be read into RAM is the amount of available RAM memory (say 232 bits) required for each cell (16 bits + 64 bits = 80 bits), or 232 /80 = 53,687,091. (This is an upper limit. Not all of the RAM on a computer is available for use by a given program. The operating system and other programs, such as a web browser, take up RAM space as does other hardware, such as the graphics card.)

1.4.1. Variable Name and Value Limits

If the simple spreadsheet encoding method is used, there is a limitation on the number of variable names that can be used. The variable name is represented by 16 bits, and there are 216 unique bit patterns that can be associated with a variable name.

The limitation on the possible values of the number associated with each variable name is it must be an integer between 0 and 264-1 (1.8446744e+19).

1.4.2. Precision Limits

In the simple spreadsheet example, we were limited to storing integers only in the cell array. This means our precision is limited in two ways:

  1. Only integer values may be stored, so if you entered 1.1, the saved file would only contain the binary representation of the decimal number 1 (63 zeros and a 1).
  2. The number must be in the range of 0 through 264-1 = 1.8446744e+19 (i.e., 1.8446744e+19).

Both of these limitations will lead to round-off errors. In the second case, suppose you said the value of element C1 is the value of A1/B1 and A1 = 1 and B1 = 2. The value stored would be 0.5 rounded, or 1. Only when A1 and B1 are integers will there be no round-off errors.

1.4.3. A More Realistic Spreadsheet

A more realistic spreadsheet encoding would allow non-integer values. To allow for this, we need to modify how cell values are encoded. However, there are many ways to do this, and in the early 1980s there was an effort to create a standard encoding rule [1]. The result of this effort was a standard called IEEE 754 floating point. The standard said that to represent a non-integer number as a bit pattern, write the sign of the number as the first bit (0 for + and 1 for -), the "exponent" as the next 11 bits, and then the fraction as the next 52 bits. For example, the IEEE 754 representation of 1.1 is:

sign       exponent                      significand
0       01111111111 10001100110011001100110011001100110011001100110011010

In order to translate this to a decimal number, we must first convert the sign, exponent, and significand to decimal. The sign and exponent are integers, so the method of translation to decimal is the same as that used previously.

To translate the significand to decimal, we must add to one a fraction determined by adding

1 + s1/21 + s2/22 + ... + s52/252

Where the value of the numerator s is zero or one, with values taken from the significand pattern working from left to right). The first five numerators are 1, 0, 0, 0, and 1. The last two numerators are 1 and 0, giving

1 + 1/21 + 0/22 + 0/23 + 0/24 + 1/25 + ... + 1/251 + 0/252 = 1.000000000000000888178419700125232338905

(this is the method of creating a fractional decimal number from a binary pattern). The previous table, with values written in decimal, is:

signd        exponentd            significandd
 0            1023    1.000000000000000888178419700125232338905

The rule for decoding this is to multiply the three items together: -1signd ·2(exponentd-1023)·significandd. The result is:

-10 x 21023-1023·1.1000000000000000888178419700125232338905

or:

1.1000000000000000888178419700125232338905

What can be concluded is that the decimal number cannot be exactly represented in IEEE 754 in binary. If you encode the number 1.1 in IEEE 754 in binary and someone decodes it, they will think that you actually wrote:

1.1000000000000000888178419700125232338905

instead of:

1.1000000000000000000000000000000000000000

If the computation of 1.1·1.1 is made by hand, the result is:

1.2100000000000000000000000000000000000000

but on a computer that uses IEEE 754 floating point, the result is:

1.2100000000000001865174681370262987911701

Because of the multiplication, the error part of the result is now larger (that starts in the 16th decimal place) than the error part in the representation of 1.1 (that starts in the 17th decimal place).

1.4.4. IEEE 754 Floating Point Example

TODO

1.5. Time Limits

Each computer sold has a limit on the number of floating point operations per second (FLOPS) that can be performed. Suppose it took a human 2 seconds to multiply the floating point numbers 1.1·1.1. This is one floating point operation in two seconds, or 0.5 FLOPS (1 floating point ÷ 2 seconds).

This number is typically four times the "clock speed" of the processor. Modern computers typically have clock speeds of 2 GHz, or 2·109 cycles per second and the number of floating point operations that can be performed per second is usually 4x larger than the clock speed, corresponding to 8·109 FLOPS.

This number is very large but still puts a limit on the scale of a scientific problem that can be solved. As an example, consider computing digits of π. The following calculation must be made:

$$\pi = \frac{4}{1} - \frac{4}{3} + \frac{4}{5} - \frac{4}{7} + \frac{4}{9} - \frac{4}{11} + \frac{4}{13} - \cdots$$

Each division operation corresponds to a floating point operation as does each addition or subtraction.

1.6. Time and Memory Limits Example

In the introduction, we discussed how population growth could be computed by repeating the formula:

Pnext = Pnow + αPnow

where Pnext is the population in the next time interval, Pnow is the population in the current time interval, and α is the growth rate. Suppose the number of bacteria doubled every second, corresponding to α = 1. If the initial population was 2, the population in the next second is computed as:

Pnext = 2 + 1·2 = 4

and the population in the third second is:

Pnext = 4 + 1·4 = 8

Each of the above two calculations requires two floating point operations: a multiplication and an addition.

2. Problems

2.1. A Simpler Spreadsheet

2.2. Variable Name Limits

Suppose the simple spreadsheet used only 8 bits to represent the variable name. How many unique variable names (cells) could a spreadsheet have?

2.3. Memory Limits

If a computer has 100 MB of RAM available, what is the largest spreadsheet file that could be opened (assuming the file was encoded in the simple spreadsheet format)?

2.4. IEEE 754 Floating Point

Decode the following IEEE 754 floating point number into decimal (to three decimal places only!).

0 10000000000 0001100110011001100110011001100110011001100110011010

3. Activities

3.1. Spreadsheet Limits

Open up a blank spreadsheet and place the value 100 in as many cells as possible. Your instructor will show you how to do this. When does the program tell you that there are too many values? When you save a file with less than this number, how big is the file on disk? Is this file smaller or larger than the file that would be create using the simple spreadsheet encoding rule given earlier?

3.2. Time Limits

Use the formula $$\pi = \frac{4}{1} - \frac{4}{3} + \frac{4}{5} - \frac{4}{7} + \frac{4}{9} - \frac{4}{11} + \frac{4}{13} - \cdots$$ to compute π to 10 decimal places using a calculator. How many FLOPS were you able to perform?

3.3. Pythagorean Theorem

Suppose that you were asked to verify that the following equation is true for some positive integer values of a, b, and c using a computer.

a2 + b2 = c2

What would you do? How many positive values of a, b, and c could you check in a year using a computer with the following specifications:

  • 2 GHz clock speed.
  • The computer can store positive integers between 0 and 264-1.

Do the same for the case where

  • The computer can store positive integers between 0 and 232-1.

3.4. Fermat's Last Theorem

In number theory, Fermat's Last Theorem (sometimes called Fermat's conjecture, especially in older texts) states that no three positive integers a, b, and c can satisfy the equation an + bn = cn for any integer value of n greater than two. [2]

Suppose that you were asked to verify that the following equation is true. What would you do? Could you ever verify that it is true for all possible values of a, b, and c?

4. Resources

Personal tools