Internal data representation

Data is stored internally in bits or electric circuits that are either on or off.
On external storage devices bits are recorded as magnetic or reflective spots.
 
8 bits make up 1 byte.
 
When string data is stored in memory each character/symbol requires 1 byte of storage.
 
The following representation of a byte is how the character “a” is stored in ASCII.

 0 1 1 0 0 0 0 1


ASCII stands for American Standards Committee for Information Interchange and is a standard coding scheme used by most personal computers to represent letters, numerals, symbols, as well as keys on the keyboard like Enter, Backspace, etc.
 
When we refer to the ASCII code we usually will use the decimal equivalent to the binary (base 2) number represented by the ONs and OFFs (1’s and 0’s). So we would say that the character “a” is represented by the ASCII code 97. To convert a binary number to decimal we need to understand how numbers in different bases work.
 
The number 789 in base ten is equal to


   9 * 100 =     9 (any number raised to the power of zero equal 1)
+ 8 * 101 =   80 (any number raised to the power of one equals itself)
+ 7 * 102 = 700
 
9 + 80 + 700 = 789


 
The same can be done with a binary (base 2) number

        0      1      1      0      0      0      0      1

      27    26     25     24    23     22    21    20


      1 * 20 = 1
    +0 * 21 = 0
    +0 * 22 = 0
    +0 * 23 = 0
    +0 * 24 = 0
    +1 * 25 = 32
    +1 * 26 = 64
    +0 * 27 = 0
 
     1 + 32 + 64 = 97

The ASCII coding scheme uses only 7 bits of a byte (0 – 127). Computers also use numbers 128 through 255 to represent additional characters but this list is not really standardized so you may find difference between charts.
 
Because 256 characters are not sufficient to represent all characters used in Asian languages a new standard has emerged. The "Unicode" character set uses two (or four) bytes of storage and contains more than 32000 characters. Unicode is not currently the universal standard but is supported by Microsoft Windows and Mac OS operating systems. Visual Studio uses Unicode to store characters in memory.

Numbers are stored in binary as well

        0      1      1      0      0      0      0      1

      27    26     25     24    23     22    21    20

 
is also equivalent to the unsigned integer 97. (An integer is a whole number.)
 

How the contents of an area of memory are interpreted is based on how it is defined. If this area is defined as character data, it will be treated as ASCII character "a".
If it is defined as an unsigned integer, it will be treated as the number 97.
 
Actually an unsigned integer uses at least 2 bytes or 16 bits

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

and can store a number as large as 65,535.

32768 + 16384 + 8192 + 4096 + 2048 + 1024 + 512 + 256 + 128 + 64 + 32 + 16 + 8 + 4 + 2 + 1

in two bytes    

 1       1       1      1      1       1       1       1       1        1        1       1        1        1        1        1

215  +214 +213 +212 +211 +210   +29    +28    +27     +26    +25    +24    +23    +22    +21    +20

 
Signed integers store the sign in the left most bit.


A signed integer that uses 2 bytes can store any whole number between -32,768 and 32,767

A signed integer that uses 4 bytes can store any whole number between -2,147,483,648 and 2,147,483,647

The number of bytes used to store a particular numeric data type (and therefore the range of values) will vary on different platforms.

 

Numbers with fractional values are stored differently. Some of the bits are used to store a number and others are used to indicate the position of a decimal point.

The more memory available to store a number the larger or more precise it can be.

When declaring numeric variables, what should they be?

When decimal positions, very large, or very small numbers are needed --- use floating point variables.
 
Working with integers is faster than single or double precision so if floating point numbers are not needed integers will improve speed. They also don't take up as much memory.

 
On some large mainframe systems data is stored using a different coding scheme – EBCDIC.