Overview of This Lesson

In Java (and other programming languages) we need to be specific about the kind of data we're using. Before we get into programming with Java, we need to learn about these different kinds of data and how they're used in your program code.

In Java there are two sets of types: primitive data types and object types (these are sometimes also called reference types). Primitive types are probably the most familiar to you - they mostly include numbers and characters. For example, if you wanted to ask a customer for the number of skeins of yarn they want to purchase, they would enter a number like 1 or 15. Similarly, if you asked that same customer for the shipping address, they would enter various kinds of characters like digits and letters, and maybe even symbols like the # (number sign or hashtag symbol) or the period. These are simple pieces of data that could be stored in one of Java's primitive types.

Object/reference types would only be familiar to you if you've done some object-oriented programming before. Object or reference types are not just simple pieces of data: they are more complex structures. In an object-oriented language, programs are designed around classes and objects. For example, if you wanted to write a program that allowed students to register for courses, the objects might include Student, Course, Registration Form, and Fee Payment. Classes are the templates or definitions of objects in a program. For example, you might code a Student class. From this Student class, your program can create objects for each specific student that wants to register for a course. In this case, the type of data being used is of the Student object type.

In this session, we will focus on the numeric primitive data types. Most of these are numbers, but there are a couple of others, as you will see.

Pre-Requisites

Before doing this tutorial, make sure you've gone through the Writing your First Java Program tutorial.

Review of Exponential and Scientific Notation

To understand how Java stores certain kinds of numbers, it's important to understand exponential and scientific notation. You probably learned this in a secondary school math class, but if you need a review, don't skip this section of the lesson.

Scientific Notation

Let's examine a number such as 123.456. To convert this into scientific notation, you would move the decimal over enough spaces to ensure that there is only one digit to the left of the decimal. So 123.456 would become 1.23456 . We moved the decimal point over 2 spaces to the left to arrive at the number 1.23456 . However, 1.23456 doesn't have the same meaning or value as 123.456, so we have to indicate that we've moved the decimal to arrive at the smaller number. When you move the decimal to the left, you are really just dividing the number by some power of 10, so we would represent 123.456 in scientific notation as 1.23456 x 102 (said as "1.23456 times 10 to the power of 2").

We chose 102 because we moved the decimal point two positions to the left, which would be the same as dividing by 100. Therefore, the notation 1.23456 x 102 = 1.23456 x 100 = 123.456

How would you convert the value .0023 to scientific notation? First we would have to move the decimal point 3 spaces to the right, turning the number into 2.3 (we drop the "leading zeros", which are 0's that appear to the left of the number). Moving the decimal point to the right instead of the left, we are in fact multiplying the number by a power of 10 (in this case, .0023 * 1000 = 2.3. Therefore, when we represent this as scientific notation, we have to use a negative exponent: 10-3. The value .0023 ends up being represented in scientific notation as 2.3 x 10-3.

Exercise:

A. For practice, try converting these to scientific notation:

  1. 34,982.58
  2. .004785
  3. .123467

B. Now try converting these values from scientific notation into regular decimal numbers:

  1. 9.7824 x 103
  2. 2.222 x 107
  3. 1.987 x 10-2
  4. 5.5 x 10-4

A.

1. 34,982.58 = 3.498258x104
2. .004785 = 4.785x10-3
3. .123467 = 1.23467x10-1

B.

1. 9.7824 x 103 = 9,782.4
2. 2.222 x 107 = 22,220,000
3. 1.987 x 10-2 = 0.01987
4. 5.5 x 10-4 = 0.00055

Exponential Notation

Exponential notation is determined out the same way as scientific notation - it just has a different format or syntax. To represent a number in exponential notation, you simply replace the "x 10" with the letter "e" or "E" and write the exponent (in regular script instead of superscript). For example, to represent 123.456 or 1.23456 x 102 in exponential notation, you would write 1.23456e2 or 1.23456E2. To represent .0023 or 2.3 x 10-3 you would write 2.3e-3 or 2.3E-3

Exercise:

Convert the following into exponential notation:

  1. 34,982.58
  2. .004785
  3. .123467

1. 34,982.58 = 3.498258e4
2. .004785 = 4.785e-3
3. .123467 = 1.23467e-1

Now that you understand how a decimal number looks in scientific or exponential notation, we can talk more about Java types and how numbers are represented and stored in Java programs.

Numeric Data Types

There are two categories of numeric types in Java: integers and floating point numbers. Integers are what you might commonly know as "whole numbers". Floating-point numbers are numbers with a decimal point or fractional portion.

Integers

There are actually four types of integers in Java: byte, short, integer, and long integer. Each one is a different size and holds a different range of values:

Java Integer Data Types
Type Name Java Keyword Size Allowed Values
Byte byte 1 byte -128 to +127
Short short 2 bytes -32,768 to +32767
Integer int 4 bytes -2,147,483,648 to +2,147,483,647 (that's about ±2 billion)
Long Integer long 8 bytes -9,223,372,036,854,775,808 to +9,223,372,036,854,775,807 (that's about ±9 quintillion)

Now you might think, "to make it easier we should just always store everything as a Long Integer!" but that's not true at all - If you're storing a lot of small numbers, that's a huge waste of space. Similarly, you might assume that you should always use Byte or Short for smaller numbers, since they are smaller in size. In Java, this isn't true either! In Java, it's actually the most efficient to store all integers as the Integer (int) type, unless the value is too big to fit (in which case you would use a Long Integer). Why???

In Java, when you type any literal integer value in your code (such as 5 or -99999), the compiler always treats that number as an Integer/int type. Integer (int) is always the default for integer literals. The only exception is if your number is too big for an integer: in this case, you must append an upper- or lower-case L to your number, which makes it a Long Integer (long).

For example, 999999 will be treated as an int and 999999999999999999L (999 quadrillion, and I left out the commas on purpose - you'll see why later) will be treated as a long. You'll see in an upcoming lesson what happens when you use 999999999999999999 (a value too big for an int and without the L at the end).

Floating Point Numbers

Floating point numbers may have a fractional or decimal portion, so sometimes they are referred to as "decimal numbers". Note however, they may have a zero value for the decimal/fractional portion. For example, 5.0 is considered a floating point number, because it includes the decimal point. In programming, some new students find the name "floating-point" and "decimal number" confusing. Usually when referring to numbers in programming, the term "decimal" refers to the number system: The number system we use in every-day life is the decimal number system (or "base 10" number system). In technology, you might also use the binary number system, hexadecimal number system, or octal number system. Therefore, when referring to data that can (or could) contain decimal values, we prefer to call it a floating-point number, and not a decimal number. We use the term "floating-point" because if you recall scientific/exponential notation, the decimal point moves or floats as you convert from decimal notation to scientific or exponential notation and vice versa.

In Java, there are two types of floating-point numbers: single-precision floating point numbers (float) and double-precision floating point numbers (double).

Java Floating-Point Types
Type Name Java Keyword Size Allowed Values
Single Precision Floating-Point float 4 bytes Negative Range: -3.4028235e38 to -1.41e-45
Positive Range: 1.4e-45 to 3.4028235e38
(all with 7 significant digits)
Double Precision Floating-Point double 8 bytes Negative Range: -1.7976931348623157e308 to -4.9e-324
Positive Range: 4.9e-324 to 1.7976931348623157e308
(all with 15 significant digits)

You'll notice that the range of values for both floating-point types are expressed in scientific notation. That's because these values are very small or very large.

You'll also notice that each one contains two sets of values: a negative range and a positive range. This is because it's not just about the higher or lower positive negative number you can store: it's more about the amount of precision you can store. That's why both sets of ranges include a power of 10 that is negative (very very small values) and positive (very very large values).

Lastly, you'll notice that each of the floating point types can only store a certain number of significant digits. This is because of how floating point numbers are stored.

When the computer stores a floating-point number, it first converts it into a binary form of scientific notation. It uses some of the storage space for the the sign (+ or -), some for the exponent, and the rest for the coefficient. The coefficient is the portion of the number that appears before the "e" or "E" in exponential notation, or the x 10 in scientific notation. For example, in the number 2.3 x 10-3 and 2.3e-3, 2.3 is the coefficient and -3 is the exponent. The computer has to convert a regular decimal number like 123.456 into this special notation before storing it, and since the decimal point is being moved, or "floating", we call it a floating-point number.

Data Type Description # bits for
Coefficient
# bits for
Exponent
# bits for
Sign
float Single-precision floating-point 23 8 1
double Double-Precision floating-point 52 11 1

The number of bytes a floating-point number requires is an indication of its precision. A single-precision floating-point number in Java is 4 bytes. The first 23 bits are used to store the coefficient, the next 8 bits are used to store the exponent, and the last bit is used to store the sign of the number. A double-precision floating-point number takes up 8 bytes of space. The first 52 bits are for the coefficient, the next 11 bits are for the exponent, and the last bit is for the sign of the number. Since the amount of space allowed for the coefficient is greater in a double-precision number, you can store a number with greater precision in a double type than you can in a float type. If you were going to store the value for PI (3.14159265....) in Java, you would get more significant digits if you stored it as a double, than if you stored it as a float.

This means that for numbers that require more precision, you should use double, and for numbers that require less precision, you should use float. Further more, like the int type for integers, double is the default type for floating point literals, so it's often more efficient to simply use double.

Null Values for Numeric Types

One last thing to cover about numeric data in Java is null values. For numeric types, this is pretty straight-forward - the null value is 0. For example, a customer might have an outstanding balance of 0, or you might have 0 students in a class.

Note that Some programmers tend to use the value 0.0 for floating-point null value literals. This has to do with how Java treats literal numeric values and performs converions - this will be covered in an upcoming lesson.

Exercise

Which data type would you use for each scenario: integer or floating point?

  1. A number for the month (e.g. 1 for January, 2 for February, etc).
  2. The number of students in a class.
  3. The length of a room.
  4. A value for the cost of show tickets, which is $10.
  5. A number for the amount of money donated to charity.
  1. Month number: integer because there's no "fractions" for months, there's no 3.5 month, for example.
  2. Number of students: integer, you can't have a portion of a student!
  3. Length of a room: floating point. Measurements are almost always floating point numbers because there is likely to be a fractional/decimal portion.
  4. Cost of tickets: floating point. Currency values are almost always floating point numbers because they have dollars and cents (the cents are the decimal portions). In the example, it does say the price is $10 (which is an integer) but what if the price goes up 50 cents? You would have to change the data type if you wanted to store a new ticket price of 10.50 and that's not considered good programming.
  5. Amount of money: floating point. Again, currency is almost always a floating point number.

The Boolean Type

The Boolean type in Java (the keyword used is boolean is a type that takes up exactly 1 bit (although due to how memory works in Java, it actually consumes a whole byte, and sometimes 2 bytes) of space. Boolean values (in any programming language) consist only of two values: true and false. In Java, the values true and false are keywords of the language.

Boolean values are often used in relational expressions that are used in conditional statements, if statements, and loops. These are covered in upcoming lessons.

The Character Type

There is an additional primitive type in Java called the Characeter or char type. The char type is used for storing and representing single characters, as opposed to a set of characters (a set of characters is a String!). When we write out a char value, we surround it in single-quotes (instead of double-quotes like we do with String values). This helps us to visually see that a value is a char and not a string, but it's also required Java syntax, just as double-quotes are the proper syntax for string values.

One interesting thing to not about the char type is that it is considered one of the integer types. In a later session when we talk about casting and converting data from one type or another, this will be apparent. For now, just keep in mind that the char type is in the category of integer primitives.

The char type in Java is 2 bytes, so it is capable of representing up to 65,536 characters using the 16-bit Unicode standard (although Java does support the Unicode Supplementary Characters, it's beyond the scope of this course). Special Unicode characters won't usually display on the console but you can display them in dialogs and GUI components.

To display a specific Unicode character, use the \u escape sequence. For example, \u00f7 will display the division sign:

System.out.println('\u00f7');

We'll learn more about Characters and the char type in upcoming lessons.

Examples of char values:

Note that a value such as '123.456' is invalid because a char type can consist of only one character or escape sequence.

Exercise

What do you think is the output of the following programs? Try and figure it out before you try the code yourself.

// one
public class TryChars1 {
    public static void main(String[] args) {
        System.out.println('x');
        System.out.println('y');
        System.out.println('z');
    }
}
// two
  public class TryChars2 {
      public static void main(String[] args) {
          System.out.println('x');
          System.out.println('\n');
          System.out.println('z');
      }
}

What is the problem with this code segment?

// one
  public class TryCharsError {
      public static void main(String[] args) {
          System.out.println('xyz');
      }
}

What about Strings?

In Java, any series of characters that is to be treated as "text" is considered a String. Strings are not actually primitive types in Java: they are object/reference types. We'll explain what this means more in an upcoming lesson. For now we'll just learn how Strings work as pieces of data.

What is a String?

A string is any set of letters, numbers, or symbols that are considered "text" and are not for calculation or numeric comparison. For example, a customer's first name would be stored as a string. A customer's outstanding balance would not be stored as a string because at some point we'd probably calculate with that value, so we'd store it as a number (likely a floating-point number). Additionally, we might want to compare a customer balance to see if it were greater than 0 - 0 is a number so balance would have to be a number, too (you can't compare numbers with strings in Java).

A customer's phone number is usually stored as a string; even though a phone number can be stored as just a set of digits (your program code could format a phone number with brackets and dashes, so you don't need to store those), you don't calculate with phone numbers (e.g. you don't add two phone numbers together or multiply a phone number by some other value). The same is sometimes true of ID numbers such as a product ID or student ID. Sometimes if a programmer is concerned with saving storage space, s/he might store a phone number or ID value as an integer instead of a string. There really isn't a right or wrong when it comes to storing those kinds of values - it really all depends on what the designers of the system have discussed and decided.

The String Type

In Java, strings are not actually considered a primitive data type; strings are an object (or reference) type. We will learn more about objects and classes later in the course, but it is important to know that a string is technically an object, not a primitive data type. You can often treat a string like a primitive data type, which can make things a bit confusing! For now, we will keep in mind that a string is an object, although we will cover some aspects of strings as we cover the rest of the primitive data types.

String Values

When we refer to a string literal, we are referring to a string value that you write explicitly in code or on the computer screen. String literals are always surrounded in double-quotes like "this". When you start writing Java code, this becomes extremely important! If you don't put double-quotes around your string values in your code, Java won't understand what you're typing! Note that this convention might vary from language to language: some langauges also allow strings in single-quotes, and some languages, such as JavaScript, also has special Template Strings that are enclosed within backticks (`). In Java, "double-quotes" are for Strings, and 'single-quotes' are for single char values.

Examples of Strings:

Note that it doesn't matter if the character or string is made up of digits! Any digits enclosed in quotes is either a character or a string, not a number. For example, 123.456 is a number, but "123.456" is a string.

Exercise

What do you think is the output of the following programs? Try and figure it out before you try the code yourself.

// one
  public class TryStrings1 {
      public static void main(String[] args) {
          System.out.println("x");
          System.out.println("yz");
      }
  }
// two
    public class TryStrings2 {
        public static void main(String[] args) {
            System.out.println("Hello");
            System.out.println("\n");
            System.out.println("World");
        }
}
// three
  public class TryStrings3 {
      public static void main(String[] args) {
          System.out.println("Hello\n\nWorld");
      }
}

From examples two and three, which one do you think is most efficient, and why?

Example three is more efficient: Executing a method (println() is a method) requires a lot of processing power, so it's much more efficient to invoke the method only once. Imagine you had to carry three things from one room to another: It's faster to just pick up all three things at once and carry them in one trip. You would be wasting time if you picked up one thing, took it to the next room, then went back for the second thing, took the second thing to the next room, then went back for the third thing.

Why does example three have two newlines (\n) and example two has only one?

The println() method prints the value you give it, and it always automatically adds a newline to the end of that value. So in example two, when you print "Hello", the cursor sits on the next line, below the H, waiting for the next output. Then the next println() prints \n (the newline character). Of course, println() prints it, and then adds its own newline to the end - so that actually does newline twice, which causes a blank line after "Hello", and leaving the cursor two lines down, under the "H". Finally, println() prints "World" on the third line of the output.

In the second example, we only use one println() so if we want to add a blank line between "Hello", and "World", we have to add two newlines to replicate what the previous program did.

Exercises

For each of the following literal values, identify whether or not the value is a string, character, integer number, floating point number, or invalid.

  1. "Programming is fun."
  2. "PROG10082"
  3. 10082
  4. "10082"
  5. 10,082
  6. "$5.95"
  7. '2'
  8. 2
  9. 2.99991
  10. '\t\n'
  11. "\t\n"
  1. "Programming is Fun." is a String
  2. "PROG10082" is a String
  3. 10082 is an integer (int) number
  4. "10082" is a String
  5. 10,082 is invalid: a numeric can't have commas. If this is a string, it should be enclosed in double-quotes, e.g. "10,082".
  6. "$5.95" is a String
  7. '2' is a char or Character
  8. 2 is an integer (int) number
  9. 2.99991 is a floating point number
  10. '\t\n' is invalid: A char or Character value can only consist of one character, or one escape sequence. This has two escape sequences.
  11. "\t\n" is a String