================================= Simple variables and data types ================================= Fundamental concepts ==================== As explained in the first introductory lecture, there are two motivations for writing programs: software engineering for automation, formalization to help people think. Most languages designed for software engineering, including Java, are called “imperative” languages: a program tells the computer what to do, step by step, like a recipe tells the cook what to do. All imperative languages are based on two fundamental concepts: 1) the computer proceeds through the program step-wise; so there is a conceptual “cursor” that designates “the current point of execution” over time; and 2) the computer is equipped with *memory* where values can be stored and retrieved. So when you use an imperative language, you have two mechanisms at your disposal to tell the computer what to do: statements that change the *flow of control*, like “``if``” seen in the last lecture, and statements that *modify the state of the memory*, like the assignment. In hardware, the memory is a circuit that can only store unstructured binary values: very long strings of digits 0 and 1. To ease the task of the programmer, language designers have created the concept of *variable* to add logical structure to the memory circuits. A variable is a combination of a name for an area of memory; and a type which indicates the computer how to interpret the raw binary digits stored in that area. Practical introduction ====================== Variables in a program define space in a computer's memory where values can be stored and later retrieved. In languages like Java, and contrary to other languages like Python, variables must be *declared* before they can be used: the program must “announce” that a variable exists and give it a name before subsequent steps can store values in it or read from it. Moreover, *each variable can only store a limited set of different values.* For example, one variable could only hold integral numbers between -2147483648 and 2147483647, and another only values between -128 and 127. The programmer selects the set of values that can be potentially held in a variable by selecting a *type* for the variable when it is declared. The language will then automatically determine an area in memory of sufficient size to hold the variable. Declarations ============ The general form for a variable declaration is defined thus: *Declaration*: Syntax: ``;`` Semantics: the construct declares a variable named by the identifier on the right, that can potentially store values determined by the type on the left. Each language has different types for variables already *built-in*, ie provided “out of the box”, always available to programs. For example, in Java, we often use the built-in types “``int``” for integral values (like 123) and “``double``” for approximate real values (“numbers with a comma”, like 3.5). So we can write: .. code:: java int i; double x; int myName; to declare 3 variables, named “``i``”, “``x``” and “``myName``”, and the respective types ``int``, ``double`` and ``int``. Built-in types in Java ====================== Java provides us with the following built-in types (non-exhaustive): ============== ============================== Type name Set of allowed values ============== ============================== ``int`` any integral values between -2147483648 (:math:`-2^{31}`) and 2147483647 (:math:`2^{31}-1`) ``double`` a finite number of approximations of real values between :math:`-1.8\times 10^{308}` and :math:`+1.8\times 10^{308}`, and representations for positive and negative infinities ``boolean`` either ``true`` or ``false`` ``char`` any valid Unicode__ character (65536 possible values) ``byte`` any integral value between -128 and 127 ``long`` any integral value between :math:`-2^{63}` and :math:`2^{63}-1` ============== ============================== .. __: https://en.wikipedia.org/wiki/Unicode Values for variables, literal forms =================================== Values can be stored in variables using either computations from other values and variables (eg. ``a + b``), or using a *literal form*: a value encoded directly in the program. Like for other language constructs, literal forms have a syntax and a semantics. Some examples: *Literal integers*: Syntax: :sub:`[` ``-`` :sub:`]?` :sub:`[` :sub:`]*` (An optional minus sign, followed by one or more digits) Semantics: the conventional arithmetic interpretation of the literal digits in base 10. Examples: ``-123``, ``0``, ``456``. *Literal Booleans*: Syntax: ``true`` Semantics: the Boolean “truth” value. Syntax: ``false`` Semantics: the Boolean “falsehood” value. *Literal doubles*: Syntax: :sub:`[` :sub:`]?` :sub:`[` :sub:`]?` :sub:`[` ``.`` :sub:`]?` :sub:`[` ``e`` :sub:`[` :sub:`]?` :sub:`]?` :sub:`[` ``d`` :sub:`]?` (An optional sign, followed by an optional integer, followed optionally by a dot and an integer, followed optionally by “``e``”, an optional sign and an integer, the entire form optionally terminated by the letter “``d``”; the ``d`` is only optional if there is a dot “``.``” already in the form) Semantics: the conventional scientific interpretation of the number in base 10; with the number after ``e`` used as a power multiplier. Examples: ``0d``, ``-1d``, ``.5d``, ``0.123``, ``3.14e20`` (means :math:`3.14\times10^{20}`). Assignments and type compatibility, casts ========================================= What happens if the value on the right side of an assignment is not part of the set of possible values for the variable on the left? For example: .. code:: java int x; x = 3.1415; (the ``double`` approximation of 3.1415 is not an integral number, so it cannot directly fit in ``x``) In general, if the value on the right is of a different type than the value on the left, the following rules apply: - if the set defined by the type on the right is a subset of the set defined by the type on the left, we say that “the type on the right is *compatible* with the assignment” and the assignment works as expected. For example: .. code:: java double y; y = 123; here the value of “``123``” has type ``int`` which is different from ``double`` on the left, but since the set of values for ``int`` is contained in the set of values for ``double`` the assignment can proceed silently. - if the set on the right is not included in the set on the left, the language processor *rejects* the assignment with a conversion error. This is what happens with “``int x = 3.1415;``” above. Sometimes, the need arises to *force an assignment*: when the programmer knows better than the language that a specific expression on the right will always evaluate to a value that happens to fit on the left, or when the programmer wants to force an approximation of a value. The main Java construct to force an assignment is inherited from C and C++, and defined as follows: *Cast expression*: Syntax: ``(`` ``)`` (An opening parenthesis, followed by a type, followed by a closing parenthesis, followed by an expression) Semantics: the expression on the right is evaluated. Then its value is forcefully transformed into the type specified on the left. The resulting value is compatible with the new type. The conversion can incur an approximation, for example converting from ``double`` to ``int`` will round the number by removing the digits after the comma. Example: .. code:: java int x; x = (int)3.1415; The value of “3.1415” is first forced into type ``int`` by rounding down to 3, so it becomes compatible with the assignment which can then proceed silently. Assignments and declarations ============================ The simple assignment statement in Java is defined with a semicolon: *Assignment statement:* Syntax: ``=`` ``;`` (An identifier, followed by one “``=``” sign, followed by an expression, followed by a semicolon) Semantics: evaluate the expression on the right, then assign the result to the variable designated by the name on the left. Using the constructs seen so far it becomes easy to read the following examples: .. code:: java int i; i = 123; double x; x = 3.1415; This combination of a declaration followed by an assignment is so common that Java also provides a *combined form*. This is defined as follows: *Combined declaration and assignment*: Syntax: ``=`` ``;`` Semantics: this form is equivalent to: ``;`` ``=`` :sub:`[` ``=`` :sub:`]?` :sub:`[` ``,`` :sub:`[` ``=`` :sub:`]?` :sub:`]*` ``;`` (A type, followed by an identifier, followed optionally by ``=`` and an expression, followed by a repetition of zero or more occurrences of a comma and an identifier followed optionally by ``=`` and an expression, the whole eventually terminated by a semicolon) Semantics: equivalent to :sub:`[` ``=`` :sub:`]?` ``;`` repeated once for every identifier-expression pair in the comma-separated list. Example: ``int i = 1, j, k = 2;`` is equivalent to the 3 declarations above. Important concepts ================== - *variables* and *types* - *set of allowed values for a type* - *built-in* types - ``int``, ``double`` - type *compatibility* in assignments (general concept) - *cast expression* - *combined form* (general concept) - *combined declaration and assignment* - *combined declaration of multiple variables* Further reading =============== - Absolute Java, section 1.2 (pp 16-23) and sections 3.1-3.2 (pp. 25-27) - Introduction to Programming, sections 2.2.2 and 2.2.3 (pp. 25-27) ---- Copyright and licensing ======================= Copyright © 2014, Raphael ‘kena’ Poss. Permission is granted to distribute, reuse and modify this document according to the terms of the Creative Commons Attribution-ShareAlike 4.0 International License. To view a copy of this license, visit `http://creativecommons.org/licenses/by-sa/4.0/ `_.