Components of Runtime Environment (RTE)

Main components of runtime enviroment are:   
1.Static area: is allocated at load/startup time. 
Examples: global/static variables and load-time constants.

2.Stack area: used for allocating local variables and to store
procedure return addresses. More generally, any data that follows a 
LIFO rule can  be stored on the stack.

3.Heap: area that is allocated dynamically for data that does not follow
the LIFO lifetime rule. Examples: all objects in Java, lists in
Scheme. 

Some languages use heap for everything. While C++ and JAVA use a
combination of stack and heap.


Layout for different components of the runtime environment:

                    -------------------     
        High       |      stack        |    
       Address     -------------------     
                   |                   |
                   |    free space     |
                   |                   |
                    -------------------     
                   |      heap         |    
                    -------------------     
                   |    static area    |   
                    ------------------- 
         Low       |     code area     |
        Address     -------------------

Note that the stack "grows down" --- at the beginning of the program
execution, the stack is empty, the top of the stack is positioned at
the highest possible address available to programs. As procedure calls
are made and variables allocated on the stack, additional space is created
on the stack by _decrementing_ the stack top. The size of the stack at
any time is given by highest_address - stack_top.

Also keep in mind that with today's OSes, memory addresses refer to
virtual memory. We never talk about physical memory addresses, which
are managed by the OS. 

Procedures and the environment
----------------------------------

An Activation Record (AR) is created for each invocation of a procedure.
Note that if a procedure is called multiple times, each call results
in an independent AR. 

Below is the structure of AR:
                    -------------------     |
                   |  return value     |    |
                    -------------------     |
                   | actual parameter  |    | 
Frame Pointer ----> -------------------     |  Direction of stack growth
                   |  local variables  |    |
                    -------------------     |
                   |temporary variables|    V
                    -------------------    

Frame pointer is also called Base Pointer (base of AR) or Environment
pointer. It may be abbreviated as FP, BP or EP.

Languages differ in terms of where ARs are allocated.
In Fortran 77 ARs are allocated statically, which means that the data 
on the AR for one call of a procedure will be overwritten by a subsequent
call. This will be fine if the previous call is no longer active, but
obviously won't work right if the previous call has not returned. As
a result, this approach cannot support recursive functions.

Functional languages (Scheme, ML) and some OO languages (Smalltalk) are
heap-oriented: most data, including AR, may be allocated dynamically.

Typical languages (Java, C, C++) allocate AR on the stack.

Simple stack-based allocation
------------------------------

Local variables are allocated at a fixed offset on the stack. They are
accessed using this constant offset from BP. 

Example: to load a local variable at offset 12 into the 
EBX register on x86 architecture, one would use an instruction such as
    mov 0xc(%ebp),%ebx 

Example: { int x; int y; { int z; } { int w; } }
 
When you enter the block containing "z":    
Base of AR (BP)-------->  --------------- 
                          |      x        |    
                           ---------------     
                          |      y        |    
                           ---------------     
                          |      z        |   
                           --------------- 
                      
When you exit the block containing "z":
Base of AR (BP)-------->  --------------- 
                          |      x        |    
                           ---------------     
                          |      y        |    
                           ---------------     
                          
When you enter the block containing "w":
Base of AR (BP)-------->  --------------- 
                          |      x        |    
                           ---------------     
                          |      y        |    
                           ---------------     
                          |      w        |   
                           --------------- 


Steps involved in a procedure call
Caller:
1.Save registers.
2.Evaluate actual parameters, push on the stack. 
3.Push l-values for CBR, r-values in the case of CBV.
4.Allocate space for return value on stack.
5.Call: Save return address, jump to the beginning of called function.

Callee
1.Save BP (control link field in AR).
1.Move SP to BP.
3.Allocate storage for locals and temporaries (Decrement SP).
4.Local variables accessed as [FP+k], parameters using [FP-l].


Steps in return
Callee
1.Copy return value into its location on AR.
2.Increment SP to deallocate locals/temporaries.
3.Restore BP from Control link.
4.Jump to return address on stack.

Caller
1.Copy return values and parameters.
2.Pop paramters from stack.
3.Restore saved registers.


Steps involved in a procedure call
----------------------------------
Caller:
1.Save registers.
2.Evaluate actual parameters, push on the stack. 
3.Push l-values for CBR, r-values in the case of CBV.
4.Allocate space for return value on stack.
5.Call: Save return address, jump to the beginning of called function.

Callee
1.Save BP (control link field in AR).
1.Move SP to BP.
3.Allocate storage for locals and temporaries (Decrement SP).
4.Local variables accessed as [FP-k], parameters using [FP+l].


Steps in return
---------------
Callee
1.Copy return value into its location on AR.
2.Increment SP to deallocate locals/temporaries.
3.Restore BP from Control link.
4.Jump to return address on stack.

Caller
1.Copy return values and parameters.
2.Pop paramters from stack.
3.Restore saved registers.

Example:
int x;
void p(int y)
{ int i = x;
  char c; ...
}
void q(int a)
{ int x;
  p(1);
}
main()
{ q(2);
  return 0;
}

Diagram of run-time stack:

    main's BP-->|------------------|<----------+
    main's AR   |      temp        |           |
                |------------------|           |
                |       2          |<----------|---parameter to q
                |------------------|           | 
                |   Ret Addr       |           |Control link
      q's BP--->|------------------|           |
      q's AR    |        *---------+-----------+
                |------------------|<----+     
                |        X         |     |
                |------------------|     |
                |        1         |<----|-------parameter to p
                |------------------|     |
                |   Ret Add        |     |Control link
      p's BP--->|------------------|     |
      p's AR    |        *---------+-----+ 
                |------------------|
                |       i          |
                |------------------|
                |       c          |
                |------------------|

When p returns the reverse process takes place in the above diagram,
local variables gets deallocated and returned to the point of calling.
              
Note: 
** The same concept is known by three different names
   Frame pointer(FP) ==  Base pointer(BP) ==  Environment pointer (EP)
   Fp is term used in stacks.
   Bp is term used in Intel architecture specifications.
   EP is term used in our course text book.
** Stack grows in downward direction. Heap grows in upward direction.
** Global variables are in global area. They are not shown on the stack.
** main returns 0 to the function(compiler generated code) which called main.


Nested procedures
----------------- 
If no nested procedure then we only have two kinds of variables visible 
inside any procedure (local & global).
In case of nested procedure.
Example:
int p() {
  int x = 1;
  int  q(int y) {
      ... x
  }
  q(3)
}
main() {
   ...p(5)
}


Diagram of run-time stack:

    main's BP-->|------------------|<----------+
    main's AR   |       5          |           |
                |------------------|           | 
                |   Ret Addr       |           |Control link
      p's BP--->|------------------|           |
      p's AR    |        *---------+-----------+
                |------------------|<----+<----+     
                |       x=1        |     |     |
                |------------------|     |     |
                |        3         |     |     |
                |------------------|     |     |
                |   Ret Add        |     |Control link
      q's BP--->|------------------|     |     |  
      q's AR    |        *---------+-----+     |
                |------------------|           | Access Link
                |       *----------+-----------+
                |------------------|
 

Access link means that if you don't find any variable in current scope
then follow the access link and find variable there. Using Access link has
more overhead and is less efficient as compared to using only control
links.

Note: **Nested procedures are not allowed in JAVA and C because to avoid
complication and we can always do away with nested procedures by using
parameter passing mechanism. **Only benefit of having Nested procedures is
to get access to variables of surrounding procedures without having to 
pass these values explicitly as parameters. In C/JAVA, to have the same
effect, you need to pass such variables explicitly as parameters. This
is a relatively minor inconvenience, but makes the runtime implementation
significantly more simple and efficient.

Note that in general, an access link may point to the AR of a surrounding
procedure that is arbitrarily deep in the stack. For instance, in the
above example, if q is recursive, then the access links of all these
recursive calls to q will all point to the AR for p.


Implementation Aspects of OO-Languages
---------------------------------------

Allocation of space for data members: The space for data members is 
laid out the same way it is done for structures in C or other languages.
Specifically:
 -- the data members are allocated next to each other
 -- some padding may be required in between fields, if the
    underlying machine architecture requires primitive types to
    be aligned at certain adresses
 -- at runtime, there is no need to look up the name of a field and
    identify the corresponding offset into a structure; instead, we
    can statically translate field names into relative addresses,
    with respect to the beginning of the object.
 -- data members for a derived class immediately follow the data
    members of the base class
 -- multiple inheritance requires more complicated handling, we
    will not discuss it here

Example:

class B {
      int i;
      double d;
      char c;
      float f;
}

Layout of objects of type b:
     ---------------
  0: | int i       | // Assumption: Integer requires 4 bytes
     ---------------
  4: | XXXXXXXXXXX | // pad, assuming double's are to be 
     --------------- // aligned on 8-byte boundaries
  8: | double d    |
     |             | // Assumption: Double requires 8 bytes
     ---------------
 16: | char c|XXXXX| // Assumption: char needs 1 byte, 3 bytes are padded
     ---------------
 20: | float f     | // Assumption: float to be aligned on 4-byte boundaries,
     --------------- // and require 4-bytes of space.

class C {
  int k, l;
  B b;
}

     ---------------
  0: | int k       |
     ---------------
  4: | int l       |
     ---------------
  8: | int i       |
     ---------------
 12: | XXXXXXXXXXX |
     ---------------
 16: | double d    |
     |             |
     ---------------
 24: | char c|XXXXX|
     ---------------
 28: | float f     |
     ---------------

class D: public C {
   double x;
}

     ---------------
  0: | int k       |
     ---------------
  4: | int l       |
     ---------------
  8: | int i       |
     ---------------
 12: | XXXXXXXXXXX |
     ---------------
 16: | double d    |
     |             |
     ---------------
 24: | char c|XXXXX|
     ---------------
 28: | float f     |
     ---------------
 32: | double x    |
     |             |
     ---------------

Implementation of Virtual Functions
------------------------------------

Approach 1: Lookup type info at runtime, and then call the function
   defined by that type.
   Problem: very expensive, require type info to be maintained at runtime

Approach 2: Treat function members like data members: Allocate storage for
   them within the object. Put a pointer to the function in this location,
   and translate calls to the function to make an indirection through this
   field.
   Benefit: No need to maintain type info at runtime. Implementation of
   virtual methods is fast, as it requires only a dereferencing of the
   field that stores the pointer to member funtion to be invoked.
   Problem: Potentially lot of space is wasted for each object, even though
   all objects of the same class have identical values for the table.

Approach 3: Introduce additional indirection into approach 2: Store a pointer
   to a table in the object, and this table holds the actual pointers to
   virtual functions. Now we use only one word of storage in each object.

class B {
   int i ;
   char c ;
   virtual void g();
   virtual void h() ;
  }

  B b1, b2;

      b1:
       +-------------+
       |    i        |
       |-------------|
       |    c        |
       |-------------|
       | VMT ptr  ---|----------------+
       +-------------+                |
                                      |
                                      |
                                      |            Virtual Method Table (VMT)
                                      |                 for class B
                                      |                 +-------------+
                                      +---------------->|ptr to B's g |
                                      |                 |-------------|
                                      |                 |ptr to B's h |
      b2:                             |                 |-------------|
       +-------------+                |
       |    i        |                |
       |-------------|                |
       |    c        |                |
       |-------------|                |
       | VMT ptr  ---|----------------+
       +-------------+                

Impact of subtype principle on Implementation of OO-Languages
-------------------------------------------------------------

The subtype principle requires that any piece of code that operates on
an object of type B can work "as is" when given an object belonging
to a subclass of B. This implies that runtime representation used for
objects of a subtype A must be compatible with those for objects of
the base type B. 

Note that the way the fields of an object are accessed at runtime is
using an offset from the start address for the object. For instance,
b1.i will be accessed using an expression of the form (we use C-like
syntax here)
       *(&b1+0)
where 0 is the offset corresponding to the field i. Similarly, the field
b1.c will be accessed using the expression
       *(&b1+1)
(This assumes that addresses are given in terms of words, and a single
word of memory can store an integer variable. If we use byte addressing
instead, and if all integers require 4 bytes, then the access would use
an expression of the form *(&b1+4).)

Similarly, an invocation of the virtual member function b1.h() will be 
implemented at runtime using an instruction of the form:
   call *(*(&b1+2)+1)
where:
    &b1+2 gives the location where the VMT ptr is located
    *(&b1+2) gives the value of the VMT ptr, which corresponds
             to the location of the VMT table
    *(&b1+2) + 1 yields the location within the VMT table where the
             pointer to virtual function h is stored. (Note that the
             pointer to h is stored at offset 1 from the base of VMT.)

Viewed in light of the way member fields and operations are accessed,
the subtype principle imposes the following constraint:
    any field of an object of type B must be stored at the same
      offset from the base of any object that belongs to a subtype of B
    the VMT ptr must be present at the same offset from the base of 
        any object of type B or one of its subclasses
    the location of virtual function pointers within the VMT should remain
        the same for all virtual functions of B across all subclasses of B.

As a result, we must use the following layout for an object of type A
defined as follows:

class A: public B {
      float f;
      void h(); // Redefines h, but reuses implementation of G from B;
      virtual void k();
}

A a;
       a's layout:                           Virtual Method Table (VMT)  
       +-------------+                            for class A            
       |    i        |                           +-------------+         
       |-------------|                    /----->|ptr to B's g |         
       |    c        |                   /       |-------------|         
       |-------------|                  /        |ptr to A's h |         
       | VMT ptr  ---|-----------------/         |-------------|         
       |-------------|                           | ptr to A's k|         
       |  float f    |                           +-------------+         
       +-------------+                
                                      
Note that in order to satisfy the constraint that VMT ptr appear at the
same position in objects of type A and B, it is necessary for the data
field f in A to appear after the VMT field.

A couple of other points:
a) non-virtual functions are statically dispatched, so they do not
   appear in the VMT table
b) when a virtual function f is NOT redefined in a subclass, the VMT table
   for that class is initialized with an entry to the function f defined
   its superclass.