IP Pascal Language

 

IP Pascal complies with the complete ISO 7185 Pascal language at level 0.

IP Pascal implements many extensions to the original Pascal Language. We have extended the language very carefully to keep compatibility with the original language. For example, as in many Pascals, the ability exists to open a file by name. However, unlike many Pascals, an "advanced string type" is not required to do this. Ordinary Pascal string types are acceptable. There are many other such examples.


Basic language extensions

 

 

label terminate;

 

...

 

goto terminate;

$feeb  Hexadecimal

&76    Octal

%0110  Binary

writeln('hello\sub\ff\0\$a5');

Any number base can be used.

type a: integer;

writeln(a or $a5);

 

procedure wrtout(a: integer; c: char); forward;

 

...

 

procedure wrtout(a: integer; c: char);

 

begin

 

...

 

end;

 

procedure error;

 

begin

 

   writeln('Program terminating');

   halt

 

end;

 

c:\TEST> copyfile myfile.txt newfile.txt

 

program copyfile(output, infile, outfile);

 

var c: char;

 

begin

 

   while not eof(infile) do

      if eoln(infile) then writeln(outfile);

      else begin

 

         read(infile, c);

         write(outfile, c)

 

      end;

   writeln('Copy complete')

 

end.

error    

list     

command  

 

program echo(command, output);

 

begin

 

   writeln('Command line: ');

   while not eoln(command) do begin

 

      read(command, c);

      write(c)

 

   end;

   writeln

 

end.

 

type f: text;

 

begin

 

   assign(f, 'myfile.txt');

   reset(f);

 

   ...

 

repeat

 

   assign(f, name);

   reset(f);

 

   ...

 

   close(f)

 

until done;

update(f)    

position(f, p)    

location(f)    

 

type byte = 0..255;

     bytfil = file of byte;

 

var bf: bytfil;

 

begin

 

   write(bf, $a5);

 

   ...

var r: packed record a: 0..3; b: 0..255; c: (one, two, three) end;

 Gives a record with the memory format:

 

 

Note that byte field "b" crosses a byte boundary. Boolean fields can be used as "shims" to adjust the appearance of the record. This capability to control exactly how a record is laid out is equivalent to the "union" construct in C.

 

type number = (one, two, three);

 

var n: number;

 

begin

 

   n := number(1);

 

   ...

 

type byte = 0..255;

 

var a, b: integer;

 

begin

 

   a := byte(b+1);

 

   ...

 

The ability to specify exactly the precision needed helps greatly in embedded applications on small processors (8 or 16 bit) where it would be inefficient to perform all expression operations at full integer precision. This is equivalent in function to the C language use of different size types such as "char", "int" and "long" using rules of promotion that only promote to the largest element in the expression.

 

var x: integer;

 

procedure a(view b: integer);

 

begin

 

   ...

 

end;

 

begin

 

   a(x*5)

 

A view parameter has all of the efficiency of a var parameter, but uses value passing rules. For example:

 

type string10: packed array [1..10] of char;

 

procedure a(view s: string10);

 

...

 

Normally, a value parameter of an array such as "string10" would be inefficient because of the need to copy the entire array to the value parameter. However, typing the parameter as "view" leaves it up to the compiler how to most efficiently pass the parameter, by value or reference. The effect is the same, since the parameter cannot be modified, just as in a value parameter.

View parameters are also useful for cases where we want to be sure that a parameter is not modified at all within a procedure.

Any attempt to modify a view parameter within the routine is flagged as an error by the compiler.

 

fixed a: integer = 1;

      b: array [1..10] of integer = array 5, 6, 8, 2, 3, 5, 9, 1, 12, 85 end;

      c: record a: integer; b: char end = record 1, 'a' end;

      d: array [1..5] of packed array [1..5] of char = array

  

         'tooth',

         'trap ',

         'same ',

         'fall ',

         'radio'

 

      end;

 

Fixed types have a similar declaration syntax to "var" declared types, except for the appearance of the initalizer (= ...). Fixed types can be used anywhere a var type can, but fixed types cannot be modified or threatened (ISO 7185 contains the definition of a "threatened" variable, including appearing as a "for" index, "var" parameter, etc.).

Fixed types can be structured to any complexity, but cannot be files, pointers, or variant records, because these are all dynamically changeable constructs. The syntax of the initializer makes it clear what type of structure is being initialized, array or record.

 

program test;

 

var a: integer;

 

label 99;

 

procedure x; begin end;

 

const a = 5;

 

...

 

program test;

 

#include myfile.pas

 

The include operator is limited in function, it must be flush left (1st character on line must be "#"), and only whole lines can be inserted at a line boundary. The include file mechanism is not the preferred way to accomplish breaking programs into files, that is the purpose of modularity. However, the include operator has occasional uses.


General arrays

One of the practical difficulties of Pascal is the appearance of arrays as fixed types. This leads to constructs such as:

 

type string10: packed array [1..10] of char;

 

procedure wrtspc(s: string10);

 

...

 

The greatest impact of this feature is on the creation of general purpose libraries of code, which cannot anticipate just what common array lengths might be required. Also impacted is the efficiency of dynamically stored strings, such as large lists of strings, which must take more space than necessary because of a fixed type.

The ISO 7185 "conformant array" construct, which was not specified as required by the standard, was an attempt to deal with the library problem, but does not address dynamic storage issues.

Many compilers deal with the problem by importing Basic language like "strings" into Pascal, which solves the problem only for one class of arrays (character strings). Also common is to allow the user to specify the exact length of the dynamically allocated object in a "new" statement, which unfortunately has the side effect of negating type safety.

The extended Pascal standard also addresses the problem of dynamic length arrays with "schemas", or parameterizable types.

I studied the problem of dynamic length arrays for several years, and identified several requirements that any such solution must have to be acceptable.

In short, I took the problem quite seriously, and created a number of candidate schemes to solve the problem, along with other approaches, and rated each one according to the above requirements.

The winning scheme, the one that appears in IP Pascal, was certainly not the most obvious scheme. It is similar to the approach taken in the language Oberon, and in fact is functionally identical to the method used in C, but with type safety included. It has strong limitations on what you can do with it, but after using it for several years I have found that it can perform all the functions of more complex schemes by using the right techniques. This is mainly because the system is a "building block", that you can use to form more complex structures yourself, vs. a complex and general scheme.

General arrays use the principle that the address of an object (a pointer to it) is not affected by its type, and that such a pointer can formed or destroyed at will during the program run. A pointer to such a dynamic type can, therefore, also contain the parameters of its type, such as a length.

A general array has the form:

type ga: array of char;

I.e., the index specification of the array is left off.

The primary rule of general arrays is that they can never be statically allocated. Thus:

var gav: array of char;

Is illegal. This would seem to be a great limitation, but it turns out to be what makes the entire general array scheme work, as will be clear shortly.

A general array has an implied index type of integer, and the first array element is always 1. The compatibility rules of general arrays are:

 Two arrays are assignment compatible if:

  1. One or both of the arrays are a general array.
  2. The arrays have compatible base types.
  3. Both arrays have the same packed/unpacked status.

So, for example, the following declarations are compatible:

 

type a = packed array [1..100] of char;

     b = packed array of char;

 

The following declarations are not compatible:

 

type a: packed array [1..10] of integer;

     b: packed array of boolean;

 

So, for example, we can create a general array string output routine:

 

type string = packed array of char;

 

var mystring: packed array [1..8] of char;

 

procedure wrtstr(var s: string);

 

var i: integer;

 

begin

 

   for i := 1 to max(s) do write(s[i]);

   writeln

 

end;

begin

 

   mystring := 'hi there';

   wrtstr(mystring);

 

end.

The function "max" returns the highest element of any general array. A general array can appear in a "var" declaration because that is an address referenced, not statically allocated, parameter.

More interesting is the example:

 

type string = packed array of char;

 

procedure wrtstr(view s: string);

 

var i: integer;

 

begin

 

   for i := 1 to max(s) do write(s[i]);

   writeln

 

end;

begin

 

   wrtstr('hi there');

 

end.

 

The first thing that is different with this example is that the "view" declaration is used for the general array. This allows it to be a reference parameter, but still use value passing rules (because the view parameter cannot be modified). Since value passing rules are in effect for the general string parameter, it can be passed a constant string.

Besides parameters, general arrays can be dynamically allocated as pointers:

 

type a: array of integer;

 

var p: ^a;

    i: integer;

 

begin

 

   new(p, 1000); { allocate general array for 1000 integers }

   for i := 1 to 1000 p[i] := 0;

 

   ...

 

The form of "new" used to allocate a general array is similar to that of a variant record allocation with a tagfield. However, the length is neither required nor allowed in a "dispose" statement.

dynamically allocated general arrays are a general building block for data structures of any complexity. For example, it is not possible to create a general array that has more than one dimension, but such arrays can be easily built:

 

type string = packed array of char;

     pstring = ^string;

 

var a: array [1000] of pstring;

    i: integer;

 

begin

 

   for i := 1 to 1000 do new(a[i], 100);

   ...

 

In fact, this can be used to create an array whose base elements vary in length from each other, impossible in a normal matrix structure.

In short, general arrays give the benefits of more complex dynamic array schemes by placing the work of creating complex structures onto the programmer. This is exactly how C performs the same function. There is no "dynamic matrix" declaration in C, nor can dynamic length arrays be statically allocated in original ANSI C. The programmer uses "malloc"ed array structures to build these constructs. IP Pascal gives these same capabilities, but with full type safety. Further, the general array scheme uses the minimum amount of extra data, a single word to describe the length of an array, and only uses that where required, leaving standard fixed arrays exactly as they are in ISO 7185 Pascal. Finally, the scheme is completely compatible with ISO 7185 Pascal right down to assignment compatibility between the general array and standard array types.


 

Modularity

 

IP Pascal implements modularity as a formal language construct. The linkages between modules are fully type checked. IP Pascal modules are designed so that it is not necessary to have both a declaration module and an implementation module. Any module can be used for declarations as well as being the implementation of the module. If the user desires, the implementation code can be stripped from a module so that it does become a declaration module only.

The basic module construct is:

 

module mymod(input, output);

 

const one = 1;

 

type string = packed array of char;

 

procedure wrtstr(view s: string);

 

private

 

var s: string;

 

procedure wrtstr(view s: string);

 

begin

end;

 

begin

end;

 

begin

end.

 

A module appears very much like a program, which in IP Pascal is just considered a special type of module. A module has header files, but cannot automatically parse command line parameters like a program can. Modules can get access to the input, output, error, list and command files just as a program can. For a module to use any of the files, or perform default input and output, the file must appear in the header, just as in a program.

Declarations appear in a module just as in a program. IP Pascal implements a new keyword "private" that demarcates between declarations that other modules can use, and declarations that are only to appear within the module. Procedures and functions can be forwarded from the public area to the private area. The private keyword is not restricted to modules, it can also be used in programs.

Modules have two main body blocks, known as the initialization and finalization blocks. The normal main block that you are familiar with from the program construct is the initialization block. The second, optional block is the finalization block. When multiple modules appear in a program image, each of the initialization blocks in the image are executed before the program module gains control. After the program module exits, each module has its finalization section executed. The order in which the initialization and finalization sections are executed is specified at link time, and will be covered below in "The Modular System".

A module is compiled on its on as a unit. A separate module that wants to use the module includes it in a "uses" statement.

 

program test;

 

uses extlib, gralib;

 

...

 

Appearance in a uses statement effectively includes all of the non-private elements of the used module into the using module. Either a program or a module can have a uses statement. IP Pascal automatically keeps track of used modules to prevent duplicate declarations, and recursive or circular uses references are valid. IP Pascal knows how to extract declaration information from either the source code or the compiled code, and will automatically choose whichever is available and most recently modified. The compilation proceeds faster if the precompiled declarations are used.


Tasking Modules

 

Besides the program and module code units, IP Pascal implements a full source level system to implement multiple tasking and threads. In many or even most languages, threading and multitasking is left up to the programmer to add on with procedures like "threadstart", "lock" and "unlock". This ad-hoc methodology works, but leads to deadlocks, data corruption and similar problems when it does not. Multithreading is a challenging programming technique to get right, and generates problems that are very difficult to debug.

IP Pascal implements multitasking/multithreading as a formal language construct, and takes the set up, tear down and synchronisation of multiple tasks and threads over on behalf of the programmer. The result is code that is "correct by construction", with a clean set of rules governing the use of multithreading/multitasking.

The "process" module looks very much like a program module:

 

process mythread(input, output);

 

type string = packed array of char;

 

var a: integer;

 

procedure x; begin end;

 

begin

 

   ...

 

end.

 

A process module can have file parameters, but the programmer should pick one process or program module to handle I/O so that output is not mixed up.

Like a program module, a process only has an initialization section. When the entire program image is started, each process receives its own thread, and thus each process runs until the program completes. The program can optionally arrange to signal each process that it is to terminate.

A process module cannot have a uses statement that includes a program or standard module. The program and standard modules form a group that is shared within the primary program thread. Instead, a process module can only use one of two types of modules, the monitor and share modules.

A monitor appears very much as a standard module:

monitor x(input, output);

 

private

 

var y: integer;

 

procedure x(var a: integer);

 

begin

 

   y := a

 

end;

 

begin

 

   { initialization }

 

end;

 

begin

 

   { finalization }

 

end.

 

Monitors can contain headers files, and have both initialization and finalization sections as standard modules do. There are two basic rules that a monitor must obey:

 

  1. Monitors cannot contain any non-private variable declarations, only private ones.
  2. Monitors cannot contain any non-private functions or procedures that have pointer parameters, nor parameters that are structures containing pointers.

The way monitors work is as follows. A multitask lock exists for the monitor. Each non-private call to the monitor acquires the lock on entry, and releases the lock on exit.  Because variable data can only be declared as private within the monitor, the data for the module can only be accessed by the monitor procedures and functions while under protection of the multitask lock associated with the module. Monitors are not allowed to pass pointers into or out of the monitor, since that would effectively allow the access of monitor data or data from another process or program outside the locking system.

Because monitors are multitask protected, the routines in a monitor can be accessed by any module in the system, including a program, standard module, process, other monitor or share module.

Because monitors have a certain amount of overhead due to the locking system, another type of module known as a "share" is available:

 

share x(input, output);

 

const a = 1;

 

procedure x; begin end;

 

.

 

A share module cannot have any variables outside of the variables local to the procedures and functions it contains. Because it has no global data, it has no need for initialization and finalization sections, and the share module ends with a single '.'. Because share modules contain only procedures, functions, and constant declarations, share modules can also be used by any other module type in the system. Shares are a good place to put libraries of procedures, functions, constants, types and fixed declarations so they are usable anywhere in the system without the overhead of multitask locking.


The Modular System

 The system of modules is very important in IP Pascal. Modules have a special format in memory, and form a "stack", that implements the program:

 

Much like the layout of a Pascal program, the module places the most basic modules at the bottom, and the most complex modules at the top. Thus, the basic I/O modules are placed at the bottom, and the module that contains the "program" block is at the top. When a program is executed, it actually successively activates its module stack. Each module gets control, executes its initialization code, then calls the next higher module. When that module returns, it executes its finalization code and returns itself. Thus, the chain of control ripples up the module stack until the end is reached, then ripples back down.

The program module is the top of the module stack. It runs its initialization, which is actually its program body, and when it exits, the entire program completes and exits (this is arranged via a special module known as the "cap", whose only job is to simply return). Process modules also have only an initialization, that appears to never exit. However, the actual initialization for a process module is to spawn a thread to execute the initialization code. When the finalization is executed, it terminates the thread forcibly.

IP Pascal implements its own OS support code via the module system. The Pascal I/O is written in Pascal itself, including the code to implement things like writeln, readln, page, etc. Although these are compiler built in calls, IP Pascal actually breaks them down into standard Pascal compatible calls so that the support library can execute them.

IP Pascals' I/O mechanisms are built using a successive series of more primitive modules. The lowest module is the OS "wrapper" module. This module implements the basic OS call translation and any startup mechanisms needed. The next module "system" implements a set of I/O calls that represent the minimum functionality that IP Pascal needs to express all its basic serial and command line I/O. This module translates the IP Pascal way of doing things to the particular OS in use.

Higher level functions

 Because the calls for various implementation levels of graphics are compatible, the module stack is easily upgraded to graphics mode use.

Modules have the ability to "override" each others calls. So for example, a higher level module such as Gralib, the graphics level support library, can override the basic I/O calls contained in System. This is how the ordinary writeln, readln, and other common Pascal statements are redefined so that they end up outputting graphically drawn characters onto a graphics window.

Extending the system

As another example, here we have added a module to add Internet functionality to the stack.

Tcplib adds Internet (Tcp/IP) capability to the basic I/O of IP Pascal. In this way, standard IP Pascal files, using writeln, readln, and standard Pascal functions, can be redefined into Internet data links by specifying the link parameters such as the destination IP address and port in a special file name syntax during an assign.

Custom modules

Because we define all of the details of the base implementation modules for IP Pascal, and modules are implemented using standard IP Pascal syntax, it's a very easy matter to completely customize an implementation of IP Pascal if you desire. You can move a program to an operating system IP Pascal does not yet run on, or implement it on an embedded system with completely custom I/O. You can also decide how complex you wish the implementation to be. The minimum serial level I/O can be implemented, or a full windowed graphical system, or anything in between.


The companion assembler/linker

 IP Pascal  comes complete with its own assembler and linker. The assembler and linker are designed to be both machine and operating system independent. The assembler is a full macro assembler with what I call "zero pass" methodology. How that works is that instead of providing 1, two, or some number of "passes" over the source, as (the assembler) reads the source only once and compiles all of the expressions into the linker format with full complexity. ln (the linker), can then resolve any complexity of expression, including address dependencies, at link time. There are no pass dependencies in as/ln.

On the 80x86 series family, we use a mode known as "type free", similar to what AT&T and GAS uses. There are no "type" concepts associated with symbols. Instead, each instruction is typed for operation and address length by the instruction notation itself, using Intel standard postfixes:

 

b    8 bit Byte.

w    16 bit word.

d    32 bit double word.

q    64 bit quad word.

 

So that:

 

movw    [eax],$56

 

Is a "self typed" instruction.

Similar to the expression compilation of as/ln, macros are also precompiled for speed. Macros are stored as a series of strings with instructions on how to insert the parameters to a macro in dictionary form. The result is that no searching is performed when a macro is expanded, greatly accelerating the processing of macros.


 For more information contact: Scott A. Moore samiam@moorecad.com