Scheme the programming language [1, 4] is a LISP-derivative that has wide use as a educational and scientific programming language. Scheme has been used as the first language students have been taught in HUT CS lab courses due to its simplicity, elegancy and functionality.
Agent-based operating systems are a complete counterpart for file-oriented operating systems (such as Plan 9) [3]. In a fully agent-based operating system a process-like entity called agent can model everything: all database services, kernel services, computational services etc. are, or can be viewed as agents. The result is a network-oriented, load-balanced operating system [3].
The aim of the project is to implement a simulation of a planned agent-based operating system called SCHEMESTATION that uses Scheme as its primary systems and application programming language. At the moment, there is no real implentation --- neither simulation nor native --- of SCHEMESTATION. The operating system is however specified in the Operating system specification.
A functioning SCHEMESTATION simulator will provide a basis for scientific operating systems research, involving distributed (migrated) agent based programming and trust based security Scheme among others. It can also be used as part of Scheme language education as it provides means for compiling, debugging and running Scheme code.
The SCHEMESTATION project involves writing a Scheme [cross-]compiler in Scheme. The users will be able to compile their own Scheme programs to native SCHEMESTATION byte code, run the code as agents, watching their execution.
The implemented simulation will naturally serve as an invaluable resource of information when a stand-alone version of SCHEMESTATION is implemented.
Any intermediate computer science student (or a person with equal skills) should be able to use the system for agent-style Scheme programming, and any computer science researcher (or a person with equal skills) should be able to understand the structure of the system, and possibly write extensions to it.
The simulator can be used as an educational tool due to the programming and debugging facilities it assumedly provides.
The system can be used in both research and education, since extensions to it can be written easily. Studying the structure of the system might also have some educational value.
Agent based systems can be implemented on the top of this simulator, which makes this simulator an interesting test bed for agent programming reasearch.
A separate document (Terminology specifications) describes the terminology used throughout the documentation.
This document specifies the functionality that the SCHEMESTATION simulation --- built during the class Tik-76.115 Individual Project --- will provide. Some features must be also specified from the internal point of view. This is because a functional specification of a operating system cannot be restricted only to a high-level user interface. The SCHEMESTATION OS itself is specified in more detail in Operating system specification. However, this document reiterates some fundamental characteristics of the SCHEMESTATION OS where necessary.
This document is structured as follows.
The Summary draws things together and concludes the document.
Operating system specification is the authoritative specification of the abstract structure of the SCHEMESTATION operating system. The simulator implemented during the project will be an implementation of this specification. The overall conceptual structure of the implemented simulator is shown in Fig. 1.
The simulator consists of SCHEMESTATION domain simulators that execute as single-threaded processes on the top of a UNIX system. The link with the UNIX system is represented by the link between the blue rectangle representing a UNIX system and the rest of the simulator.
Domain simulators connect to other domain simulators via TCP/IP networking; thus a network interface is a conceptual part of any SCHEMESTATION domain simulator. Networking facility is connected with messaging system that facilitates intra-domain and inter-domain agent communications. Networking facility is also used as an interface for external agents and domains (domains that are not implemented as domain simulators).
Message passing system needs address system to be functional. The address system is also shown in the figure, linked with the messaging system.
The messaging system servers the set of agents that is run in a domain simulator. (The multiple lines of the actors rectangle describes the prularity of agents; this applies to other symbols as well.) Agents are not a module running on the UNIX per se, but an abstract object that exists inside a domain, and thus also in a domain simulator. Agents have their local states encoded in heaps, which are thunks of list-structured garbage-collected typed memory. These heaps are also shown in the figure. Behaviour of agents is described in terms of byte code programs.
Byte code programs can be produced externally using a cross-compiler and an assembler that together compile an extended dialect of Scheme to SCHEMESTATION byte code programs. These programs can be fed to the system using some external interface; for example, they can be made available through an object server [see Operating system specification and below].
A virtual machine is used to execute the byte code instructions. Scheduling execution activity is performed by a scheduler.
User's interface to a set of SCHEMESTATION domains is a terminal. Terminals are SCHEMESTATION domains that provide a graphical user interface. Terminals are implemented on the top of X11 windowing and are run as independent UNIX processes, in the same way as non-terminal SCHEMESTATION domains are run.
The SCHEMESTATION will run its own byte code specified in Virtual machine definition and VM instruction numbers. The instructions are fetched, decoded and executed by a virtual machine. The scheduler maintains a run queue and assures that all the agents get running time according to their requirements.
The virtual machine instruction set [9] specification is open; there is no obstacle for implementing other compilers for this system. The instruction set is however favourable for LISP-style programming languages due to its built-in heap manipulation instructions.
The list of agents will be kept in the scheduler [3] module. The list of agents' addresses --- either local or remote for migrated agents --- will be kept in the addressing module. Local address specifies an agent entry in the scheduler's queue. The agents can't access these addresses directly, they use an agent specifier which is translated to an address by the address system.
Every (non-special) agent has a garbage-collected, list-structured, typed heap. The linearizer, garbage-collector and heap module will provide this functionality.
Agents are persistent. In the simulation simulation this means, that every agent entry from the scheduler and every heap object is saved on to disk when system is shut down. They are reloaded to memory during the system wake-up.
There will be a special kernel agent in each domain that acts as an interface between the other agents and the kernel. The agents can send special message to the kernel agent in order to get services like agent-creation, child-removal, messaging, manipulation of agent's priorities and such.
The kernel interface will be specified in more detail in Operating system specification.
The compiler will consist of a lexer, parser and assembly producer parts [2, 11].
The assembler compiles SCHEMESTATION assembly language defined in Virtual machine definition to SCHEMESTATION bytecode defined in VM instruction numbers and Virtual machine definition.
The agents have randomly generated 256-bit agent specifiers that are used for addressing them. As guessing the address of an agent A is virtually impossible, the system ensures that agents not trusted by the agent A are unable to send messages to the agent A. If an agent decides to trust the agent A it can reveal its own address to the agent A.
The addressing system keeps a list of used agent specifiers and the corresponding local or remote addresses. The local address is a pointer to the scheduler agent entry and the remote address is a (domain address, agent entry pair). The data structure for the list has to be hashed array [8, 12] for optimal performance.
In this simulation every domain simulator process listens on a TCP/IP port, and accepts connection from other domains. These TCP/IP sockets are then used to carry the SCHEMESTATION messages. The convention is to use the ip-address of the unix host as the first four bytes of the domain address and the listened port as the next two bytes as the domain address. The rest is padded with zero.
When a domain needs to send a packet to domain it is not yet connected, it opens a socket [6, 7, 5] . The parties of the communication then initiate a hand-shake [10] as specified in [Networking Specification]. List of opened connections are then kept in the networking module so that a new connection is not needed for every packet. A linked list [8] will be adequate data structure for this.
Every agent has its own message queue, where the arrived messages are added. An agent can send a message by contacting the messaging module. For migrated agents, there is a remote address entry in the address table. If a message is sent to a migrated agent, it is passed forward the remote location.
The reliability of messaging in this simulation is assured by sending the message over TCP/IP. If abnormal conditions are encountered (such as a malfunctional unix host), packets may be lost. The messaging in this simulation is reliable in "normal external conditions".
On boot-up, the simulator can be instructed to connect a special domain, router through which it will routa all its packets. The router is just an ordinary domain, since every domain routes forward all the packets not belonging to itself. By default every simulated domain acts like this, but simulators can be instructed to use a router for all of their networking with command line arguments.
More accurate description about the networking can be found in Networking Specification and Networking Implementation.
The user's interface to a running simulation is a SCHEMESTATION terminal. The user can start a terminal session from the UNIX command prompt by specifying the domain the terminal should be attached to. The terminal then connects to the domain specified and is seen as an agent from the remote domain's point of view. The remote domain can then control the terminal.
The user can use the terminal to launch new agents and to utilize the already present agents in the system. (Typically, he would have rights to spawn a compiler, a debugger and to connecto to several common services.)
There can be different types of terminal (text-based versus graphical). The required terminal is a resizable text-based terminal that is implemented on the top of an X11R6 windowing system. An artistic vision of the text-based terminals' outlook (assuming that colors are included) is presented in Fig. 2.
The state of the simulation can be monitored. This includes getting information on the number of agents running, load of the VM, number of address entries in the addressing system and the number of network connections in the networking module. This functionality will be provided by a special agent connecting to the simulation through a interface that the messaging system provides.
As the core system has been implemented to the functional stage, a natice SCHEMESTATION application will be implemented. The application will be a remote chess game, that two players can play from different terminals. The boards will be displayed on the terminals, and the users will be able to move the pieces by typing the standard chess notation moves.
The chess game will check that the made moves are legal and transmit the moves.
The application will provide the means for an external agent to connect it; in practice this means that eg. a GNU Chess can act as a player.
When a new domain is created (by starting a new SCHEMESTATION simulation process on unix host), the following happens:
vherva@Schemestation.cs.hut.fi:~/Schemestation/bin>Schemestation -p 7777This instructs the Schemestation to listen to the port 7777. As no other domain address is specified via the -d switch, a domain address, whose first four bytes contain the local ip address (130.233.40.74) and the next two bytes the port 7777, is used. The Schemestation-simulator checks whether that port is already listened or whether a domain with that domain address has been saved (frozen) to disk.
If the user just wanted to wake up a frozen domain he would enter the same command and when then simulator would find a frozen domain on disk by that address it would restart it (load the address table, agents etc. from disk).
The shutdown can be either temporary or final. Temporary shutdown involves saving the system on to disk and the final forcing all the agents to migrate away and shutting the domain permanently.
The exact command line arguments for this as well as the details of the terminal user interface will be decided later, since the terminal is only a minor part of this project.
It is possible that the cross-compiler can be ported to the SCHEMESTATION operating system (this is not trivial, though). If this succeeds, then the compiler will have a SCHEMESTATION terminal -based user interface and compilation can be done using this interface, directly on the top of the simulation itself.
SS> (addresses) ((address-system 12312312443242234234) (agent-invoker 71234899861349969676) (object-server 92873479879823794798)) SS> (message 2312312443242234234 ("get" "monitor-bc")) t SS> SS> -- message received: ("monitor-bc" %byte-code%) SS> (message 71234899861349969676 ("invoke" %byte-code%)) t SS> ss>-- message received ("state-of-the-system" ("agents-running" 10) ("load-of-system" 0.23) ("network-connections" 5) ("addresses" ("local" 7) ("remote"3))) SS>
Every code module will be specified in detail with a separate document before the implementation. These documents will contain the description of the functionality the module is required to implement, and a description of the interface to the module as well a brief plan of the internal structure.
Another document will be written about the implementation of the module. It will contain the exact interface (function prototypes and a brief description of their semantics as well as documentation of data structures used). Also an overview of the structure and implementation of module has to be included.
To make code browsing easier, a web-based code-browser (in url http://Schemestation.cs.hut.fi/cgi-bin/code-browser/status.pl?index=1) will be developed as a side-effect of this project. There is a more detailed document in Code-Browser Utility. The aim is to make finding functions and their headers, searching code and getting oversight to the system for both the members of the project and the outsiders. The browser will hopefully help in the future develpoment of the SCHEMESTATION project, when new programmers try to study the system.
Certain instructions are followed writing the code. These include uniform indentation, naming policy and well-formed commentation (partly in purpose to make it possibly to use the comments with the code-browser.) Project coding policy defines the coding discipline in detail.
The portability of the source will be considered throughout the project. The language used is strictly ansi C - no Gnu extensions to C apart from long long will be used. The code must be compilable with the gcc --ansi switch without warnings. No such system calls or library functions, whose functioning on any considerable unix system is doubted, will be used unless absolutely necessary. Such architecture depended issues as the byte-order or length of interger on the target system should not have impact on the functionality of the system.
The Makefile will be made in Gnu Make format.
The portability will be ensured compiling on the following systems
The terminal will use the X11R6 window system.
The Scheme part of the source has to be R4RS compliant if it is intended to be runnable with other compilers than SCHEMESTATION Scheme compiler (for example the source of the compiler itself). Other Scheme source needs to be runnable with SCHEMESTATION Scheme compiler - the Scheme code and SCHEMESTATION bytecode are portable as specified in Operating system specification. Due to this, there are no major portability issues with the SCHEMESTATION Scheme source.
The perl-utilities developed within this project will be functional wherever at least version 5 of perl is available.
The project does not involve testing the source with different compilers, only Gnu C Compiler will be used. This however does not mean that the code was not intended to compile with other compilers.
Porting the system to other operating systems, such as MacOs, Plan 9, Amoeba, or perhaps Windows, could be possible, but is not considered during this project.
Agents failing to perform their tasks due to logical errors in the algorithms they use or due to invalid byte code instructions etc. will be handled by the simulator as specified by the Operating system specification.
Since users can only impact the system by sending messages to agents (in addition to giving the command line arguments), they cannot cause any more serious errors than agents.
Any special agents providing special services such as the kernel agent, will quietly ignore erroneous request messages.
The coding discipline [Project coding policy] and the comprehensive documentation will ensure that the code resulting of this project will be maintainable and easy to understand.
A simulator running as the only cpu-intensive process of the unix-machine with Pentium 166 CPU (which is our test machine) should be able to do at least 200 000 SCHEMESTATION virtual machine instructions (the average case, not just NOP) per second. This does not include the stop-and-copy garbage-collection nor the overhead caused by the system, it is just the speed of the virtual machine. That rate is adequate for testing educational Scheme programs. In practice, the system operating at that rate would execute an empty loop,
(define (f i) (if (= 0 i) 0 (f (- i 1)))) (f 100000)that generates some 10-20 SCHEMESTATION instructions with the SCHEMESTATION Scheme compiler, in 5-10 secs. This is 2-5 times slower than for example VSCM Scheme implementation for SGI Indy workstation.
Issue | Aim | Actions | Measuring |
---|---|---|---|
Portability | Runnable in any standard unix machine, see requirements |
|
|
Extensibility | An advanced computer science student should be able to code an external agent to the system. |
| If the demo application can be implemented, the extensions are adequate |
Usability | An intermediate computer science student should be able to code a SCHEMESTATION actor. The programming interfaces, external interfaces and internal programming interfaces (such as the kernal actor interface) have to be usable. |
|
|
Performance | The performance of the system must be acceptable - running multiple actors (5-10) simultaneosly should yield a decent response time per actor (less than half a second). |
|
|
Stability | The system will not fail under any normal circumstances (the unix system is stable) |
| See whether the system can be halted with test agents. |
Documentation | Sufficient for using, studying and extending the system |
|
The number undocumented interfaces and features minimal |
|
SCHEMESTATION is an agent based operating system. An agent is a entity with internal state, cpapbility to send and receive messages. It usually does some kind of computational tasks, it might for example serve as a filtering data buffer.
Each agent belongs to a domain, that includes addressing system, messaging system and some sort of scheduler to divide the cpu capacity assigned to the domain.
There is no main memory, instead, every agent has its own linearized, garbage-collected, typed heap, that includes the data and the bytecode.
The agents migrate from domain to another transparently by sending their linearized heaps as messages. The agents are addressed with agent specifiers. If a message is sent to an agent that has migrated, the addressing system of the domain returns transparently the new remote address.
The communication is secured by crypting the communication (not implemented in this prototype) and implementing trust based security between domains.
Non-SCHEMESTATION programs can connect the system and act as an agent. These might include data bases, terminals, compilers etc.
The domain and the agents are persistent meaning that they can be temporarily shut down and later woken up with the same internal state.
This project involves implementing a simulation of the SCHEMESTATION operating system. The simulation is ran as a user process in a posix-compliant unix host with Gnu tools. Part of this simulation is a Scheme compiler written in Scheme, that produces SCHEMESTATION bytecode. The byte code is executed by the SCHEMESTATION virtual machine, that is controlled by the scheduler.
The simulator can be used as an instrument of education in HUT CS, due to the programming and debugging facilities it provides. The system can be used in both research and education, since it extension to it can be easily written. Studying the structure of the system might also have some educational value.
The user interface to the system is a X11 based terminal through which the user can send messages to agents - which is essentially everything user needs to do to controls the system.
Essetial quality issues include the adequitity of the documentation, the readability of the code, extensibility, portability, usability by means that that the intended users can use the system for beneficial activity.