SchemeStation Functional Specification
Document formatted by vherva at Fri Apr 24 11:23:41 1998 on the host schemestation. This document is produced by the SchemeStation project during the Tik-76.115 course.
| ||||||||||||||||||||||||||||||
1 Introduction | ||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||
1.1 Background | ||||||||||||||||||||||||||||||
Scheme the programming language [1, 4] is a LISP-derivative that has wide use as a educational and scientific programming language. Scheme has been used as the first language students have been taught in HUT CS lab courses due to its simplicity, elegancy and functionality. Agent-based operating systems are a complete counterpart for file-oriented operating systems (such as Plan 9) [3]. In a fully agent-based operating system a process-like entity called agent can model everything: all database services, kernel services, computational services etc. are, or can be viewed as agents. The result is a network-oriented, load-balanced operating system [3]. The aim of the project is to implement a simulation of a planned agent-based operating system called SCHEMESTATION that uses Scheme as its primary systems and application programming language. At the moment, there is no real implentation --- neither simulation nor native --- of SCHEMESTATION. The operating system is however specified in the Operating system specification.
| ||||||||||||||||||||||||||||||
1.2 Project motivation | ||||||||||||||||||||||||||||||
A functioning SCHEMESTATION simulator will provide a basis for scientific operating systems research, involving distributed (migrated) agent based programming and trust based security Scheme among others. It can also be used as part of Scheme language education as it provides means for compiling, debugging and running Scheme code. The SCHEMESTATION project involves writing a Scheme [cross-]compiler in Scheme. The users will be able to compile their own Scheme programs to native SCHEMESTATION byte code, run the code as agents, watching their execution. The implemented simulation will naturally serve as an invaluable resource of information when a stand-alone version of SCHEMESTATION is implemented.
| ||||||||||||||||||||||||||||||
1.3 Target Users | ||||||||||||||||||||||||||||||
Any intermediate computer science student (or a person with equal skills) should be able to use the system for agent-style Scheme programming, and any computer science researcher (or a person with equal skills) should be able to understand the structure of the system, and possibly write extensions to it. The simulator can be used as an educational tool due to the programming and debugging facilities it assumedly provides. The system can be used in both research and education, since extensions to it can be written easily. Studying the structure of the system might also have some educational value. Agent based systems can be implemented on the top of this simulator, which makes this simulator an interesting test bed for agent programming reasearch.
| ||||||||||||||||||||||||||||||
1.4 Terminology | ||||||||||||||||||||||||||||||
A separate document (Terminology specifications) describes the terminology used throughout the documentation.
| ||||||||||||||||||||||||||||||
1.5 The Structure and Scope of This Document | ||||||||||||||||||||||||||||||
This document specifies the functionality that the SCHEMESTATION simulation --- built during the class Tik-76.115 Individual Project --- will provide. Some features must be also specified from the internal point of view. This is because a functional specification of a operating system cannot be restricted only to a high-level user interface. The SCHEMESTATION OS itself is specified in more detail in Operating system specification. However, this document reiterates some fundamental characteristics of the SCHEMESTATION OS where necessary. This document is structured as follows.
| ||||||||||||||||||||||||||||||
2 General description | ||||||||||||||||||||||||||||||
Operating system specification is the authoritative specification of the abstract structure of the SCHEMESTATION operating system. The simulator implemented during the project will be an implementation of this specification. The overall conceptual structure of the implemented simulator is shown in Fig. 1.
The simulator consists of SCHEMESTATION domain simulators that execute as single-threaded processes on the top of a UNIX system. The link with the UNIX system is represented by the link between the blue rectangle representing a UNIX system and the rest of the simulator. Domain simulators connect to other domain simulators via TCP/IP networking; thus a network interface is a conceptual part of any SCHEMESTATION domain simulator. Networking facility is connected with messaging system that facilitates intra-domain and inter-domain agent communications. Networking facility is also used as an interface for external agents and domains (domains that are not implemented as domain simulators). Message passing system needs address system to be functional. The address system is also shown in the figure, linked with the messaging system. The messaging system servers the set of agents that is run in a domain simulator. (The multiple lines of the actors rectangle describes the prularity of agents; this applies to other symbols as well.) Agents are not a module running on the UNIX per se, but an abstract object that exists inside a domain, and thus also in a domain simulator. Agents have their local states encoded in heaps, which are thunks of list-structured garbage-collected typed memory. These heaps are also shown in the figure. Behaviour of agents is described in terms of byte code programs. Byte code programs can be produced externally using a cross-compiler and an assembler that together compile an extended dialect of Scheme to SCHEMESTATION byte code programs. These programs can be fed to the system using some external interface; for example, they can be made available through an object server [see Operating system specification and below]. A virtual machine is used to execute the byte code instructions. Scheduling execution activity is performed by a scheduler. User's interface to a set of SCHEMESTATION domains is a terminal. Terminals are SCHEMESTATION domains that provide a graphical user interface. Terminals are implemented on the top of X11 windowing and are run as independent UNIX processes, in the same way as non-terminal SCHEMESTATION domains are run.
| ||||||||||||||||||||||||||||||
2.1 Virtual machine and Scheduler | ||||||||||||||||||||||||||||||
The SCHEMESTATION will run its own byte code specified in Virtual machine definition and VM instruction numbers. The instructions are fetched, decoded and executed by a virtual machine. The scheduler maintains a run queue and assures that all the agents get running time according to their requirements. The virtual machine instruction set [9] specification is open; there is no obstacle for implementing other compilers for this system. The instruction set is however favourable for LISP-style programming languages due to its built-in heap manipulation instructions.
| ||||||||||||||||||||||||||||||
2.2 Agents and heaps | ||||||||||||||||||||||||||||||
The list of agents will be kept in the scheduler [3] module. The list of agents' addresses --- either local or remote for migrated agents --- will be kept in the addressing module. Local address specifies an agent entry in the scheduler's queue. The agents can't access these addresses directly, they use an agent specifier which is translated to an address by the address system. Every (non-special) agent has a garbage-collected, list-structured, typed heap. The linearizer, garbage-collector and heap module will provide this functionality. Agents are persistent. In the simulation simulation this means, that every agent entry from the scheduler and every heap object is saved on to disk when system is shut down. They are reloaded to memory during the system wake-up.
| ||||||||||||||||||||||||||||||
2.3 Kernel agent | ||||||||||||||||||||||||||||||
There will be a special kernel agent in each domain that acts as an interface between the other agents and the kernel. The agents can send special message to the kernel agent in order to get services like agent-creation, child-removal, messaging, manipulation of agent's priorities and such. The kernel interface will be specified in more detail in Operating system specification. | ||||||||||||||||||||||||||||||
2.4 Compiler | ||||||||||||||||||||||||||||||
The system includes a Scheme compiler. The compiler recognizes the
standard Scheme syntax and generates
SCHEMESTATION assembly
language. The compiler will be written in Scheme so that it can be
cross-compiled with itself to a native
SCHEMESTATION agent. The
cross-compilation is not a requirement of this project, but
this our aim.
The compiler will consist of a lexer, parser and assembly producer parts [2, 11]. The assembler compiles SCHEMESTATION assembly language defined in Virtual machine definition to SCHEMESTATION bytecode defined in VM instruction numbers and Virtual machine definition.
| ||||||||||||||||||||||||||||||
2.5 Messaging, Networking and Addressing | ||||||||||||||||||||||||||||||
As Operating system specification states; an important part of the
SCHEMESTATION is
messaging between agents. The simulation will provide the means for
this, so that messages can be passed from agent to another inside a
domain, or between domains. The domains can reside on different
unix-machines - the networking module creates
TCP/IP-connections between machines to route the messages.
The agents have randomly generated 256-bit agent specifiers that are used for addressing them. As guessing the address of an agent A is virtually impossible, the system ensures that agents not trusted by the agent A are unable to send messages to the agent A. If an agent decides to trust the agent A it can reveal its own address to the agent A.
The addressing system keeps a list of used agent specifiers and the corresponding local or remote addresses. The local address is a pointer to the scheduler agent entry and the remote address is a (domain address, agent entry pair). The data structure for the list has to be hashed array [8, 12] for optimal performance. In this simulation every domain simulator process listens on a TCP/IP port, and accepts connection from other domains. These TCP/IP sockets are then used to carry the SCHEMESTATION messages. The convention is to use the ip-address of the unix host as the first four bytes of the domain address and the listened port as the next two bytes as the domain address. The rest is padded with zero. When a domain needs to send a packet to domain it is not yet connected, it opens a socket [6, 7, 5] . The parties of the communication then initiate a hand-shake [10] as specified in [Networking Specification]. List of opened connections are then kept in the networking module so that a new connection is not needed for every packet. A linked list [8] will be adequate data structure for this. Every agent has its own message queue, where the arrived messages are added. An agent can send a message by contacting the messaging module. For migrated agents, there is a remote address entry in the address table. If a message is sent to a migrated agent, it is passed forward the remote location. The reliability of messaging in this simulation is assured by sending the message over TCP/IP. If abnormal conditions are encountered (such as a malfunctional unix host), packets may be lost. The messaging in this simulation is reliable in "normal external conditions". On boot-up, the simulator can be instructed to connect a special domain, router through which it will routa all its packets. The router is just an ordinary domain, since every domain routes forward all the packets not belonging to itself. By default every simulated domain acts like this, but simulators can be instructed to use a router for all of their networking with command line arguments. More accurate description about the networking can be found in Networking Specification and Networking Implementation.
| ||||||||||||||||||||||||||||||
2.6 Terminal | ||||||||||||||||||||||||||||||
The user's interface to a running simulation is a SCHEMESTATION terminal. The user can start a terminal session from the UNIX command prompt by specifying the domain the terminal should be attached to. The terminal then connects to the domain specified and is seen as an agent from the remote domain's point of view. The remote domain can then control the terminal. The user can use the terminal to launch new agents and to utilize the already present agents in the system. (Typically, he would have rights to spawn a compiler, a debugger and to connecto to several common services.) There can be different types of terminal (text-based versus graphical). The required terminal is a resizable text-based terminal that is implemented on the top of an X11R6 windowing system. An artistic vision of the text-based terminals' outlook (assuming that colors are included) is presented in Fig. 2.
| ||||||||||||||||||||||||||||||
2.7 Debuging and monitoring the system | ||||||||||||||||||||||||||||||
There will be a special agent providing debugging service. The agent
connects directly to the virtual machine through the interface
provided by the messaging system. The users can send a message to this
agent in order to study the execution of his agents. The debugging
agent will then instruct the VM to execute the bytecode of the agent
step by step. Information of the actions taken by the debugged agent
will be prvided for the user throught the terminal by the debugging agent.
The state of the simulation can be monitored. This includes getting information on the number of agents running, load of the VM, number of address entries in the addressing system and the number of network connections in the networking module. This functionality will be provided by a special agent connecting to the simulation through a interface that the messaging system provides.
| ||||||||||||||||||||||||||||||
2.8 Object server | ||||||||||||||||||||||||||||||
Instead of a file system the
SCHEMESTATION OS in this
simulation has agents functioning as
object servers (the OS specification does not require this; storing
information can be done in any other manner). The agents can send requests
to these servers to either fetch (data in the form of linearized
heaps) or store objects. The messages include the type of request
(store ot fetch) and optionally the data.
| ||||||||||||||||||||||||||||||
2.9 Applications | ||||||||||||||||||||||||||||||
As the core system has been implemented to the functional stage, a natice SCHEMESTATION application will be implemented. The application will be a remote chess game, that two players can play from different terminals. The boards will be displayed on the terminals, and the users will be able to move the pieces by typing the standard chess notation moves. The chess game will check that the made moves are legal and transmit the moves. The application will provide the means for an external agent to connect it; in practice this means that eg. a GNU Chess can act as a player.
| ||||||||||||||||||||||||||||||
3 Functionality | ||||||||||||||||||||||||||||||
3.1 Initial Boot-Up | ||||||||||||||||||||||||||||||
3.1.1 Purpose | ||||||||||||||||||||||||||||||
There are two kinds of boot-ups for SCHEMESTATION domains: (1) the
initial boot-up, and (2) system wake-up. During the initial boot-up, a
domain that has not existed before is created. System wake-ups occur
when a domain that has been temporarily freezed starts working again
and the persistent agents stored continue execution from the point
where they were left when the system went down. In the simulation the
freezing essentially means saving the entire state of the simulator
to disk: agents and their heaps, addressing system and all such. Network
connections need not to be saved as they are automatically opened upon request.
When a new domain is created (by starting a new SCHEMESTATION simulation process on unix host), the following happens:
| ||||||||||||||||||||||||||||||
3.1.2 User action | ||||||||||||||||||||||||||||||
The user starts the SCHEMESTATION
simulation executable on command
line specifying and unique (not formerly used) domain address.
vherva@Schemestation.cs.hut.fi:~/Schemestation/bin>Schemestation -p 7777This instructs the Schemestation to listen to the port 7777. As no other domain address is specified via the -d switch, a domain address, whose first four bytes contain the local ip address (130.233.40.74) and the next two bytes the port 7777, is used. The Schemestation-simulator checks whether that port is already listened or whether a domain with that domain address has been saved (frozen) to disk. If the user just wanted to wake up a frozen domain he would enter the same command and when then simulator would find a frozen domain on disk by that address it would restart it (load the address table, agents etc. from disk).
| ||||||||||||||||||||||||||||||
3.2 System shutdown | ||||||||||||||||||||||||||||||
System shutdown is initiated by an agent with sufficient
priviledges. The agent sends a special message to the kernel agent in
order to start the shutdown. The semantics of the message can be
found in Operating system specification.
The shutdown can be either temporary or final. Temporary shutdown involves saving the system on to disk and the final forcing all the agents to migrate away and shutting the domain permanently.
| ||||||||||||||||||||||||||||||
3.3 Invoking a Terminal | ||||||||||||||||||||||||||||||
The user can start a terminal program by entering the name of terminal
executable and specifying the domain it will be attached to. This
would invoke a X11 window, through which the user could send messages
to the agents.
The exact command line arguments for this as well as the details of the terminal user interface will be decided later, since the terminal is only a minor part of this project.
| ||||||||||||||||||||||||||||||
3.4 Compiling Scheme programs | ||||||||||||||||||||||||||||||
The SCHEMESTATION Scheme compiler
will have a simple command line interface. The user simply types the
name of the compiler executable and specifies the Scheme program to be
compiled. The result will be a file containing a binary
SCHEMESTATION
executable object. Scheme source files have the suffix .ssscm. Byte
code files have the suffix .ssbc. It is also possible to dump out some
intermediate files during the compilation, for example a symbolic byte
code file (assembly file). Assembly files contain the suffix .ssasm.
It is possible that the cross-compiler can be ported to the SCHEMESTATION operating system (this is not trivial, though). If this succeeds, then the compiler will have a SCHEMESTATION terminal -based user interface and compilation can be done using this interface, directly on the top of the simulation itself.
| ||||||||||||||||||||||||||||||
3.5 Debugging agents, monitoring the system, using object server and sending messages to the agents | ||||||||||||||||||||||||||||||
All these actions simply involve sending messages to the proper
agents. The user can do this using the terminal. The exact UI of the
terminal is not yet specified, but using for example the monitoring
agent would involve fetching the the bytecode of the monitoring agent
from the object server, typing the run command and specifying the
code to be ran. The session might look something like this:
SS> (addresses) ((address-system 12312312443242234234) (agent-invoker 71234899861349969676) (object-server 92873479879823794798)) SS> (message 2312312443242234234 ("get" "monitor-bc")) t SS> SS> -- message received: ("monitor-bc" %byte-code%) SS> (message 71234899861349969676 ("invoke" %byte-code%)) t SS> ss> | ||||||||||||||||||||||||||||||
4 External Interfaces | ||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||
4.1 Non-SCHEMESTATION Domain | ||||||||||||||||||||||||||||||
Every simulated domain will act as a proper
SCHEMESTATION domain
[Operating system specification]. The SCHEMESTATION
domains uses The SCHEMESTATION
packet protocol [Networking Specification] for their
communication. In the simulation this protocol is used over TCP/IP, as
defined in Networking Specification. Any external entity capable
of using this protocol (SSPP on TCP/IP) can connect to the simulated
SCHEMESTATION network and act
as a domain. The external entity may
begin to provide services that show as agents after this.
The agent-like services have to obey the messaging protocol - this
means that every service must have an address for communication.
| ||||||||||||||||||||||||||||||
4.2 External agents | ||||||||||||||||||||||||||||||
The kernel will provide an interface for external agents to directly
connect to the system. The external agent can show as members of that
domain in the SCHEMESTATION system.
The communication between the
external agent and the domain will be done through a unix socket. The
networking module will provide the way for the external agents to
connect and register themselves. The protocol will be the same as in
the communication between domains in this simulation (see | ||||||||||||||||||||||||||||||
5 Other requirements | ||||||||||||||||||||||||||||||
5.1 Documentation | ||||||||||||||||||||||||||||||
One of the main aspects of the project is the understandability of the
system. The quality of the documentation is essential to ensure that
the further development of the system can take place after this
project. Documentation will be made in a special html-like ssdoc-format,
which is automatically converted into html-format. The purpose of this
convention is to make documentation easier; the visual appearance of
the documents is provided automatically by the conversion script. It also
provides the means for automatically embedding xfig-figures,
LaTex-equations and references to other documents in this project.
The conversion script with assistance from the code-browser will
provide a way for automatically embedding a generated documentation of
a code module interface to a document, if the code module is suitably
commented. Further documentation on the documentation system can be
found in Documentation.
Every code module will be specified in detail with a separate document before the implementation. These documents will contain the description of the functionality the module is required to implement, and a description of the interface to the module as well a brief plan of the internal structure. Another document will be written about the implementation of the module. It will contain the exact interface (function prototypes and a brief description of their semantics as well as documentation of data structures used). Also an overview of the structure and implementation of module has to be included. To make code browsing easier, a web-based code-browser (in url http://Schemestation.cs.hut.fi/cgi-bin/code-browser/status.pl?index=1) will be developed as a side-effect of this project. There is a more detailed document in Code-Browser Utility. The aim is to make finding functions and their headers, searching code and getting oversight to the system for both the members of the project and the outsiders. The browser will hopefully help in the future develpoment of the SCHEMESTATION project, when new programmers try to study the system. Certain instructions are followed writing the code. These include uniform indentation, naming policy and well-formed commentation (partly in purpose to make it possibly to use the comments with the code-browser.) Project coding policy defines the coding discipline in detail. | ||||||||||||||||||||||||||||||
5.2 Portability | ||||||||||||||||||||||||||||||
Since a standard is hard to define, the system on which
the code ditribution of
SCHEMESTATION can be compiled
is defined to be a unix system with the following capabilities:
The portability of the source will be considered throughout the project. The language used is strictly ansi C - no Gnu extensions to C apart from long long will be used. The code must be compilable with the gcc --ansi switch without warnings. No such system calls or library functions, whose functioning on any considerable unix system is doubted, will be used unless absolutely necessary. Such architecture depended issues as the byte-order or length of interger on the target system should not have impact on the functionality of the system. The Makefile will be made in Gnu Make format. The portability will be ensured compiling on the following systems
The terminal will use the X11R6 window system. The Scheme part of the source has to be R4RS compliant if it is intended to be runnable with other compilers than SCHEMESTATION Scheme compiler (for example the source of the compiler itself). Other Scheme source needs to be runnable with SCHEMESTATION Scheme compiler - the Scheme code and SCHEMESTATION bytecode are portable as specified in Operating system specification. Due to this, there are no major portability issues with the SCHEMESTATION Scheme source. The perl-utilities developed within this project will be functional wherever at least version 5 of perl is available. The project does not involve testing the source with different compilers, only Gnu C Compiler will be used. This however does not mean that the code was not intended to compile with other compilers. Porting the system to other operating systems, such as MacOs, Plan 9, Amoeba, or perhaps Windows, could be possible, but is not considered during this project. | ||||||||||||||||||||||||||||||
5.3 Error recovery | ||||||||||||||||||||||||||||||
The simulator itself should be reliable and work correctly in normal
conditions (i.e. it can fail only when there is a hardware or
software failure somewhere else affecting the system).
Agents failing to perform their tasks due to logical errors in the algorithms they use or due to invalid byte code instructions etc. will be handled by the simulator as specified by the Operating system specification. Since users can only impact the system by sending messages to agents (in addition to giving the command line arguments), they cannot cause any more serious errors than agents. Any special agents providing special services such as the kernel agent, will quietly ignore erroneous request messages.
| ||||||||||||||||||||||||||||||
5.4 Usability | ||||||||||||||||||||||||||||||
To ensure good usability of the system, the following issues have to
be considered:
| ||||||||||||||||||||||||||||||
5.5 Extensibility | ||||||||||||||||||||||||||||||
The source code will be made as open as possible for future
extensions. The documentation will try to provide as good basis for
future development as possible. All code modules will be documented as
specified in Documentation. The external
interfaces of SCHEMESTATION can
be used to expand the system with external modules.
| ||||||||||||||||||||||||||||||
5.6 Maintainability | ||||||||||||||||||||||||||||||
Throughout the development the code will be kept consistent using the
CVS concurrent version control. The code will be committed only when
the changes are discovered correct (this means that the "current"
version in the version control system should always be functional).
The previous versions of any part of the code can be extracted on demand.
The coding discipline [Project coding policy] and the comprehensive documentation will ensure that the code resulting of this project will be maintainable and easy to understand.
| ||||||||||||||||||||||||||||||
5.7 The Performance of the system | ||||||||||||||||||||||||||||||
The basis of the system is the virtual machine that executes
SCHEMESTATION byte code.
The machine executes one instruction at the
time, rather than trying to compile the code. This sets the major
limitation to the permormence of the system.
A simulator running as the only cpu-intensive process of the unix-machine with Pentium 166 CPU (which is our test machine) should be able to do at least 200 000 SCHEMESTATION virtual machine instructions (the average case, not just NOP) per second. This does not include the stop-and-copy garbage-collection nor the overhead caused by the system, it is just the speed of the virtual machine. That rate is adequate for testing educational Scheme programs. In practice, the system operating at that rate would execute an empty loop, (define (f i) (if (= 0 i) 0 (f (- i 1)))) (f 100000)that generates some 10-20 SCHEMESTATION instructions with the SCHEMESTATION Scheme compiler, in 5-10 secs. This is 2-5 times slower than for example VSCM Scheme implementation for SGI Indy workstation.
| ||||||||||||||||||||||||||||||
5.8 Quality Criteria | ||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||
6 Summary | ||||||||||||||||||||||||||||||
SCHEMESTATION is an agent based operating system. An agent is a entity with internal state, cpapbility to send and receive messages. It usually does some kind of computational tasks, it might for example serve as a filtering data buffer. Each agent belongs to a domain, that includes addressing system, messaging system and some sort of scheduler to divide the cpu capacity assigned to the domain. There is no main memory, instead, every agent has its own linearized, garbage-collected, typed heap, that includes the data and the bytecode. The agents migrate from domain to another transparently by sending their linearized heaps as messages. The agents are addressed with agent specifiers. If a message is sent to an agent that has migrated, the addressing system of the domain returns transparently the new remote address. The communication is secured by crypting the communication (not implemented in this prototype) and implementing trust based security between domains. Non-SCHEMESTATION programs can connect the system and act as an agent. These might include data bases, terminals, compilers etc. The domain and the agents are persistent meaning that they can be temporarily shut down and later woken up with the same internal state. This project involves implementing a simulation of the SCHEMESTATION operating system. The simulation is ran as a user process in a posix-compliant unix host with Gnu tools. Part of this simulation is a Scheme compiler written in Scheme, that produces SCHEMESTATION bytecode. The byte code is executed by the SCHEMESTATION virtual machine, that is controlled by the scheduler. The simulator can be used as an instrument of education in HUT CS, due to the programming and debugging facilities it provides. The system can be used in both research and education, since it extension to it can be easily written. Studying the structure of the system might also have some educational value. The user interface to the system is a X11 based terminal through which the user can send messages to agents - which is essentially everything user needs to do to controls the system. Essetial quality issues include the adequitity of the documentation, the readability of the code, extensibility, portability, usability by means that that the intended users can use the system for beneficial activity.
| ||||||||||||||||||||||||||||||
7 References: | ||||||||||||||||||||||||||||||
|
© SchemeStation project 1997-1998 | [Back to the document index] |