Future Plans

The interface

This release is called "BCR" in recognition of the fact that there remains work to be done, and to prevent having several different "CDS interfaces" floating around.  Many apparent shortcomings of the BCR interface have already been addressed by Elepar's CDS description, but even that has had limited external involvement, and has not been implemented at the time of this writing.

The correct approach to designing a stable interface would be to involve potential users, hopefully after they had experimented with current implementations like BCR.  Though the common approach would be to create some sort of standards body to physically meet at regular intervals and hammer out details, it may be just as (or more) productive to handle this in a distributed fashion on mailing lists.  A specific forum, called API_design, has been set up on SourceForge to address CDS interface design issues.  Consider this web page as food for thought for discussion there.

One goal of an ultimate CDS interface might be to minimize the number of reasons people might use to opt for some other interface., like PVM, MPI, UDP/IP, or distributed shared memory.  For example, some of the features suggested in the Elepar design, over and above those in BCR:

BCR may already have suitable (or superior) alternatives to the MPI one-sided operations, but it is not clear how features like collective communication and/or distributed I/O would map to CDS, or whether they should remain as higher-level functions, since CDS makes no assumption that it will execute on a high-performance system, or that there will be any recognizable topology among the CCEs.  Similarly, how much support should there be for "stdin" and "stdout"?  How much process (or CCE) control should be possible?

There are also binding questions.  Because of its use of pointers, CDS targets C-like languages, though it is usable (and arguably as efficient as message passing) in Fortran by using the "copyfm" and "copyto" operations exclusively to access regions.  Are there better solutions?  Can Fortran pointers fulfill the needs of CDS region IDs?


Related to the above, is it sensible to consider a Linux kernel level interface to CDS functionality?  In some sense, CDS can be considered an amalgam and generalization of SysV constructs like shared memory and message segments, and could serve as an alternative to sockets.  It can also be considered as a position-independent thread interface.  It could truly set Linux apart as being a network-ready OS.

Completion, Optimization and Porting

Some tweaks don't even require changes in the interface per se, just in how it is interpreted and implemented.  For example, the "objname" argument in an enlist is currently interpreted as a filename on the target machine which stores the executable for the process which will become the CCE.  The caller shouldn't need to know the file structure on the target machine, or whether the target CCE will be implemented as a process or a thread, or whether it is already running (and blocked at enlist) or needs to be initiated.  To handle these, each machine/program should have an independent means of mapping incoming CCE "objnames" to a thread, and a means to determine how to initiate it if it isn't already waiting.  (A potential race condition must be avoided, where the thread has begun but hasn't yet reached the "init" when the CCE request comes in.)

Some features specifyiable in the "init" call have not been implemented.  These include garbage collection (i.e. to automatically compact the comm heap when it becomes fragmented) and error regions (i.e. to inform a CCE of failures in region deliver or enlistment through automatically-generated regions to a specified cell).  Some modifications to the interface may be required to make these operate correctly (e.g. to ensure that pointers within region IDs aren't changing due to garbage collection without the user's being informed).

Even without extensions to the interface, the existing BCR could be further optimized for different platforms.  For example, nothing more extensive than spinlocks or UDP/IP (with some rather basic flow control and reliability algorithms) are currently used for communication.  Implementation of lock-free queues and various OS bypass to high-performance networks could increase performance significantly.  The shared memory allocation logic is also ad hoc, and could probably be significantly optimized.  Send and recv are not optimized in any way yet, although they are designed to allow significant optimization by circumventing much of the intervening copying.


CDS is the proper level to build security into many systems.  It's method of initiating/enlisting CCEs may even help to facilitate the generation and passing of keys, etc.