CDS and BCR Frequently Asked Questions

What is Cooperative Data Sharing?

Cooperative Data Sharing (CDS) is a family of APIs that utilize certain techniques (like dynamic shared memory allocation, shared public queues, and user-controlled copy-on-write protocols) to provide efficient , flexible, and portable communication in a wide variety of high-latency and low-latency situations. Instead of requiring the user/programmer to decide on a restrictive style of communication interface (e.g. shared memory or message passing) beforehand, and then forcing them to figure out how to optimize each subsequent communication based on the relative positioning of the communicating entities and the interface chosen, CDS just lets the programmer to specify the behavior (i.e. semantics) required of each communication in their program. CDS figures out how to provide that behavior most efficiently at run time, based on the relative positioning of the communicating entities (and perhaps other factors).

What is BCR?

BCR (which stands for "Before CDS Redesign", or "C-1, D-1, S-1") is one particular member ("flavor") of the CDS communication interface family. It already has some known shortcomings (as its name implies), but also has the advantage of being the only member that is available now, and is open source. Someday, when the interface is more evolved, the term "CDS" may refer to a specific interface, but until then, each will have some sort of identifying name.

How is CDS (or BCR) better than message passing (say, MPI)?

Message passing interfaces are designed and optimized around some basic assumptions--such as, that the "sender" and "receiver" of communicated data will be on separate processors, potentially separated by high latency channels, and that it is therefore acceptable to physically copy data from one entity to the other. As a result, even when the communicating processes are on a single processor, and even when the interface is configured with so-called "shared memory" options, that copying of data from one process to another is still required for each communication, just to make the communication act as the programmer expects it will. That is often unneeded overhead. Message passing interfaces also often restrict the programmer to a specific set of ways that they can interact, yielding problems when the program needs some other way (e.g. a more demand-driven style). CDS only copies data when it is needed (by either the programmer's specified behavior for that communication, or because the communicating entities are indeed separated by high-latency channel). And, the CDS programmer has a wide variety of options for the behavior of each and every communication.

How is CDS (or BCR) better than distributed shared memory?

Shared memory interfaces make a different set of assumptions--e.g. that there will be relatively low latency between communicating processors. As such, they don't usually provide constructs to help hide latency (such as queues, and/or the ability for the programmer to specify where data will be needed next). These interfaces also often assume homogeneous platforms, so don't appropriately allow for data to be converted as it is being communicated between machines with different data representations. CDS addresses these issues.

Are there more general reasons to use CDS?

CDS can free the programmer of restrictive assumptions which complicate the programming process. For example, if one is using a message passing interface, it can become a programming hassle if the problem at hand calls for a demand-driven style where data must be made available to any process that needs it. In CDS, it's very simple. Likewise, if one is using a shared memory interface, and decides that they need to queue up regions for a particular process to access in order as it has time, it can become a programming issue, but in CDS, it's natural.

Portability has been the leverage that has motivated productivity in software over the decades, and has instigated the development of high-level languages and standards. It allows software development costs to be amortized over a broader and/or longer useful life. CDS provides such portability in communication. In addition to the portability between high- and low-latency environments, which CDS handles better than either message passing or shared memory, there's another kind of portability related to granularity. That is, because of the overhead inherent in a message passing interface, the programmer must try to make the number of processes roughly match the number of processors to get the optimal performance, but the number of processors is itself an architectural trait. In CDS, such numerical matching becomes less important, because when multiple computing entities end up on the same processor, their communication overhead is minimal, making them almost like components in a single entity instead of individual entities.

Are there any drawbacks to CDS (or BCR) over message passing or shared memory?

Of course, though we expect to see those lessen over time(see future plans). Right now, BCR has moderately good performance in many cases, but has not been tuned for specific architectures as MPI has (over many years), so it will depend in large part what you want to do. CDS currently has no equivalent functionality to MPI-IO or many collective operations, though it may actually have superior performance in the long run by leaving these to higher-level functions.

Similar arguments can be made for shared memory. CDS has a weak release consistency model, and implementation of even that could be further optimized. If you expect stronger consistency models, and you already have a shared memory solution that works well and that is not likely to need porting to other environments, BCR may not be the answer for you. However, using CDS in some distributed memory machines may provide better performance than using the vendor's "native" shared memory interfaces, because of CDS's superior ability to hide latency.