Version 50, last updated by cyril.briquet at May 11, 2010 21:25 UTC

CanoPeer: P2P Grid middleware for Java apps

CanoPeer is a P2P grid middleware for Java applications.

The application model is the Bag of Tasks (often referred to as parameter sweep). An application submitted to a P2P grid based on the CanoPeer middleware is a set of independent tasks (JAR files) that process input data files (each input data file may be associated with one or more tasks of the set).

Academic research on CanoPeer

CanoPeer Publications

Deployment of a P2P grid based on the CanoPeer middleware

A P2P Grid based on the CanoPeer middleware can be deployed on clusters of desktop computers (e.g. with 512MB RAM, running Linux or Mac OS X) acting as worker nodes. Each cluster is acting as a Peer in the P2P grid, and it can be comprised of any number of worker nodes. One of the worker nodes, called Peer acts a head node or submission interface. The other nodes, called Resources, run JAR files sent by the Peer of the cluster they belong to.

P2P Grid

The example above displays a P2P Grid comprised of 4 clusters, with the top-left cluster processing a Bag of Tasks with its Resource as well as with the Resources of the bottom-left and bottom-right clusters:

  • top-left cluster: 1 User Agent (submitting a Bag of 5 tasks), 1 Peer, 1 Resource

  • top-right cluster: 1 Peer, 1 Resource

  • bottom-right cluster: 1 User Agent (not currently submitting anything), 1 Peer, 4 Resources

  • bottom-left cluster: 1 Peer, 2 Resources

A Search Engine (not illustrated here) allows Peers to locate other Peers with which to possibly exchange computing time.

Execution of Java apps on a P2P Grid

JAR files are transparently run on nodes of the P2P Grid. These can be nodes of the local cluster, or nodes from remote clusters willing to exchange computing time of their nodes. The CanoPeer middleware relies on bartering of comuting time, no money or credits are involved in the exchange of computing time.

Data files are transparently pushed to the nodes with BitTorrent or FTP, without any manual setup. In other words, the CanoPeer middleware automatically deploys its own data transfer overlay of BitTorrent clients, BitTorrent trackers, FTP clients, FTP servers.

The main benefits of CanoPeer reside in the transparency of the grid: resource discovery; resource negotiation; task scheduling; transfer of input data files; retrieval of output data files are all fully automated. Users need only submit JAR files to a Peer node through a User Agent. Users can thus be totally unaware of the existence of the grid and can think of the Peer node just as a submission interface without knowing any internal detail.

Simulation of the scheduling behavior of the CanoPeer middleware

A CanoPeer grid can be deployed on a discrete-event simulator to study the scheduling behavior of peers.

License

CanoPeer is released under under a Free and Open Source license (currently GPL version 2).

Quick Demo for the Impatient

4 shells are required for this quick demo: 1 shell to build CanoPeer and launch the User Agent that will run a "Hello, Grid!" application, 3 shells to deploy a minimal 1-Peer Grid on localhost (1 Search Engine, 1 Peer, 1 Resource).

Checkout CanoPeer from trunk (trust us, CanoPeer trunk currently has more features, bug fixes and is more stable than the current CanoPeer release) and build CanoPeer from the downloaded source code:

shell1$ cd /somewhere/nice

shell1$ svn co http://subversion.assembla.com/svn/canopeer/ canopeer

shell1$ cd canopeer/trunk

shell1$ ant dist examples

Deploy a minimal 1-Peer Grid on the localhost (1 Search Engine, 1 Peer, 1 Resource):

shell2$ cd /somewhere/nice/canopeer/trunk/bin/local

shell2$ ./LaunchSearchEngine.sh

shell3$ cd /somewhere/nice/canopeer/trunk/bin/local

shell3$ ./LaunchPeer.sh

shell4$ cd /somewhere/nice/canopeer/trunk/bin/local

shell4$ ./LaunchResource.sh

Run a 'Hello, Grid!" application:

shell1$ cd apps/hellogrid

shell1$ ../../bin/local/LaunchUserAgent.sh hellogrid.jdf (that is equivalent to the qsub command familiar to many cluster users)

shell1$ more canopeer-user*

Congratulations, you've run your first application using the CanoPeer middleware, after successfully deploying a (very small, for the sake of example) P2P Grid :-)

Using the CanoPeer P2P Grid middleware

Precompiled binaries are released every now and then, and can be found in the files section.

Documentation is maintained as a PDF documentation book The documentation book is published alongside releases, in the doc/book directory. The documentation book is currently a work in progress and still far from complete.

Developing applications to run on the P2P grid

Examples of Grid applications can be found in the apps and examples directories.

A grid application is comprised of a set of tasks, each packaged as as a JAR file. One or more input data files are assigned to each task for processing.

Each task is basically a collection of Java classes, one of them designated as the entry-point of the task (i.e. the grid equivalent of main()). To this end, the designated entry-point class must implement a simple interface (GridApplication) featuring:

  • 1 default (i.e. no-argument) constructor

  • 4 methods that are called at runtime (before task execution) by the grid middleware to set:

    • the input parameters (i.e. the grid equivalent of the command line args array);
    • the metadata describing the input data files (that are made locally available and can be accessed through InputStreams);
    • the patch to the playpen directory (i.e. directory on the local filesystem that can be used by the task as temporary storage)
    • the identifier of the peer node of the cluster where the task is actually run
  • 1 method that is called at runtime (immediately prior to task execution) by the grid middleware to run the task (i.e. the grid equivalent of main())

  • 1 method that is called at runtime (after task completion) by the grid middleware to retrieve the output data files as byte arrays

Example of task entry-point class:

public class HelloGrid implements GridApplication {

  private GridData[] data;
  private static final int BUFFER_SIZE=128;
  private byte[] result;

  public HelloGrid() { this.result = null; }
  public void setParameters(Object[] parameters) {}
  public void setInputData(GridData[] data) { this.data = data; }
  public void setPlaypen(String playpen_dir) {}
  public void setSupplier(String supplier_id) {}

  public void compute() {
    final StringBuilder output = "";
    for (GridData data : this.data) { // simply echo the input files into the output file
      try {
        final InputStreamReader isr = new InputStreamReader(data.getInputStream());
        final char[] buffer = new char[BUFFER_SIZE];
        int read = isr.read(buffer, 0, BUFFER_SIZE);
        while (read != -1) {
          output.append(buffer, 0, read);
          read = isr.read(buffer, 0, BUFFER_SIZE);
        }
        isr.close();
        output.append("\n\n---\n\n");
      }
      catch (Exception e) {
        this.result = e.getMessage().getBytes(); return;
      }
    }
    this.result = output.toString().getBytes();
  }

  public byte[][] getResult() { return new byte[][] { this.result }; }
}

Submitting applications to a deployed P2P grid

$ /path/to/canopeer/bin/local/LaunchUserAgent.sh my-job-description-file.jdf

Also make sure that an userconfig.xml User Agent configuration file is present in the current directory, containing at least an empty <user-configuration /> tag.

Deploying your own P2P Grid

First describe your grid configuration and topology into an XML deployment configuration file, e.g. gridconfig.xml. The documentation of the XML deployment configuration file is available in the PDF documentation book (see doc/book directory).

Then run the deployer: $ ./bin/deploy-canopeer-grid.sh -a deploy -c gridconfig.xml

Building the P2P Grid middleware from source

CanoPeer targets Java 6 and has been tested on Sun J2SE 6.0 SDK.

CanoPeer sources can be retrieved from the Subversion repository with the following command:

$ svn co http://subversion.assembla.com/svn/canopeer/ canopeer

You can then build the sources by running ant from the trunk directory (you'll need to install ant, of course).

Dependencies

CanoPeer depends on the following Open Source libraries:

All dependencies are included in the source tree, except JUnit, which is ony required if you want to build the unit tests (i.e. if you don't intend to run unit tests, you don't need JUnit).

Acknowledgments

CanoPeer is the first P2P Grid middleware with embedded BitTorrent support for data transfers. It is the continuation of the Lightweight Bartering Grid project.

CanoPeer acknowledges the support of:

  • Assembla is offering free hosting of open source and community projects.

  • YourKit is kindly supporting open source projects with its full-featured Java Profiler.

    YourKit, LLC is creator of innovative and intelligent tools for profiling Java and .NET applications.

    Take a look at YourKit's leading software products: YourKit Java Profiler and YourKit .NET Profiler.

Contact

You are welcome to send your questions, successful use cases, feature requests, bug reports to: cyril dot briquet, followed by the at-sign, then by canopeer dot org