[picture:OPTI-NUM header graphic]


Newsletters
November 2008

November 2008


Back to main newsletter

Achieving Truly Scalable Code using Parallel Computing Toolbox

The previous parts of this newsletter discussed the need for parallel computing, and how that need is addressed using Parallel Computing Toolbox. In this section we discuss how Parallel Computing Toolbox allows you to manage scalability problems, allowing your solution to grow with your available computing power.

Accessing Computing Resources....

On the Desktop

Most parallel computing users need to prototype their parallel solution before scaling that solution up to a larger problem. Using Parallel Computing Toolbox, you have access to a machine-local cluster of up to 4 workers, without the need for any other product. This allows you to prototype a parallel computing solution to determine the feasibility of your algorithm for parallel execution. This is called a local cluster.

In the Department

Once your problem needs to be scaled to more than just your desktop, you need to find resources that will enable you to access the computing power of the processor cores and memory on other machines. Departmental clusters are the first starting point for this endeavour. They typically consist of several off-the-shelf machines (often desktop machines) connected via standard Ethernet cables. To take advantage of a departmental cluster, you could purchase MATLAB licenses (plus all the toolboxes you would need for your problem) for installation on each machine. You would then be responsible for partitioning the problem onto the departmental machines, running the algorithm on each machine, and then collecting the results.
Alternatively, you could use MATLAB Distributed Computing Server to manage the licensing of MATLAB and Toolboxes, partitioning of the problem, execution of the algorithms, and collection of data, for the departmental cluster. Whatever you can run on your desktop, as long as you have Parallel Computing Toolbox to connect to the MATLAB Distributed Computing Server on the cluster, you will be licensed to use those same Toolboxes on the server (some obvious restrictions apply; see the Ineligible Programs list on The MathWorks web site.)

You can set up the cluster to run MATLAB Distributed Computing Server’s job manager to enable efficient resource utilisation among a shared departmental cluster. This job manager provides a scheduling service for MATLAB tasks, providing queues and task logs to enable you to manage batch operations from multiple users.

On the Campus

Once your problem scales to a large enough problem, you will require access to a dedicated computing cluster. Typically, this cluster becomes an IT resource, requiring special cooling, networking interconnects, and shared disk storage. More often than not, with a dedicated computing cluster in your organisation, that cluster will be supporting multiple applications. Consequently, a job scheduler is typically used to manage tasks from multiple users.

Parallel Computing Toolbox and MATLAB Distributed Computing Server provide built-in support for a number of commonly used cluster schedulers. The MATLAB Distributed Computing Server supported schedulers page provides a complete list. For those schedulers not supported, The MathWorks provides a user-configurable generic scheduler interface.

On the Grid

Grid computing is a term used to describe computing resources shared across multiple organisations, often geographically distant. By using grids such as EGEE, researchers can access computing resources when they need them, without any single organisation spending vast sums of money on a massive computing resource. Grids enable collaborative resource-pooling so that those resources are more optimally used for a number of research projects.

MATLAB Distributed Computing Server supports grid computing using EGEE. For more information, see the article “EGEE now running Matlab parallel computing products” in Scientific Computing’s HPC News.

In the Clouds!

Cloud computing is a term used to describe accessing computing resources on an as-needed basis. The most popular cloud computing infrastructure is Amazon’s Elastic Compute Cloud (EC2). Using EC2, you could have access to as many compute nodes as you need, exactly when you need them. You only pay for the use of the resource, and not for any administration and maintenance of a very large resource that may be under-utilised.

The MathWorks has written a white-paper on how to access Amazon’s EC2 with MATLAB Distributed Computing products.

Switching Schedulers or Clusters and Using Configurations

You can switch between the schedulers described above with minimal code changes, using configurations provided by Parallel Computing Toolbox. Configurations allow you to name a scheduler on your local client, and configure that scheduler’s specific resources (such as required file or path dependencies) so that you can easily switch between prototyping on your local scheduler, and then switch to the ec2 scheduler (assuming you had configured a scheduler and named it ec2).

 

Back to main newsletter