Golden Energy Computing Organization
Colorado School of Mines

FACULTY PROPOSALS FOR TIME ALLOCATION ON BLUEM

The CSM High Performance Computing Group and the Golden Energy Computing Organization (GECO) invites CSM faculty and associated research groups to submit proposals for use of Mines' high performance computing (HPC) platform, BlueM.

BlueM will allow researchers to run large simulations in support of the university's core research areas while operating on the forefront of algorithm development. BlueM is a unique high performance computing system from IBM. The overall specifications are:

Feature Value
Teraflop rating 154 teraflops. (Roughly 7xRA)
Memory 17.4 terabytes
Nodes 656
Cores 10,496
Disk 480 terabytes

BlueM is unique in configuration. It contains two independent compute partitions that share a common file system. The two partitions are built using different architectures. The first partition, known as Mc2 (Energy), runs on an IBM BlueGene Q (BGQ). The second partition, known as Aun (Golden), uses the iDataplex architecture. Each of the architectures is optimized for a particular type of parallel application. For additional information see: http://hpc.mines.edu/bluem/ and http://hpc.mines.edu/bluem/description.html .

Proposals can be submitted using the following instructions and electronic form. This procedure is to be used by CSM faculty members to request a specified number of node-hours on BlueM. Allocations will be made on a semi-annual basis, and allocation awards will be valid for six months from the award date. Proposals will be evaluated based on:

  • sciences theme;
  • reasonableness of number of node-hours requested (each node has 16 cores);
  • code scalability;
  • number of students and post docs associated with the project;
  • clear tie to requested or existing external funding;
  • previous history of bringing funding to CSM using HPC;
  • faculty publications which rely on HPC; and
  • faculty achievements, awards and honors associated with HPC.

Importance of listing Grants and Publications

You will see in the form below, fields for entering information about your proposals, funded research, and HPC related publications. This is important because this information will be used as part of the justification for future machine and infrastructure updates. This information will be reviewed and additional information many be sought if your entries do not appear complete.

Calculation of node hours to request

BlueM is significantly different from Mio. As discussed at http://hpc.mines.edu/bluem/description.html it contains two partions. These differences sould be kept in mind when calculating the number of node hours to request.

The iDataplex partition is the most similar to our old machine RA and Mio. It uses Intel processors as did RA and does Mio. The performance on an individual core maybe 2-4x that of the RA cores and there are twice as many cores per node, 16. Most applications built on Mio will run on the iDataplex but they will not see the best performance without a recompile. Also the iDataplex nodes each have 64 Gbytes of memory.

The processors on the BGQ partition are not in the same family as the RA/Mio processors. Codes built on Mio will not run on the BGQ partition. They will need to be recompiled. Also, the processors are designed to be oversubscribed. That is, they run best if there is more than 16 MPI tasks running on each node node or each task uses multiple threads. The number of MPI task times the number of threads per task is up to 64. The performance per core on the BGQ partition will be comparable to RA but there are many more cores available and the scaling will be much better.

We have below two lists of programs and libraries that have been built on other Blue Gene machines. Some of the listed items are from a earlier model of the Blue Gene, the "P". Some listed have been ported but there has not been a high level of optimization performed

Researchers with previous allocations on BlueM can determine there usage by running the following commands on either AuN or Mc2:


/opt/utility/aunhrs ############
/opt/utility/mc2hrs ############

where ############ is the account number. These commands will appear to report account information back to the the beginning of last year but since these accounts have only been active on our current accounting system since the fall only the hours used since August will be reported.

You can see the accounts you are authorized to use by running the following command on either AuN or Mc2:

sacctmgr list association cluster=mc2 user=$LOGNAME format=User,Account%20

This will return a list of your accounts. You can then see the association between your account number and your project title by running the following command, replacing ############ with the values from the previous command.

sacctmgr list account ############ format=Account%15,Desc%80

Successful proposals will result in an award of a fixed number of node-hours on BlueM, and faculty members will be able to track the node-hour usage by their group members through the commands given above. We have been rather lax in the past on enforcing limits to the time granted. Because of various contractual obligations we will need to enforce limits. Users will be allowed to run past their allocated hours but at a low priority.

Faculty will be called upon to provide research summaries and publications generated and to participate in HPC meetings and promotional activities.

Data Storage

The file system policy for BlueM is outlined at http://inside.mines.edu/mio/newpolicy.html . In particular we note that the file system is not intended for long term storage. If users wish to store data past its eminent current usage off machine storage should be arranged. See: http://hpc.mines.edu/tier2 for suggestions on archiving data.

For questions please contact Dr. Timothy Kaiser at: tkaiser@mines.edu

Instructions

Fill in all of the information and hit "Submit". All fields not marked optional must be filled. The larget text input boxed will scroll but the can also be expanded. After you hit submit you will see a summary of what you have entered. Please save this for your records.

There are two types of accounts available, normal and experimental . Experimental accounts are for those people who are new to HPC and need to gain experience before submitting a request for a normal account. If you had a previous BlueM allocation but you did not use a significant portion of it you are expected to request an experimental allocation.

Faculty Data

Faculty Name:
Email:
Title of Project:
Academic Department:
Groups Technical Point of Contact:
The Technical Point of Contact will
be the person who you select that will be
most knowledgeable about the computation
aspects of your project. They will be a
resource for others in your group and for
the people in the HPC group to help them
understand issues.

Allocation Request

Account Type (See Instructions):
Node-Hours Requested (each node has 16 cores):

Project Overview

Provide a concise, one paragraph summary of the project.

Clarify the earth/energy/environment science tie to the proposed use of BlueM.

Explain the impact of the proposed research project.

Defend the number of node-hours requested. Be as quantitative as possible. Include record of usage on similar projects if possible.

Number of students and post docs associated with the project:


Identify any external funding requiring this computer time, and the total project award to CSM by year. Has this funding been received yet? Explain whether or not GECO was specifically mentioned in the proposal to this funding agency.

List of collaborators and their organizations.

Code information

Commercial codes or software packages to be used for this project
(must be supplied by faculty member)
Open Source codes or software packages to be used for this project

Brief description of codes (including web site if available):

Code Scalability information
Attache a pdf description of scalability (Optional)
(Optional) Scaling Information choose a *pdf file to upload:

Average number of nodes to be used in operational runs:

Estimated wall time (hours) for completion of operational runs: hours

Does your code have check point restart capability?

Is there a version of your code that uses GPUs?
This does not effect allocation but will only
be used for future planning.

Anticipated maximum temporary storage requirements: Gbytes

Data Archiving Information

Data on the SCRATCH space of BlueM is subject to purging and users must archive any data that is not in eminent current usage. Please estimate your yearly archival requirements and plans for storing data
Anticipated Data Archive requirements: Gbytes
and plan

RECORD OF HPC FUNDING AND PUBLICATIONS

Provide a list of recently funded projects and projects in the proposal review process. Please provide a list of recent publications related to this project, including the Digital Object Identifier number. Identify publications that have explicitly identified CSM HPC resources in the publication acknowledgements.

Proposals in Review and/or Recent (3 years) funded Reserach
Title of Investigation Status Source
of
Funding
Amount $K CSM Project #
or
CSM Proposal #
Start/Stop Dates
of Investigation
# Students
and
Post docs

HPC Related Publications
DOI # CSM Resources
Acknowled
Citation

When you hit submit you will be shown a copy of your input.
If you see an error use the "Go Back" function on your browser,
correct the error and resubmit.

Modified 4:42