Skip to main content


Operating Status

Downtimes: There is only one e-mail sent per downtime. Information concerning completion of the work or extension of the downtime will be posted on this page. Information on this page is also available via RSS feed.

11/11/2008 2:16:00 PM

DOWNTIME: Wednesday, November 12th, 8am-5pm

REASON: To install November 2008 security patches (Windows and Linux)

WHO WILL BE AFFECTED? All users who use CAC managed resources housed in Rhodes Hall, except for the following clusters: v4, USDA, Waller, Weblab (hadoop), and Kinglab.

WHAT WILL BE UNAVAILABLE?

  • ALL Windows LOGIN servers (all sessions will be lost)
  • ALL Windows batch cluster nodes (v3, development, cbsu1, cbsu2). ALL running batch jobs on these nodes will be cancelled
  • ALL managed servers (CBSUFSRV1, CBSUSRVnn, SCIDATAn, CATSnn, PETIE, ARECIBO, DSSLINUXn, etc)
  • ALL Web pages will be unavailable
  • ALL Database servers (ctcsql, etc)
  • Moab Scheduler for v4 and v4dev jobs (running jobs on v4 and v4dev will not be affected)

CAC Resource status can be found at http://www.cac.cornell.edu/datafeed/status.aspx

10/14/2008 10:30:00 AM

DOWNTIME: Wednesday, October 15th, 8am-5pm

REASON: To install October 2008 security patches (Windows and Linux)

WHO WILL BE AFFECTED? All users who use CAC managed resources housed in Rhodes Hall, except for the following clusters: v4, USDA, waller, and kinglab.

WHAT WILL BE UNAVAILABLE?

  • ALL Windows LOGIN servers (all sessions will be lost)
  • ALL Windows batch cluster nodes (v3, development, cbsu1, cbsu2). ALL running batch jobs on these nodes will be cancelled.
  • ALL managed servers (CBSUFSRV1, CBSUSRVnn, SCIDATAn, CATSnn, WEBLAB (hadoop), ARECIBO, DSSLINUXn, etc)
  • ALL Web pages will be unavailable: http://www.cac.cornell.edu/
  • ALL Database servers (ctcsql, etc)
9/16/2008 1:53:00 AM

CAC Fileservers and batch queue will be removed from service

To Current CAC Researchers:

v3linux will be removed from service on September 17. If you are using this resource, please move your jobs to v4 now so your production computing can continue without disruption.

To Former CTC Researchers:

All old (formerly CTC) accounts will be deleted as of October 1st. In order to continue to use CAC resources you will need to be added to a faculty sponsored project. Project information is listed here: http://www.cac.cornell.edu/services/projects.aspx

To Both Groups:

After October 1, 2008, we will no longer be able to recover files from the old CAC fileservers (ctcfsrv6 through ctcfsrv13). The fileservers will be reformatted. Therefore, after 10/1, there will be no way to get any files from these machines or from backup. Please contact help if you have any questions about recovering or moving your files.

9/8/2008 2:33:00 PM

CAC Computing Update

Resource and fileserver update.

New computing resource - v4

A new computing resource will be available on September 10, 2008. v4 has 1024 cores (128 servers, each with 2 quad-cores/server), 2.5 GHz Intel Xeon, and 16GB RAM/server. It is running Red Hat Enterprise Linux Server release 5.1. There will be a charge for using the v4 compute nodes. v4 nodes are accessed using the Moab scheduler.

Login node name: linuxlogin3
Batch queue names:
v4 (up to 128 servers)
v4dev (up to 2 servers, 1 hour time limit)
v4-64g (4 servers with 64GB RAM/server)


Batch documentation
Software
CAC Resources

Training on v4 September 11 & 12

From noon until 1pm on Thursday September 11 and Friday September 12, we will hold training sessions on using Moab to compute on the v4 nodes. The sessions will include a demo, a review of the available documentation, and time to try the scheduler with help from CAC consultants. Join us in room 514 Rhodes Hall. Bring your laptop if you have one. Registration is not necessary.

v3linux will be removed from service on September 17

This will allow current users one week to migrate to the new v4 resource. Please make use of the transition time to move to v4 as soon as possible so your production computing can continue without disruption. The v3 Windows queue will continue to be available without charge for the short term; soon it too will be moved to the Moab scheduler, and the operating system will be updated to Windows server 2008 x64.

Fileserver update

After October 1, 2008, we will no longer be able to recover files from the old CAC fileservers (ctcfsrv6 through ctcfsrv13). Please contact help@cac.cornell.edu if you have any questions.

9/8/2008 2:25:00 PM

DOWNTIME: Wednesday, September 10th, 8am-5pm

REASON: To add the NEW v4 computing resource to production and to install September 2008 security patches (Linux and Windows)

WHO WILL BE AFFECTED? All users who use CAC, CBSU, WEBLAB, ARECIBO, DSS, ADMM and USDA resources housed in Rhodes Hall

WHAT WILL BE UNAVAILABLE?

  • ALL LOGIN servers (all sessions will be lost)
  • ALL batch cluster nodes (v3, development, v3Linux, v3linuxdev, admm, cbsu1, cbsu2, x64test, all running jobs will be cancelled).
  • ALL managed servers
  • ALL Web pages will be unavailable: http://www.cac.cornell.edu/
  • ALL Fileservers containing home directories
  • ALL Database servers
8/11/2008 1:40:00 PM

DOWNTIME: Wednesday, August 13th, 8am-5pm

REASON: To install August 2008 security patches (Linux and Windows)

WHO WILL BE AFFECTED? All users who use CAC, CBSU, WEBLAB, ARECIBO, DSS, ADMM and USDA resources housed in Rhodes Hall

WHAT WILL BE UNAVAILABLE?

  • ALL LOGIN servers (all sessions will be lost)
  • ALL batch cluster nodes (v3, development, v3Linux, v3linuxdev, admm, cbsu1, cbsu2, x64test, all running jobs will be cancelled).
  • ALL managed servers
  • ALL Web pages will be unavailable: http://www.cac.cornell.edu/
  • ALL Fileservers containing home directories
  • ALL Database servers
7/10/2008 8:25:00 AM

IMPORTANT DOWNTIME ANNOUNCEMENT

Downtime July 15-16 for power outage and to install security patches

WHEN? Tuesday, July 15th, 3pm – Wednesday, July 16th, 3pm (24 hours)

REASONS: Power outage for electrical repairs/upgrades in Rhodes Hall and to install July security patches (Linux and Windows)

WHO WILL BE AFFECTED? All users who use CAC, CBSU, WEBLAB, ARECIBO, DSS, ADMM and USDA resources housed in Rhodes Hall

WHAT WILL BE UNAVAILABLE?

  • ALL LOGIN servers (all sessions will be lost)
  • ALL batch cluster nodes (v3, development, v3Linux, v3linuxdev, admm, cbsu1, cbsu2, x64test, all running jobs will be cancelled).
  • ALL Web pages will be unavailable: http://www.cac.cornell.edu/ .
  • ALL Fileservers containing home directories
  • ALL Database servers
7/10/2008 8:21:00 AM

UPDATE: Project information available

MANAGEMENT CONSOLE FOR PROJECT PIs

If you are the Principal Investigator for one or more projects, you can now view your project settings and view the list of userids you have enabled. By July 18th you will be able to update project settings from this page as well; that functionality will be added incrementally over the next ten days.

http://www.cac.cornell.edu/services/projects/manage.aspx

7/1/2008 11:20:00 AM

UPDATE: Cost-recovery model transition

Update on new project transition and deadlines for those not moving to new system

Notice to Researchers:

We are now in the first week of moving to a cost-recovery model. As you probably have noticed, we still have some work to do on systems and interfaces. We are still waiting for power installation in order to install the V4 linux cluster and four large shared memory nodes (16 core, 64 GB memory). Until the new systems are available, anyone who has created a project and added users can continue to use the V3 windows and linux clusters at no charge. We will begin charging for consulting, cluster maintenance and storage as of today, July 1. We are waiting final approval from the Cornell Division of Financial Affairs for the 2009 rates. As soon as the rates are approved we will update our web site. Any charges incurred starting July 1 will be billed at the new rates.

We will continue to improve the Web forms. In addition we plan to provide a web interface to allow PIs to view and manage their projects; we hope to have the first iteration next week, allowing the project PI to view project numbers, people added to the project(s), and limits set.

We welcome your feedback and will continue to make improvements. Regular updates will be provided as the changeover progresses.

If you are not moving to a new project:

  • July 1: You will not be able to submit any jobs to the Velocity queues (v3,development, v3linux, or v3linuxdev). Your login will remain active for a short period of time (1-2 months) to allow you time to move your data to local storage.
  • Sept 1: Your account will be disabled.
  • Oct 1: We will no longer be able to recover files.

As always, please contact us if you have any questions or concerns.

6/6/2008 6:12:00 PM

DOWNTIME: Wednesday, June 11, 8am-5pm

REASON: Patches will be applied to both Linux and Windows servers.

WHO WILL BE AFFECTED?

All users:

  • Fileservers containing home directories on H: and /home/nfs will be inaccessible.
  • ctclogina, ctcloginb, ctclogind, linuxlogin1, linuxlogin2 - all login sessions will be lost.
  • CAC batch machines (v3, development, v3linux, v3linuxdev, admm, cbsu1, cbsu2, x64test). All running jobs will be cancelled.
  • CAC Web pages will be unavailable: http://www.cac.cornell.edu/ .

To minimize interruptions in accessing files, the fileservers containing home directories on H: and /home/nfs will be patched first.

Database: The following machines will be inaccessible:

  • sqlsrv01, sqlsrv02, ctcsql, scidata1, scidata2, sonofstager

Specific groups: If you are using any of the following machines, you would make direct connections to them. These connections will be interrupted.

  • Cbsusrv01-03
  • Scicentr1-3
  • Weblab
  • Arecibo
5/16/2008 11:32:00 AM

Fileserver ctcfsrv13 experiencing hardware problems, only linux users affected.

ctcfsrv13 is experiencing hardware problems. This problem only affects linux users who have home folders on ctcfsrv13. Those users will be moved to other filesystems. If you think you are in this group, please contact us.

5/9/2008 1:35:00 AM

CAC downtime changes and updates

REASON: Patches and scheduler changes

Network maintenance 5/18/2008: Network connectivity to CAC managed servers will be lost from 7:00 - 7:15 AM. Running jobs will not be affected.

Velocity Scheduler (vsched) changes 5/14/2008: The changes will provide better access to the temporary smaller number of nodes and high demand.

  • The v2linux and v2linuxdev machines will be removed to allow space and power for new hardware
  • Cluster affiliation v3 will have the Max_Queue_Limit* raised from 244800 to 259200
  • Cluster affiliation v3linux will have the Max_Queue_Limit* changed from unlimited to 230400
* Max_Queue_Limit is the maximum number of node-wallclock minutes for all jobs in a given queue belonging to a single user, both running and waiting.

Downtime changes 5/14/2008, 8am-5pm:

All users:

  • Fileservers containing home directories on H: and /home/nfs will be inaccessible.
  • ctclogina, ctcloginb, ctclogind, linuxlogin1, linuxlogin2 - all login sessions will be lost.
  • CAC batch machines (v3, development, v3Linux, v3linuxdev, admm, cbsu1, cbsu2, x64test). All running jobs will be cancelled.
  • CAC Web pages will be unavailable: http://www.cac.cornell.edu/ .
To minimize interruptions in accessing files, the fileservers containing home directories on H: and /home/nfs will be patched first.

Database: The following machines will be inaccessible:

  • sqlsrv01, sqlsrv02, ctcsql, scidata1, scidata2, sonofstager

Specific groups: If you are using any of the following machines, you would make direct connections to them. These connections will be interrupted.

  • Cbsusrv01-05
  • Scicentr2&3
  • Weblab
  • Arecibo
5/6/2008 3:23:00 AM

Network maintenance 5/18/2008

Network connectivity to CAC managed servers will be lost on Sunday, May 18, 2008, 7:00 and 7:15 AM

4/28/2008 2:00:00 PM

CAC cluster changes

REASON: v2linux will be removed to be replaced with new cluster

The CAC is in the process of acquiring a new linux cluster, called v4linux. v4linux will consist of 128 M600 Dell Blades, each with 2 x Quad-core 2.5GHz Intel Processors (8 cores), 16GB RAM, and 1 x 120GB hard drive, running RHEL 5 linux. We plan to use the Moab scheduler on v4linux.

To make room for v4, the v2linux and v2linuxdev clusters must be removed first. In order to continue to offer both Windows and Linux clusters in the interim, we are temporarily moving half of the v3 nodes to Linux, configured as closely as possible to the current v2linux nodes. The v3 nodes will be returned to Windows as soon as v4linux comes online. Our goal is to have as many systems ready by July 1st as possible.

TIMEFRAME:

  • Monday, April 28th - Friday, May 1: Half of the v3 nodes will move from Windows to Linux
  • Monday, April 28th: v3linux nodes begin to be available for test and migration
  • Wednesday, May 14th: v2linux and v2linuxdev will be removed from service

TEMPORARY CHANGES FOR V3 (WINDOWS) USAGE:

  • Fewer nodes will be available in this pool. The nodes will be returned after v4linux is installed

TRANSITION CHANGES FOR V2 (LINUX) USAGE:

  • Change the affiliation in your xml batch script from v2linux to v3linux
  • Change the affiliation in your xml batch script from v2linuxdev to v3linuxdev
  • The v3linux and v3linuxdev nodes have been configured as closely as possible to the v2linux nodes. The nodes have more memory and faster cpus than v2
4/7/2008 11:00:00 AM

DOWNTIME: Thursday, April 9, 8am-5pm

REASON: Patches will be applied to both Linux and Windows servers. Velocity Scheduler policy changes will be applied.

WHO WILL BE AFFECTED?

All users:

  • Fileservers containing home directories on H: and /home/nfs will be inaccessible.
  • ctclogina, ctcloginb, ctclogind, linuxlogin1, linuxlogin2 - all login sessions will be lost.
  • CAC batch machines (v3, development, serial, v2Linux, v2linuxdev, admm, cbsu1, cbsu2, x64test). All running jobs will be cancelled.
  • CAC Web pages will be unavailable: http://www.cac.cornell.edu/ .

To minimize interruptions in accessing files, the fileservers containing home directories on H: and /home/nfs will be patched first.

Velocity Scheduler (vsched) users:

  • Cluster affiliations development and v2linuxdev will have the max_time_limit raised from 20 to 60 minutes
  • Cluster affiliation v3 will have the max_time_limit raised from 24 to 48 hours
  • Cluster affiliation v3 will have min_nodes reduced from 2 to 1
  • The serial affiliation will be removed. All serial nodes will be merged into v3. Jobs submitted to serial will fail with the message Affiliation "serial" does not exist. If you have been using the serial affiliation, simply change it to v3.

Database: The following machines will be inaccessible:

  • sqlsrv01, sqlsrv02, ctcsql, scidata1, scidata2, sonofstager

Specific groups: If you are using any of the following machines, you would make direct connections to them. These connections will be interrupted.

  • Cbsusrv01-03
  • Scicentr1-3
  • Weblab
  • Arecibo
3/6/2008 4:13:00 PM

DOWNTIME: Wednesday, March 12, 8am-5pm

REASON: Patches will be applied to both Linux and Windows servers.

WHO WILL BE AFFECTED?

All users:

  • Fileservers containing home directories on H: and /home/nfs will be inaccessible.
  • ctclogina, ctcloginb, ctclogind, linuxlogin1, linuxlogin2 - all login sessions will be lost.
  • CAC batch machines (v3, development, serial, v2Linux, v2linuxdev, admm, cbsu1, cbsu2, x64test). All running jobs will be cancelled.
  • CAC Web pages will be unavailable: http://www.cac.cornell.edu/ .

To minimize interruptions in accessing files, the fileservers containing home directories on H: and /home/nfs will be patched first.

Database: The following machines will be inaccessible:

  • sqlsrv01, sqlsrv02, ctcsql, scidata1, scidata2, sonofstager

Specific groups: If you are using any of the following machines, you would make direct connections to them. These connections will be interrupted.

  • Cbsusrv01-03
  • Scicentr1-3
  • Weblab
  • Arecibo
2/8/2008 2:43:00 PM

DOWNTIME: Wednesday, February 13, 8am-5pm

REASON: Patches will be applied to both Linux and Windows servers.

WHO WILL BE AFFECTED?

All users:

  • Fileservers containing home directories on H: and /home/nfs will be inaccessible.
  • ctclogina, ctcloginb, ctclogind, linuxlogin1, linuxlogin2 - all login sessions will be lost.
  • CAC batch machines (v3, development, serial, v2Linux, v2linuxdev, admm, cbsu1, cbsu2, x64test). All running jobs will be cancelled.
  • CAC Web pages will be unavailable: http://www.cac.cornell.edu/ .

To minimize interruptions in accessing files, the fileservers containing home directories on H: and /home/nfs will be patched first.

Database: The following machines will be inaccessible:

  • sqlsrv01, sqlsrv02, ctcsql, scidata1, scidata2, sonofstager

Specific groups: If you are using any of the following machines, you would make direct connections to them. These connections will be interrupted.

  • Cbsusrv01-03
  • Scicentr1-3
  • Weblab
  • Arecibo
1/7/2008 11:41:00 AM

DOWNTIME: Wednesday, January 9, 8am-5pm

REASON: Patches will be applied to both Linux and Windows servers.

WHO WILL BE AFFECTED?

All users:

  • Fileservers containing home directories on H: and /home/nfs will be inaccessible.
  • ctclogina, ctcloginb, ctclogind, linuxlogin1, linuxlogin2 - all login sessions will be lost.
  • CAC batch machines (v3, development, serial, v2Linux, v2linuxdev, admm, cbsu1, cbsu2, x64test). All running jobs will be cancelled.
  • CAC Web pages will be unavailable: http://www.cac.cornell.edu/ .

To minimize interruptions in accessing files, the fileservers containing home directories on H: and /home/nfs will be patched first.

Database: The following machines will be inaccessible:

  • sqlsrv01, sqlsrv02, ctcsql, scidata1, scidata2, sonofstager

Specific groups: If you are using any of the following machines, you would make direct connections to them. These connections will be interrupted.

  • Cbsusrv01-03
  • Scicentr1-3
  • Weblab
  • Arecibo
1/2/2008 4:12:00 PM

ctcloginb is available again.

This machine is back online at this time.

Please report any problems with this machine.

1/2/2008 11:06:00 AM

ctcloginb is unavailable.

ctcloginb is experiencing hardware trouble. CAC systems staff is working on it now.

12/10/2007 11:00:00 AM

DOWNTIME: Wednesday, Dec 12th, 8am-5pm

REASON: Patches will be applied to both Linux and Windows servers.

  • Fileservers containing home directories on H: and /home/nfs will be inaccessible.
  • ctclogina, ctcloginb, ctclogind, linuxlogin1, linuxlogin2 - all login sessions will be lost.
  • CAC batch machines (v3, development, serial, v2Linux, v2linuxdev, admm, cbsu1, cbsu2, x64test). All running jobs will be cancelled.
  • CAC Web pages will be unavailable: http://www.cac.cornell.edu/.
  • To minimize interruptions in accessing files, the fileservers containing home directories on H: and /home/nfs will be patched first.
  • The following database machines will be inaccessible: sqlsrv01, sqlsrv02, ctcsql, scidata1, scidata2, sonofstager
  • Specific groups: If you are using any of the following machines, you would make direct connections to them. These connections will be interrupted.
    • Cbsusrv01-03
    • Scicentr1-3
    • Weblab
    • Arecibo
11/9/2007 10:30:00 AM

DOWNTIME: Wednesday, Wed, Nov 14th., 8am-5pm

REASON: Patches will be applied to both Linux and Windows servers.

  • Fileservers containing home directories on H: and /home/nfs will be inaccessible.
  • ctclogina, ctcloginb, ctclogind, linuxlogin1, linuxlogin2 - all login sessions will be lost.
  • CAC batch machines (v3, development, serial, v2Linux, v2linuxdev, admm, cbsu1, cbsu2, x64test). All running jobs will be cancelled.
  • CAC Web pages will be unavailable: http://www.cac.cornell.edu/.
  • To minimize interruptions in accessing files, the fileservers containing home directories on H: and /home/nfs will be patched first.
  • The following database machines will be inaccessible: sqlsrv01, sqlsrv02, ctcsql, scidata1, scidata2, sonofstager
  • Specific groups: If you are using any of the following machines, you would make direct connections to them. These connections will be interrupted.
    • Cbsusrv01-03
    • Scicentr1-3
    • Weblab
    • Arecibo
11/5/2007 3:30:00 PM

Sessions disconnected for more than 24 hours will be terminated

All sessions to login nodes that are disconnected for more than 24 hours will be terminated in order to improve system response time.