Friday, January 31, 2014

Architecting a High Performance Lustre Storage Solution



The Solution Architect group within Intel HPDD has recently updated the Architecting a High Performance Storage System technical white paper. This paper outlines a systematic, requirements-driven approach for designing Lustre-powered storage solutions.

See Architecting a High Performance Storage System

References:
  1. Intel® Solutions for Lustre* software

Wednesday, January 29, 2014

Compiling Java 7 on CentOS 5 and 6

Step 1: Go to Oracle Java Download site and select the

Step 2: Unpack the Archive
# cd /usr/local/
# tar -zxvf jdk-7u51-linux-x64.tar.gz

Step 3: Setup the Environmental Variables. At your .bashrc
export JAVA_HOME=/usr/local/jdk1.7.0_51
export JRE_HOME=/usr/local/jdk1.7.0_51/jre
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin 

Step 4: Check the version
# java -version
java version "1.7.0_51"
Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)

Monday, January 27, 2014

DELL also recently announced a major layoff of up to 30% of their workforce recently.

If you thought that IBM selling its x86 business was the only shock in the x86 markets (Lenovo Plans to Acquire IBM’s x86 Server Business ). You may want to take a look at Dell looks to be heading toward staff layoffs of 20 to 30 percent


According to the article...... Sources familiar with the matter told The Register that along with laying off employees that work in the PC department, Dell is also cutting staff from other departments like enterprise software and storage

 but Dell spokemen refute the report claiming that the Refister's report on the layoff was inaccurate....

Waiting for further updates.....

Sunday, January 26, 2014

Lenovo Plans to Acquire IBM’s x86 Server Business

Taken from the announcement from IBM "IBM to extend partnership with Lenovo"

Lenovo and IBM have entered into a definitive agreement in which Lenovo plans to acquire IBM’s x86 server business. This includes System x, BladeCenter and Flex System blade servers and switches, x86-based Flex integrated systems, NeXtScale and iDataPlex servers and associated software, blade networking and maintenance operations.

IBM will retain its System z mainframes, Power Systems, Storage Systems, Power-based Flex servers, and PureApplication and PureData appliances.

The agreement builds upon a longstanding collaboration that began in 2005 when Lenovo acquired IBM’s PC business, which included the ThinkPad line of PCs. In the period since the companies have continued to collaborate in many areas.

IBM will continue to develop and evolve its Windows and Linux software portfolio for the x86 platform. IBM is a leading developer of software products for x86 servers with thousands of products and tens of thousands of software developer and services professionals who build software for x86 systems.


For more detailed information, see
  1. Lenovo Plans to Acquire IBM’s x86 Server Business
  2. Lenovo to buy IBM's x86 server business for $2.3bn
xxxxxx

Wednesday, January 22, 2014

Mellanox MLNX_OFED 2.1 is released

Product Updates

1. Mellanox OFED Version 2.1-1.0.0 for Linux Driver is now available.

The release has the following new features:

  • Signature Verbs (T10-PI) (beta level) RoCE Time Stamping
  • PeerDirect™
  • Inline-Receive
  • Ethernet performance counters
  • Memory window
  • VMA 6.5.9 bundled with MLNX_OFED
  • DCT support (at beta level)
  • eIPoIB multicast support
  • Connect-IB – Added the ability to resize CAs
  • GA support for ConnectX-3 Pro
  • IPoIB – Reusing DMA mapped SKB buffers
  • IPoIB – Performance improvements when IOMMU is enabled
  • mlnx_en – reporting auto-negotiation support
  • mlnx_en – Trans00mit Packet Steering (XPS) support
  • mlnx_en – reporting 56GbE link speed support
  • mlnx_en – Receive Flow Steering (RFS) support in UDP
  • mlnx_en – Low latency Socket (LLS) support
  • mlnx_en – check for dma_mapping errors
  • eIPoIB – non-virtualized environment support
Related Links
  1. MLNX_OFED 2.1-1.0.0 Product Page
  2. MLNX_OFED 2.1-1.0.0 Driver Compatibility Matrix
  3. MLNX_OFED 2.1-1.0.0 Release Notes
  4. MLNX_OFED 2.1-1.0.0 User Manual
For more information,

  1. See Mellanox Product Updates 21 Jan 2014

Tuesday, January 21, 2014

Suse Chalktalk - The Fundamental of Cloud Fundamentals

This chalk talk covers the fundamentals of cloud computing and its importance to the future of enterprise computing. Watch the Cloud Fundamental Chalk Talk

Saturday, January 18, 2014

Compiling gnuplot 4.6.4 on CentOS 5

I compiled gnuplot 4.6.4 on CentOS 5.4



Step 1: Install prerequisites, 
(a) wxGTK and wxGTK-devel,
(b) readline, readline-devel
(c) gd, gd-devel
# yum install wxGTK wxGTK-devel readline readline-devel gd gd-devel


Step 2: Compile gnuplot-4.6.4
# tar -zxvf  gnuplot-4.6.4
# cd gnuplot-4.6.4
# ./configure --prefix=/usr/local/gnuplot-4.6.4
# make
# make install

If you are having issues, do take a note at the fedora forum below

References:
  1. http://forums.fedoraforum.org/showthread.php?p=1397790
  2. gnuplot homepage

Friday, January 17, 2014

Intel Recommended Reading List for Q1 2014


The Intel Recommended Reading List for 2014 are out. Intel Press Book always grab my attention. You may want to take a look at Intel Press Book and Recommended List for Developer (pdf)

Tuesday, January 14, 2014

Interesting Cluster Supercomputing from Cray

Taken from Cray Site


The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.

With flexible configurations options, the Cray CS300-AC system enables users to design a supercomputer that meets their unique HPC requirements. The system is tightly integrated with Cray’s HPC cluster software stack addressing cluster communication efficiency, deployment and management. The Cray CS300-AC cluster supercomputer is designed to meet the challenges of capacity and data-intensive computing workloads

Video: Introducing the Cray CS300-AC Cluster Supercomputer




The Cray CS300-LC cluster supercomputer delivers energy savings through the use of warm water heat exchangers instead of chillers. This innovative liquid-cooled design directly cools the compute processor and memory to more efficiently remove system heat. Based on modular, industry-standard platforms, the Cray CS300-LC system offers flexible configuration options with the latest processor technologies and multiple networking options.

Every Cray CS300-LC system is tightly integrated with Cray’s HPC cluster software stack addressing cluster communication efficiency, deployment and management. With an innovative liquid-cooling architecture, the Cray CS300-LC system delivers outstanding price/performance, easy remote management, monitoring and reporting for a lower TCO and faster ROI vs. traditional air-cooling architectures. The Cray CS300-LC is designed for capacity and data-intensive computing to help users solve massively parallel computing and Big Data analytics challenges.
Video: Introducing the Cray CS300-LC Cluster Supercomputer

Monday, January 13, 2014

Using the command “ls” commonly used options and arguments

ls has some interesting format which is very useful. Here are some arguments which you can use. For example, I like
$ ls - ltF

-a --all All files include those with .
-l Display results in long format
-r --reverse Reverse order while sorting
-S Sort results by file size
-t Sort by modification time.
-F --classify Append an indicator character to the end of each listed name
References:
  1. The Linux Command Line (by No Starch Press)

Saturday, January 11, 2014

Encountering bash: ELF: command not found

I encounter one of the user having this issue "ELF: command not found" There is a good writeup and explanation on this phenomenon "bash: ELF: command not found" by Solution Blog

Friday, January 10, 2014

Diagnosing Mellanox Fabrics and RDMA Aware Networks Programming User Manual

If you wish to diagnose the Mellanox/Voltaire Fabric, do the following

1. Reset all the fabric PM counters
# ibdiagnet -pc

2. If any of the provided PM is greater then its provided value than print it
# ibdiagnet -P all=1

You will find logs of the findings from ibdiagnet such as
# ls -l /var/tmp/ibdiagnet2/
ibdiagnet2.aguid
ibdiagnet2.db_csv
ibdiagnet2.debug
ibdiagnet2.log
ibdiagnet2.lst
ibdiagnet2.nodes_info
ibdiagnet2.pkey
ibdiagnet2.pm
ibdiagnet2.sm

3. To check out the errors in your Fabric Environment, check the generated ibdiagnet2.log file for more information.

For references on
  1. RDMA Aware Networks Programming User Manual (pdf)
  2. Mellanox OFED for Linux User Manual (pdf)

Tuesday, January 7, 2014

Commercial Download for Intel Compilers

If you are downloading the Commercial version of the Intel Compilers, do log-on to the site below and access  https://registrationcenter.intel.com/RegCenter/MyProducts.aspx

Friday, January 3, 2014

Deleting PBS and MAUI Jobs which cannot be purged

If the Compute Node pbs_mom is lost and cannot be recovered (due to hardware or network failure) and to purge a running job from the qstat output or show 1. Shutdown the pbs_server daemon on the PBS Server
# service pbs_server stop
2. Remove Job Spool Files that holds the hanged JobID (For example 4444)
# rm /var/spool/torque/server_priv/jobs/4444.headnode.SC
# rm /var/spool/torque/server_priv/jobs/4444.headnode.JB
3. Start the pbs_Server Daemon
# service pbs_server start
4. Restart the MAUI Daemon
# service maui restart
References:
  1. Deleting PBS/Maui Jobs

Wednesday, January 1, 2014

Compiling Chelseio IWARP Drivers (2.8.0.0) on Chelsio T4 Cards on CentOS 5

The below is a subset of the Chelsio 2.8.0.0 ReadMe

The Chelsio Unified Wire software has been developed to run on 64-bit Linux based platforms. Following is the list of Drivers/Software and supported Linux distributions. Here is a subset of the README.

The OS I used was CentOS 5.8

|########################|#####################################################|
|   Linux Distribution   |                Driver/Software                      |
|########################|#####################################################|
|RHEL5.8,2.6.18-308.el5  |NIC/TOE,vNIC,iWARP,WD-UDP*,WD-TOE*,iSCSI Target*,    |
|                        |Bonding,IPv6,Bypass*,Sniffer & Tracer                |
|                        |UM(Agent,Client),UDP-SO,Filtering,TM                 |
|------------------------|-----------------------------------------------------|
|RHEL5.9,2.6.18-348.el5  |NIC/TOE*,vNIC*,iWARP*,WD-UDP*,WD-TOE*,iSCSI Target*, |
|                        |Bonding*,IPv6*,Bypass*,Sniffer & Tracer*,UDP-SO*,    |
|                        |Filtering*,TM*                                       |
|------------------------|-----------------------------------------------------|
|RHEL6.3,                |NIC/TOE,vNIC,iWARP,WD-UDP,WD-TOE*,iSCSI Target*,     |
|2.6.32-279.el6          |iSCSI Initiator*,FCoE Initiator*,                    |
|                        |Bonding,IPv6,Bypass*,Sniffer & Tracer,UDP-SO,        |
|                        |UM(Agent,Client,WebGUI),Filtering,TM                 |
|------------------------|-----------------------------------------------------|
|RHEL6.4,                |NIC/TOE,vNIC,iWARP,WD-UDP,WD-TOE,iSCSI Target,       |
|2.6.32-358.el6          |iSCSI Initiator,FCoE Initiator,Bonding,IPv6,Bypass,  |
|                        |Sniffer & Tracer,UDP-SO,UM(Agent,Client,WebGUI),     |
|                        |Filtering,TM,uBoot(DUD)                              |
|------------------------|-----------------------------------------------------|

Strangely, I was not able to compile with 3.5.1. It seems that the compat-rdma on 3.5.1 is having issues with CentOS 5.8. See Failed to build compat-rdma RPM when compiling OFED 3.5.1 on CentOS 5.8

I tried with OFED 1.5.4.1, but errors occurred as well. But compiling OFED 1.5.3.2 works well and Chelsio T420-BCH was able to compile nicely with OFED 1.5.3.2. To download OFED 1.5.3.2, do visit the OFED Downloads Site


Part 1

To compile from source

 i.  Download the tarball ChelsioUwire-x.x.x.x.tar.gz

ii. Untar the tarball
  
[root@host]# tar zxvfm ChelsioUwire-x.x.x.x.tar.gz
 
iii. Change your current working directory to Chelsio Unified Wire package
     directory. Build the source:

[root@host]# make
 
iv. Install the drivers, tools and libraries:
  
[root@host]# make install
 
v. The default configuration tuning option is Unified Wire.
   The configuration tuning can be selected using the following commands:

[root@host]# make CONF=(T5/T4 Configuration)
[root@host]# make CONF=(T5/T4 Configuration install)

(where T5/T4 Configuration is
UNIFIED_WIRE, HIGH_CAPACITY_TOE, HIGH_CAPACITY_RDMA, LOW_LATENCY, UDP_OFFLOAD, T5_WIRE_DIRECT_LATENCY)


Part 2  - Installing Individual Drivers

i. To build and install iWARP driver against outbox OFED:
[root@host]# make iwarp
[root@host]# make iwarp_install


Part 3a - Loading IWARP Drivers

Manually  Load  Drivers
To load the iWARP driver we need to load the NIC driver & core RDMA drivers first:
[root@host]# modprobe cxgb4
[root@host]# modprobe iw_cxgb4
[root@host]# modprobe rdma_ucm

Part 3b - Automatic IWARP Drivers
To load the Chelsio iWARP drivers automatically, add this additional lines to /etc/modprobe.conf

options iw_cxgb4 peer2peer=1
install cxgb4 /sbin/modprobe -i cxgb4; /sbin/modprobe -f iw_cxgb4; /sbin/modprobe rdma_ucm
alias eth1 cxgb4 # assuming eth1 is used by the Chelsio interface


Finally Reboot the system to load the new modules

References:
  1. Chelsio 2.8.0.0 ReadMe