Saturday, March 21, 2015

Unable to Submit via Torque Submission Node - Socket_Connect Error for Torque 4.2.7

I am using Torque Server version 4.2.7. I was trying to configure a Submission Node. Here are a sample of my qmgr -c 'p s" output. Firewall has allows the necessary traffic in outr

# qmgr -c "p s"
.......... 
set server acl_hosts = submission_node.cluster.spms.ntu.edu.sg
set server acl_hosts += head_node.cluster.spms.ntu.edu.sg
set server submit_hosts = submission_node.cluster.spms.ntu.edu.sg
set server submit_hosts += head_node.cluster.spms.ntu.edu.sg
set server allow_node_submit = True 
.......

After we ssh into the submission_node, and as I simulate as a user, I got this errors. Yes, the submission_node has been configured as a conventional client.

socket_connect error (VERIFY THAT trqauthd IS RUNNING)
Error in connection to trqauthd (15137)-[could not connect to unix socket /tmp/trqauthd-unix: 111]
socket_connect error (VERIFY THAT trqauthd IS RUNNING)
Error in connection to trqauthd (15137)-[could not connect to unix socket /tmp/trqauthd-unix: 111]
socket_connect error (VERIFY THAT trqauthd IS RUNNING)
Error in connection to trqauthd (15137)-[could not connect to unix socket /tmp/trqauthd-unix: 111]
Unable to communicate with head_node(10.10.10.20)
Communication failure. qsub: cannot connect to server head_node (errno=15137) could not connect to trqauthd

Taking a look at the Torque 4.2.7 documentation, the documentation mentioned that you have to make sure the submission node have trqauthd script at /etc/init.d if you are  using RH / CentOS. You can easily scp the /etc/init.d/trqauthd to the submision node

From the head_node
# scp -v /etc/init.d/trqauthd root@submssion_node:/etc/init.d/

Create a /etc/hosts_equiv file
# touch /etc/hosts_equiv
Put the Submission_Node file name at the /etc/hosts.equiv of the head_node
submission_node 

At the Submission_Node, start the  trqauthd service
# service trqauthd start

Now trying submitting as a normal user

1 comment:

deepak singh said...

Thanks for providing this informative information you may also refer.
http://www.s4techno.com/blog/2016/07/11/aix-paging-space-commands/