[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

CPU PANIC on Trucluster 5.1A



Oops I missed the subject line.
Regards,
Bala

>  -----Original Message-----
> From: 	Sathiamoorthy Balasubramaniyan (ext_TCS)  
> Sent:	Friday, March 05, 2004 7:33 PM
> To:	Tru64 List (E-mail)
> Subject:	
> 
> hello managers,
>  We had a catastrophe on our 2-node trucluster 5.1A yesterday.
>  Both the nodes crashed with a CPU panic (vrele: bad ref count).
>  And the nodes rebooted automatically for 3 times in a span of 30 minutes
> and 
>  the logs for the each reboot show the following:
> 
> On Node1:
> ---------
> Mar  4 11:33:25 node1 vmunix: vrele: bad ref count: type VDIR, usecount 0
> Mar  4 11:33:26 node1 vmunix: 	tag VT_CFS, fsid ee4bcf0d,a
> Mar  4 11:33:26 node1 vmunix: panic (cpu 1): vrele: bad ref count
> Mar  4 11:33:26 node1 vmunix: syncing disks... 
> Mar  4 11:33:26 node1 vmunix: Memory trolling not supported, cpu Major id
> 11, Minor id 9
> Mar  4 11:33:26 node1 vmunix: Alpha boot: available memory from 0x582e000
> to 0xffff4000
> Mar  4 11:33:26 node1 vmunix: Compaq Tru64 UNIX V5.1A (Rev. 1885); Sat Aug
> 2 22:25:02 MEST 2003
> 
> On node2:
> ----------
> Mar  4 11:33:19 node2 vmunix: vrele: bad ref count: type VDIR, usecount 0
> Mar  4 11:33:19 node2 vmunix: 	tag VT_CFS, refcnt 1 pvp
> fffffc00c6a54a00
> Mar  4 11:33:19 node2 vmunix: 	type VDIR, usecount 1
> Mar  4 11:33:19 node2 vmunix: 	panic (cpu 0): vrele: bad ref count
> Mar  4 11:33:19 node2 vmunix: syncing disks... 
> Mar  4 11:33:20 node2 vmunix: Memory trolling not supported, cpu Major id
> 11, Minor id 14
> Mar  4 11:33:20 node2 vmunix: Alpha boot: available memory from 0x5824000
> to 0xffff4000
> Mar  4 11:33:20 node2 vmunix: Compaq Tru64 UNIX V5.1A (Rev. 1885); Sun Aug
> 3 11:34:31 MEST 2003
> 
> UERF on both systems:
> ---------------------
> ----- EVENT INFORMATION -----
> 
> EVENT CLASS                             ERROR EVENT 
> OS EVENT TYPE                  302.     PANIC 
> SEQUENCE NUMBER              14969.
> OPERATING SYSTEM                        DEC OSF/1 
> OCCURRED/LOGGED ON                      Thu Mar  4 11:21:39 2004
> OCCURRED ON SYSTEM                      node2 
> SYSTEM ID                 x000B0022
> SYSTYPE                   x00000000
> PROCESSOR COUNT                  2.
> PROCESSOR WHO LOGGED      x00000000
> MESSAGE                                 panic (cpu 0): vrele: bad ref
> count 
> 
> 
> System information:
> -------------------
> COMPAQ AlphaServer DS20E 666 MHz with 4GB RAM on both machines.
> Operating system: Tru64 5.1A, Trucluster 5.1A with patchkit 4.
> 
> 
> The system also saved the vmzcore files but i have no idea how to extract
> relevant information from it.
> 
> Please help me with some information how to find the cause of this error.
>  
> 
> Thanks in Advance,
> Bala
> 
> 
> Note: This e-mail may contain privileged, undisclosed or otherwise
> confidential information. 
> If you have received this e-mail in error, you are hereby notified that
> any review, copying or distribution 
> of it is strictly prohibited. Please inform the sender immediately and
> destroy the original transmittal.
> Thank you for your understanding
>