[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

DEC PW600au CPU Panic



Hi folks,

we have a DEC PW600au running DU4.0e which has been crashing several
times in the past few days.
The culprit seems to be a 'decsound' process:

[...]
14712   decsound
_kernel_process_status_end:
_current_pid:  14712
_current_tid:  0xfffffc0009b52700
_proc_thread_list_begin:
thread 0xfffffc0009b52700 stopped at  [boot:1931 ,0xfffffc00003e05dc]   
Source not available
_proc_thread_list_end:

Could anyone give a hint whether this is really a CPU problem
or perhaps the sound card or ...?
I would also appreciate any hints what I could do for further
analysis (what does 'machine check code 98' or 'too many processor
corrected errors on cpu0' indicate?).

I attach a few excerpts from /var/adm/messages and 
/var/adm/crash/crash-data.

Thanks in advance & a Happy New Year,
Volker


/var/adm/messages (excerpt)
-----%<-----------------------

Jan  2 21:08:40 cmcszsws vmunix: Environmental Monitoring Subsystem
Configured.
Jan  2 21:08:44 cmcszsws vmunix: mmsessprobe: IRQ channel = 5
Jan  2 21:08:44 cmcszsws vmunix: mmsess0 at isa0
Jan  2 21:08:44 cmcszsws vmunix: mmsess sound driver V4.1 configured
Jan  2 23:12:37 cmcszsws vmunix: WARNING: too many Processor corrected
errors detected on cpu 0. Reporting suspended.
Jan  3 05:54:31 cmcszsws vmunix: Machine Check Processor Fatal Abort
Jan  3 05:54:31 cmcszsws vmunix: Machine Check Code = 98
Jan  3 05:54:31 cmcszsws vmunix: Processor detected hard error
Jan  3 05:54:31 cmcszsws vmunix:        pal temp[0-1]           =
ffffffffffffffff 000003ffc0005bb0
Jan  3 05:54:31 cmcszsws vmunix:        pal temp[2-3]           =
fffffc00003dcd40 0000000000005200
Jan  3 05:54:31 cmcszsws vmunix:        pal temp[4-5]           =
0000000000000001 000000000000ff00
Jan  3 05:54:31 cmcszsws vmunix:        pal temp[6-7]           =
000000000000fff2 fffffc00003dc660
Jan  3 05:54:31 cmcszsws vmunix:        pal temp[8-9]           =
1f1e161514020100 fffffc00003dca80
Jan  3 05:54:31 cmcszsws vmunix:        pal temp[10-11]         =
000003ff80008adc fffffc00003dc8e0
Jan  3 05:54:31 cmcszsws vmunix:        pal temp[12-13]         =
fffffc00003dccb0 fffffffffff8da00
Jan  3 05:54:31 cmcszsws vmunix:        pal temp[14-15]         =
0000000000f00270 0000000000f0380c
Jan  3 05:54:31 cmcszsws vmunix:        pal temp[16-17]         =
0000009806700001 0000000000000000
Jan  3 05:54:31 cmcszsws vmunix:        pal temp[18-19]         =
000000011ffff9a0 ffffffff91457a38
Jan  3 05:54:31 cmcszsws vmunix:        pal temp[20-21]         =
0000000007098000 fffffc00003dcce0
Jan  3 05:54:31 cmcszsws vmunix:        pal temp[22-23]         =
fffffc000056d180 0000000000c99a38
Jan  3 05:54:31 cmcszsws vmunix:        shadow[0-1]             =
0000000000000000 0000000000000000
Jan  3 05:54:31 cmcszsws vmunix:        shadow[2-3]             =
0000000000000000 0000000000000000
Jan  3 05:54:31 cmcszsws vmunix:        shadow[4-5]             =
0000000000000000 0000000000000000
Jan  3 05:54:31 cmcszsws vmunix:        shadow[6-7]             =
0000000000000000 0000000000000000
Jan  3 05:54:31 cmcszsws vmunix:        Address of excepting
instruction        = 000003ff80008adc
Jan  3 05:54:31 cmcszsws vmunix:        Summary of arithmetic traps    
= 0000000000000000
Jan  3 05:54:31 cmcszsws vmunix:        Exception mask                 
= 0000000000000000
Jan  3 05:54:32 cmcszsws vmunix:        Base address for PALcode       
= 0000000000018000
Jan  3 05:54:32 cmcszsws vmunix:        Interrupt Status Reg           
= 0000000100000000
Jan  3 05:54:32 cmcszsws vmunix:        CURRENT SETUP OF EV5 IBOX      
= 0000004166020000
Jan  3 05:54:32 cmcszsws vmunix:        I-CACHE Reg Tag parity error   
= 0000000000000000
Jan  3 05:54:32 cmcszsws vmunix:        D-CACHE error Reg              
= 0000000000000000
Jan  3 05:54:32 cmcszsws vmunix:        Effective VA            =
000003ff808b40f4
Jan  3 05:54:32 cmcszsws vmunix:        reason for D-stream     =
0000000000014290
Jan  3 05:54:32 cmcszsws vmunix:        EV5 Secondary Cache address    
= ffffff000001d04f
Jan  3 05:54:32 cmcszsws vmunix:        EV5 Secondary Cache TAG/Data
parity     = 0000000000000000
Jan  3 05:54:32 cmcszsws vmunix:        EV5 BC_TAG_ADDR         =
ffffff80054d6fff
Jan  3 05:54:32 cmcszsws vmunix:        EV5 EI_STAT_ADDR Phys addr of
Xfer      = ffffff000877a00f
Jan  3 05:54:32 cmcszsws vmunix:        Fill Syndrome           =
0000000000000017
Jan  3 05:54:32 cmcszsws vmunix:        EI_STAT reg             =
fffffff945ffffff
Jan  3 05:54:32 cmcszsws vmunix:        LD_LOCK                 =
ffffff000e3e3a4f
Jan  3 05:54:32 cmcszsws vmunix:        PYXIS_DMA_DATA          =
0000000000000000
Jan  3 05:54:32 cmcszsws vmunix:        CIA/PYXIS ERR                  
= 0000000000000000
Jan  3 05:54:32 cmcszsws vmunix:        CIA/PYXIS ERR STAT             
= 0000000000000000
Jan  3 05:54:32 cmcszsws vmunix:        CIA/PYXIS ERR MASK             
= 0000000000000b93
Jan  3 05:54:32 cmcszsws vmunix:        CIA/PYXIS ECC_SYN              
= 0000000000000000
Jan  3 05:54:32 cmcszsws vmunix:        CIA/PYXIS MEM ERR0             
= 000000000001d540
Jan  3 05:54:32 cmcszsws vmunix:        CIA/PYXIS MEM ERR1             
= 0000000058000000
Jan  3 05:54:32 cmcszsws vmunix:        CIA/PYXIS PCI ERR0             
= 0000000002010002
Jan  3 05:54:32 cmcszsws vmunix:        CIA/PYXIS PCI ERR1             
= 0000000000000071
Jan  3 05:54:32 cmcszsws vmunix:        ISA bridge NMI status & control
= 0000000000000000
Jan  3 05:54:32 cmcszsws vmunix:        CIA/PYXIS PCI ERR2             
= 0000000000000071
Jan  3 05:54:33 cmcszsws vmunix: panic (cpu 0): Processor Machine Check
Jan  3 05:54:33 cmcszsws vmunix: syncing disks... device string for dump
= SCSI 0 1004 0 0 0 0 0.
Jan  3 05:54:33 cmcszsws vmunix: DUMP.prom: dev SCSI 0 1004 0 0 0 0 0,
block 131072
Jan  3 05:54:33 cmcszsws vmunix: device string for dump = SCSI 0 1004 0
0 0 0 0.
Jan  3 05:54:33 cmcszsws vmunix: DUMP.prom: dev SCSI 0 1004 0 0 0 0 0,
block 131072
Jan  3 05:54:33 cmcszsws vmunix: Alpha boot: available memory from
0xae0000 to 0xfffe000
Jan  3 05:54:33 cmcszsws vmunix: Digital UNIX V4.0E  (Rev. 1091); Thu
Apr 22 17:44:56 GMT 1999 
Jan  3 05:54:33 cmcszsws vmunix: physical memory = 256.00 megabytes.
Jan  3 05:54:33 cmcszsws vmunix: available memory = 245.41 megabytes.
Jan  3 05:54:33 cmcszsws vmunix: using 975 buffers containing 7.61
megabytes of memory
Jan  3 05:54:33 cmcszsws vmunix: Digital Personal WorkStation 600au
Jan  3 05:54:33 cmcszsws vmunix: Firmware revision: 6.9-7
Jan  3 05:54:33 cmcszsws vmunix: PALcode: Digital UNIX version 1.22-0
Jan  3 05:54:33 cmcszsws vmunix: pci0 at nexus
Jan  3 05:54:33 cmcszsws vmunix: tu0: DECchip 21143: Revision: 3.0
Jan  3 05:54:33 cmcszsws vmunix: tu0: auto negotiation capable device
Jan  3 05:54:33 cmcszsws vmunix: tu0 at pci0 slot 3
Jan  3 05:54:33 cmcszsws vmunix: tu0: DEC TULIP (10/100) Ethernet
Interface, hardware address: 00-00-F8-76-5E-93

/var/adm/crash/crash-data.0 (excerpt)
-----%<------------------------------

[...]
_dump_begin: 
>  0 boot() ["../../../../src/kernel/arch/alpha/machdep.c":1931, 0xfffffc00003e05dc]
nmp = 0xfffffc000056cf50
rs = -4398040821936
mycpu = 1
rpb = 0xfffffc0000565a58
rpb_cpu = (nil)
item_list = struct {
    function = 18446739675665624712
    out_flags = 256347296
    in_flags = 4294966272
    rtn_status = 18446739675665891140
    next_function = 0x3ff9e037cc0
    input_data = 0
    output_data = 18446739675919288960
}

   1 panic(0x578, 0x16, 0xfffffc000ff00800, 0xfffffc000ff00800,
0x1ea6b59) ["../../../../src/kernel/bsd/subr_prf.c":755,
0xfffffc00002844b0
]

   2 thread_block() ["../../../../src/kernel/kern/sched_prim.c":2159,
0xfffffc00002b8654]
thread = 0xfffffc0009b52700
new_thread = 0xfffffc00001d6100
mycpu = 0
myprocessor = 0xfffffc00001d6100
s = 5
pset = 0xfffffc000056cf50

   3 thread_preempt(thread = 0x26, processor = 0xfffffc00001d6100)
["../../../../src/kernel/kern/sched_prim.c":4048, 0xfffffc00002bb034]
s = 2
pset = 0xfffffc0000593560

   4 boot() ["../../../../src/kernel/arch/alpha/machdep.c":1876,
0xfffffc00003e04bc]
nmp = 0xfffffc000056cf50
rs = -4398040821936
mycpu = 5427584
rpb = 0xfffffc0000565a58
rpb_cpu = 0x1ea6b59
item_list = struct {
    function = 436338788
    out_flags = 1
    in_flags = 0
    rtn_status = 18446744069414584320
    next_function = 0x376a1f1600000001
    input_data = 6366207712153912073
    output_data = 4981061741327700809
}
   5 panic(0x0, 0x1f, 0x1a000000, 0x1a020064, 0x1)
["../../../../src/kernel/bsd/subr_prf.c":842, 0xfffffc0000284664]

   6 machcheck(0x1, 0x0, 0x6c994f, 0x20000001a, 0xffffffff91457930)
["../../../../src/kernel/arch/alpha/hal/eb164.c":3096, 0xfffffc000040ff
a4]

   7 mach_error(0x6c994f, 0x20000001a, 0xffffffff91457930,
0xfffffc0000006068, 0xfffffc00003dc9f0)
["../../../../src/kernel/arch/alpha/hal/
cpusw.c":1027, 0xfffffc00003f187c]

   8 _XentInt(0x8, 0x3ff80008adc, 0x3ffc0008720, 0x3ffc000b900,
0x120003c48) ["../../../../src/kernel/arch/alpha/locore.s":1339,
0xfffffc00
003dc9ec]

_dump_end: 
----------------------------------------------------------------------
      __ __
     / //_ \  Volker Becker	        email: becker@xxxxxxxxxxxxxx
 ___/ //   /  Deutsche Flugsicherung    www:   http://www.dfs.de
 \   //__  \  Rintheimer Querallee 6    phone: +49-721-6903-326
  \_//_____/  D-76131 Karlsruhe	        fax:   +49-721-6903-247
----------------------------------------------------------------------