[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

"kernel stack not valid halt" / CDROM device name corruption problem



I have a problem on a DS10L while attempting to boot the Tru64 5.1b install CDROM.
If I boot from the hard disk, the CDROM drive works OK when mounted from Tru64.

When I power on the system, the CD drive is reported correctly by 'show dev' at SRM:

	dqb0.0.1.13.0              DQB0                        CD-224E  9.5B            

If I try to boot the GENERIC 5.1b boot-linked kernel from CDROM, I get an error "kernel stack not valid halt":

	>>>boot dqb0 -fl a -fi GENERIC                                                  
	(boot dqb0.0.1.13.0 -file GENERIC -flags a)                                     
	block 0 of dqb0.0.1.13.0 is a valid boot block                                  
	reading 15 blocks from dqb0.0.1.13.0                                            
	bootstrap code read in                                                          
	base = 2c0000, image_start = 0, image_bytes = 1e00(7680)                        
	initializing HWRPB at 2000                                                      
	initializing page table at 1ffee000                                             
	initializing machine state                                                      
	setting affinity to the primary CPU                                             
	jumping to bootstrap code                                                       
	                                                                                
	UNIX boot - Wednesday October 16, 2002                                          
	                                                                                
	Loading GENERIC ...                                                             
	Loading at fffffc0000310000                                                     
	Linking 205 objects: 205                                                        
	halted CPU 0                                                                    
	                                                                                
	halt code = 2                                                                   
	kernel stack not valid halt                                                     
	PC = 0                                                                          


Now if I do 'show dev' again, I get a weird corruption of the device name:

	dqb0.0.1.13.0              DQB0        CD/224G " " " " " " " "  ;.7B" "         

If I do an 'init' after the problem has occurred, I get a subsequent self-test failure:

	Testing the Disks (read only)                                                   
                                                                                
	*** Hard Error - Error #8 -                                                     
	Diagnostic Name        ID             Device  Pass  Test  Hard/Soft   1-JAN-2000
	exer_kid         00000317      dqb0.0.1.13.0     0     0     1    0     12:00:01
	Buffer counts differ - buf1:0, buf2:512, location:2a00                          
                                                                                
	*** End of Error ***                                                            
                                                                                

I thought maybe this was a problem with the CDROM drive, so I changed it for a drive on another system that works OK, but the problem still remains. I've also tried changing the drive ribbon cable but that didn't help either.

The only way to get back to a 'valid' device name is to cycle the power supply.

Any ideas?  I'm wondering if this is a fault on the motherboard.

thanks,
	Iain