| Author |
Message |
Andrew Wasielewski
Guest
|
Posted:
Tue Sep 21, 2004 3:23 am Post subject:
SCSI RAID problems |
|
|
Hello SCSI friends,
I am having some strange/worrying problems with my SCSI RAID setup. Perhaps someone can help?
I have got a Gigabyte GA-KNXP Ultra m/b with embedded Adaptec AIC-7902 U320 SCSI controller. I have got a RAID 10 array made up of 4 x Maxtor Atlas IV 10k. 36GB disks (SCSI IDs 3, 4, 5 & 6) on Channel A. The 72GB useable space is partitioned into a 4GB user partition + 4 others for paging file, apps & data (+ some free space left over). I run WinXP Pro SP1.
A little while ago I started to get signs of hardware problems:-
a.. Errors in the Windows Event Log (ID 9) saying "The device, \Device\Scsi\a320raid1, did not respond within the timeout period." A few times the system crashed about the same time;
b.. Nasty "scrunching" sounds coming from the disks;
c.. In the SCSI BIOS the array is tagged "Degraded". On drilling down, the disk on ID 4 is tagged "Degraded", all the others are "Optimal".
After a short while the errors & noise go away, so I assume the disk has finally died & prepare to RMA it. In the interim I get a different error in the event log (ID 7) saying "The device, \Device\Harddisk0\D, has a bad block." whenever (and only when) I try to backup drive C: to the Tandberg SLR7 tape drive using the WinXP Backup utility. (I get error ID 14 saying "The shadow copy of volume C: was aborted because of an IO failure." at the same time. I can't find out much about these errors, other than that he first one usually means disk failure is imminent; however as the O/S can only see the logical disk, not the individual physical devices, I put this down to a glitch from the disk failure. Apart from this, all apps appear to work normally as far as I can tell.
When I get the replacement disk I replace the "Degraded" one using the same SCSI ID & attempt a rebuild. However this fails "Dur ro read error on device ID 3". When I run Verify Media on disk ID 3 it finds 3 bad blocks; it says it has remapped these, but on re-running both the array rebuild & the verify media come up with the same errors. I put the "old" disk on Channel B, expecting it to be dead; however it appears in the BIOS, and when I run verify media it comes up with 1 bad block. Again, remapping fails to fix it, but after a low-level format verify media finds no errors.
So the disk I originally thought was dead now appears to be as good as new (so too presumably is the replacement), however the *real* problem seems to lie with disk ID 3, even though the BIOS says it is optimal - and I can't rebuild the array. As the array is now running on only 3 disks out of 4 I am reluctant to do anything with ID 3 that might corrupt the array. Do I have any alternative other than to rebuild the array & reinstall everything? Fortunately I can still make backups of all partitions *except* C:, and I guess I can save anything I need from there on CR-R, but it is still a real pain...
Anyone have any ideas what is really going on, & how I can recover it? And how many, if any, dud disks do I have? The noises when the problem originally manifested definitely sounded very physical!
Thanks in advance for all help & advice.
Andrew |
|
| Back to top |
|
 |
Anthony Preston
Guest
|
Posted:
Wed Sep 22, 2004 1:47 pm Post subject:
Re: SCSI RAID problems |
|
|
Are the disks hot?
I was getting scsi disk errors due to the 4 disks that I had configured where getting to hot and started to play up.
I have since added additional fans and now the disks are easy to touch and handle but before I added the fans I could hardly touch them.
"Andrew Wasielewski" <andrew@wasielewski.co.uk> wrote in message news:J_Gdndy_aMPG-9LcRVn-uQ@brightview.com...
Hello SCSI friends,
I am having some strange/worrying problems with my SCSI RAID setup. Perhaps someone can help?
I have got a Gigabyte GA-KNXP Ultra m/b with embedded Adaptec AIC-7902 U320 SCSI controller. I have got a RAID 10 array made up of 4 x Maxtor Atlas IV 10k. 36GB disks (SCSI IDs 3, 4, 5 & 6) on Channel A. The 72GB useable space is partitioned into a 4GB user partition + 4 others for paging file, apps & data (+ some free space left over). I run WinXP Pro SP1.
A little while ago I started to get signs of hardware problems:-
a.. Errors in the Windows Event Log (ID 9) saying "The device, \Device\Scsi\a320raid1, did not respond within the timeout period." A few times the system crashed about the same time;
b.. Nasty "scrunching" sounds coming from the disks;
c.. In the SCSI BIOS the array is tagged "Degraded". On drilling down, the disk on ID 4 is tagged "Degraded", all the others are "Optimal".
After a short while the errors & noise go away, so I assume the disk has finally died & prepare to RMA it. In the interim I get a different error in the event log (ID 7) saying "The device, \Device\Harddisk0\D, has a bad block." whenever (and only when) I try to backup drive C: to the Tandberg SLR7 tape drive using the WinXP Backup utility. (I get error ID 14 saying "The shadow copy of volume C: was aborted because of an IO failure." at the same time. I can't find out much about these errors, other than that he first one usually means disk failure is imminent; however as the O/S can only see the logical disk, not the individual physical devices, I put this down to a glitch from the disk failure. Apart from this, all apps appear to work normally as far as I can tell.
When I get the replacement disk I replace the "Degraded" one using the same SCSI ID & attempt a rebuild. However this fails "Dur ro read error on device ID 3". When I run Verify Media on disk ID 3 it finds 3 bad blocks; it says it has remapped these, but on re-running both the array rebuild & the verify media come up with the same errors. I put the "old" disk on Channel B, expecting it to be dead; however it appears in the BIOS, and when I run verify media it comes up with 1 bad block. Again, remapping fails to fix it, but after a low-level format verify media finds no errors.
So the disk I originally thought was dead now appears to be as good as new (so too presumably is the replacement), however the *real* problem seems to lie with disk ID 3, even though the BIOS says it is optimal - and I can't rebuild the array. As the array is now running on only 3 disks out of 4 I am reluctant to do anything with ID 3 that might corrupt the array. Do I have any alternative other than to rebuild the array & reinstall everything? Fortunately I can still make backups of all partitions *except* C:, and I guess I can save anything I need from there on CR-R, but it is still a real pain...
Anyone have any ideas what is really going on, & how I can recover it? And how many, if any, dud disks do I have? The noises when the problem originally manifested definitely sounded very physical!
Thanks in advance for all help & advice.
Andrew |
|
| Back to top |
|
 |
Mal
Guest
|
Posted:
Wed Sep 22, 2004 5:10 pm Post subject:
Re: SCSI RAID problems |
|
|
when you say you have a Raid 10 array setup ... do you mean a raid 1/0? ...
the only raid setups I've come across are either Raid0, Raid1, Raid1/0 or
Raid5.
It sounds like you have a Raid 0/1 array from the drives you have and the
space available (Raid5 would give you 108Gb -- 3x36Gb + 4th drive online
spare) ... how is the mirroring setup? ... if the drives on scsi id's 5&6
are a mirror of drives 3&4 then you should be able to restore from them. If
that's the case you could try rebuilding the array using your original drive
4 and the replacement while keeping drive 3 intact
let us know how you get on
Mal |
|
| Back to top |
|
 |
Tim Kelley
Guest
|
Posted:
Thu Sep 23, 2004 12:51 am Post subject:
Re: SCSI RAID problems |
|
|
In article <J_Gdndy_aMPG-9LcRVn-uQ@brightview.com>, Andrew Wasielewski wrote:
| Quote: | I am having some strange/worrying problems with my SCSI RAID setup. =
Perhaps someone can help?
I have got a Gigabyte GA-KNXP Ultra m/b with embedded Adaptec AIC-7902 =
U320 SCSI controller. I have got a RAID 10 array made up of 4 x Maxtor =
Atlas IV 10k. 36GB disks (SCSI IDs 3, 4, 5 & 6) on Channel A. The 72GB =
useable space is partitioned into a 4GB user partition + 4 others for =
paging file, apps & data (+ some free space left over). I run WinXP Pro =
SP1.
|
When you're having mystifyng problems, try looking at heat and power
....
That's a lot of drives and if you have a lot of other stuff, perhaps
your power supply can't handle it? That can cause all manner of
weirdness.
Are they too hot? What's the temp sensor on the drives say?
--
_ _ _ _ _ _ _ _ _ _ _ _ _
/ \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \
( t | i | m | @ | i | t | . | k | p | t | . | c | c )
\_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/
GPG key fingerprint = 1DEE CD9B 4808 F608 FBBF DC21 2807 D7D3 09CA 85BF |
|
| Back to top |
|
 |
Andrew Wasielewski
Guest
|
Posted:
Thu Sep 23, 2004 3:09 am Post subject:
Re: SCSI RAID problems |
|
|
Disks don't feel noticably warm, let alone hot. Case is a Lian-Li PC-71 with plenty of fans, so don't think that can be it...
"Anthony Preston" <preston_PANTSanthony@hotmail.com> wrote in message news:41514a19@news.comindico.com.au...
Are the disks hot?
I was getting scsi disk errors due to the 4 disks that I had configured where getting to hot and started to play up.
I have since added additional fans and now the disks are easy to touch and handle but before I added the fans I could hardly touch them.
"Andrew Wasielewski" <andrew@wasielewski.co.uk> wrote in message news:J_Gdndy_aMPG-9LcRVn-uQ@brightview.com...
Hello SCSI friends,
I am having some strange/worrying problems with my SCSI RAID setup. Perhaps someone can help?
I have got a Gigabyte GA-KNXP Ultra m/b with embedded Adaptec AIC-7902 U320 SCSI controller. I have got a RAID 10 array made up of 4 x Maxtor Atlas IV 10k. 36GB disks (SCSI IDs 3, 4, 5 & 6) on Channel A. The 72GB useable space is partitioned into a 4GB user partition + 4 others for paging file, apps & data (+ some free space left over). I run WinXP Pro SP1.
A little while ago I started to get signs of hardware problems:-
a.. Errors in the Windows Event Log (ID 9) saying "The device, \Device\Scsi\a320raid1, did not respond within the timeout period." A few times the system crashed about the same time;
b.. Nasty "scrunching" sounds coming from the disks;
c.. In the SCSI BIOS the array is tagged "Degraded". On drilling down, the disk on ID 4 is tagged "Degraded", all the others are "Optimal".
After a short while the errors & noise go away, so I assume the disk has finally died & prepare to RMA it. In the interim I get a different error in the event log (ID 7) saying "The device, \Device\Harddisk0\D, has a bad block." whenever (and only when) I try to backup drive C: to the Tandberg SLR7 tape drive using the WinXP Backup utility. (I get error ID 14 saying "The shadow copy of volume C: was aborted because of an IO failure." at the same time. I can't find out much about these errors, other than that he first one usually means disk failure is imminent; however as the O/S can only see the logical disk, not the individual physical devices, I put this down to a glitch from the disk failure. Apart from this, all apps appear to work normally as far as I can tell.
When I get the replacement disk I replace the "Degraded" one using the same SCSI ID & attempt a rebuild. However this fails "Dur ro read error on device ID 3". When I run Verify Media on disk ID 3 it finds 3 bad blocks; it says it has remapped these, but on re-running both the array rebuild & the verify media come up with the same errors. I put the "old" disk on Channel B, expecting it to be dead; however it appears in the BIOS, and when I run verify media it comes up with 1 bad block. Again, remapping fails to fix it, but after a low-level format verify media finds no errors.
So the disk I originally thought was dead now appears to be as good as new (so too presumably is the replacement), however the *real* problem seems to lie with disk ID 3, even though the BIOS says it is optimal - and I can't rebuild the array. As the array is now running on only 3 disks out of 4 I am reluctant to do anything with ID 3 that might corrupt the array. Do I have any alternative other than to rebuild the array & reinstall everything? Fortunately I can still make backups of all partitions *except* C:, and I guess I can save anything I need from there on CR-R, but it is still a real pain...
Anyone have any ideas what is really going on, & how I can recover it? And how many, if any, dud disks do I have? The noises when the problem originally manifested definitely sounded very physical!
Thanks in advance for all help & advice.
Andrew |
|
| Back to top |
|
 |
Andrew Wasielewski
Guest
|
Posted:
Thu Sep 23, 2004 3:21 am Post subject:
Re: SCSI RAID problems |
|
|
RAID 0+1 is another name for my setup i.e. striping + mirroring. Since the
4 disks are identical I presume there is a 1-to-1 correspondence between the
the blocks on one side of the mirror and the other, as the mirroring logic
doesn't care whether & how they are striped. However in that case I don't
know how the disks are paired off. There wasn't anywhere to specify it in
the array setup in the SCSI BIOS, & I can't see anything that displays it.
I am wary about finding out the hard way by disconnecting disk ID 3, as if
that turns out to be the currently non-redundent disk I don't want to risk
corrupting the array irretrievably. Or will it simply fail to recognise the
array at all in that case, until I put back the missing disk?
"Mal" <mincer2000@hotmail.com> wrote in message
news:415179b2$0$20244$cc9e4d1f@news-text.dial.pipex.com...
| Quote: | when you say you have a Raid 10 array setup ... do you mean a raid 1/0?
....
the only raid setups I've come across are either Raid0, Raid1, Raid1/0 or
Raid5.
It sounds like you have a Raid 0/1 array from the drives you have and the
space available (Raid5 would give you 108Gb -- 3x36Gb + 4th drive online
spare) ... how is the mirroring setup? ... if the drives on scsi id's 5&6
are a mirror of drives 3&4 then you should be able to restore from them.
If
that's the case you could try rebuilding the array using your original
drive
4 and the replacement while keeping drive 3 intact
let us know how you get on
Mal
|
|
|
| Back to top |
|
 |
Tim
Guest
|
Posted:
Thu Sep 23, 2004 4:31 am Post subject:
Re: SCSI RAID problems |
|
|
Hi,
I can't see the original post for this due to ISP clobbering the
newsgroup....
What type of raid controller is it?
- Tim
"Andrew Wasielewski" <andrew@wasielewski.co.uk> wrote in message
news:wtGdnQLHvuJ_lc_cRVn-tw@brightview.com...
| Quote: | RAID 0+1 is another name for my setup i.e. striping + mirroring. Since
the
4 disks are identical I presume there is a 1-to-1 correspondence between
the
the blocks on one side of the mirror and the other, as the mirroring logic
doesn't care whether & how they are striped. However in that case I don't
know how the disks are paired off. There wasn't anywhere to specify it in
the array setup in the SCSI BIOS, & I can't see anything that displays it.
I am wary about finding out the hard way by disconnecting disk ID 3, as if
that turns out to be the currently non-redundent disk I don't want to risk
corrupting the array irretrievably. Or will it simply fail to recognise
the
array at all in that case, until I put back the missing disk?
"Mal" <mincer2000@hotmail.com> wrote in message
news:415179b2$0$20244$cc9e4d1f@news-text.dial.pipex.com...
when you say you have a Raid 10 array setup ... do you mean a raid 1/0?
...
the only raid setups I've come across are either Raid0, Raid1, Raid1/0 or
Raid5.
It sounds like you have a Raid 0/1 array from the drives you have and the
space available (Raid5 would give you 108Gb -- 3x36Gb + 4th drive online
spare) ... how is the mirroring setup? ... if the drives on scsi id's 5&6
are a mirror of drives 3&4 then you should be able to restore from them.
If
that's the case you could try rebuilding the array using your original
drive
4 and the replacement while keeping drive 3 intact
let us know how you get on
Mal
|
|
|
| Back to top |
|
 |
Andrew Wasielewski
Guest
|
Posted:
Thu Sep 23, 2004 10:25 am Post subject:
Re: SCSI RAID problems |
|
|
It's a Gigabyte GA-KNXP Ultra m/b with embedded Adaptec AIC-7902 - effectively same thing as a 39320A-R card AFAIK.
Also posted on alt.comp.periphs.mainboard.gigabyte. I can't find an "Adaptec" Usenet group - does anyone know of one?
Thanks
"Tim" <Tim@NoSpam.com> wrote in message news:cit5ar$pj$1@lust.ihug.co.nz...
| Quote: | Hi,
I can't see the original post for this due to ISP clobbering the
newsgroup....
What type of raid controller is it?
- Tim
"Andrew Wasielewski" <andrew@wasielewski.co.uk> wrote in message
news:wtGdnQLHvuJ_lc_cRVn-tw@brightview.com...
RAID 0+1 is another name for my setup i.e. striping + mirroring. Since
the
4 disks are identical I presume there is a 1-to-1 correspondence between
the
the blocks on one side of the mirror and the other, as the mirroring logic
doesn't care whether & how they are striped. However in that case I don't
know how the disks are paired off. There wasn't anywhere to specify it in
the array setup in the SCSI BIOS, & I can't see anything that displays it.
I am wary about finding out the hard way by disconnecting disk ID 3, as if
that turns out to be the currently non-redundent disk I don't want to risk
corrupting the array irretrievably. Or will it simply fail to recognise
the
array at all in that case, until I put back the missing disk?
"Mal" <mincer2000@hotmail.com> wrote in message
news:415179b2$0$20244$cc9e4d1f@news-text.dial.pipex.com...
when you say you have a Raid 10 array setup ... do you mean a raid 1/0?
...
the only raid setups I've come across are either Raid0, Raid1, Raid1/0 or
Raid5.
It sounds like you have a Raid 0/1 array from the drives you have and the
space available (Raid5 would give you 108Gb -- 3x36Gb + 4th drive online
spare) ... how is the mirroring setup? ... if the drives on scsi id's 5&6
are a mirror of drives 3&4 then you should be able to restore from them.
If
that's the case you could try rebuilding the array using your original
drive
4 and the replacement while keeping drive 3 intact
let us know how you get on
Mal
|
|
|
| Back to top |
|
 |
Folkert Rienstra
Guest
|
Posted:
Fri Sep 24, 2004 1:48 am Post subject:
Re: SCSI RAID problems |
|
|
"Tim" <Tim@NoSpam.com> wrote in message news:cit5ar$pj$1@lust.ihug.co.nz
| Quote: | Hi,
I can't see the original post for this due to ISP clobbering the newsgroup....
|
news:J_Gdndy_aMPG-9LcRVn-uQ@brightview.com
Some ISPs filter HTML posts.
You can find the message-ID in the header.
Cut and paste it to your address bar and type "news:" in front of it.
Or go to the top most message and click the attribution line.
| Quote: |
What type of raid controller is it?
- Tim
"Andrew Wasielewski" <andrew@wasielewski.co.uk> wrote in message
news:wtGdnQLHvuJ_lc_cRVn-tw@brightview.com...
|
[snip] |
|
| Back to top |
|
 |
|
|
|
|