Help - Search - Members - Calendar
Full Version: ASM diskgroups disappeared...
Oracle DBA Forums > Oracle > Oracle Forum
alp
Hello.
I have awful situation. Today after update of VMware ESX server our RAC nodes failed. ASM failed to mount diskgroups (I've checked disks permissions, they are OK) with following messages:

Mon Apr 6 15:11:41 2009
lmon registered with NM - instance id 2 (internal mem no 1)
Mon Apr 6 15:11:42 2009
Reconfiguration started (old inc 0, new inc 24)
ASM instance
List of nodes:
0 1
Global Resource Directory frozen
Communication channels reestablished
* allocate domain 1, invalid = TRUE
* domain 1 valid = 1 according to instance 0
* allocate domain 2, invalid = TRUE
* domain 2 valid = 1 according to instance 0
* allocate domain 3, invalid = TRUE
* domain 3 valid = 1 according to instance 0
Mon Apr 6 15:11:42 2009
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Mon Apr 6 15:11:42 2009
LMS 0: 0 GCS shadows cancelled, 0 closed
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Mon Apr 6 15:11:42 2009
LMS 0: 0 GCS shadows traversed, 0 replayed
Mon Apr 6 15:11:42 2009
Submitted all GCS remote-cache requests
Post SMON to start 1st pass IR
Fix write in gcs resources
Reconfiguration complete
LCK0 started with pid=15, OS id=4303
Mon Apr 6 15:11:43 2009
SQL> ALTER DISKGROUP ALL MOUNT
Mon Apr 6 15:11:43 2009
NOTE: cache registered group DATA number=1 incarn=0x43283dc2
NOTE: cache registered group FRA number=2 incarn=0x43583dc3
NOTE: cache registered group LOGS number=3 incarn=0x43583dc4
Mon Apr 6 15:11:43 2009
ERROR: no PST quorum in group 1: required 2, found 0
Mon Apr 6 15:11:43 2009
NOTE: cache dismounting group 1/0x43283DC2 (DATA)
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DATA was not mounted
Mon Apr 6 15:11:43 2009
ERROR: no PST quorum in group 2: required 2, found 0
Mon Apr 6 15:11:43 2009
NOTE: cache dismounting group 2/0x43583DC3 (FRA)
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup FRA was not mounted
Mon Apr 6 15:11:43 2009
ERROR: no PST quorum in group 3: required 2, found 0
Mon Apr 6 15:11:43 2009
NOTE: cache dismounting group 3/0x43583DC4 (LOGS)
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup LOGS was not mounted

After reboot of each node it seems asm instances forgot about asm diskgroups at all.
select * from v$asm_diskgroup gave nothing.
I've tried to recreate LOGS diskgroup with the same disk - it became worse, data was lost (it is only REDO, I hope I can ressurect FRA discgroup and recover database...). v$asm_disk shows the following:
SQL> select distinct path,group_number,INCARNATION,MOUNT_STATUS,HEADER_STATUS,MODE_STATUS from V$ASM_DISK;

PATH GROUP_NUMBER INCARNATION MOUNT_S HEADER_STATU MODE_ST

/dev/sdc 0 0 CLOSED PROVISIONED ONLINE //Former DATA Diskgroup member

/dev/sdd 1 3915939028 CACHED MEMBER ONLINE // my attempt to recreate diskgroup

/dev/sdf 0 0 CLOSED PROVISIONED ONLINE // Former FRA Diskgroup member

kfed says the following on /dev/sdf:

kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 0 ; 0x004: T=0 NUMB=0x0
kfbh.block.obj: 2147483648 ; 0x008: TYPE=0x8 NUMB=0x0
kfbh.check: 3801353830 ; 0x00c: 0xe2940e66
...
kfdhdb.driver.provstr: ORCLDISK ; 0x000: length=8
...
kfdhdb.compat: 168820736 ; 0x020: 0x0a100000
kfdhdb.dsknum: 0 ; 0x024: 0x0000
kfdhdb.grptyp: 1 ; 0x026: KFDGTP_EXTERNAL
kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname: FRA_0000 ; 0x028: length=8
kfdhdb.grpname: FRA ; 0x048: length=3
kfdhdb.fgname: FRA_0000 ; 0x068: length=8
kfdhdb.capname: ; 0x088: length=0
....

kfed says the following on /dev/sdc:
kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 0 ; 0x004: T=0 NUMB=0x0
kfbh.block.obj: 2147483648 ; 0x008: TYPE=0x8 NUMB=0x0
kfbh.check: 2743336053 ; 0x00c: 0xa383fc75
...
kfdhdb.compat: 168820736 ; 0x020: 0x0a100000
kfdhdb.dsknum: 0 ; 0x024: 0x0000
kfdhdb.grptyp: 1 ; 0x026: KFDGTP_EXTERNAL
kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname: DATA_0000 ; 0x028: length=9
kfdhdb.grpname: DATA ; 0x048: length=4
kfdhdb.fgname: DATA_0000 ; 0x068: length=9
kfdhdb.capname: ; 0x088: length=0
...

We don't use asmlib.
The only good thing is that this database hasn't been entered into production state. However, a lot of work was done there and I have to recover it...
Backups were on FRA diskgroup...

At least, is there any way to take files from such misconfigured diskgroups?

P.S. I know that I need to contact Oracle Support, but our chiefs didn't want to spend money on Oracle support (so we even don't have Metalink) ... I know, it's silly, but its so...
P.P.S If it helps: I'm running CentOS 5.2, Oracle 10gR2 v. 10.2.0.4.0 , RAC configured.
burleson
Hi Alp,

>> our chiefs didn't want to spend money on Oracle support (so we even don't have Metalink)

For the record, that's not "silly", that's INSANE, and you have my permission to tell them that I said so!

To spend money on RAC (assuming that they are not "borrowing" a RAC copy), without support is beyond malfeasence.

If you can, I would quit, because as soon as something goes wrong in production, you will no question be blamed . . .

**************************************
As to your issue, you need to troubleshoot why each node cannot see ASM via the CRS interface . . .

I would use Madhu Tumma's RAC troubleshooting checklist, from his book:

http://www.rampant-books.com/book_0401_10g_grid.htm

Steve
Hello,

This could potentially be a very convoluted problem, but to start simple can you please tell me what happens where you run:

alter system set asm_diskstring = '/dev/sdc','/dev/sdd','/dev/sdf';
select * from v$asm_diskgroup;

You can find out more about the asm_diskstring parameter at http://www.dba-oracle.com/real_application...parameters.html
burleson
>> This could potentially be a very convoluted problem

Oh yeah! When you add-in VMWare things get very complex.

You might find some tips in Dr. Scalzo's new book on Oracle on VMWare:

http://www.rampant-books.com/book_0801_oracle_vmware.htm

Andreas Fassl
QUOTE (burleson @ Apr 6 2009, 12:52 PM) *
>> This could potentially be a very convoluted problem

Oh yeah! When you add-in VMWare things get very complex.

You might find some tips in Dr. Scalzo's new book on Oracle on VMWare:

http://www.rampant-books.com/book_0801_oracle_vmware.htm



And - that is the most worse problem - you won't get any support from Oracle.

Oracle on VMware isn't a supported configuration. Oracle has got a XEN-based VM-solution.

From Metalink Note 249212.1:

"Support Status for VMware Virtualized Environments
--------------------------------------------------
Oracle has not certified any of its products on VMware virtualized
environments. Oracle Support will assist customers running Oracle products
on VMware in the following manner: Oracle will only provide
support for issues that either are known to occur on the native OS, or
can be demonstrated not to be as a result of running on VMware."

Next thing:

CentOS isn't a supported OS. Please use RedHat, SuSe or Oracle Unbreakable Linux.

I hate this to say, but this is a typical example for saving money just there, where you shouldn't save.

What I'd try to do:
- Try to get the ASM diskgroups (any more information about this?) back to life.
- If this isn't working, build up a supported single-instance configuration and reload the database from the backups (to be honest: I've got some impression about your answer on that backup problem)

What cluster file system are you using? ORCL2?

What does the ASM-alert log tell you?

Best regards

Andreas
alp
QUOTE (Steve @ Apr 6 2009, 01:44 PM) *
Hello,

This could potentially be a very convoluted problem, but to start simple can you please tell me what happens where you run:

alter system set asm_diskstring = '/dev/sdc','/dev/sdd','/dev/sdf';
select * from v$asm_diskgroup;


GROUP_NUMBER NAME SECTOR_SIZE BLOCK_SIZE ALLOCATION_UNIT_SIZE STATE TYPE TOTAL_MB FREE_MB REQUIRED_MIRROR_FREE_MB USABLE_FILE_MB OFFLINE_DISKS you COMPATIBILITY DATABASE_COMPATIBILITY
0 LOGS 512 4096 1048576 DISMOUNTED 0 0 0 0 0 N
0.0.0.0.0 0.0.0.0.0


So, I see only unsuccessfully recreated logs diskgroup and not any old diskgroup.

voting_disk, ocr_configuration and ASM spfile are stored on shared ocfs2 fs, RAC1 and RAC2 spfiles, controlfiles, data were stored on DATA asm diskgroup, backups and archivelog on FRA asm diskgroup.
And that's the largest problem - backups were also stored in ASM.
Andreas Fassl
QUOTE (alp @ Apr 6 2009, 03:21 PM) *
GROUP_NUMBER NAME SECTOR_SIZE BLOCK_SIZE ALLOCATION_UNIT_SIZE STATE TYPE TOTAL_MB FREE_MB REQUIRED_MIRROR_FREE_MB USABLE_FILE_MB OFFLINE_DISKS you COMPATIBILITY DATABASE_COMPATIBILITY
0 LOGS 512 4096 1048576 DISMOUNTED 0 0 0 0 0 N
0.0.0.0.0 0.0.0.0.0


So, I see only unsuccessfully recreated logs diskgroup and not any old diskgroup.

voting_disk, ocr_configuration and ASM spfile are stored on shared ocfs2 fs, RAC1 and RAC2 spfiles, controlfiles, data were stored on DATA asm diskgroup, backups and archivelog on FRA asm diskgroup.
And that's the largest problem - backups were also stored in ASM.


Hi Alp,

again - to be very, very, honest. Don't invest anything in this setup, which isn't productive, yet.

- VMware isn't supported as a virtualization platform by oracle
- CentOS isn't supported by oracle
- I question that all this is properly licensed (including the entitlement for using the patch level 10.2.0.4)

To choose a positive approach you should treat this as lesson why supported, maintained configurations are important.

1) Choose a supported OS, for example Oracle Unbreakable Linux (this is red hat derivate).
2) If you want to use a VM, do use Oracle VM

Both aren't really expensive - "licenses" aren't available, only support contracts (due to its open source heritage).

3) Get a properly licensed setup + maintenance contract. Oracle support is available 24x7 follow-the-sun. And if you have an urgent problem, they really try to help you including hand-over to the next available specialist.

4) Plan and choose a reliable backup strategy. Don't make inbox backups. Do regular exports. Do sometimes a full-cold-backup. Test the restore!!!! Seen so many IDS-situations (IDS=in deep shit), where customers realize, that they never had a valid backup.

5) Be happy, that this happened before the database went productive.

6) Check, if your hardware setup is really useable for RAC. Including three ethernet devices per "node".

7) And: Can't see any benefit from a virtualized productive database server in a RAC configuration anything else as some sort of training environment or proof of concept.

Best regards

Andreas
alp
QUOTE (Andreas Fassl @ Apr 6 2009, 02:17 PM) *
What does the ASM-alert log tell you?


There is alert.log extract (received on startong one ASM instance):

Tue Apr 7 00:43:31 2009
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 eth1 10.2.2.0 configured from OCR for use as a cluster interconnect
Picked latch-free SCN scheme 3
Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/oracle/product/10.2.0/db_1/dbs/arch
Autotune of undo retention is turned off.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
ksdpec: called for event 13740 prior to event group initialization
Starting up ORACLE RDBMS Version: 10.2.0.4.0.
System parameters with non-default values:
large_pool_size = 12582912
spfile = /u01/shared_config/spfile+ASM.ora
instance_type = asm
cluster_database = TRUE
instance_number = 1
remote_login_passwordfile= EXCLUSIVE
background_dump_dest = /u01/app/oracle/admin/+ASM/bdump
user_dump_dest = /u01/app/oracle/admin/+ASM/udump
core_dump_dest = /u01/app/oracle/admin/+ASM/cdump
asm_diskstring = /dev/sdc, /dev/sdd, /dev/sdf
asm_diskgroups = FRA, LOGS, DATA
Cluster communication is configured to use the following interface(s) for this instance
10.2.2.1
Tue Apr 7 00:46:30 2009
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
PMON started with pid=2, OS id=24670
DIAG started with pid=3, OS id=24672
PSP0 started with pid=4, OS id=24674
LMON started with pid=5, OS id=24676
LMD0 started with pid=6, OS id=24678
LMS0 started with pid=7, OS id=24680
MMAN started with pid=8, OS id=24684
DBW0 started with pid=9, OS id=24686
LGWR started with pid=10, OS id=24688
CKPT started with pid=11, OS id=24690
SMON started with pid=12, OS id=24694
RBAL started with pid=13, OS id=24699
GMON started with pid=14, OS id=24701
Tue Apr 7 00:46:31 2009
lmon registered with NM - instance id 1 (internal mem no 0)
Tue Apr 7 00:46:31 2009
Reconfiguration started (old inc 0, new inc 1)
ASM instance
List of nodes:
0
Global Resource Directory frozen
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Tue Apr 7 00:46:31 2009
LMS 0: 0 GCS shadows cancelled, 0 closed
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Post SMON to start 1st pass IR
Tue Apr 7 00:46:31 2009
LMS 0: 0 GCS shadows traversed, 0 replayed
Tue Apr 7 00:46:31 2009
Submitted all GCS remote-cache requests
Fix write in gcs resources
Reconfiguration complete
LCK0 started with pid=15, OS id=24703
Tue Apr 7 00:46:32 2009
SQL> ALTER DISKGROUP ALL MOUNT
Tue Apr 7 00:46:32 2009
NOTE: cache registered group DATA number=1 incarn=0xcb971bfc
NOTE: cache registered group FRA number=2 incarn=0xcbc71bfd
NOTE: cache registered group LOGS number=3 incarn=0xcbc71bfe
Tue Apr 7 00:46:32 2009
ERROR: no PST quorum in group 1: required 2, found 0
Tue Apr 7 00:46:32 2009
NOTE: cache dismounting group 1/0xCB971BFC (DATA)
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DATA was not mounted
Tue Apr 7 00:46:32 2009
ERROR: no PST quorum in group 2: required 2, found 0
Tue Apr 7 00:46:32 2009
NOTE: cache dismounting group 2/0xCBC71BFD (FRA)
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup FRA was not mounted
Tue Apr 7 00:46:32 2009
NOTE: Hbeat: instance first (grp 3)
Tue Apr 7 00:46:36 2009
NOTE: start heartbeating (grp 3)
NOTE: cache opening disk 0 of grp 3: LOGS_0000 path:/dev/sdd
Tue Apr 7 00:46:36 2009
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache mounting (first) group 3/0xCBC71BFE (LOGS)
* allocate domain 3, invalid = TRUE
Tue Apr 7 00:46:37 2009
NOTE: attached to recovery domain 3
Tue Apr 7 00:46:37 2009
NOTE: cache recovered group 3 to fcn 0.16
Tue Apr 7 00:46:37 2009
NOTE: opening chunk 1 at fcn 0.16 ABA
NOTE: seq=6 blk=8
Tue Apr 7 00:46:37 2009
NOTE: cache mounting group 3/0xCBC71BFE (LOGS) succeeded
SUCCESS: diskgroup LOGS was mounted
Tue Apr 7 00:46:39 2009
NOTE: recovering COD for group 3/0xcbc71bfe (LOGS)
SUCCESS: completed COD recovery for group 3/0xcbc71bfe (LOGS)

alp
QUOTE (Andreas Fassl @ Apr 6 2009, 04:59 PM) *
5) Be happy, that this happened before the database went productive.

I am happy smile.gif

QUOTE
6) Check, if your hardware setup is really useable for RAC. Including three ethernet devices per "node".

Why three?
We only need 1 for mgmt+client access and one more for interconnect.

QUOTE
7) And: Can't see any benefit from a virtualized productive database server in a RAC configuration anything else as some sort of training environment or proof of concept.

Why do you think so? It is very convenient to have clustered hosts for purposes of system maintenance and fault tolerance. And it is much easier to manage virtual server...
Andreas Fassl
QUOTE (alp @ Apr 6 2009, 04:00 PM) *
There is alert.log extract (received on startong one ASM instance):


Ok,

lets give it a try:
->
Cluster communication is configured to use the following interface(s) for this instance
10.2.2.1

- Please check, if all the configured interfaces are up and running.

Please post the network setup.

Your application setup seems to be weird:

Archivelogs go to:
/u01/app/oracle/product/10.2.0/db_1/dbs/arch
ASM+ Logs go to:
/u01/app/oracle/admin/+ASM/bdump

How have you configured your oracle home directories?
Andreas Fassl
QUOTE (alp @ Apr 6 2009, 04:08 PM) *
Why three?
We only need 1 for mgmt+client access and one more for interconnect.


From the RAC FAQ:

How many NICs do I need to implement RAC?

At minimum you need 2: external (public), interconnect (private). When storage for RAC is provided by Ethernet based networks (e.g. NAS/nfs or iSCSI), you will need a third interface for I/O so a minimum of 3. Anything else will cause performance and stability problems under load. From an HA perspective, you want these to be redundant, thus needing a total of 6.


You mentioned fault tolerance, at least the cluster interconnect should be available redundant = 3.

Steve
For another simple question...can you:

ls -ltr /dev/sd*
alp
QUOTE (Andreas Fassl @ Apr 6 2009, 05:16 PM) *
- Please check, if all the configured interfaces are up and running.


QUOTE
- 10.2.2.0 is a not usable as an IP address.

Yes, it's network address. (10.2.2.0/255.255.255.0)


QUOTE
Please post the network setup.

Network setup is right, DBMS have been working with it for more than month. CRSS starts, nodes see each other, but CRSS can't start ASM instance.
$ oifcfg iflist -p -n
eth0 X.X.245.192 UNKNOWN 255.255.255.224
eth1 10.2.2.0 PRIVATE 255.255.255.0
$ ifconfig -a
eth0 Link encap:Ethernet HWaddr xx:xx:xx:xx:xx # external networking
inet addr:X.X.245.192 Bcast:X.X.245.214 Mask:255.255.255.224
inet6 addr: fe80::250:56ff:fe97:503d/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:19627 errors:0 dropped:0 overruns:0 frame:0
TX packets:16897 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2087002 (1.9 MiB) TX bytes:3267731 (3.1 MiB)
Base address:0x1480 Memory:f4820000-f4840000

eth0:1 Link encap:Ethernet HWaddr xx:xx:xx:xx:xx # vip address
inet addr:X.X.245.216 Bcast:X.X.245.223 Mask:255.255.255.224 Mask:255.255.255.224
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Base address:0x1480 Memory:f4820000-f4840000

eth1 Link encap:Ethernet HWaddr yy:yy:yy:yy:yy
inet addr:10.2.2.2 Bcast:10.2.2.255 Mask:255.255.255.0
inet6 addr: fe80::250:56ff:fe97:1f48/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:75547 errors:0 dropped:0 overruns:0 frame:0
TX packets:96186 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:19095688 (18.2 MiB) TX bytes:30757069 (29.3 MiB)
Base address:0x14c0 Memory:f4840000-f4860000

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:64436 errors:0 dropped:0 overruns:0 frame:0
TX packets:64436 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:3323490 (3.1 MiB) TX bytes:3323490 (3.1 MiB)

sit0 Link encap:IPv6-in-IPv4
NOARP MTU:1480 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 cool.gif TX bytes:0 (0.0 cool.gif

QUOTE
Your application setup seems to be weird:

Archivelogs go to:
/u01/app/oracle/product/10.2.0/db_1/dbs/arch
ASM+ Logs go to:
/u01/app/oracle/admin/+ASM/bdump

How have you configured your oracle home directories?

What is strange with such setup?
$ORACLE_HOME is /u01/app/oracle/product/10.2.0/db_1 (for both ASM and DB instances).
It is not important, how archivelogs are configured on ASM instance. And it's right, ASM logs goes to /u01/app/oracle/admin/+ASM/bdump .
Andreas Fassl
QUOTE (Steve @ Apr 6 2009, 04:41 PM) *
For another simple question...can you:

ls -ltr /dev/sd*


And some output from the cluvfy utility is very helpful.
------------------------------------------

cluvfy comp nodecon -n node1, node2

cluvfy comp admprv -n all -o user_equiv -verbose

cluvfy comp -list

alp
QUOTE (Steve @ Apr 6 2009, 05:41 PM) *
For another simple question...can you:

ls -ltr /dev/sd*

$ ls -ltr /dev/sd*
brw-r----- 1 ora10g disk 8, 80 Apr 6 19:08 /dev/sdf
brw-r----- 1 root disk 8, 64 Apr 6 19:08 /dev/sde
brw-r----- 1 ora10g disk 8, 32 Apr 6 19:08 /dev/sdc
brw-r----- 1 root disk 8, 2 Apr 6 19:08 /dev/sda2
brw-r----- 1 root disk 8, 0 Apr 6 19:08 /dev/sda
brw-r----- 1 root disk 8, 1 Apr 6 19:08 /dev/sda1
brw-r----- 1 root disk 8, 16 Apr 6 19:08 /dev/sdb
brw-r----- 1 ora10g disk 8, 48 Apr 7 01:46 /dev/sdd

As I mentioned, I've tried to recreate LOGS diskgroup (/dev/sdd), but it only created new diskgroup and destroyed old one.
This new diskgroup is operating normally...
Andreas Fassl
QUOTE (alp @ Apr 6 2009, 04:45 PM) *
Network setup is right, DBMS have been working with it for more than month. CRSS starts, nodes see each other, but CRSS can't start ASM instance.
$ oifcfg iflist -p -n
eth0 X.X.245.192 UNKNOWN 255.255.255.224
eth1 10.2.2.0 PRIVATE 255.255.255.0

I'm not the one having trouble. I don't know, if the network setup is still correct after upgrading my VMware. I only see a "UNKNOWN" in your list. Anything like that should be inspected.

BTW, Oracle doesn't recommend RAC on VM for productive environments at all. Just for test & development.

Probably you really need to learn this roughly.
alp
QUOTE (Andreas Fassl @ Apr 6 2009, 05:49 PM) *
cluvfy comp nodecon -n node1, node2

$ cluvfy comp nodecon -n oracle10rac1-n1,oracle10rac1-n2

Verifying node connectivity

Checking node connectivity...

Node connectivity check passed for subnet "X.X.245.192" with node(s) oracle10rac1-n2,oracle10rac1-n1.
Node connectivity check passed for subnet "10.2.2.0" with node(s) oracle10rac1-n2,oracle10rac1-n1.

Suitable interfaces for VIP on subnet "X.X.245.192":
oracle10rac1-n2 eth0:X.X.245.214 eth0:X.X.245.216
oracle10rac1-n1 eth0:X.X.245.213 eth0:X.X.245.215

Suitable interfaces for the private interconnect on subnet "10.2.2.0":
oracle10rac1-n2 eth1:10.2.2.2
oracle10rac1-n1 eth1:10.2.2.1

Node connectivity check passed.

Verification of node connectivity was successful.
QUOTE
cluvfy comp admprv -n all -o user_equiv -verbose


Verifying administrative privileges

Checking user equivalence...

Check: User equivalence for user "oracle"
Node Name Comment
------------------------------------ ------------------------
oracle10rac1-n2 passed
oracle10rac1-n1 passed
Result: User equivalence check passed for user "oracle".

Verification of administrative privileges was successful.

Cluster is operating normally...
alp
Ave !!!! I've found one difference in kfed lists:
set kfdhdb.acdb.ub2spare to 0 in kfed dump and wrote it to disk with kfed. Now I can mount my FRA!

$ kfed read /dev/sdf > /tmp/sdf.noop.mod
$ vi /tmp/sdf.noop.mod //Changed kfdhdb.acdb.ub2spare to 0
$ kfed op=write dev=/dev/sdf text=/tmp/sdf.noop.mod CHKSUM=YES

Now alter diskgroup fra mount works!!!

Recovering db...
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2017 Invision Power Services, Inc.