Hardware iSCSI MPIO in ESX 6?

December 24, 2015, 4:54 pm

Latest and popular articles on VMWare Virtualization

≫ Next: Migration from iscsi to Fiber and 5.5 to 6

Is there a trick to getting hardware MPIO working on iSCSI adapters in ESXi 6?

I have x2 QLogic iSCSI adapters (vmhba33 / vmhba34) that can attach to a QNAP storage array but the LUN I present can only been seen by one NIC at a time? The LUN was created to allow multiple connections. If vmhba33 is connected, then vmhba34 doesn't see it. If I disable vmhba33 (ie, shutdown the port on the switch) then vmhba34 will discover it. I can't seem to get both NICs to see it at the same time though.

If I create a software iSCSI adapter and bind vmnic0 & vmnic1 to it, then I get MPIO.

There's tons of documents out there about creating multipath iSCSI connections but they all rely on the software iSCSI adapter. I haven't found any good resources for hardware iSCSI.

Thanks!!

↧

Migration from iscsi to Fiber and 5.5 to 6

December 29, 2015, 7:23 am

Latest and popular articles on VMWare Virtualization

≫ Next: Unable to mount an NFS datastore from Solaris 11 host

≪ Previous: Hardware iSCSI MPIO in ESX 6?

So I have a project going now, where I am trying to move from iscsi connected storage to fiber connected storage using 2 datastores. Adding 2 new hosts and removing 4 old hosts. I have 8 hosts that I need to change, but I do not have the down time to just shut down and migrate over and bring back up. So I am trying to find the best possible scenarios for this move. So far I have 3 hosts moved over to fiber, but the other 5 I do not. What I am running into is the fact I have DRS off so I have to manually managed my vm's so I'm not over taxing any one host. So I was thinking of making temp cluster to move my 4 old hosts that I do not plan to update or use that way I can keep my most critical data/vm's on there while I prep and move all my others vm's to the new 6.0.0u1 fiber connected hosts. But I'm not sure if that is the right way to approach this. And I'm sure I have missed something while typing this all out, but I'm trying to find the best way to describe what I have and what I am trying to accomplish. Just looking for some thoughts, or ideas as I have never made this kind of move before. Thanks for your time.

↧

Unable to mount an NFS datastore from Solaris 11 host

December 29, 2015, 8:46 pm

Latest and popular articles on VMWare Virtualization

≫ Next: All shared datastores failed on the host - warning after removing iSCSI Adapter

≪ Previous: Migration from iscsi to Fiber and 5.5 to 6

I have a directory on a Solaris 11 server that has a bunch of ISOs in it, and I want to make those available to my vSphere servers (ESXi 5.5U3). I've been doing the same from a Solaris 10 server for several years (ESXi 4.1 & 5.5) without issue. I have been able to successfully mount the Solaris 11 share on multiple Linux & Solaris client systems, but any attempt to mount it on my ESXi server fails.

Here's what I see at the ESXi command line:

~ # esxcfg-nas -a -o 192.168.1.33 -s /ptmp/iso iso
Connecting to NAS volume: iso
Unable to connect to NAS volume iso: Sysinfo error on operation returned status : Unable to connect to NFS server. Please see the VMkernel log for detailed error information

Looking in the vmkernel log, I see these messages:

2015-12-30T04:03:20.252Z cpu4:1173363)NFS: 157: Command: (mount) Server: (192.168.1.33) IP: (192.168.1.33) Path: (/ptmp/iso) Label: (iso) Options: (None)

2015-12-30T04:03:20.252Z cpu4:1173363)StorageApdHandler: 698: APD Handle a3b73984-acaaad7e Created with lock[StorageApd0x410a7b]

2015-12-30T04:03:20.258Z cpu4:1173363)StorageApdHandler: 745: Freeing APD Handle [a3b73984-acaaad7e]

2015-12-30T04:03:20.258Z cpu4:1173363)StorageApdHandler: 808: APD Handle freed!

2015-12-30T04:03:20.258Z cpu4:1173363)NFS: 168: NFS mount 192.168.1.33:/ptmp/iso failed: Unable to connect to NFS server.

I worked through KB article 1003967, and netcat, etc. all seems to work OK for me.

Any ideas?

Thanks.

↧

All shared datastores failed on the host - warning after removing iSCSI Adapter

January 1, 2016, 10:28 am

Latest and popular articles on VMWare Virtualization

≫ Next: Tegile | Tintri | Compellent - Evaluating for next SAN refresh

≪ Previous: Unable to mount an NFS datastore from Solaris 11 host

Hi,

On my three hosts, when I removed the one shared datastore that was connected via iSCSI by unmounting the datastore then disabling the iSCSI adapter and after rebooting I get a warning of "All shared datastores have failed on the host".

Is there a way to clear that old datastore from the system so that I do not receive this message anymore?

Thanks!

↧

Tegile | Tintri | Compellent - Evaluating for next SAN refresh

June 21, 2014, 8:16 am

Latest and popular articles on VMWare Virtualization

≫ Next: vSphere 5.5 U1 and IBMv7000, very slow performance on SvMotion and clone (VAAI)

≪ Previous: All shared datastores failed on the host - warning after removing iSCSI Adapter

I have quotes for all three - Tegile (HA2100,HA2300), Tintri (T650), and Compellent (SC8000).

I love how easy the Tintri was to setup and how it performs automatic I/O alignment. The main thing I dislike about it is the lack of enterprise grade hardware (consumer grade MLCs, SATA vs SAS) and the lack of compression on the mechanical disks. The Tegile has enterprise grade hardware (SAS, eMLC) and it does deduplication/compression on all disks (SSD and HDD).

The Compellent, well, every storage vendor tells me they are bad, don't go with them, but I honestly feel it's a case of "that's my competitor, buy my stuff!."

Dell gave me a fantastic deal on a Compellent array with 12 TB of SSD (6 pack SLC 400GB/ea, 6 pack 1.6TB/ea eMLC) and a tray (x24) of 1 TB 7200 rpm disks. I dislike that this array moves data based on a schedule - once a day in 4K pages. However, it's saving grace is the large amount of SSD and the fact that they are adding compression into T3 this September.

Any experience with any of them? Right now I'm an all HDS shop (Hitachi, AMS and HUS). I dislike these SANs. They are solid but offer absolutely no innovation.

Compression/Deduplication is a huge deal for me as most data is cold and is def. very compressible. I'm done with paying for space I don't need. Compression and deduplication is the answer. I should be able to put a 30 TB box in a 4U space that has the efficiency/capacity of a 120 TB SAN. I know the Tegile can do this, but the entry level model they quoted me is pricey and has a low amount of SSD.

Tegile is not as reasonable with pricing as they would try to tell you.

↧

vSphere 5.5 U1 and IBMv7000, very slow performance on SvMotion and clone (VAAI)

April 4, 2014, 6:41 am

Latest and popular articles on VMWare Virtualization

≫ Next: how to enable all connected hard disks in VMware?

≪ Previous: Tegile | Tintri | Compellent - Evaluating for next SAN refresh

Hello,

I recently installed vSphere 5.5 U1 enviroment with IBM Storwize v7000. Migration from IBM DS3500 storage looks very promising with arround 200-250MBps write transfer on v7000. But after migration SvMotion between two v7000 LUN's dissapoint it me. I google a little bit and found similar cases linked with VAAI and hardware acceleration. Yes, all v7000 Datastores have hardware acceleration support, but it's not usable. I have latest 7.2.0.4 firmware level on v7000 and latest vSphere. When I set to 0, hardware accelaration in advance settings->DataMove, SvMotion is far far better then before. It's a little bit frustrating about VAAI performance. So someone to have similar experiance with v7000?

Thanks,

↧

how to enable all connected hard disks in VMware?

January 5, 2016, 11:44 pm

Latest and popular articles on VMWare Virtualization

≫ Next: vmotion vs svmotion - Throttle via vDS Port Group

≪ Previous: vSphere 5.5 U1 and IBMv7000, very slow performance on SvMotion and clone (VAAI)

how to enable all connected hard disks in VMware?

↧

vmotion vs svmotion - Throttle via vDS Port Group

January 7, 2016, 7:12 pm

Latest and popular articles on VMWare Virtualization

≫ Next: Multipath probing in VMware ESXi 6.x

≪ Previous: how to enable all connected hard disks in VMware?

Greetings,

Recently I was attempting to storage vmotion a guests disk from local storage to our SAN. It reached about 52% and we started to get a flooded with server alerts of systems timing out. The svmotion task eventually timed out and systems went to responding normal again.

We utilize 2 vDS in our environment, one for iSCSI traffic to the SAN and another for production network traffic. We have a port group assigned for "vmotion" traffic which is on our production network. In efforts to throttle the storage vmotion I modified the port group that we have for the "vmotion" vlan.

While running the storage vmotion, I ran esxtop and was monitoring the vmnic and noticed it was not utilizing the production vds at all, however I was seeing traffic across the iSCSI vDS. When I ran a standard vmotion, moving a guest from one host to another, I could see the traffic hitting the vmnic assigned for vmotion.

So my question is, how can we throttle back storage vmotion traffic if its not utilizing vmnic that has been assigned the vmotion role?

Cheers

↧

Multipath probing in VMware ESXi 6.x

January 8, 2016, 4:58 am

Latest and popular articles on VMWare Virtualization

≫ Next: IBM DS3400 Hardware Acceleration (VAAI)

≪ Previous: vmotion vs svmotion - Throttle via vDS Port Group

Basically I trying to understand how does ESXi host select a path during path state change event i.e how does ESXi gives weightage to a path based on all RTPG responses which it receives from the available target ports

I have a ESXi 6.x host with a clustered storage (Netapp cDOT)

During a takeover event in a 4 node Clustered where

4 node cluster consisting of (Node1,Node2) a pair and another pair being (Node3,Node4)

Nodes (Node3,Node4) are Out of Quorum (OOQ) that means they cannot sync with other nodes in the cluster

Node Node1 takes over Node2 i.e Node2 going down after transferring the LUN ownership to Node1

TPG ID:

Node1 (1000/0x03E8)

Node2 (1001/0x03E9)

Node3 (1010/0x03F2)

Node4 (1011/0x03F3)

For a given LUN in Node2

Initial ALUA states for RTPG data looks as below

RTPG to Node1

RTPG Data:

Node1 (1000/0x03E8) - ANO

Node2 (1001/0x03E9) - AO

Node3 (1010/0x03F2) - ANO

Node4 (1011/0x03F3) - ANO

RTPG to Node2

RTPG Data:

Node1 (1000/0x03E8) - ANO

Node2 (1001/0x03E9) - AO

Node3 (1010/0x03F2) - ANO

Node4 (1011/0x03F3) - ANO

RTPG to Node3

RTPG Data:

Node1 (1000/0x03E8) - ANO

Node2 (1001/0x03E9) - AO

Node3 (1010/0x03F2) - ANO

Node4 (1011/0x03F3) - ANO

RTPG to Node4

RTPG Data:

Node1 (1000/0x03E8) - ANO

Node2 (1001/0x03E9) - AO

Node3 (1010/0x03F2) - ANO

Node4 (1011/0x03F3) - ANO

Questions:

----------------

1. During the transition stage after a check condition to a I/O command followed by a RTPG response of new AO and ANO paths as below, why is ESXi continuing to route I/O through the same path which is marked as ANO

Is it because the last reproted RTPG from Node4 says Node2 port (1001/0x03E9) is AO ?

ALUA states for RTPG data looks as below and its mentioned in the sequence how its send and received in trace which I analyzed

RTPG to Node1

RTPG Data:

Node1 (1000/0x03E8) - AO (Changed from ANO due to takeover)

Node2 (1001/0x03E9) - ANO (Changed from AO due to takeover)

Node3 (1010/0x03F2) - Unavailable

Node4 (1011/0x03F3) - Unavailable

RTPG to Node2

RTPG Data:

Node1 (1000/0x03E8) - AO (Changed from ANO due to takeover)

Node2 (1001/0x03E9) - ANO (Changed from AO due to takeover)

Node3 (1010/0x03F2) - Unavailable

Node4 (1011/0x03F3) - Unavailable

RTPG to Node3

RTPG Data:

Node1 (1000/0x03E8) - Unavailable

Node2 (1001/0x03E9) - AO (No Change in path states because Node3 is out of quorum)

Node3 (1010/0x03F2) - Unavailable

Node4 (1011/0x03F3) - Unavailable

RTPG to Node4

RTPG Data:

Node1 (1000/0x03E8) - Unavailable

Node2 (1001/0x03E9) - AO (No Change in path states because Node4 is out of quorum)

Node3 (1010/0x03F2) - Unavailable

Node4 (1011/0x03F3) - Unavailable

2. After the takeover is completed for Node2 (i.e Node2 completely down) RSCNs for Node2 were received from switch followed by RTPGs with the below mentioned states reported by target , why is ESXi going into an endless loop of path probing/RTPGs ? Is it because the last reproted RTPG from Node4 says Node2 port (1001/0x03E9) is AO ? when its really down and the host knows about it from the RSCN received?

ALUA states for RTPG data looks as below and its mentioned in the sequence how its send and received in trace which I analyzed

RTPG not send to Node2 as its down after takeover

RTPG to Node1

RTPG Data:

Node1 (1000/0x03E8) - AO

Node2 (1001/0x03E9) - Unavailable

Node3 (1010/0x03F2) - Unavailable

Node4 (1011/0x03F3) - Unavailable

RTPG to Node3

RTPG Data:

Node1 (1000/0x03E8) - Unavailable

Node2 (1001/0x03E9) - AO (No Change in path states because Node3 is out of quorum)

Node3 (1010/0x03F2) - Unavailable

Node4 (1011/0x03F3) - Unavailable

RTPG to Node4

RTPG Data:

Node1 (1000/0x03E8) - Unavailable

Node2 (1001/0x03E9) - AO (No Change in path states because Node4 is out of quorum)

Node3 (1010/0x03F2) - Unavailable

Node4 (1011/0x03F3) - Unavailable

3.And why is ESXi always probing or sending RTPGs to path in the below order only

Node1 (1000/0x03E8)

Node2 (1001/0x03E9)

Node3 (1010/0x03F2)

Node4 (1011/0x03F3)

↧

IBM DS3400 Hardware Acceleration (VAAI)

March 12, 2015, 10:49 pm

Latest and popular articles on VMWare Virtualization

≫ Next: Datastore running out of disk space

≪ Previous: Multipath probing in VMware ESXi 6.x

Hi there,

I have been searching whether Hardware Acceleration (VAAI) is supported on IBM DS3400, it appear to be yes (VAAI-Block HW Assisted Locking, Full Copy, Block Zero) according to VMware's compatibility guide as follow:

VMware Compatibility Guide: Storage/SAN Search

But in our environment, all datastores are either showing "Unknown" or "Not supported". I understand that I have to set "1" for both options from "Datamover" on the Advanced Settings from the host.

However, am I still missing something here? Are there any settings that have to be done on the DS3400?

Thanks,

Grant

↧

Datastore running out of disk space

January 8, 2016, 8:23 am

Latest and popular articles on VMWare Virtualization

≫ Next: VMFSKTools usage help for white space reclaiming of netapp storage

≪ Previous: IBM DS3400 Hardware Acceleration (VAAI)

Is anyone have an issue where there datastore is running out of space. It is chewing up about 1GB per day. I am running ESXi 5.5 I have two VM's each with 1TB thick provisioned. Is there an area that I can check to remove logs or something. I have checked to make sure there is no snapshots or log files. Running out of time and space...

↧

VMFSKTools usage help for white space reclaiming of netapp storage

January 12, 2016, 6:51 am

Latest and popular articles on VMWare Virtualization

≫ Next: Too Many VMDK files

≪ Previous: Datastore running out of disk space

Hi all,

I hope you can help with a couple of questions below as we're wanting to reclaim some white space from our filers but I've got various bits of info from other sources so would like a bit of clarification;

1 - Using the tool I keep reading you can release xx% of your free space, so does this mean if I think I can reclaim 25% of my total storage of 4TB, then can I run the command with the -y 25 and reclaim 1TB or will it reclaim 256GB (i.e. 25% of 1TB)? Can this reclamation process keep being done until I get close (not all I understand) to reclaiming the white space?

2 - I have taken the warnings of not trying to reclaim 100% of free space because you can fill up your storage during the process so can I tell vmkfstools to use an external HDD/Storage area to run the reclamation? i.e. old server with several large HDDs all formatted to required protocol.

As for everything else recommended or advised we are ready to go, we're all just a bit unsure of how often this can be done, how close to the full size we can go and very importantly the last part above - Can we use an external source to do the work of the white space reclamation, i.e. like a page file swap system?

Many thanks for any help you can provide,

Daiman

↧

Too Many VMDK files

January 12, 2016, 8:38 am

Latest and popular articles on VMWare Virtualization

≫ Next: SAN Fabric for VMware

≪ Previous: VMFSKTools usage help for white space reclaiming of netapp storage

Hello All,

I have a server with an odd issue that i need some assistance with. This is a security camera server that keeps locking up due to a hard drive issue. We have a data drive assigned as the F drive and from time to time it will get to a point that the server locks up. While troubleshooting i find that the F drive cannot be accessed. The quick fix is for me to just remove that drive and add another one. Problem now is that there is a Flat file and 4 or fie delta files to a disk that no longer is there. I placed the new drive in another datastore. There is also a VMDK file that has an odd name ...camera-ctk.vmdk. There is no hard drive with that name. Can these be removed with no issues or is there a specific way to go about getting rid of them?

↧

SAN Fabric for VMware

January 14, 2016, 5:24 am

Latest and popular articles on VMWare Virtualization

≫ Next: Datastore VMFS and physical volumes (LUNs) loses

≪ Previous: Too Many VMDK files

I on a contract now and the group I'm working for is looking at a new SAN fabric design, and there are many considerations for this proposed design that has left me with some unanswered questions that I could use some expert knowledge.

There are many hops between the vSphere hosts, which are hp blades, the data will leave the blade and travel through the following path:

1) FCP out of the chassis via a brocade san module

2) into patch panel at top of rack

3) out of patch in another rack to a cisco 5k

4) FCP becomes FCoE out of this 5k and travels into another patch panel

5) FCoE out of patch panel into cisco 7k and then into patch panel

6) FCoE out of patch panel into another 5k where it then reverts back to FCP and enters patch panel

7) FCP out of patch panel into Storage array

I am concerned with the # of hops, even if they are just patch panel without live electronics

I am also concerned with the FCP to FCoE and back to FCP

All this movement for a VM to see its storage....

Is this design going to kill the performance of the new storage? I don't need recommendations on a new design, I need to understand whether the # of hops and the FCP to FCoE is a bad move and why..

↧

Datastore VMFS and physical volumes (LUNs) loses

January 15, 2016, 5:55 am

Latest and popular articles on VMWare Virtualization

≫ Next: Oracle Linux iscsi connection - direct connection or new hard disk?

≪ Previous: SAN Fabric for VMware

Hi,

In our VMware infrastructure we have some clusters with 8-9 nodes, datastore created on LUNs of third part storage array connected in SAN, and a vSphere version 5.1

The questions are the following:

1) When a datastore loses one or two LUNs and these are no more available (due a LUN deletetion on the storage array side), what are the chances we have to recovery all VMs that uses the corrupted datastore? Apart the backup/restore or the Journaling of Filesystem.

2) Has the VMFS, from a Volume Manager point of view, mechanisms to preserve/recovery corrupted or missing data as does LVM on AiX OS? FYI: AiX LVM applies a kind of data parity on all physical volumes and a quorum used to recovery data when a physical volume is missing.

3) can you give us more informations/documents about the VMFS anatomy and how it works in more details?

↧

Oracle Linux iscsi connection - direct connection or new hard disk?

January 18, 2016, 1:04 am

Latest and popular articles on VMWare Virtualization

≫ Next: Profile-Driven Storage

≪ Previous: Datastore VMFS and physical volumes (LUNs) loses

We are configuring a new iscsi production environment with iSCSI. We've installed four Oracle Linux VM's and I would like to ask which is better option to attach Guest VMs to iSCSI SAN;

1. Direct connection from each VM to SAN

2. Connect LUNs/volumes to vSphere (making a new datastore) and add them as a new hard disk to VMs

We are having a HA with two ESXi 6 hosts (vSphere Enterprise), both of them attached with four NICs to iSCSI (two NICs per switch). Storage we are using is three Equallogic SANs.

↧

Profile-Driven Storage

January 18, 2016, 2:49 am

Latest and popular articles on VMWare Virtualization

≫ Next: Read/Write Storage

≪ Previous: Oracle Linux iscsi connection - direct connection or new hard disk?

If not using Storage Profiles can the Profile-Driven Storage service be disabled? The Java machine it runs in appears to take up an enormous amount of memory.

↧

Read/Write Storage

January 18, 2016, 8:17 pm

Latest and popular articles on VMWare Virtualization

≫ Next: Hosts disconnects from vCenter when scanning for new volumes

≪ Previous: Profile-Driven Storage

Suppose, I have two ESXi in vCenter (not cluster configuration) and read / write the same partition. So data can crash or not?

↧

Hosts disconnects from vCenter when scanning for new volumes

January 20, 2016, 12:41 am

Latest and popular articles on VMWare Virtualization

≫ Next: What is the difference between Full clone vs native clone vs lazy clone

≪ Previous: Read/Write Storage

Good day all

I am busy setting up a new environment end to end. In short, New vCenter Server running VC 6.0 U1, New Dell M630 Blades running Esxi6.0 U1 with latest patches using Qlogic 57810 HBAs, New NettApp 8040 SAN.

All the blades are being managed by the new VC. After having assigned a few LUNs I re-scanned the HBAs for new storage devices and volumes. For some reason the host disconnects from VC every time when doing so. Simply reconnecting the host puts it back in VC but it disconnects immediately again once i scan for new volumes.The hosts remains up and pingable in the background but needs to be manually reconnected after every scan.

I'm not finding much related info elsewhere online so assuming it's not that common an anomaly. Any nudge in the right direction would be appreciated as deadlines are looming.

↧

What is the difference between Full clone vs native clone vs lazy clone

January 20, 2016, 12:44 am

Latest and popular articles on VMWare Virtualization

≫ Next: HP VSA 11.0 (2014) - Bad storage performance

≪ Previous: Hosts disconnects from vCenter when scanning for new volumes

When reading VAAI NAS certification guide, I noticed that there are some test cases which perform Native full cloning or native lazy file cloning. I am really confused, what's the difference between Full clone vs native clone vs lazy clone. Anyone can help?

↧