2014
01.06

I have been working with vCloud over 3 years now and directly administering several 1.5 and 5.1 cloud instances for over 2.5 years. Anyone who has worked with a scaling vCloud knows that you may run into some form of an issue revolving around networking at some point or another. The most consistent issue I ran into in my 1.5 vClouds was when under load if several vApps were undeploying sometimes the portgroups in vCenter would not get cleaned up. The main culprit of that was most likely vShield which in the vCloud 1.5 era was limited to 32bit and could quite easily run into memory issues.

So to the point, as my clouds scaled I found myself starting to spend more and more time cleaning these portgroups up manually which wasn’t complicated once you figured it out but it was time consuming.

The basic process was:

  •   Go to stranded items in the system admin pane on vCloud
  •   Select all the portgroup items there (they would appear as something like dvs-networkname-unique-hash)
  •   Try to delete them
  •   Watch the vCenter activity pane simultaneously and watch for errors
  •   When it erred click the error to be brought to the portgroup (sometimes the stranded portgroups will remove them self cleanly)

Clean up what’s left in 1 of 2 ways:

  • If all the was there was a vShield EDGE VM (prefixed with VSE) power off and delete from disk (you can delete it because the standard deployment process of vCloud is to create a new one from template)
  • If there were several VM’s disconnect the nics and reconnect them from the vCloud interface (which was time consuming)

So, the latter is the one I dealt with as it was more tedious. What I did was with the vCloud API is create a script that finds the vApp name with the issue, goes through each VM and disconnects and reconnects the VMs. This has saved me hours and hours of work because while the portgroup issue may not happen often it happened enough to be noticeable.

Here is the script below:

#!/usr/bin/php
<?php 
   if ($argc < 2) {
      echo "Usage: resetnics <orgname> <vapp>\n";
      exit(1);
   }

   $orgName = $argv[1];
   $vAppName = $argv[2];

   $home = getenv('HOME');
   require_once $home . '/etc/config.php';

   // Initialize Global parameters
   $httpConfig = array('ssl_verify_peer'=>false, 'ssl_verify_host'=>false);

   // login
   $service = VMware_VCloud_SDK_Service::getService();
   $service->login($server, array('username'=>$user, 'password'=>$pswd), $httpConfig);

   $orgRefs = $service->getOrgRefs($orgName);

   if (!empty($orgRefs))
   {  
      foreach ($orgRefs as $ref)
      {
         echo $ref->get_name() . ":\n";
         $sdkOrg   = $service->createSDKObj($ref);
         $vdcRefs  = $sdkOrg->getVdcRefs();
         $sdkVdc   = $service->createSDKObj($vdcRefs[0]); //our cloud only has one vDC per ORG currently, can cheat here
         $vAppRefs = $sdkVdc->getVAppRefs($vAppName);
         if(!empty($vAppRefs))
         {
            foreach ( $vAppRefs as $vARefs ) {
               echo "vApp " . $vARefs->get_name() . "\n"; 
               $sdkvApp = $service->createSDKObj($vARefs);
               $status = $sdkvApp->getStatus();
               $vmRefs = $sdkvApp->getContainedVmRefs();
               if(!empty($vmRefs)) {
                  foreach ($vmRefs as $vRefs) {
                     $sdkvm = $service->createSDKObj($vRefs);
                     $status = $sdkvm->getStatus();
                     //Disconnect NICS
                     echo $vRefs->get_name() . " Disconnecting NICS...\n";
                     $nics = $sdkvm->getVirtualNetworkCards();
                     $nics_rasd = $nics->getItem();
                     $updatedNics = new VMware_VCloud_API_RasdItemsListType();
                     foreach($nics_rasd as $nic_rasd) {
                        $nic_value = $nic_rasd->getAutomaticAllocation();
                        if($nic_value->get_valueOf() == "true") {
                           $nic_value->set_valueOf("false");
                           $nic_rasd->setAutomaticAllocation($nic_value);
                        }
                        $updatedNics->addItem($nic_rasd);
                     }
                     $task = $sdkvm->modifyVirtualNetworkCards($updatedNics);
                     $service->waitForTask($task);
                     //Reconnect Nics
                     echo $vRefs->get_name() . " Reconnecting NICS...\n";
                     $nics = $sdkvm->getVirtualNetworkCards();
                     $nics_rasd = $nics->getItem();
                     $updatedNics = new VMware_VCloud_API_RasdItemsListType();
                     foreach($nics_rasd as $nic_rasd) {
                        $nic_value = $nic_rasd->getAutomaticAllocation();
                        if($nic_value->get_valueOf() == "false") {
                           $nic_value->set_valueOf("true");
                           $nic_rasd->setAutomaticAllocation($nic_value);
                        }
                        $updatedNics->addItem($nic_rasd);
                     }
                     $task = $sdkvm->modifyVirtualNetworkCards($updatedNics); //restore original.
                     $service->waitForTask($task);
                  }
               }
            }
         }
      }
   }
   echo "\n";
   $service->logout();
?>

What this script does is with the supplied Organization and vApp name it will scan through the vApp inventory of that Organization until it is found. Once the vApp is found it will cycle through the VMs of that vApp and look through the OVF RASD Items List for a specific value unique to being a NIC. Once a NIC was identified, update it to disconnected but continue checking that VM for NICs and update them as well. Once all the NICs have been modified to disconnect on that VM apply the change back to vCloud and wait until completed. After completion reverse the change and apply and wait again. This leaves the vApp networking intact, no need to worry about the IP settings of the VMs, just disconnect-> reconnect-> complete. After this script has completed and finished with the vApp the next time the user tries to start the vApp or the next time you try to delete the stranded item it should clear without fault.

Now you may notice I don’t use the query service here, this is one of my older scripts before I had learned the query service however I do frequently still use this manner because it’s easy to quickly grab one of my existing scripts and add new functionality to it in a short manner of time. Usually I’ll only switch to query service if time is a significant issue with what the script is doing.

Facebook Comments

No Comment.

Add Your Comment

You must be logged in to post a comment.