What is your Cloud Strategy?

Hello Again,

It is a new year and a topic worthy of writing started to form in my head through the holidays. In the past year and a half, I have discovered that it is extremely challenging to write original content. For those of you who do this on a regular basis, you have my utmost admiration.

In this entry, I will share some thoughts on why you need a Cloud Strategy and how to go about adopting a hybrid cloud approach. In a future post, I will share some approaches to a relatively easy application migration to the cloud.

As part of my new role as a Cloud Strategist,  I spent a large part of 2019 advising decision makers at various enterprise business on adopting the Cloud model. I have visited many customers over the course of the year and Cloud Adoption or Cloud Migration were some of the common themes that the key decision makers were pondering over. Over the course of many white board sessions, many hours of contentious discussions and back and forth view points and based on a lot of reading, I came up with this white board for Cloud Strategy. It isn’t meant to go into significant detail, it is meant to help you formulate a high level plan.

When I first started, I wasn’t really sure of the function of a Cloud Strategist. When I read the Gartner Report: The Cloud Strategy Cookbook, 2019 it served as validation that such a function is needed. The report further validated what we were advising customers.

Any migration conversation starts with the 6R’s of migration which was also originally published by Gartner as the 5R’s, in 2011. Every IT decision maker has a Cloud mandate. In many cases, they even have a mandate to exit private or on-premise data centers. That much is certain. What isn’t clear is the path to do the same, especially to the people lower down in the organization who will eventually own this task.

image0

Start with the WHY?

In other words, what is the expected Business Outcome that is expected from this mandate. In majority of the cases, it falls under one of two categories.

  1. Cost savings ($)
    • Cloud Agility, Flexibility, Scale to drive down cost of supporting the business
    • Data Center consolidation/Evacuation
    • Migration of workloads
  2. Drive business innovation
    • Cloud EcoSystem to support faster development cycles
    • IT as a business enabler

It is important to communicate this widely across the organization that will support this endeavor.

Migrate to VM form factor? Container?

The first task at hand it to identify if all new development will go to cloud and if it will use Cloud Native technologies such as containers, micro-services, managed services, serverless functions, alerts, monitoring and management functions that are commonly offered by every big cloud vendor. There are very few reasons to not go this route for new application development. Here is a sneak peak into the evolving eco-system.

That leaves us with how we can take the current monolithic applications to the cloud.

There are even some options to containerize existing VM applications and drop them in a public cloud provider in a ‘fat container’ format. GCP Anthos is one such solution. While one can argue that there are inherent benefits to this approach i.e. IT team no longer manages OS, patching, availability of these apps; this approach is more risky and less beneficial than adopting true cloud native approach which may require a complete redesign of the application.

Which TEAM will support this?

Digital Transformation first starts with People, then Processes and lastly the Tools. It isn’t that the tools are not important, but the mindset of the People needs to be adapted to thinking very differently.

The Fellowship of the Cloud Council

Most customers who successfully adopted a transformative approach started with a Cloud Council. This is a hybrid team that consists  of team leads with varying expertise in networking, security, storage, virtualization etc. This team typically reports to the project lead and could potentially report into the VP or CIO to provide updates. The task for this team is to create a standard framework or template, which will meet the requirements set forth by this team. It is important that the members in the Cloud Council team function as one Hybrid Team and communicate relevant information to their own teams and continuously drive the projects assigned to them. It is also required to constantly communicate changes. Chaos reigns supreme during any kind of transformation. By bringing along the team, everyone in the IT organization is motivated and understand how their role is a critical part of the larger effort.

In House Cloud Expertise

If specific public cloud expertise is lacking in house, it is best to bring in partners who specialize in this area. This provides a valuable function – to learn from the missteps of others that have been supported by the partner.

Needless to say, the choice of partner is very important to the success of the project. Do not naturally assume that any existing data center focused channel partners are default options. Cloud Adoption requires a different mindset, dare to reconsider which partner will suit your needs best. It is time to put existing partnerships to the test. Many organizations fail in picking the appropriate partner, dooming their project from the very start.

If a cloud provider has been decided, a paid sales engagement with the cloud provider is also a very good approach, atleast initially. If that choice is yet to be made, the Cloud Council needs to consider various aspects such as

  • Workload requirements and nature of workload
  • Developer skillset and requirements
  • Eco-system of cloud provider
  • Vertical that the business is in and any impact the cloud provider choice may have – e.g. are they considered competitive to your core business?
  • Existing relationships with vendors

Next up is to socialize the Cloud Framework across the teams. The Cloud Framework must be treated as a continuously evolving framework and defines specific guidelines and boundaries with respect to Security, RBAC, Identity Access and Management, Monitoring, Alerts, Visibility. In addition, well defined response procedures and workflows must be highlighted.As the adoption curve increases, this document will evolve.

Start Small: Fire bullets before Cannonballs

It is of utmost importance to start small. What that means is to allow one IT team to lead the first migration effort, rather than have the whole IT organization jump into a dev-ops mindset and making radical changes. This will lead to confusion and result in a lack of motivation among large teams. Legacy companies especially need to be aware that not all dev-ops practices will naturally fit as it would for a SaaS provider and it may need a lot of tweaking to be relevant.

Lastly, define the success criteria. This is an important metric to have before the start of the actual migration. Some benefits are harder to quantify in the short term. For instance it is harder to quantify savings from moving a couple of VM’s from an application. Criteria should be defined accordingly. 

In Part 2 of this blog, we will delve into the different phases of migration.

PS: Hat-tip Prabhu Barathi @prabhu_b for reviewing my work and providing me valuable feedback.

Tagged with: , ,
Posted in Uncategorized

Unboxing of VeloCloud Edge 510

 

Recently, I got an opportunity to get my hands on some edge hardware (Working for VirtZilla, it took 2+ years and an acquisition for me to say that!) and I decided to set it up at home.

I was one of the lucky few to get a VeloCloud Edge 510 device. This is mostly meant for branch offices, but this little nifty device can do many things. Before we get into it, lets take a quick look at VeloCloud.

VMware acquired VeloCloud in December 2017. VeloCloud is a cloud networking services company that simplifies branch WAN networking. This acquisition is part of VMware’s overall SDDC strategy and continued push into networking.

The main value proposition for VeloCloud is below:

  1. Improves business up time by making internet reliable and independent of expensive and dedicated MPLS circuits.
  2. Dynamic Multi-Path Optimization to leverage multiple internet connections- including 4G LTE to optimize utilization as well as route around failures.
  3. Assured application performance by prioritizing time sensitive applications such as voice and video.
  4.  Allows higher priority to SaaS applications or cloud based applications by use of VeloCloud Gateways which can reside in public cloud providers such as AWS.

One unique feature that VeloCloud can deliver is the ability to switch uplinks upon failure without dropping voice calls. This is made possible by patented IP such as the aforementioned DMPO.

DMPO also reorders UDP based flows such as voice and video. DMPO can also work around a lossy network and improve performance. This is done by duplicating packets when loss is detected, in order to keep TCP sliding windows size at maximum.

Here is an example of what that could mean for a file transfer session:Screen Shot 2018-04-11 at 11.49.39 PM

Now that we covered what the VeloCloud SD-WAN solution does, lets take a look at how easy it was to set it up.

I received this device in the mail and opened it up – I was pleasantly surprised to see the attention to detail to packaging. Here are some photos from what it looks like:

img_7306.jpg

img_1371-e1523516861131.jpg

I plugged in the device and followed the instructions to plug in the the uplink port from my Wireless router into GE3. I had to login to the temporary wifi SSID and change a few things to ensure there wasnt any IP overlap with my existing network. Soon after, I was able to get an IP address. As part of this process, I was already added to a VCO- VeloCloud Orchestrator. As an admin setting up a branch office, you would set up a new edge from the VCO, by using a pre-created profile.

Screen Shot 2018-03-27 at 4.41.38 PM

Once the new edge was provisioned,  I was able to generate an email with the activation key and a softlink to activate the physical edge. The activation itself is uneventful from that point on.

Screen Shot 2018-03-27 at 5.01.00 PM

Once the device is activated, you can control and manage the device from the VCO. From the VCO, you can change the interface configuration, set up primary and back up uplinks, or set up multiple active-active uplinks. For my setup, since I did not have multiple uplink connections, I used the USB port to set up a 4G LTE uplink as a second active uplink. I planned to do some failover tests at a later point.

Here are some screenshots of applications that I am running through my VeloCloud Edge and some quality scores as well. To make things more interesting, I started some Netflix and YouTube in the background. With in 3 minutes those applications popped up as well.

Screen Shot 2018-04-12 at 1.16.12 AM

Another really useful tool is where the edge is able to display the quality of the uplink. The VeloCloud Enhancements bar includes remediation by DMPO.

Screen Shot 2018-04-11 at 11.00.58 PM

In summary, the VeloCloud Edge was very easy to setup and administer. In the second part of this blog, we will cover architecture and a link failure demo.

Leave your comments below.

PS: Hat tip to George Shih (VeloCloud SE) or helping me with this blog.

 

Posted in NSX, SD-WAN, VeloCloud

What’s new in NSX 6.4!

NSX 6.4 release was announced in January. This is a dot release – one that brings many major features. The new features are broadly classified into Core Feature enhancements, Advanced Micro-segmentation and Ease of Use and Serviceability features,

Head over to the NSX 6.4 Official page here, in the meantime here are some thoughts.

Context Aware Micro-segmentation

The primary new feature that 6.4 delivers is the ability to provide L7 granularity in the Distributed Firewall. DFW adds layer-7 based application context for flow control and micro-segmentation. Initially, approximately 60 commonly used application signatures are supported in this release.

This official VMware blog does a great job of explaining the Context Awareness in a great detail. At a very high level ther users can now use the APP ID to provide a more granular policy. This allows security policy based on applications, even if they don’t use the standard ports.

There is also RDSH or Multi Session Identity Firewall which allows application access on a shared desktop based on user ID. In other words, two users accessing the same desktop can be provided access to different applications based on their user group affiliations. NSX previously supported Identity Firewall based on integration with Active Directory. This provides granularity at the Virtual Desktop level. See a detailed demo of RDSH here .

Ease of Use and Serviceability

Upgrade Co-ordinator: This feature is bound to significantly ease any NSX System upgrade pains. Upgrades can now be co-ordinated and managed from the NSX Upgrade Co-ordinator which offers a single pane to manage the upgrade of various components. This handy features also allows you to automate the upgrade process. The tool also performs a pre-check to ensure system upgrade will only if the system is healthy to begin with.

Several features are now available under HTML5 (as well as Flash). They have the same feature functionality.

Here is a detailed video of 6.4 Upgrade Co-ordinator.

Upgrade Co-ordinator offers two modes- Custom upgrade plan and One Click Upgrade. It also provides an inventory of the NSX components and lists the current version and target version. NSX manager upgrade is required as a precursor to this step.

Screen Shot 2018-02-21 at 9.50.19 PM

One Click Upgrade pre-defines the upgrade sequence and displays the settings which cannot be modified. The “Plan your upgrade” option allows you to choose which components you want to include in the upgrade and provides the some other knobs.

Screen Shot 2018-02-21 at 10.09.11 PM

Lastly, this also allows you to monitor the upgrade progress for each component.

Packet Capture

NSX 6.4 now allows the user to capture packets from the Web-UI itself.  While this was previously supported via CLI, users can now start a packet capture for debugging purposes from the Web client without necessarily being familiar with the CLI. There are 4 points along the data path where packets can be captured: Physical, VMKernel, vNIC and vDR port. A more expansive list is available via CLI. User must specify the direction of the traffic to be captured. Only one direction is supported but multiple sessions can be created for rx and tx and then combined using Wireshark.

Screen Shot 2018-02-27 at 11.44.15 AM

Support Bundle

Support bundle can now be collected from the UI and uploaded to a remote server directly. Support bundle can be collected for NSX Manager, Hosts, Controllers and Edges by selecting from a drop down and choosing which objects to include.

Screen Shot 2018-02-27 at 11.50.01 AMNSX Dashboard – improvements

Lastly, as you have probably noticed from 6.3, the dashboard continues to gets enhanced with handy widgets that are a great for a single point of visibility into NSX component health. System Scale is a new widget which provides alerts and warnings when you approach scale limits. Here is a snapshot of the new and improved dashboard.

Screen Shot 2018-02-27 at 11.24.38 AM

In addition, a new tab for System Scale provides a global view of Object Types and their counts per NSX manager. This includes Firewall Rules, Security Groups, number of hosts prepared etc. Screen Shot 2018-02-27 at 11.55.05 AMOther Blogs related to this topic:

VMware NSX-v 6.4 Released

https://www.vmguru.com/2018/01/nsx-v-6-4-is-here-and-massive/

 

Posted in Uncategorized

Making the case for VMC – VMware Cloud on AWS

If you are in the data center industry, you probably already heard about the partnership between VMware and AWS. The excitement around all of VMware’s cloud initiatives was palpable at VMworld 2017. As I rode the shuttle back to the hotel from VMworld, I overheard a common theme- customers talked about how VMware’s newfound vision would help them adapt to the changing data center landscape.

A couple of days later, as I started dropping in on discussions on the web, I realized that there was a lot of confusion on what VMC it is and what it isn’t. On a LinkedIn discussion, there were many admins who were comparing the price of running a VM in AWS to that of VMC. If you are asking this question- you probably misunderstood the value of VMC.

VMC is not meant to compete against AWS’s native offering. Rather, it is offered as a low resistance, immediate path to adopt the elasticity of AWS, without having to retool and relearn new skills. Here is a recap of the three main use cases of VMC:

  1. Maintain and Expand: Some customers looking to continue maintaining their DC and to expand services into the cloud. If some developers need access to native AWS services like Lamda or Kinesis, this is a really fast way to provide them with all of AWS’s offerings, without having to learn how to manage an AWS environment at scale. There are other use cases such as DRaaS which are appealing.
  2. Consolidate and Migrate: Some other customers are looking to consolidate their data center and begin a migration process to the cloud. Many smaller customers are freezing their DC spend and expanding into the cloud. This may make sense depending on the size of the footprint & the type of data they handle. Even small data centers need a team comprised of security, virtualization, networking  and storage skill-sets. Smaller businesses could see significant savings of OpEx and CapEx by adopting a public cloud strategy. Migration of existing data center workloads is not a trivial problem. This solution will take them one step closer to that goal.
  3. Capacity Flex: This is another great use case for a customer that needs seasonal capacity. Rather than invest in data center hardware that may otherwise remain idle for large periods of time, they can cloud burst into AWS, but manage their environment with the same  familiar tools and operational overhead- vSphere, vROPs, LogInsight etc.

The key takeaway here is the ability for a business to be able to benefit from an AWS footprint without having to re-architect their application, learn new skills and review deployment models.

DR as a Service alone is such an important feature of VMC; it is very unique and will appeal to IT operations staff. How many businesses today are confident of executing their DR run book? The answer based on the conversations I have had, lies in the low 20-30%. Imagine having an on-demand DR site, where you don’t have idle capacity that is adding to your OpEX (not to mention CapEx),  but it can be summoned in the case of a DR event. When I paint this picture to the VP of IT Operations, their eyes literally light up!

Having the ability to extend IP spaces from your data center to the cloud will take away the hassle of re-IP-ing workloads. Your DR strategy just became super simplified.

In my opinion- if you have a cloud native workload, whose life-cycle is completely in the public cloud – it is better off on AWS EC2. If there are existing applications that were designed for your VMware private cloud but needs the flexibility or services of a public cloud, this is a good candidate for VMC.

This is the value that VMC brings. So, next time if you are asking the question “How does it compare in cost to AWS?”, you are asking the wrong question.

For additional information on VMC please go to- https://cloud.vmware.com/vmc-aws

Posted in DC migration, NSX, VMC

NSX Proactive Health Check

The idea for this post came from an activity that we engaged in with one of our premier customers who was just entering their Q4 peak. We offered them a proactive health check before they head into peak season.

Think of this as the equivalent of – “check your oil, tire inflation and head lights before you head into the hills” for your NSX footprint.

Here is a quick checklist of Health Check items. This list is not meant to be comprehensive, but lists a few common sense techniques that can be addressed prior to a forecasted peak or in a scheduled interval.

  1. NSX manager: Check if NTP, DNS and Syslog is setup properly.Screen Shot 2017-10-13 at 1.17.23 AM
  2. NSX manager: Regularly download tech support bundle so that you have a known good state handy.
  3. NSX manager: Check CPU, memory and storage usage on NSX manager. Temporary peaks for CPU and Memory are acceptable.Screen Shot 2017-10-13 at 1.15.12 AM
  4. NSX manager: Restore NSX manager backup on a test NSX manager appliance to check integrity of the file. This need not be a weekly or even monthly activity, but should be done as needed by the business. (Eg, before an expected peak)
  5. Check system events under NSX manager on the vSphere plugin and address any critical alerts.Screen Shot 2017-10-13 at 1.30.22 AM
  6. Login to ESG’s and run the “show highavailability” command – look for sync status between active and standby instances and  rx/tx errors. A few errors are acceptable but continuously increasing error count can point to a larger issue. (HA will be turned off if ECMP is enabled.)Screen Shot 2017-10-18 at 1.00.07 PM
  7. LogInsight: login to the appliance and look for critical errors. Also check to see if all ESXi hosts are reporting syslogs to LogInsight. Admins may have forgotten to add logging capability to newly added hosts.
  8. VRNI: if you also have vRealize Network Insight, check the critical errors on vRNI. Periodically review system alerts and ensure they have appropriate email notifications set up. Remember to use the custom alert feature, this is easy to set up and can make a big difference in avoiding a crisis.
  9. VRNI System Alerts: vRNI provides 110 system alerts (and growing) that are already set up and only need to be toggled to enable with appropriate option for notifications. These are of varying severity but provide a proactive way to monitor critical infrastructure.  Screen Shot 2017-10-20 at 11.02.30 AM

If there is anything else that comes to mind- please leave a comment below and I will add it to this list.

Tagged with: , ,
Posted in Uncategorized