What is Vmware on AWS in my own words!!

Off course, majority of my network guys are already aware of this concept, just I am posting from my prospective by adding network info because end of the day application performance  not only depends on code, infrastructure also plays a key role!

In simple manner – AWS is COLO for Vmware cloud and top of that, vmware does not have datacenters across the world so, they are leveraging  AWS datacenters and calling it as “VMWARE Cloud on AWS”

My prospective, both are two different public cloud entities which means, who is expert on AWS can`t work on Vmware cloud and vice-versa !

      1) VMWare has placed their infrastructure in AWS and built public cloud with Vspehre,VSAN and NSX

    2) You cannot consume any of AWS services (EC2/EBS..so on) and also, terminology  in vmware is completely different.

  3)  Just you create SDDC on vmware cloud (public cloud) and extend to your        vsphere on-prem.

  4)  Once you create SDDC on vmware cloud automatically VPC gets created to AWS and you will get access to AWS work loads

  5)  Latency between Vmware and AWS cloud is less than 1 ms(I was told by one the expert, I suggest we should try this before conculde)

  6)  How the AZ(Availability Zone)  concepts works, just placing their infra in different AZ`s in same location and extend VSAN and NSX

  7) As of now, they have foot prints in couple of regions in USA and ASIA(Australia) – I don’t have complete data.

8) What is AWS RDS on Vmware – Well I did the lab on the same,  here is the interesting thing,  just create RDS in AWS and add a DB strings of application servers which sits in VMware cloud and it communicates over VPC 🙂 .

     Still I am not why they call it AMS RDS on Vmware… might be they are coming up with      some more options on this, take RDS OVA and deploy on Vmware on-prem something like that but not sure how it works if that is the case then we can call it AWS RDS on VMware

My POV on VXLAN EVPN implementation/Migration challenges !!!

I am not talking about rich features  because we have seen hell lot of articles on the VXLAN EVPN  topic!!

I want to talk about challenges, based on my work experience with ACI and VXLAN EPVN and provided the information

We have seen MP-BGP l2vpn address families but this address family little crazy with excellent features

Agile implementation  – NO

Simple explanation:

In legacy model if we want create  vlan and inter vlan routing , it’s so easy that can be done with few commands  and boom you are ready!!

In VXLAN world, number of steps increases (from 5 to 50 ?). Here is my simple calculation. Just to configure VLAN/VXLAN and enable inter VLAN/VXLAN routing…

First you have to focus on two things “UNDERLAY and OVERLAY” setup

  • Create VLAN and map to VNID
  • if you decided to use VPC(little tricky)
  • Configure VXLAN tunnel interface (nve) , associate l2vni and enable bgp for host reachability
  • EVPN  stuff  configuration in MP-BGP
  • Same configuration will be applied on all leafs (off course, it depends), but in virtualization world we require VM mobility obviously, will end up with configuring  on all leafs.
  • On top of that you have to ensure all features are enabled on switch to support VXLAN EVPN

Hold on!! so far we have configured VLAN/VXLAN bridging. Need to enable inter VLAN/VXLAN routing 🙂

  • Create a separate VLAN and map to VNID (if you are using multiple tenants, again you to follow all the steps 🙂 )
  • SVI for l3vni  and associate with vrf
  • Associate with nve
  • Enable evpn in mp-bgp

Ohh!!!  I am done  with VLNA/VXLAN creation and inter vlan routing

Complex: Partially YES

Obviously, by looking at above  do you think its easy to troubleshoot

Missing important  item , as all we know that VXLAN EVPN technology is very good at handling BUM traffic… yes, to handle this traffic need to configure Multicast( Sometimes troubleshooting multicast is Nightmare)

Cost effective: NO

  • Again hardware based solution ??  yes, we need to go for  switches which supports this technology(cisco/Arista)  Training to Operations team  and so on…!!!
  • More Time consumption for break fix

Trouble shoot:  Difficult

Off course, type of issue but remember  need to keep eye on all these areas.

Underlay, overlay and multicast.

Verify BGP process, l2vpn,nve peers and so on.. if look at the below route table for one IP, you have to study so much of stuff to understand why the IP/MAC  is not reachable.

vxlan epvn


Source : Above pic is from cisco

Off course, Cisco has couple of tools for  automation NFM,DCNM and OAM(feature on switch) but not sure how it’s really useful in day-to-day operations.

My suggestion, Look at any SDN products(ACI/NSX/Nuage…) which can do this stuff for you  without much manual intervention and ZERO touch provision…  because  world is moving towards SD**

In my next article, will come up with how to build nexus 9k  virtual lab on Vmware  ( you can build VXLAN EVPN on virtual switches on your own)

I wrote this article my keeping operations team in mind as well!!

Looking for more comments/inputs that can help others !!!

My experience with ACI so far!!

Great to here that cisco announced ACI 3.0 … let me brief my experience with ACI

ACI is fantastic product for green field deployment but for brown field deployment had a bad experience … here are couple of issues.

1) End point fluctuation – if you see more end points moves frequently in fabric, end point process get crashed and eventually, fabric get crashed.

Command to check endpoint move on leaf:

tail -f /var/log/dme/log/epm-trace.txt | grep “EP move” -B 1
tail -f epm-trace.txt | grep “”

2) VTEP IP assign issues – When we added new leaf in fabric, given used VTEP IP to new leaf due to which my production leaf (which has that IP) went down, not sure how it’s assigned the used IP ??because IP assigning to new leafs through DHCP , seems to be ACI not tracking DCHP assigned IP`s ?

3) When registering the new leafs, suddenly 4-5 others leafs got rebooted and impacted.

4 ) Directly connected route issue – if you are mapped Subnet-1 to VRF-1 and later you decided to move \d subnet-1 to VRF-2 but still your subnet-1 shows as directly connected VRF-1 as well VRF-2 which makes that subnet unreachable

Work around – erase the specific leaf configuration (ensure you have dual home connectivity) and reload the leaf – seems to be fixed.

Being a CCIE guy I am not trying to project ACI a bad product , assuming cisco will ensure ACI is the best product because cisco has lunched this product with high expectations , soon will see stable product

We  wish cisco will  bounce back with new code addressing all the issues in software defined competitive world .

Best of luck !!

ACI troubleshoot commands – Part1

I have noted  for myself during troubleshooting, thought to share with all of you. but its not in order, I suggest you all to use APIC controller to get the right information  for endpoint(leaf/VPC ).
firmware path in ACI – cd firmware/fwrepo

show port-chann ext
show vlan ext(will show s/w and h/w vlan, encap vlan)
acidiag fnvread( VTEP information)

acidiag avread (application vector, give APIC controller information(Version/IP/VTEP pool..)
show coop internal info ip-db | grep -A 10 – check endpoint in spine

vsh_lc (To check internal )
show system internal elmn info vlan bri

Type only vsh, gets into NXOS mode

ENd point move checking:

End point move check:

tail -f /var/log/dme/log/epm-trace.txt | grep “EP move” -B 1

tail -f emp.txt | grep -C 10 “MAC ADDRESS”

show coop internal info global – Which spine is primary
show oob – Mgmt IP`s (From APIC)

show system internal epmc endpoint ip
show system internal ethpm event-history interface eth1/17

cat /mit/sys/lldp/inst/if-[eth1/1]/summary – Wiring issues
cat /mit/uni/fabric/compcat-default/swhw-*/summary | grep model
cat /mit/sys/summary – OBB ip

iping -V xyz.vrf -c 1000

From APIC Controller:

show vlan-domain vlan 1226 –> mapped to which EPG and static bindings
show endpoints vlan 1226 – endpoints of specific VLAN
show oob from APIC controller
show tenant XYZ application database (complete application profile end end points)
show tenant XYZ application database epg phx-e2-tims-db5-2309 endpoints(specific endpoints)
show tenant XYZ epg phx-e1-payb-app1-1213 detail – Static bindings
show tenant XYZ endpoint vlan 1213 – to know the expg
show vpc map to check list of vpc`s configured
show tenant XYZ endpoints | egrep “|AEPg”
show tenant XYZ ip interface bridge-domain | egrep “x.x.x.x|Interface” – To know the BD
show ip interface bridge-domain | grep 10.0.10 -B 3 -A 3 – Where you don`t have endpoint, got only subnet and you want find BD/EPG(reverse engineering)

show tenant E1-eCP ip interface bridge-domain | grep -A3 -B3 10.20.182

show vlan ext

show port-channel ext

show vpc ext

show vlan id 75,81 ext

OSPF commands

show ip ospf database self-originated vrf common:intra-app-west-vrf
show ip ospf neighbors vrf common:internet-west-vrf
show ip ospf database vrf common:internet-west-vrf

Continues…………………   in next part!!!

How to change Set metric/Metric type (ospf/bgp) in ACI

It’s pretty straight forward but little tricky while configure route-maps in ACI.

Example : Changing set Metric/type.

Click on L3-out

Click on Route  Maps/Profiles

Create route-map/profile – The name should be “default-import/Default-export “(depends on your requirement) name should not be  given any other names.

Example : If you want to set Metric/Type for all outgoing routes from fabric.

Create a route map with default-export

Select Match Prefix AND Routing policy

Create a route context (leave default # 0) – name (here can be anything )

Set the rule and rule name (can be anything) – Set metric – Value which you want to be and Metric type (type1/Type) also, you can do something in this section(can modify the BGP attributes as well)

Untitled picture

Troubleshoot SAN boot from UCS!!

Well !! There are lot of articles published on the same topic but want to publish my own version by consolidating available articles & add my experience and make people to understand who has moderate knowledge on UCS.

I am not going to explain how to create a service profile because lot of videos are available in the market.

Here is  the simple diagram with explanation:


 MDS(native FC- NPIV mode)-Fabric interconnect(Native FC – NPV mode)-UCS

    1)   Configured FC link as uplink interface on both th FI`s

                 – Equipment- FI section- Click on FI-A and FC ports- select 1/1 as Uplink port (do the same for FAB B)

   2)   Create a VSAN 10 and 20 on MDS and FI as per the diagram

            – Example on FI-A –> 1/1 to be mapped to VSAN 10 and FI-B-1/1 to be      mapped VSAN 20

              – Go to SAN (click on SAN cloud)–> Click on Fabric A – click on VSAN – create VSAN 10 and  FCOE vlan is 10 – Select the Fabric A  (repeat the same step for Fabric B as well)

              –  Go to SAN (click on SAN cloud)–> Click on Fabric A – Up link FC interfaces – click on port 1/1 and map vsan 10 (repeat same for FI-B)

   3)  Ensure that MDS is configured in NPIV mode

          – feature NPIV

         – feature fport-channel-trunk

  4) You have to ensure that FI has configured as “set FC end host mode”(means NPV will be enabled by default)

        – Check the NPV flogi – show NPV flogi-table/status

        – Configure the boot policies.

   5) Create a boot policies

           –  boot order 1 CD-ROM

           –  boot order 2  – HBA`s/fc template)

          – 1st HBA/fc0- as primary – Point to  FAB-A storage controller WWN(get it from storage team))

           – 2nd HBA/fc1 – as a secondary- Point to  FAB-B storage controller WWN(get it from storage team)

         6) One more thing that you need to give the “lun id” for each WWN(By default  0 or 1 used by any storage vendor but some organizations may assign`s specific LUN ID to LUN`s those stuff you should get it from the storage team) because wrong of lun id I got LUN access error (Click below link for troubleshoot)

      7) Reboot and KVM  console to blade, while booting you should be LUN mapped to blade means, yo are good from storage prospective.

        8) MAP ESX/any OS image to blade before booting(for first time installation)

       9) KVM console to Blade – Click on virtual media – Activate virtual device – CD/DVD and MAP ISO Image and install the ESX/any OS.

     10 ) If you are having any issue with LUN, here is the link for document t troubleshoot because I got the solution from this only.


Is CCDE really worth of doing at current market trends?

I have gone through material and list of books that needs to be referred are fantastic, no doubt about the content, given depth of each and every topic.

If I am not wrong need to refer almost 15-20 books 🙂 !! after you go through all these books we will become a “network design expert” (even beyond that..)

Definitely Cisco has come up with  different approach(business-functional and mapping to technical requirements  ), vendor agnostic certification and different than any CCIE certification.  Covered almost all the technologies  be it datacenter/enterprise/service provider.

Till this part  I am fine, not sure why this course has not included  Virtualization and SDN technologies(SD-WAN and  included more  information on VXLAN and EVPN )  because market is driving by SD**  in next generation networks??

Do we really need to go depth in the following topics MPLS L2/L3 ??  And Spanning tree  technologies(slowly  fading away and replacing with spine and leaf architecture in datacenters but not sure in service provider and enterprise networks are behind spanning tree? And same has applicable for FHRP as well because all the Data center`s are preferring fabtricpath ,depth on vxlan/nvgre overlay protocols and more focused on designing data centers with evpn and so on..(would have more stressed on these technologies )

In think certification should have covered CLOUD specific designs as well ex: Hybrid cloud designs (Merging ON-PREM and public/private cloud design)  and open stack neutron functionality technologies because Cisco also more focused on open stack private cloud network (metapod..)

And should have included couple of use cases/designs by  using SDN technologies (SD-WAN/Enterprise/Data center ).

Off course, my perception on CCDE course content  is might be the wrong if so, please correct me. One thing I can say that course outline is really impressed but the worry is why we need to spend more time on couple on  legacy technologies?

Finally, I Appreciate  “Marwan and orphan” for publishing good books/video/design articles.

This is just my opinion , not challenging  the CCDE course content !!!

Looking for comments and inputs….!!

Next Generation Network engineer Skills!!

Couple of years back, while  I was  going through  cisco live library, found below screen shot .  In center, could be ACI/Any of the SDN product(Is anyone aware of that how many SDN products are available in mkt??(Almost 140) but I know very very few 🙂

Next Generation skills

Recently we would have seen lot of articles that SDN/NFV/ open networking going to rule the networking world …!!!

In future, market requirements  going to change,  there won`t be any network engineers. Going to be replaced with new roles viz Cloud Network/SDN experts/Network programming experts ??? Partially yes.

What additional  skills require  along with Expert level (CCIE or any kind expert level certificate ) ..here are my thoughts!!

We should  know the SDN(ACI/NSX ..)  & NFV products functionality and should know how can we integrate with legacy network.

Improve the virtualization skills (Vmware/Hyper-v/KVM just try to understand these technologies functionality because it might be useful when you are troubleshooting end to end flow.

Improve the cloud networking skills (AWS/Azure)  because in future, your management may ask you to integrate  existing data center with cloud because of  various factors (cost cutting/agile..)  and DR/UAT/test may move to cloud so, you need to know the options  on how we can extend the network connectivity to cloud from on-prem (viz : Direct connect in AWS and express route in Azure)

Understand the open networking functionality viz : Openstack neutron/Open daylight/Open flow …. To handle these technologies you should be well versed with linux  OS.

last but not least, we should know the programming like Python/JSON/XML off course, I can understand that this is one of the DRY area for all of us include`s me 🙂 but no option for us .

Seen couple of articles that do we really know the container functionality as well (viz docker ) ???  But i am also not sure at this point..

Might be in my post  I would have covered specific domains (Datacenter, Enterprise networks), not covered much on service providers but as per my knowledge, NFV plays key role in Service provider

Cheers to Network folks, We have to accept the fact  and have to be up to speed and take the hot seat/grow up before someone occupying it in SDN era..

http://sateeshkolagani.com/?m=201510 –> you can look for SDN vs NFV vs Traditinal Networking posted  couple of months back ( sorry, I forgot the source, depicted well in diagram )

My research on infrastructure design for BigData

Future trend is going to  be change for networking folks, should be ready to handle application awareness networks and  have better understanding of application functionality to come up with best network design.

As part of transformation, fortunately/Unfortunately 🙂  got a chance to work on HADOOP solution. During my research  on INTERNET/GOOGLE I came know  that to handle bigdata we require  special h/w (Compute/Network/Storage) and also, I learned how big data works and why we need special infrastructure

Hadoop is Opensource Data mining platform to process/convert large set of variety of unstructured data to structred data in Datalake  integrated with BigData Platforms in Hadoop such Cassendra/ mongoDB /CouchDB etc., to Manage Cluster by using Ambari and Automation by ZooKeeper. The Scoop for data load from RDBMS to HDFS  and so on…..

Today the market leading hadoop ecosystem distribution  flavors are

  1. MapR
  2. Cloud era
  3. Horton works

Hadoop ecosystem, please don`t ask me how it functions 🙂


Here are the my inputs to choose right hardware for BigDATA platform.

Key principles which should be considered while designing Hadoop environment.

  • Usually not virtualized(hypervisor only adds overhead)
  • Usually not blade servers (not enough local storage)
  • Usually not highly oversubscribed (significant east-west traffic)
  • Usually not SAN/NAS (see subsequent slides)
  • Servers should have 10 Gig ports.

Network options:

  1. To handle Hadoop platform’s high density traffic,  the datacenter would require 10/40 Gigabit ports and low latency switches like Cisco Nexus platform (5K/3K)and UCS Common Platform Architecture to deliver high performance.
  2. Cisco ACI kit(Nexus 9k) , but I haven`t seen  right use cases ACI with Bigdata

Myself, I will prefer to go with Option1, if anyone interested in  next generation network, can go with ACI but defiantly will have more challenges while deploying and integration.

Compute options:

  1. UCS M3 240 M3 servers2.
  2. UCS CPA(Common platform architecture)

Two Cisco UCS 6296UP Fabric Interconnects

Eight cisco Nexus 2232PP Fabric Extenders (two per rack)

64 Cisco UCS C240M3 Rack-Mount Servers (16 per rack )

Single Domain Up to 10 racks, 160 servers

Four Cisco R42610 standard racks

Offcourse,  We don`t require to go with all the above mentioned components. Initially, go with one rack(Two fex)  with few rack servers, as and when require keep adding the servers

Recommended FEX connectivity by cisco:


10 Hadoop Hardware Leaders:


  • Source: Cisco live BRKAPP-2033/BRKCOM-2011

Refreshing my memory on cisco ACI (part – II)

I am trying to recollect some more points on ACI which is continues to my previous post.

ACI is a simple modular switch, below depicted diagram suppose to be in part -1  🙂

ACI - Modular switch

AVS -> Application Virtual Switching supported only on VMWARE (VEM)

  •         Essentially a modified N1K VEM with an Opflex agent (port-groups backed by VxLANs)
  •       APIC will also talk to AVS/VEM over OPFLEX and assign it IP address just like any other f    Fabric component

AVS flow

AVS switching modes:

  •            Local switching: Intra-EPGs traffic switched on the same host
  •           FEX mode: All traffic sent to Leaf for switching
  •           Full switching : : Full APIC policy enforcement on server


  • X9700 only supported ACI supported line card
  • NXOS line cards are different that ACI line cards
  • Leaf and Spine communicate over IS-IS (by default) and IBGP (configurable for route leaking)
  • Traffic is normalized into eVXLAN (ACI VXLAN) at the spine and communication happens based on source and destination EPG
  • If leaf does not know dest mac, traffic is sent to spine
  • If even spine does not know, then the frame is dropped by default, however we can configure it to flood such frames
  • Leaf identifies a new host as it comes up with any snooping technology and reports the Spine through a communication protocol called COOP
  • Old entries on leaf switch will be removed after 5 minutes
  • APIC is configurable through CIM-C and KVM
  • APIC will further configure the spine and leaf switches starting with IP assignment
  • Management IP offered by APIC to fabric are only for management communication and not for any outside access
  • APIC will communicate with fabric over a dedicated VRF called Overlay-1
  • VM kernel IP address subnet should be different than APIC IP assignment subnet
  • VLAN ID is required for infrastructure network 4093
  • Kernel of APIC is CENT OS
  • Cannot conf t to leaf switches