My research on infrastructure design for BigData

Future trend is going to  be change for networking folks, should be ready to handle application awareness networks and  have better understanding of application functionality to come up with best network design.

As part of transformation, fortunately/Unfortunately 🙂  got a chance to work on HADOOP solution. During my research  on INTERNET/GOOGLE I came know  that to handle bigdata we require  special h/w (Compute/Network/Storage) and also, I learned how big data works and why we need special infrastructure

Hadoop is Opensource Data mining platform to process/convert large set of variety of unstructured data to structred data in Datalake  integrated with BigData Platforms in Hadoop such Cassendra/ mongoDB /CouchDB etc., to Manage Cluster by using Ambari and Automation by ZooKeeper. The Scoop for data load from RDBMS to HDFS  and so on…..

Today the market leading hadoop ecosystem distribution  flavors are

  1. MapR
  2. Cloud era
  3. Horton works

Hadoop ecosystem, please don`t ask me how it functions 🙂

Hadoop

Here are the my inputs to choose right hardware for BigDATA platform.

Key principles which should be considered while designing Hadoop environment.

  • Usually not virtualized(hypervisor only adds overhead)
  • Usually not blade servers (not enough local storage)
  • Usually not highly oversubscribed (significant east-west traffic)
  • Usually not SAN/NAS (see subsequent slides)
  • Servers should have 10 Gig ports.

Network options:

  1. To handle Hadoop platform’s high density traffic,  the datacenter would require 10/40 Gigabit ports and low latency switches like Cisco Nexus platform (5K/3K)and UCS Common Platform Architecture to deliver high performance.
  2. Cisco ACI kit(Nexus 9k) , but I haven`t seen  right use cases ACI with Bigdata

Myself, I will prefer to go with Option1, if anyone interested in  next generation network, can go with ACI but defiantly will have more challenges while deploying and integration.

Compute options:

  1. UCS M3 240 M3 servers2.
  2. UCS CPA(Common platform architecture)

Two Cisco UCS 6296UP Fabric Interconnects

Eight cisco Nexus 2232PP Fabric Extenders (two per rack)

64 Cisco UCS C240M3 Rack-Mount Servers (16 per rack )

Single Domain Up to 10 racks, 160 servers

Four Cisco R42610 standard racks

Offcourse,  We don`t require to go with all the above mentioned components. Initially, go with one rack(Two fex)  with few rack servers, as and when require keep adding the servers

Recommended FEX connectivity by cisco:

Fex-connectivity

10 Hadoop Hardware Leaders:

http://www.informationweek.com/big-data/hardware-architectures/10-hadoop-hardware-leaders/d/d-id/1234772

  • Source: Cisco live BRKAPP-2033/BRKCOM-2011

Leave a Reply

Your email address will not be published.