Background
In this classic Analytics space, we comprise companies based on analytics, where people execute algorithmic crunching and sharing the findings as a white paper simultaneously. However, these findings are dead in some time. We can certainly relate to top notch vendors like SAP, Oracle, MSFT and IBM in our knowledge, who charge licensing cost involved in database, platform and for approximately 25 other products, they have acquired in due course of over a decade of time span. The Big 4s are coming up
with good solutions for Big Data but they all are extremely costly
with few talent bases available. With the advent of Big Data and
Hadoop based infrastructure, today's analytics companies need to
have all the skills, they should be able to setup Big Data Engineering
lab with engineers knowing the best method to deal with large
amount of data and being data scientists as well. Finally you need
a strong BI Visualization framework or tool where dashboards can
be seen on all devices.Companies like BizViz, having all the hard-earned capabilities to work end to end utilizing home-grown utilities and open source software can execute this job in the best economical manner possible and with the best control over real time analytics.The motive behind the collection of huge amount of data is to ensure deeper insight into analytics and the identified correlation between the data which would be commenced in further
phases. However the data collected had a very diverse and
unstructured format which could be of no use in the analytics in the
current form. The data was stored in Hadoop Clusters (in HDFS).
What is Big Data ?
Well, everybody knows this by now. Let’s represent this figuratively as explained by our experts, clubbing all sorts of data into a data cobweb.
Why Big Data Analytics is required ?
Everyone knows that data volumes are
growing exponentially. What's not so clear is how to unlock the
value it holds. To improve the health of a person we monitor all
parameters but in case of an organization more than 50% data is
unstructured or partially structured but we don't use that to
check the health of organization. A health of an organization is
always relative. In case of internet of things, we need to keep a
close watch on what is happening in our business dimension
vis-à-vis competition. The organizations which don't do Big Data
Analytics will probably perish in next 10 years or we can say the
organizations who did Big Data in last 5-10 years are ruling the
internet world today. The internet of things will bring all
business platforms on Internet and Big Data will decide the
financial growth, competitiveness and target markets of any
progressive organization.
Key Skills to provide Big Data Analytics
In this document there will be repetition
around these 4 Skills. In my view all these 4 skills are important
to deliver a profitable real time Big Data Platform to an
enterprise.
BizViz either directly or through any one
SME partner is able to provide all these 4 skills. This way
customer can be completely sure that one vendor is able to take
care of all 4 legs of Big Data Analytics & the vendor is able to
get world class support for a long time.
What are the pre-requisites for Big Data
Implementation
Different analysts have been talking
about ROI from Big Data Implementations. Some says 25% return,
some says 55 cents return out of 1$ investment. All these data
baffles us. According to us minimum 30$ return out of $1 invested,
in 5 years should be there else there is something grossly
mistaken. One of the key points is to ensure all following
pre-requisites are being met before you start a Big Data Project.
- Conversion of all the text data to lower
cases for exact matches
- You have a full-fledged BI implementation
system and you are happy with the ROI it has delivered to your
organization.
- You have done 'What-IF' analysis of your
key Financial Parameters and have been flexible enough to make
changes in your organization.
- You have been taking regular feedback or
Survey from your customers, partners, vendors etc.
- You have defined a clear problem which you
want to solve by Big Data Implementation.
- You have allocated a specific budget to
solve this problem. Part of the budget is to allocate internal
resources time who can work with a Big Data partner/Vendor like
BizViz.
Note: In case you have not done 1, 2 & 3.
We suggest you to take BizViz's BI consultancy services and do
complete BI implementation and ask your team to implement it.
Exception to Pre-Requisites
Suppose you have identified a separate
module where you directly want to leverage Hadoop based data
warehouse implementation to save cost/tb than you should go ahead
but this is not a Big Data Implementation Project, it is like a
POC.
1. Subject Matter Expert at Work
- The domain expert understands the problem
statement and defines the problem in a simpler way that technical
team use to solve it.
- You have a full-fledged BI implementation
system and you are happy with the ROI it has delivered to your
organization.
- Knows where the data is residing, which
data is useful and what data can be used for which type of
Analytics.
- You have been taking regular feedback or
Survey from your customers, partners, vendors etc.
- Can design different Scorecards,
Algorithmic charts, Benchmarking Analysis reports & dashboards.
- Work with end client to define the real
problem, define scope and agree on a scope & drives technical
team.
2. Unlock Big Data - Working on Data Collection Layer
- The domain expert understands the problem
statement and defines the problem in a simpler way that
technical team use to solve it.
- Understand existing data sources.
- Search and navigate data within existing
systems.
- Reading Web Data - Crawling or Scraping
of data. Data can be even scrapped from Image, PDF, Doc, Audio
file, Video file.
- Reading Social Media Data - we have a
connector through which we can read data from FB, Twitter,
Linked-in etc. - This is a tool developed by BizViz. More
details on this in another blog on BizViz website.
- Reading Structured or Unstructured data
from WEB Data Apps like Sales Force, Google Analytics etc.
- Providing end to end Survey Services
where data can be trapped as a normal survey or as text based
Survey. BizViz has complete end to end Survey Platform and
Services [www.BizVizsurvey.com] & dashboards.
- We have dynamic HTML5 forms where metric
data can be entered from Mobile devices which will be directly
used for dashboards - This is tool provided by BizViz which is
part of our HTML5 portal which runs on all devices.
3. Data Processing Layer
Data Clean Up - The unstructured data can
be too large and most of it might be meaningless. But its
collective message is meaningful and impactful. One needs to
filter this data. Convert data to lower case, remove Punctuation
marks, stem words for exact matches etc. Use NLP - Natural
Language processing techniques at this step.
Categorization & Classification of Data -
Use Machine Learning Tools like Apache Mahout or Enterprise R.
Both are different tools that provide different Algorithms on
Clustering and Classification. Automated Text Conversion is also
used here for proper classification of unstructured data.
Finding the relations of different data
and pushing this into Hadoop File System. It uses various tools
and paves the way for modern data warehousing that will change the
manner in which we think of a conventional database.
Hadoop Framework -
Hadoop based DWH Implementation Benefits:-
- In Hadoop, you don’t need to know what questions are needed to be asked before designing data warehouse – Hadoop
- Simple Algorithms on Big Data outperform
complex models
- Powerful ability to analyse unstructured
data
- You can save Millions in TCO
- 10x Faster, 100X Cheaper long term
solution
- Maintains the same SLAs as you have been
maintaining
- Changes can be implemented without
impacting users
Data Organization Layer
- The relevant data can now be moved into
HIVE [Again a Row- Column DB like MySQL or Oracle].
- A query can be written on this data [Hive
Query Language to get relevant data].
- The latency is high at Hive so a Data Mart
Layer is being created. In memory like Spark/Shark are in R&D
stage but will come out soon to give in-memory flavour to open
source database.
- Here different tools like Cloudera, Impala
etc. can also be used as a MPP [massively parallel processing]
query engine.
- Here we can arrange data from other
Structured Databases as well.
Data Connector Layer of Analytic Engine
- Once we have the relevant data, now using
a connector we can read the data in an analytics Engine - it can
be R Server or Any third Party Analytic Application like Tableau
or Jasper or SAP Business Objects or BizViz?s BizViz.
- A query can be written on this data [Hive
Query Language to get relevant data].
- The data is fed into R Server and data is
pushed back into Server layer of BI Visualization framework.
The BizViz or UI Layer - Completely developed by
BizViz.
- The entire business can be visualized
using our dashboarding product called BizViz. This actually
represents the entire Big Data Analytics Framework.
- BizViz has a Designer which will help us
to select different Charts and design the UI layer as envisaged
by SME.
- The Analytics data is passed to server of
UI and passed to relevant charts where Predictive Components are
brought handled.
- The normal data can be passed to relevant
charts by writing query services.
- The dashboard can be made on Benchmarking
data, Individual customer data, Group data, Survey data or a
combination of all of them.
- We have HTML5 Designer and HTML5 based
technology where the dashboards can run on all Devices, latest
Browsers etc.
The Hosting and Display of Analytics/Dashboards -
BizViz has HTML5 Portal
- To display dashboards we need a portal
that can host dashboards.
- This portal can be hosted on Premise or as
well as on Cloud.
- The Portal is developed in HTML5 so that
it can also run in all devices and all latest browsers. The
portal and dashboards runs seamlessly on iPad and other mobile
devices.
- The portal has strong Security, User Admin
& Audit features.
Note: Setting up infrastructure involves
setting Hadoop clusters, Hadoop administration, setting up other
servers for various activities like BI/Reporting/Analytics/R
Server etc. This is part of our Big Data Services.
TCO of a Big Data Project
With BizViz the TCO is very low as
compared to any other Big Data Vendor; we are able to achieve this
due to following reasons -
- Quick time to Value due to all 4 Big Data
Analytics capabilities at our hand.
- Work Class Support at offshore rates.
- Full Open Source Compatibility &
Integrated installation of other components.
- Enhanced Business Knowledge with flexible
BizViz Analytics platform.
- Reduced Operational Risk since we have
exposure to work on different Big Data Projects.
- Strong Delivery platform - HTML5
dashboards with world class UI [Already being using by few
Fortune 500 Clients].
How to ensure ROI from a Big Data Project
In any organization based out of
different continents, currently following is the ratio of
structured data, Semi-Structured data and unstructured data. One
can easily make out that when we decide business based out of 50%
data most of our decisions are going to fall flat in some numbers
of years.
With world getting digitized and huge
data getting added on a daily basis, this % will further reduce
from 50% to 20-25% in few years to come. Now think of 2 competitor
companies one having invested in Big Data and doing analysis on
100% data and other company which is doing traditional BI. The
change in business dynamics will be so high that the second
company will be certainly wiped out in few years. So first is to
survive. To survive one will have to do Big Data Analytics.
- Ensure that your team is not lethargic to
changes. One needs to intelligently apply the new findings.
- There is lot of iterative and exploratory
analysis - take small steps and once you get results - increase
the intensity.
- Start with a proper POC that doesn't
involve too much of cost. The real high cost is of real time
Analytics. Before moving real time, ensure the findings are
working for you.
- Do module by module. It is better to
implement Big Data for a new module.
- Once solution is giving ROI, invest on
making it real time.
When do you stop from a Big Data Project
- If the Big Data is not resulting in any
ROI than look at the problem statement & the recommendation from
vendor.
- In case you have not implemented the
recommendation of the vendor than that needs to be done first for
ROI. If you can't do it than stop the Big Data Project.
- In case you have implemented the
recommendations and you are either not getting or getting
negative results than change the vendor immediately.
Note1: Generally one should never stop
from a Big Data Project. One can change the vendor if they are not
getting the signs of ROI in 12-18 months itself.
Note2: Currently since Big Data is new
therefore one will find thousands of Big Data Vendors who are just
selling Big Data services without actually knowing it. They might
be strong on Sales but actually don't have implementations
capabilities. Be aware, Big Data is just not Hadoop & every new
Big Data Start-up may not know its complexities.
Big Data Comparison Table
The above table will give you an
excellent idea of who is providing what. Most of these are
licensed products and some of them are extremely high cost.
BizViz Value Statement
- No Software Licenses - No need to buy any
software licenses from any third party. BizViz does sell licenses
of its dashboarding product but those licenses come free along
with a large Big Data Project.
- Everything from BizViz - Get Big Data
Analytics from a company which does everything from requirements
to setting up a Big Data Lab to efficiently manage and upgrade
the system in any technology.
- Get up and running quickly - With
different tools built by BizViz on Big Data, any customer can
expect results in few weeks/months.
- Highly Cost effective - Please see the TCO
section for more details.
- Experiment with analysis on different data
and combine them with other sources.
- Perfect merging of Traditional BI and Big
Data Approaches.
- Top end Business Visualization that runs
on all devices.
- Top end services from a team which has
exposure of BI from last 15 yrs.
- BizViz team already has exposure on Big
Data projects in Financial Services, Marketing Intelligence,
Education, Banking and Automobile Industry.