The Magical World of Data Science
“Any sufficiently advanced technology is
indistinguishable from magic.”
– Arthur C.
Clarke
With the advent of
technology and its incorporation in our life has led to generation of data in
huge amount. Data is extracted from various resources such as Social Media,
e-commerce sites, Mobile phones, Healthcare, internet searches, online surveys
etc. This variety of data has lot of hidden insights which can produce magical
results with the help of emerging field called ‘Data Science’.
Science has provided
us an empirical field where discovering and describing the things are done by
observing and experimenting. Data Science is a form of applied science which
carefully studies large amount of data using various analytical techniques and
then describing the results.
Earlier the Business
houses were totally dependent on the intelligence and gut feeling of few bunch
of people. Nowadays, this work has been taken up by intuitive tools and
techniques which has provided a significant shift in policy making in various
areas such as healthcare, travel, shopping, social and political issues etc.
Data Science
performs magic using various tools resulting in analysis that converts annoying
data storage into useful insights.
A very interesting example of data science we
use in daily life is Google maps. Many people are using Google maps but not
knowing how they work, it is we or the people using it provide all the
information regarding traffic and routes. Whenever a car drives through a route
and a person sitting in the car has a cell phone with Google maps installed and
connected to internet, then the data such as speed of car, route etc. is send
to the Google maps server which is analyzed resulting in the real time traffic
reports to other users.
Various tools that can be used to show magical results
with Data sets:
S.No.
|
Name of the Tool
|
|
1
|
Rapid Miner
Not just a GUI, it also extends support to people using
Python & R for model building. Their product line has several products
built for big data, visualizations, model deployment, some of which
(enterprise) include a subscription fee.
|
|
2
|
Rattle GUI
This GUI is built on R Programming Language
and is a free software. It’s also more than just data mining tool.
Rattle supports various ML algorithms such as Tree, SVM, Boosting, Neural
Net, Survival, Linear models etc.
|
|
3
|
Orange
It is an open-source data
visualization, machine learning and data mining toolkit.
It has an extensive library of data mining tasks which includes all
classification, regression, clustering methods.
|
|
4
|
Weka
It is free software and provides options for
data pre-processing, classification, regression, clustering, association
rules and visualization.
|
|
Puja Munjal
Assistant Professor
Dept. of Information Technology