Security issues:
A major concern with the Growing Dependency and Usage of Big Data concepts
With the advent
of technology in present era all most all organizations, large and small enterprises,
private companies, almost all sectors whether Government organizations, private
organizations, sector related to healthcare, airlines industry, sensors,
Education sectors, Sales and marketing department of almost all companies, various
government projects of digital era are using the booming technology related to
big data and working rigorously on big data project.
Demonetization is
the current source which has generated humongous amount of data. Analytical
engineers are working day and night to know the facts and details of black
money defaulters. If analytics are not applied on the data on time, data
becomes stale. Presently the data is generated by three explosions; cloud
Computing Explosions, Data Explosions, and Conversion Explosion. This exploded data is stored in large
repositories which are also increasing every second where Data is stored in
peta-bytes, zeta bytes.
Big Data Systems:
Big data systems can store very large amounts of data; can manage that data
across many systems; and provide some facility for data queries, data consistency,
and systems management.
The challenge is
not to manage a boatload of data – many platforms can do that. And it’s not
just about analysis of very large data sets. Various data management platforms
provide the capability to analyze large amounts of data, but their cost and
complexity make them non-viable for most applications. The big data revolution
is not about new thresholds of scalability for storage and analysis.
Big Data as an specific Technology:
Big Data as a specific technology is not any Hadoop HDFS or Lustre or
Google GFS or shard storage system. It is more than managing big data sets. It
is not a Map Reduce cluster as its more than how you query large data sets.
Big data is an application with all combined traits of different technologies
together which attracts developers, data managers and large scale data analysts.
It’s a collection of plethora of cost effective technologies with different
attributes and capabilities, works together to give effective and result
oriented results.
Big data as any data repository is able to handles large amounts of data
stored in distributed and redundant storage. It can perform parallel task
processing and easily accessible as a commercial or open source product. It is
extensible with basic capabilities which can be augmented and altered easily.
Big data revolution is basically built on three pillars, big, cheap and easy
data management which enhances its ability to scale data store at greatly
reduced cost and analyse data easily, effectively in a faster manner with all
complex data type. It provides all characteristics which are not available with
traditional databases.
Hadoop Framework:
The big data systems uses Hadoop framework i.e. most big data systems actually
use one or more Hadoop components, and extend some or all of its basic functionality.
The components of Hadoop Framework can be explained through its working and
divided into 5 layers:
Layer 1: Data
Storage: HDFS
Layer 2: Data
Processing: Map Reduce
Layer 3: Task and
Resource Management: YARN
Layer 4: Data
Access: PIG
Layer 5:
Orchestration: HBase
The Hadoop
framework is much like a LAMP stack. Normally these pieces are grouped
together, but you can mix and match, or add onto the stack, as needed. For
example, there are optional data access services like Sqoop and Hive. Lustre,
GFS, and GPFS are data storage alternatives to HDFS. Or you can extend HDFS
functionality with tools like Scribe. The entire stack can be configured and
extended as needed. As this new technology enables all to collect, manage and
analyze their data lying in humongous amount in the repositories and take
better and efficient decision for their betterment and growth. It has
significantly changed the nature of analytics from descriptive analytics to
Disruptive Analytics to Predictive Analytics so that this humongous data can be
leveraged in global business to improve its economy. As the technology is
benefiting the company in terms of growth prospects the repositories are
stuffed with sensitive data which emerges the issues of security. It is a
matter of big concern as all sensitive information of companies or individuals
are stored in repositories which can be misused and can promote to fraudulent.
Lots of demographic information of individuals is also stored in big repositories
under UIDAI scheme. At the same time as individuals adhaar number is linked
with various govt schemes provided for citizens, data is laundering freely. If
a small human stupidity get take lace due to unawareness, it’s very easy to get
victim.
With the
increased adoption of cloud based, web based and mobile based applications
sensitive data has become accessible from different types of platforms easily.
Especially these platforms are highly vulnerable to hacking, especially if they
are low-cost or free.
Hackers and
crackers are keeping their eyes on this flow of data, anytime by any means they
can catch your personal information and can do all malicious attacks, frauds
and can perform mischievous activities which can harm personally or
financially. This is the reason the protection of private and confidential
information gains more and more attention. Big Data Security is the most highly
ranked priority in the IT strategies, in case of big data the way data is
exploding protection of information is of crucial importance. Hence we can say
a lack of data security can lead to great financial losses and reputational
damage for a company. In case of continuously exploding big data losses due to
poor IT security can exceed even the worst expectations.
Challenges of Big
Data Security:
·
Single level protection in most distributed systems
·
Evolvement of NoSQL without security feature
·
Requirement of additional security measures in automated data transfer
·
Validation of information received, whether it remains trustworthy and
accurate.
·
Mining personal information without asking user permission or notifying
them.
·
Lack of division of different level of confidentially within the
company.
·
Lack of routine audits to be performed on Big Data.
·
Inconsistent monitor and track of origin of big data.
Antivirus industry is working together and continuously from years to
deal with malicious attacks and to provide maximum gains. The big data security
can be improved by focusing on application Security, rather than device security,
should keep isolated devices and servers containing critical data, by
introducing real time security information and event management, by providing
reactive and proactive protection.
Ms. Arpana Chaturvedi
Assistant Professor
Dept. of Information Technology