Today's Question:  What does your personal desk look like?        GIVE A SHOUT

SEARCH KEYWORD -- Hadoop.Linux



  Why to opt for Hadoop?

Hadoop is a open source that stores and processes big data. The framework is written in Java for distributed processing and distributed storage of very large data. Hadoop is Scalable. It is a scalable platform because it stores and distributed large amount of data sets to hundreds and thousands of servers that operate in parallel. Traditional database systems cannot process large amount of data. But, hadoop enable business to run applications involving thousands of Terabyte data. Hadoop is ...

       2015-09-22 10:17:43

  Build Hadoop environment in Linux

Hadoop standalone installation: 1. Install JDK Install JDK with below command: sudo apt-get install sun-java6-jdk Configure Java environment, open /etc/profile, add below contents: export JAVA_HOME = (Java installation directory) export CLASSPATH =".:$JAVA_HOME/lib:$CLASSPATH" export PATH = "$JAVA_HOME/:PATH" Verify installation of Java Type java --version, if it outputs Java version information, then Java is successfully installed. 2. Install SSH Install SSH with below command: sudo ...

   Hadoop.Linux,Configuration     2013-07-31 23:22:27

  Embrace open source

In past few days, there are many tech news which are related to open source. For example, Microsoft enables Linux on its Windows Azure cloud, Facebook open sourced its C++ library Folly and Samsung joined Linux foundation. Now more and more big companies realize the power of open source and are willing to contribute to the open source community. It will benefit not only developers but also these big companies as well.By providing some open source libraries or projects, developer may reduce their...

   Open source,Microsoft,Samsung,Facebook,Linux     2012-06-06 05:37:59

  Top 5 Reasons Not to Use Hadoop for Analytics

As a former diehard fan of Hadoop, I LOVED the fact that you can work on up to Petabytes of data.  I loved the ability to scale to thousands of nodes to process a large computation job.  I loved the ability to store and load data in a very flexible format.  In many ways, I loved Hadoop, until I tried to deploy it for analytics.   That’s when I became disillusioned with Hadoop (it just "ain't all that"). At Quantivo, we’ve explored many ways to deploy H...

   Cloud computing,Hadoop,Analytics     2012-04-17 13:43:26

  Make Big Data Collection Efficient with Hadoop Architecture and Design Tools

Hadoop architecture and design is popular to spread small array of code to large number of computers. That is why big data collection can be made more efficient with hadoop architecture and design. Hadoop is an open source system where you are free to make changes and design new tools according to your business requirement.   Here we will discuss most popular tools under the category Hadoop development and how they are helpful for big projects. Ambari and Hive– When you are designing...

   HADOOP ARCHITECTURE,HADOOP HIVE ARCHITECTURE,HADOOP ARCHITECTURE AND DESIGN     2015-09-17 05:24:44

  Hadoop or Spark: Which One is Better?

What is Hadoop? Hadoop is one of the widely used Apache-based frameworks for big data analysis. It allows distributed processing of large data set over the computer clusters. Its scalable feature leverages the power of one to thousands of system for computing and storage purpose. A complete Hadoop framework comprised of various modules such as: Hadoop Yet Another Resource Negotiator (YARN MapReduce (Distributed processing engine) Hadoop Distributed File System (HDFS) Hadoop Common Thes...

   COMPARISON,HADOOP,SPARK     2018-11-22 07:08:57

  Linux Kernel is replacing HTTP link with HTTPS

Linux kernel is in the process of replacing the HTTP links in its source code with HTTPS links. HTTPS is considered more secure than HTTP and can prevent lots of attacks like Man-In-The-Middle attack.  Currently there are more than 150 patches submitted by Linux Kernel developers to replace these HTTP links.  One thing to be noted is this replacement process is not a manual search and replace process. Indeed, some scripts are created to find out these links and try to find whethe...

   LINUX KERNEL,HTTP,HTTPS     2020-08-08 01:35:20

  Data governance Challenges and solutions in Apache Hadoop

Do you understand meaning of data governance? This is taken as most critical part of an organization that deals with sensitive data of an enterprise. If organization wanted to know who is accessing their sensitive data and what action has been taken by the viewers then data governance is wonderful solution to consider. In this article, we will discuss on data governance solutions and what are the challenges that are faced by organization during implementation of data governance. We will also dis...

   HADOOP DEVELOPMENT,HADOOP INTEGRATION     2015-10-26 08:06:29

  Microsoft is the 17th largest contributor to Linux

The Linux Foundation has released 2012 Linux White Paper which analyzes developers and contributors of the Linux kernel from 2.6.36 to 3.2.The top ten contributors are: Red Hat, Intel, Novell, IBM, Texas Instruments, Broadcom, Nokia, Samsung, Oracle, and Google. The software giant Microsoft's contribution ranked at 17, while the company's CEO Steve Ballmer has claimed previously that Linux is a cancer. Microsoft engineers have contributed 688 patches, which are mostly related to Hyper-V vir...

   Linux,Microsoft,Contribution     2012-04-05 07:43:19

  Shortest command on Linux

Usually when we log in to a Linux system,. we may type some frequently used Linux commands such as pwd,ls, ps etc. All these commands are really simple but powerful with different options. But do you know what is the shortest command on Linux? The answer is w. According to Linux manual, w will show who is logged on and what they are doing on the system. w displays information about the users currently on the machine, and their processes. The header shows, in this order, the current time, how l...

   Linux,w,shortest command     2014-04-30 11:07:38