Today's Question:  What does your personal desk look like?        GIVE A SHOUT

SEARCH KEYWORD -- PIG UDF



  Hey, you may be happy to know these mottos about programming languages

Different programming are similar in helping people build staff work as people want them to. But they all have their own features which differentiate them from other programming languages. The language type may be different, for example Java is OOP, some may have different syntax. A programming motto usually can best describe the characteristic of the programming language. For example, "Write once Run everywhere" for Java. But today we are going to take a look at "unofficial mottos" about some p...

   motto,programming language     2014-04-05 20:02:13

  Cleansing data with Pig and storing JSON format to HBase with Pig UDF

Introduction This post will explain you the way to clean data and store JSON format to HBase. Hadoop architect experts also explain Apache Pig and its advantages in Hadoop in this post. Read more and find out how they do it. This post contains steps to do some basic clean the duplication data and convert the data to JSON format to store to HBase. Actually, we have some built-in lib to parse JSON in Pig but it is important to manipulate the JSON data in Java code before store to HBase. Apache Pig...

   JSON,HADOOP ARCHITECT,APACHE HBASE,PIG UDF     2016-06-10 01:13:41

  Make Big Data Collection Efficient with Hadoop Architecture and Design Tools

Hadoop architecture and design is popular to spread small array of code to large number of computers. That is why big data collection can be made more efficient with hadoop architecture and design. Hadoop is an open source system where you are free to make changes and design new tools according to your business requirement.   Here we will discuss most popular tools under the category Hadoop development and how they are helpful for big projects. Ambari and Hive– When you are designing...

   HADOOP ARCHITECTURE,HADOOP HIVE ARCHITECTURE,HADOOP ARCHITECTURE AND DESIGN     2015-09-17 05:24:44

  Top 5 Reasons Not to Use Hadoop for Analytics

As a former diehard fan of Hadoop, I LOVED the fact that you can work on up to Petabytes of data.  I loved the ability to scale to thousands of nodes to process a large computation job.  I loved the ability to store and load data in a very flexible format.  In many ways, I loved Hadoop, until I tried to deploy it for analytics.   That’s when I became disillusioned with Hadoop (it just "ain't all that"). At Quantivo, we’ve explored many ways to deploy H...

   Cloud computing,Hadoop,Analytics     2012-04-17 13:43:26

  Why is Great Design so Hard?

I want to take a slight detour from usable privacy and security and discuss issues of design. I was recently at the Microsoft Faculty Summit, an annual event where Microsoft discusses some of the big issues and directions they are headed. In one of the talks, a designer at Microsoft mentioned two data points I've informally heard before but had never confirmed. First, the ratio of developers to user interface designers at Microsoft was 50:1. Second, this ratio was better than any other comp...

   Apple,Microsoft,UI design     2011-03-28 02:06:31

  Games don’t need to be social

Social games have been a big trend in recent years. Zynga struck it big and now everyone else is trying to emulate them. Unfortunately, the first thing that pops into anyones head when a Zynga game is mentioned is Facebook. Facebook is the platform upon which their success stories like FarmVille were built, but it’s not the reason for their success.Zyngas games work because they are fun. The social connectivity is merely a mechanism to share your enjoyment of the game with other...

   Game,Social,No need,Game design     2011-10-29 07:15:34

  Write Your Own R Packages

Introduction A set of user-defined functions (UDF) or utility functions are helpful to simplify our code and avoid repeating the same typing for daily analysis work. Previously, I saved all my R functions to a single R file. Whenever I want to use them, I can simply source the R file to import all functions. This is a simple but not perfect approach, especially when I want to check the documentation of certain functions. It was quite annoying that you can’t just type ?func&n...

   DATA SCIENCE,R PROGRAMMING,DATA ENGINEERING     2019-10-19 07:20:52

  Twitter to sponsor Apache Software Foundation

Twitter recently made a commitment that they would sponsor the Apache Software Foundation, it will become its official sponsor. The Apache Software Foundation is a nonprofit organization, it can provide the organization and management, legal and financial support for open source projects. As we all know, Twitter loves open source, and its engineers are often engaged in open source community to provide technical support. Twitter team is also responsible for the related construction of the o...

   Apache,ASF,Twitter,Sponsor     2012-04-20 12:08:06

  How Google Tests Software - Part Three

Lots of questions in the comments to the last two posts. I am not ignoring them. Hopefully many of them will be answered here and in following posts. I am just getting started on this topic.At Google, quality is not equal to test. Yes I am sure that is true elsewhere too. “Quality cannot be tested in” is so cliché it has to be true. From automobiles to software if it isn’t built right in the first place then it is never going to be right. Ask any car company that has ever h...

   Google,Software,Testing,Quality,Fidelity     2011-03-22 14:31:00

  Hadoop or Spark: Which One is Better?

What is Hadoop? Hadoop is one of the widely used Apache-based frameworks for big data analysis. It allows distributed processing of large data set over the computer clusters. Its scalable feature leverages the power of one to thousands of system for computing and storage purpose. A complete Hadoop framework comprised of various modules such as: Hadoop Yet Another Resource Negotiator (YARN MapReduce (Distributed processing engine) Hadoop Distributed File System (HDFS) Hadoop Common Thes...

   COMPARISON,HADOOP,SPARK     2018-11-22 07:08:57