Today's Question:  What does your personal desk look like?        GIVE A SHOUT

SEARCH KEYWORD -- data



  Why to opt for Hadoop?

Hadoop is a open source that stores and processes big data. The framework is written in Java for distributed processing and distributed storage of very large data. Hadoop is Scalable. It is a scalable platform because it stores and distributed large amount of data sets to hundreds and thousands of servers that operate in parallel. Traditional database systems cannot process large amount of data. But, hadoop enable business to run applications involving thousands of Terabyte data. Hadoop is ...

       2015-09-22 10:17:43

  Google has done more for the world with ngrams

Data is valuable asset for a company in the Internet world. With data of users, a company can gain lots of benefits. They can push specified ads to users by analyzing user behaviors, they can even sell the data to third parties. Data is very important for a company's success, so some companies will keep their data secret in order to gain advantages over competitors. However, Google seems do it in another way. Google shared their ngrams text corpus publicly, which basically contains valuable info...

   Ngram,NLP,Data     2013-12-12 07:56:02

  Exit main thread and keep other threads running in C

In C programming, if using return in main function, the whole process will terminate. To only let main thread gone, and keep other threads live, you can use thrd_exit in main function. Check following code: #include #include #include int print_thread(void *s) { thrd_detach(thrd_current()); for (size_t i = 0; i < 5; i++) { sleep(1); printf("i=%zu\n", i); } thrd_exit(0); } int main(void) { ...

   C LANGUAGE,MULITHREAD,MAIN THREAD     2020-08-14 21:20:04

  Google releases Analytics real time API

According to Tech Crunch, Google finally released its Analytics real time API. Although this feature was launched two years ago, there was no convenient way for webmasters to adjust the data so that they can be viewed properly. Now developers can use the API to get what they want and utilize these data to do what they want to. Developers need to apply for using the API now. Once you get access to this API, then you can search your own real time data and utilize these data as you want to. For ex...

   Google Analytics,API,Real time     2013-08-02 00:05:56

  How does PHP session work?

This article is about how PHP session works internally. Below are the steps : 1. Session in PHP is loaded into PHP core as an extension, we can understand it as an extension. When session extension is loaded, PHP will call core functions to get the session save_handler, i.e interface or functions for reading and writing session data. By default, PHP will handle session data by writing and reading files on the server. But PHP also supplies custom methods for handling session data, we can use sess...

   PHP, session, mechanism     2012-12-28 13:36:49

  How Kafka achieves high throughput low latency

Kafka is a message streaming system with high throughput and low latency. It is widely adopted in lots of big companies. A well configured Kafka cluster can achieve super high throughput with millions of concurrent writes. How Kafka can achieve this? This post will try to explain some technologies used by Kafka. Page cache + Disk sequential write Every time when Kafka receives a record, it will write it to disk file eventually. But if it writes to disk every time it receives a record, it would ...

   BIG DATA,KAFKA     2019-03-08 09:42:57

  Introduction to GoLang generics and advanced usage

Generics in Go allow you to write code that can work with multiple types of data, without having to write separate versions of the code for each type. This can make your code more flexible and easier to maintain, as you only need to write and test the code once, rather than maintaining multiple versions. To use generics in Go, you first need to define a type parameter, which is a placeholder for the type that the code will work with. For example, you might define a type parameter called "T" like...

   GOLANG,GENERICS     2022-12-17 05:12:21

  Cracking the Data Lineage Code

What is Data Lineage?  Data lineage describes the life-cycle of data, from its origins to how it is manipulated over time until it reaches its present form. The lineage explains the various processes involved in the data flow of an organization and the factors that influence each process. In other words, data lineage provides data about your data.  Data lineage helps organizations of all sizes handle Big Data, as finding the creation point of the data and its evolution provides valuabl...

   BIG DATA, DATA LINEAGE,BUSINESS     2019-08-08 12:41:42

  Google Chrome to support sync clipboard data among devices

Google has been working very hard to make it possible for syncing clipboard data among PC and Android devices through Chrome. This feature is finally available in Chrome Canary 79 and is going to be released in future version of Chrome although it only supports sync data from PC to Android but not vice versa. But before getting to that day, users can start to explore this feature in latest Chrome Canary version 79.  There are three flags(chrome://flags) to control enablement of the feature...

   WINDOWS 10,CHROME CANARY,CLIPBOARD,CLIPBOARD SYNC     2019-09-15 07:18:26

  Data governance Challenges and solutions in Apache Hadoop

Do you understand meaning of data governance? This is taken as most critical part of an organization that deals with sensitive data of an enterprise. If organization wanted to know who is accessing their sensitive data and what action has been taken by the viewers then data governance is wonderful solution to consider. In this article, we will discuss on data governance solutions and what are the challenges that are faced by organization during implementation of data governance. We will also dis...

   HADOOP DEVELOPMENT,HADOOP INTEGRATION     2015-10-26 08:06:29