Big data performance evalution of map-reduce pig and hive
- Title
- Big data performance evalution of map-reduce pig and hive
- Creator
- Santosh Kumar J.; Raghavendra S.; Raghavendra B.K.; Meenakshi
- Description
- Big data is nothing but unstructured and structured data which is not possible to process by our traditional system its not only have the volume of data also velocity and verity of data, Processing means ( store and analyze for knowledge information to take decision), Every living, non living and each and every device generates tremendous amount of data every fraction of seconds, Hadoop is a software frame work to process big data to get knowledge out of stored data and enhance the business and solve the societal problems, Hadoop basically have two important components HDFS and Map Reduce HDFS for store and mapreduce to process. HDFS includes name node and data nodes for storage, Map-Reduce includes frame works of Job tracker and Task tracker. Whenever client request Hadoop to store name node responds with available free memory data nodes then client will write data to respective data nodes then replication factor of hadoop copies the blocks of data with other data nodes to overcome fault tolerance Name node stores the meta of data nodes. Replication is for back-up as hadoop HDFS uses commodity hardware for storage, also name node have back-up secondary name node as only point of failure the hadoop. Whenever clients want to process the data, client request the name node Job tracker then Name node communicate to Task tracker for task done. All the above components of hadoop are frame works on-top of OS for efficient utilization and manage the system recourses for big data processing. Big data processing performance is measured with bench marks programs in our research work we compared the processing i.e. execution time of bench mark program word count with Hadoop Map-Reduce python Jar code, PIG script and Hive query with same input file big.txt. and we can say that Hive is much faster than PIG and Map-reduce Python jar code Map-reduce execution time is 1m, 29sec Pig Execution time is 57 sec Hive execution time is 31 sec. BEIESP.
- Source
- International Journal of Engineering and Advanced Technology, Vol-8, No. 6, pp. 2982-2985.
- Date
- 2019-01-01
- Publisher
- Blue Eyes Intelligence Engineering and Sciences Publication
- Subject
- CloudxLab; Hadoop JAR; HDFS; Hive; Pig
- Coverage
- Santosh Kumar J., Department of Computer Science and Engineering, K.S. School of Engineering and Management, Bangalore, India; Raghavendra S., Department of Computer Science and Engineering, CHRIST DEEMED TO BE UNIVERSITY, Bangalore, India; Raghavendra B.K., Research Institute is an institution deemed to be university, Chennai, India, Bengaluru University, Bengaluru, India; Meenakshi, Jain University, Bengaluru, India
- Rights
- All Open Access; Gold Open Access
- Relation
- ISSN: 22498958
- Format
- Online
- Language
- English
- Type
- Article
Collection
Citation
Santosh Kumar J.; Raghavendra S.; Raghavendra B.K.; Meenakshi, “Big data performance evalution of map-reduce pig and hive,” CHRIST (Deemed To Be University) Institutional Repository, accessed February 24, 2025, https://archives.christuniversity.in/items/show/16601.