期刊导航

论文摘要

基于垃圾回收的MapReduce作业内存调优

GC-based MapReduce Job Memory Tuning

作者:罗永刚(四川大学计算机学院)

Author:LUO YONGGANG()

收稿日期:2014-12-10          年卷(期)页码:2015,47(6):104-112

期刊名称:工程科学与技术

Journal Name:Advanced Engineering Sciences

关键字:MapReduce; Hadoop; Java虚拟机; 垃圾回收; 资源优化

Key words:MapReduce, Hadoop, JVM, GC, Memory tuning

基金项目:2013年度自然科学基金项目(61272447),国家科技支撑计划资助课题(2012BAH18B05)

中文摘要

针对合理管理MapReduce作业内存资源困难的问题,提出评估方法并给出优化配置建议。首先分析Java虚拟机的内存分配与垃圾回收的原理,给出垃圾回收重要指标;其次提出内存分配合理性评估的3种指标和评估方法;最后根据评估结果给出2种优化配置建议:一是通过使用聚类算法和统计信息来估计对象晋升的阈值,优化Java虚拟机的对象分配和垃圾回收性能;二是使用回归模型和搜索算法来预测作业合理的内存配置。实验结果表明,本文提出的方法能自动发现作业内存配置的不足并给出优化的配置建议。与采用机器学习方法相比,本文提出的方法不需要运行大量的测试,因此本文提出的方法能很好适用于MapReduce的生产集群环境。

英文摘要

Different Job requires different memory resource, it is difficult to assess the rationality for a memory allocation to a MapReduce Job. Trying to solve this problem, we present an assessment method and recommendation for memory settings of JVM where Job’s tasks run. Firstly, we introduce some important GC metrics based on our analyzing JVM’s memory allocation and GC workflow in-depth; Then, we introduce three kinds of indicators and memory allocation rationality evaluation method based on the three indicators; Finally, we introduce two kinds of optimal JVM configuration recommendations: By using of K-means algorithm and statistical information to estimate the threshold value of the object size which should have been allocate in old generation; By modeling GC pause time and using search algorithm to predict the size of young generation and the old generation. Experimental results show that the proposed approach can automatically find insufficient of memory configuration of a Job. Compared with using machine learning methods, the proposed method does not need to run a large number of test cases, so the proposed method can apply to production cluster of MapReduce.

关闭

Copyright © 2020四川大学期刊社 版权所有.

地址:成都市一环路南一段24号

邮编:610065