[Hive]-the meaning of the hive parameter is detailed

Source: Internet
Author: User

The parameters in hive are divided into three categories, the first system environment variable information, the environment variable information, the second is ENV environment variable information, is the current user environment variable information; The third is the hive parameter variable information, is the environment variable information defined by the Hive-site.xml file and defined by the current hive session. The third hive parameter variable information is made up of Hadoop HDFs parameters (directly Hadoop), mapreduce parameters, Metastore metadata storage parameters, Metastore connection parameters, and hive run parameters.

hive-0.13.1-cdh5.3.6 parameter variable information detailed
Parameters Default value Meaning (usefulness)
Datanucleus.autocreateschema True Creates necessary schema on a startup if one doesn ' t exist. Set this to false, after creating it once, if the data metadata does not exist, is created directly, if set to False, then created later.
Datanucleus.autostartmechanismmode Checked Throw exception if metadata tables is incorrect; throws an exception if the data meta-information check fails. Optional value:checked, unchecked
Datanucleus.cache.level2 False Use a Level 2 cache. Turn this off if metadata is changed independently of Hive Metastore server; Whether to use a level two caching mechanism.
Datanucleus.cache.level2.type SOFT Soft=soft reference based cache, Weak=weak reference based cache, None=no Cache. Type of level two cache mechanism, none is not used, soft means using soft references, weak means using weak references.
Datanucleus.connectionpoolingtype Bonecp The Metastore data connection pool is used.
Datanucleus.fixeddatastore False
Datanucleus.identifierfactory Datanucleus1 Name of the identifier factory to use when generating table/column names etc. Create a factory class for the Metastore database.
Datanucleus.plugin.pluginRegistryBundleCheck LOG Defines what happens when plugin bundles is found and is duplicated [exception| Log| NONE]
Datanucleus.rdbms.useLegacyNativeValueStrategy True
Datanucleus.storemanagertype Rdbms How metadata is stored
Datanucleus.transactionisolation read-committed Transaction mechanism, DEFAULT transaction isolation level for identity generation.
Datanucleus.validatecolumns False Validates existing schema against code. Turn the If you want to verify existing schema, check schema for existing tables
Datanucleus.validateconstraints False Check Constraints for existing tables
Datanucleus.validatetables False Check table
Dfs.block.access.key.update.interval 600
Hive.archive.enabled False Whether archiving operations is permitted, whether to allow archiving operations.
Hive.auto.convert.join True Whether Hive enables the optimization about converting common join to mapjoin based on the input file size; Allow data Jo In optimization
Hive.auto.convert.join.noconditionaltask True

Whether Hive enables the optimization about converting common joins into mapjoin based on the input file size. If This parameter are on, and the sum of size for n-1 of the tables/partitions for a n-way join is smaller than the Specifi Ed size, the join is directly converted to a mapjoin (there is no conditional task). Use data join directly for a task that does not have a condition.

Hive.auto.convert.join.noconditionaltask.size 10000000 If Hive.auto.convert.join.noconditionaltask is off, the This parameter does isn't take affect. However, if it is on, and the sum of size for n-1 of the tables/partitions for a n-way join is smaller than this size, the Join is directly converted to a mapjoin (there is no conditional task). The default is 10MB; if ${hive.auto.convert.join.noconditionaltask} is set to True, it indicates the size value of the control file, which is 10M, i.e., if it is less than 10M, Then use data join directly.
Hive.auto.convert.join.use.nonstaged False For conditional joins, if input stream from a small alias can is directly applied to join operator without filtering or PR Ojection, the alias need not to being pre-staged in distributed cache via mapred Local task. Currently, this was not a working with vectorization or Tez execution engine. For a conditional data join, use distributed caching for small files.
Hive.auto.convert.sortmerge.join False 'll the join is automatically converted to a Sort-merge join if the joined tables pass the criteria for Sort-merge Joi N. If you can convert, automatically convert to the standard Sort-merge join method.
Hive.auto.convert.sortmerge.join.bigtable.selection.policy Org.apache.hadoop.hive.ql.optimizer.AvgPartitionSizeBasedBigTableSelectorForAutoSMJ
Hive.auto.convert.sortmerge.join.to.mapjoin False Whether to wear pieces sort-merge join to map join mode
Hive.auto.progress.timeout 0 How long-to-run autoprogressor for the SCRIPT/UDTF operators (in seconds). Set to 0 for forever. The execution script and UDTF expiration time, set to 0, means never expire.
Hive.autogen.columnalias.prefix.includefuncname False Does hive automatically generate temporary column names with the function name, which is not added by default
Hive.autogen.columnalias.prefix.label _c Temporary column name body portion of Hive
Hive.binary.record.max.length 1000 Maximum length of hive binary record
Hive.cache.expr.evaluation True If true, evaluation result of deterministic expression referenced twice or more would be cached. For example, in the filter condition like ". where key + > or key + ten = 0 "key + ten" would be evaluated/cached once and reused for following expression ("key + 10 = 0 "). Currently, this is applied only to expressions in select or filter operator. Whether to allow execution of the cached expression, which is allowed by default; The first stage caches only select and the expression results in where.
Hive.cli.errors.ignore False
Hive.cli.pretty.output.num.cols -1
Hive.cli.print.current.db False Whether to display the current operation database name, not displayed by default
Hive.cli.print.header False Displays the specific query header information, which is not displayed by default. For example, the column name is not displayed.
Hive.cli.prompt Hive The prefix information for the hive, which is modified to restart the client.
Hive.cluster.delegation.token.store.class Org.apache.hadoop.hive.thrift.MemoryTokenStore Hive Cluster delegate token information storage class
Hive.cluster.delegation.token.store.zookeeper.znode /hive/cluster/delegation Hive ZK Storage
Hive.compactor.abortedtxn.threshold 1000 Partition compressed file threshold
Hive.compactor.check.interval 300 Compression interval, per second
Hive.compactor.delta.num.threshold 10 Sub-partition threshold
Hive.compactor.delta.pct.threshold 0.1 Compression ratio
Hive.compactor.initiator.on False
Hive.compactor.worker.threads 0
Hive.compactor.worker.timeout 86400 Units per second
Hive.compat 0.12 Compatible version information
Hive.compute.query.using.stats False
hive.compute.splits.in.am True
Hive.conf.restricted.list Hive.security.authenticator.manager,hive.security.authorization.manager
Hive.conf.validation True
Hive.convert.join.bucket.mapjoin.tez False
Hive.counters.group.name HIVE
Hive.debug.localtask False
Hive.decode.partition.name False
Hive.default.fileformat Textfile Specifies the default FileFormat formatter. The default is Textfile.
Hive.default.rcfile.serde Org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe Rcfile the corresponding serialization class
Hive.default.serde Org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe The default serialization class
hive.display.partition.cols.separately True Separate display column names for hive partitions
Hive.downloaded.resources.dir /tmp/${hive.session.id}_resources Hive Download Resource Storage file
Hive.enforce.bucketing False Is the use of buckets allowed
Hive.enforce.bucketmapjoin False Allow bucket to map join
Hive.enforce.sorting False Whether to allow sort sorting at insertion time.
Hive.enforce.sortmergebucketmapjoin False
Hive.entity.capture.transform False
Hive.entity.separator @ Separator used to construct names of tables and partitions. For example, [email protected] @partitionname
Hive.error.on.empty.partition False Whether to throw an exception if dynamic partition insert generates empty results. When dynamic hive is enabled, if the inserted partition is empty, the exception information is thrown.
Hive.exec.check.crossproducts True Check if a vector product is included
Hive.exec.compress.intermediate False Whether the intermediate result is compressed, the compression mechanism uses Hadoop configuration information mapred.output.compress*
Hive.exec.compress.output False Whether the final result is compressed
Hive.exec.concatenate.check.index True
Hive.exec.copyfile.maxsize 33554432
Hive.exec.counters.pull.interval 1000
Hive.exec.default.partition.name __hive_default_partition__
Hive.exec.drop.ignorenonexistent True When the deletion is performed to ignore the non-existent exception information, default ignored, if ignored, then the error.
Hive.exec.dynamic.partition True Whether the partition is allowed to be specified dynamically, and if so, we can not specify the value of partition when we modify the content.
Hive.exec.dynamic.partition.mode Strict Dynamic partition mode, the strict mode requires at least one static partition value to be given. Nonstrict allows all partition to be dynamic values.
Hive.exec.infer.bucket.sort False
Hive.exec.infer.bucket.sort.num.buckets.power.two False
Hive.exec.job.debug.capture.stacktraces True
Hive.exec.job.debug.timeout 30000
Hive.exec.local.scratchdir /tmp/hadoop
Hive.exec.max.created.files 100000 The maximum number of HDFs files created in the MR Program
Hive.exec.max.dynamic.partitions 1000 Total partition maximum number of dynamic partitions
Hive.exec.max.dynamic.partitions.pernode 100 Maximum number of creation per Mr Node
Hive.exec.mode.local.auto False Whether to allow hive to run local mode
Hive.exec.mode.local.auto.input.files.max 4 Maximum number of input files in hive Local mode
Hive.exec.mode.local.auto.inputbytes.max 134217728 Hive Local mode group large input byte number
Hive.exec.orc.default.block.padding True
Hive.exec.orc.default.buffer.size 262144
Hive.exec.orc.default.compress ZLIB
Hive.exec.orc.default.row.index.stride 10000
Hive.exec.orc.default.stripe.size 268435456
Hive.exec.orc.dictionary.key.size.threshold 0.8
Hive.exec.orc.memory.pool 0.5
Hive.exec.orc.skip.corrupt.data False
Hive.exec.orc.zerocopy False
Hive.exec.parallel False Whether to allow parallel execution, which is not allowed by default.
Hive.exec.parallel.thread.number 8 The number of threads executing in parallel, default 8.
Hive.exec.perf.logger Org.apache.hadoop.hive.ql.log.PerfLogger
Hive.exec.rcfile.use.explicit.header True
Hive.exec.rcfile.use.sync.cache True
Hive.exec.reducers.bytes.per.reducer 1000000000 Size per reducer. The default is 1G, i.e if the-input size is 10G, it would use the ten reducers. The default reducer node handles the size of the data, which defaults to 1G.
Hive.exec.reducers.max 999 Maximum number of reducer allowed. This parameter is effective when MAPRED.REDUCE.TASKS is specified as a negative value.
Hive.exec.rowoffset False
Hive.exec.scratchdir /etc/hive-hadoop
Hive.exec.script.allow.partial.consumption False
Hive.exec.script.maxerrsize 100000
Hive.exec.script.trust False
Hive.exec.show.job.failure.debug.info True
Hive.exec.stagingdir . hive-staging
Hive.exec.submitviachild False
Hive.exec.tasklog.debug.timeou 20000
Hive.execution.engine Mr Executive Engine Mr or Tez (HADOOP2)
Hive.exim.uri.scheme.whitelist Hdfs,pfile
Hive.explain.dependency.append.tasktype False
Hive.fetch.output.serde Org.apache.hadoop.hive.serde2.DelimitedJSONSerDe
Hive.fetch.task.aggr False
Hive.fetch.task.conversion Minimal
Hive.fetch.task.conversion.threshold -1
Hive.file.max.footer 100
Hive.fileformat.check True
Hive.groupby.mapaggr.checkinterval 100000
Hive.groupby.orderby.position.alias False
Hive.groupby.skewindata False
Hive.hadoop.supports.splittable.combineinputformat False
Hive.hashtable.initialCapacity 100000
Hive.hashtable.loadfactor 0.75
Hive.hbase.generatehfiles False
Hive.hbase.snapshot.restoredir /tmp
Hive.hbase.wal.enabled True
Hive.heartbeat.interval 1000
Hive.hmshandler.force.reload.conf False
Hive.hmshandler.retry.attempts 1
Hive.hmshandler.retry.interval 1000
Hive.hwi.listen.host 0.0.0.0
Hive.hwi.listen.port 9999
Hive.hwi.war.file Lib/hive-hwi-${version}.war
Hive.ignore.mapjoin.hint True
Hive.in.test False
Hive.index.compact.binary.search True
Hive.index.compact.file.ignore.hdfs False
Hive.index.compact.query.max.entries 10000000
Hive.index.compact.query.max.size 10737418240
Hive.input.format Org.apache.hadoop.hive.ql.io.CombineHiveInputFormat
Hive.insert.into.external.tables True
Hive.insert.into.multilevel.dirs False
Hive.jobname.length 50
Hive.join.cache.size 25000
Hive.join.emit.interval 1000
Hive.lazysimple.extended_boolean_literal False
Hive.limit.optimize.enable False
Hive.limit.optimize.fetch.max 50000
Hive.limit.optimize.limit.file 10
Hive.limit.pushdown.memory.usage -1.0
Hive.limit.query.max.table.partition -1
Hive.limit.row.max.size 100000
Hive.localize.resource.num.wait.attempts 5
Hive.localize.resource.wait.interval 5000
Hive.lock.manager Org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager
Hive.mapred.partitioner Org.apache.hadoop.hive.ql.io.DefaultHivePartitioner
Hive.mapred.reduce.tasks.speculative.execution True
Hive.mapred.supports.subdirectories False
Hive.metastore.uris thrift://hh:9083
Hive.metastore.warehouse.dir /user/hive/warehouse
Hive.multi.insert.move.tasks.share.dependencies False
Hive.multigroupby.singlereducer True
Hive.zookeeper.clean.extra.nodes False Whether the additional node data is clear at the end of the session
Hive.zookeeper.client.port 2181 Client port number
Hive.zookeeper.quorum ZK's server-side IP
Hive.zookeeper.session.timeout 600000 ZK client End Session Expiration time
Hive.zookeeper.namespace Hive_zookeeper_namespace
Javax.jdo.PersistenceManagerFactoryClass Org.datanucleus.api.jdo.JDOPersistenceManagerFactory
Javax.jdo.option.ConnectionDriverName Instead: Com.mysql.jdbc.Driver
Javax.jdo.option.ConnectionPassword Instead: Hive
Javax.jdo.option.ConnectionURL Xxx
Javax.jdo.option.ConnectionUserName Xxx
Javax.jdo.option.DetachAllOnCommit True
javax.jdo.option.Multithreaded True
Javax.jdo.option.NonTransactionalRead True

[Hive]-hive parameter meaning detailed

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.