The pig script does not need a suffix name to run
pig脚本名为tempfile,无后缀名用pig -f tempfile 可直接运行另外,pig tempfile也可以直接运行
This allows you to directly invoke the contents of the pig script in a python temporary file
A way for Python to invoke the pig script
将pig脚本用任意文件存储,执行时写入python的临时文件(tempfile模块操作),执行结束后删除。执行过程: 用tempfile模块NamedTemporaryFile生成临时文件,名字默认随机,然后,可以用tempfile.name直接调用该文件(无后缀名.pig),pig脚本内容存放在任意文本文件中。
The advantage of this process is that the parameters are easily passed in, and the parameters of the pig script are all formatted with the Python format string, such as%s,%d, which is read as a String Object command, and the actual parameters are spelled into the string command with%. This avoids the cumbersome use of pig script-p to pass in a large number of parameters.
Cons: Superfluous, trouble. Character directly into-p after the use of the default to obtain, is also excellent
pig_script = tempfile.NamedTemporaryFile(delete=False) pig_script.write(‘set default_parallel %d; SET mapred.job.queue.name %s; %s %s‘ % (config.PIG_PARALLEL, job_queue, udf_jar_str, command_piece % args)) pig_script.flush() ‘‘‘%s -Dmapred.cache.files="%s,%s,%s,%s" -Dmapred.create.symlink=yes -Dmapred.child.java.opts=-Xmx%dm -f %s‘‘‘ % (config.PIG_BIN, metadata_dir, quadkey_dir, region_template_dir, ipdb_file, config.PIG_TASK_MAX_MEM, pig_script.name) if logger: logger.debug(command) result = exec_command(command, task_id)
Pig script does not need suffix name (Python tempfile module generates pig script temp file, executes)