文章详情

短信预约-IT技能 免费直播动态提醒

请输入下面的图形验证码

提交验证

短信预约提醒成功

源码解析springbatch的job运行机制

2022-11-13 14:42

关注

源码解析springbatch的job是如何运行的?

注,本文中的demo代码节选于图书《Spring Batch批处理框架》的配套源代码,并做并适配springboot升级版本,完全开源。

SpringBatch的背景和用法,就不再赘述了,默认本文受众都使用过batch框架。
本文仅讨论普通的ChunkStep,分片/异步处理等功能暂不讨论。

1. 表结构

Spring系列的框架代码,大多又臭又长,让人头晕。先列出整体流程,再去看源码。顺带也可以了解存储表结构。

框架使用的表

batch_job_instance
batch_job_execution
batch_job_execution_context
batch_job_execution_params
batch_step_execution
batch_step_execution_context
batch_job_seq
batch_step_execution_seq
batch_job_execution_seq

2. API入口

先看看怎么调用启动Job的API,看起来非常简单,传入job信息和参数即可

@Autowired
    @Qualifier("billJob")
    private Job job;
    
    @Test
    public void billJob() throws Exception {
        JobParameters jobParameters = new JobParametersBuilder()
                .addLong("currentTimeMillis", System.currentTimeMillis())
                .addString("batchNo","2022080402")
                .toJobParameters();
        JobExecution result = jobLauncher.run(job, jobParameters);
        System.out.println(result.toString());
 
        Thread.sleep(6000);
    }
 <!-- 账单作业 -->
    <batch:job id="billJob">
        <batch:step id="billStep">
            <batch:tasklet transaction-manager="transactionManager">
                <batch:chunk reader="csvItemReader" writer="csvItemWriter" processor="creditBillProcessor" commit-interval="3">
                </batch:chunk>
            </batch:tasklet>
        </batch:step>
    </batch:job>

org.springframework.batch.core.launch.support.SimpleJobLauncher#run

// 简化部分代码(参数检查、log日志)
@Override
public JobExecution run(final Job job, final JobParameters jobParameters){
	final JobExecution jobExecution;
	JobExecution lastExecution = jobRepository.getLastJobExecution(job.getName(), jobParameters);
       // 上次执行存在,说明本次请求是重启job,先做检查
	if (lastExecution != null) {
		if (!job.isRestartable()) {
			throw new JobRestartException("JobInstance already exists and is not restartable");
		}
		
		for (StepExecution execution : lastExecution.getStepExecutions()) {
			BatchStatus status = execution.getStatus();
			if (status.isRunning() || status == BatchStatus.STOPPING) {
				throw new JobExecutionAlreadyRunningException("A job execution for this job is already running: "
						+ lastExecution);
			} else if (status == BatchStatus.UNKNOWN) {
				throw new JobRestartException(
						"Cannot restart step [" + execution.getStepName() + "] from UNKNOWN status. ");
			}
		}
	}
	// Check jobParameters
	job.getJobParametersValidator().validate(jobParameters);
       // 创建JobExecution 同一个job+参数,只能有一个Execution执行器
	jobExecution = jobRepository.createJobExecution(job.getName(), jobParameters);
	try {
           // SyncTaskExecutor 看似是异步,实际是同步执行(可扩展)
		taskExecutor.execute(new Runnable() {
			@Override
			public void run() {
				try {
                       // 关键入口,请看[org.springframework.batch.core.job.AbstractJob#execute]
					job.execute(jobExecution);
					if (logger.isInfoEnabled()) {
						Duration jobExecutionDuration = BatchMetrics.calculateDuration(jobExecution.getStartTime(), jobExecution.getEndTime());
					}
				}
				catch (Throwable t) {
					rethrow(t);
				}
			}
			private void rethrow(Throwable t) {
                   // 省略各类抛异常
				throw new IllegalStateException(t);
			}
		});
	}
	catch (TaskRejectedException e) {
        // 更新job_execution的运行状态
		jobExecution.upgradeStatus(BatchStatus.FAILED);
		if (jobExecution.getExitStatus().equals(ExitStatus.UNKNOWN)) {
			jobExecution.setExitStatus(ExitStatus.FAILED.addExitDescription(e));
		}
		jobRepository.update(jobExecution);
	}
	return jobExecution;
}

3. 深入代码流程

简单看看API入口,子类划分较多,继续往后看

总体代码流程

JobExecution的处理过程

org.springframework.batch.core.job.AbstractJob#execute


@Ovrride
public final void execute(JobExecution execution) {
 
    // 同步控制器,防并发执行
    JobSynchronizationManager.register(execution);
    // 计时器,记录耗时
    LongTaskTimer longTaskTimer = BatchMetrics.createLongTaskTimer("job.active", "Active jobs",
            Tag.of("name", execution.getJobInstance().getJobName()));
    LongTaskTimer.Sample longTaskTimerSample = longTaskTimer.start();
    Timer.Sample timerSample = BatchMetrics.createTimerSample();
 
    try {
        // 参数再次进行校验
        jobParametersValidator.validate(execution.getJobParameters());
 
        if (execution.getStatus() != BatchStatus.STOPPING) {
 
            // 更新db中任务状态
            execution.setStartTime(new Date());
            updateStatus(execution, BatchStatus.STARTED);
            // 回调所有listener的beforeJob方法
            listener.beforeJob(execution);
 
            try {
                doExecute(execution);
            } catch (RepeatException e) {
                throw e.getCause(); // 搞不懂这里包一个RepeatException 有啥用
            }
        } else {
            // 任务状态时BatchStatus.STOPPING,说明任务已经停止,直接改成STOPPED
            // The job was already stopped before we even got this far. Deal
            // with it in the same way as any other interruption.
            execution.setStatus(BatchStatus.STOPPED);
            execution.setExitStatus(ExitStatus.COMPLETED);
        }
 
    } catch (JobInterruptedException e) {
        // 任务被打断 STOPPED
        execution.setExitStatus(getDefaultExitStatusForFailure(e, execution));
        execution.setStatus(BatchStatus.max(BatchStatus.STOPPED, e.getStatus()));
        execution.addFailureException(e);
    } catch (Throwable t) {
        // 其他原因失败 FAILED
        logger.error("Encountered fatal error executing job", t);
        execution.setExitStatus(getDefaultExitStatusForFailure(t, execution));
        execution.setStatus(BatchStatus.FAILED);
        execution.addFailureException(t);
    } finally {
        try {
            if (execution.getStatus().isLessThanOrEqualTo(BatchStatus.STOPPED)
                    && execution.getStepExecutions().isEmpty()) {
                ExitStatus exitStatus = execution.getExitStatus();
                ExitStatus newExitStatus =
                        ExitStatus.NOOP.addExitDescription("All steps already completed or no steps configured for this job.");
                execution.setExitStatus(exitStatus.and(newExitStatus));
            }
 
            // 计时器 计算总耗时
            timerSample.stop(BatchMetrics.createTimer("job", "Job duration",
                    Tag.of("name", execution.getJobInstance().getJobName()),
                    Tag.of("status", execution.getExitStatus().getExitCode())
            ));
            longTaskTimerSample.stop();
            execution.setEndTime(new Date());
 
            try {
                // 回调所有listener的afterJob方法  调用失败也不影响任务完成
                listener.afterJob(execution);
            } catch (Exception e) {
                logger.error("Exception encountered in afterJob callback", e);
            }
            // 写入db
            jobRepository.update(execution);
        } finally {
            // 释放控制
            JobSynchronizationManager.release();
        }
 
    }
 
}

3.1何时调用Reader?

在SimpleChunkProvider#provide中会分次调用reader,并将结果包装为Chunk返回。

其中有几个细节,此处不再赘述。

org.springframework.batch.core.step.item.SimpleChunkProcessor#process
 
	@Nullable
	@Override
	public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
 
		@SuppressWarnings("unchecked")
		Chunk<I> inputs = (Chunk<I>) chunkContext.getAttribute(INPUTS_KEY);
		if (inputs == null) {
			inputs = chunkProvider.provide(contribution);
			if (buffering) {
				chunkContext.setAttribute(INPUTS_KEY, inputs);
			}
		}
 
		chunkProcessor.process(contribution, inputs);
		chunkProvider.postProcess(contribution, inputs);
 
		// Allow a message coming back from the processor to say that we
		// are not done yet
		if (inputs.isBusy()) {
			logger.debug("Inputs still busy");
			return RepeatStatus.CONTINUABLE;
		}
 
		chunkContext.removeAttribute(INPUTS_KEY);
		chunkContext.setComplete();
 
		if (logger.isDebugEnabled()) {
			logger.debug("Inputs not busy, ended: " + inputs.isEnd());
		}
		return RepeatStatus.continueIf(!inputs.isEnd());
 
	}

3.2何时调用Processor/Writer?

在RepeatTemplate和外围事务模板的包装下,通过SimpleChunkProcessor进行处理:

 // 忽略无关代码...
	@Override
	public final void process(StepContribution contribution, Chunk<I> inputs) throws Exception {
 
		// 输入为空,直接返回If there is no input we don't have to do anything more
		if (isComplete(inputs)) {
			return;
		}
 
		// Make the transformation, calling remove() on the inputs iterator if
		// any items are filtered. Might throw exception and cause rollback.
		Chunk<O> outputs = transform(contribution, inputs);
 
		// Adjust the filter count based on available data
		contribution.incrementFilterCount(getFilterCount(inputs, outputs));
 
		// Adjust the outputs if necessary for housekeeping purposes, and then
		// write them out...
		write(contribution, inputs, getAdjustedOutputs(inputs, outputs));
 
	}
 
    // 遍历items,逐个item调用processor
	protected Chunk<O> transform(StepContribution contribution, Chunk<I> inputs) throws Exception {
		Chunk<O> outputs = new Chunk<>();
		for (Chunk<I>.ChunkIterator iterator = inputs.iterator(); iterator.hasNext();) {
			final I item = iterator.next();
			O output;
			String status = BatchMetrics.STATUS_SUCCESS;
			try {
				output = doProcess(item);
			}
			catch (Exception e) {
				
				inputs.clear();
				status = BatchMetrics.STATUS_FAILURE;
				throw e;
			}
			if (output != null) {
				outputs.add(output);
			}
			else {
				iterator.remove();
			}
		}
		return outputs;
	}

4. 每个step是如何与事务处理挂钩?

在TaskletStep#doExecute中会使用TransactionTemplate,包装事务操作

标准的事务操作,通过函数式编程风格,从action的CallBack调用实际处理方法

// org.springframework.batch.core.step.tasklet.TaskletStep#doExecute
    result = new TransactionTemplate(transactionManager, transactionAttribute)
				    .execute(new ChunkTransactionCallback(chunkContext, semaphore));
 
    // 事务启用过程
    // org.springframework.transaction.support.TransactionTemplate#execute
	@Override
	@Nullable
	public <T> T execute(TransactionCallback<T> action) throws TransactionException {
		Assert.state(this.transactionManager != null, "No PlatformTransactionManager set");
 
		if (this.transactionManager instanceof CallbackPreferringPlatformTransactionManager) {
			return ((CallbackPreferringPlatformTransactionManager) this.transactionManager).execute(this, action);
		}
		else {
			TransactionStatus status = this.transactionManager.getTransaction(this);
			T result;
			try {
				result = action.doInTransaction(status);
			}
			catch (RuntimeException | Error ex) {
				// Transactional code threw application exception -> rollback
				rollbackOnException(status, ex);
				throw ex;
			}
			catch (Throwable ex) {
				// Transactional code threw unexpected exception -> rollback
				rollbackOnException(status, ex);
				throw new UndeclaredThrowableException(ex, "TransactionCallback threw undeclared checked exception");
			}
			this.transactionManager.commit(status);
			return result;
		}
	}

5. 怎么控制每个chunk几条记录提交一次事务? 控制每个事务窗口处理的item数量

在配置任务时,有个step级别的参数,[commit-interval],用于每个事务窗口提交的控制被处理的item数量。

RepeatTemplate#executeInternal 在处理单条item后,会查看已处理完的item数量,与配置的chunk数量做比较,如果满足chunk数,则不再继续,准备提交事务。

StepBean在初始化时,会新建SimpleCompletionPolicy(chunkSize会优先使用配置值,默认是5)

在每个chunk处理开始时,都会调用SimpleCompletionPolicy#start新建RepeatContextSupport#count用于计数。

源码(简化) org.springframework.batch.repeat.support.RepeatTemplate#executeInternal


private RepeatStatus executeInternal(final RepeatCallback callback) {
	// Reset the termination policy if there is one...
       // 此处会调用completionPolicy.start方法,更新chunk的计数器
	RepeatContext context = start();
	// Make sure if we are already marked complete before we start then no processing takes place.
       // 通过running字段来判断是否继续处理next
	boolean running = !isMarkedComplete(context);
       // 省略listeners处理....
	// Return value, default is to allow continued processing.
	RepeatStatus result = RepeatStatus.CONTINUABLE;
	RepeatInternalState state = createInternalState(context);
	try {
		while (running) {
			
               // 省略listeners处理....
			if (running) {
				try {
                       // callback是实际处理方法,类似函数式编程
					result = getNextResult(context, callback, state);
					executeAfterInterceptors(context, result);
				}
				catch (Throwable throwable) {
					doHandle(throwable, context, deferred);
				}
                   // 检查当前chunk是否处理完,决策出是否继续处理下一条item
				// N.B. the order may be important here:
				if (isComplete(context, result) || isMarkedComplete(context) || !deferred.isEmpty() {
					running = false;
				}
			}
		}
		result = result.and(waitForResults(state));
           // 省略throwables处理....
		// Explicitly drop any references to internal state...
		state = null;
	}
	finally {
           // 省略代码...
	}
	return result;
}

总结

JSR-352标准定义了Java批处理的基本模型,包含批处理的元数据像 JobExecutions,JobInstances,StepExecutions 等等。通过此类模型,提供了许多基础组件与扩展点:

2013年, JSR-352标准包含在 JavaEE7中发布,到2022年已近10年,Spring也在探索新的批处理模式, 如Spring Attic /Spring Cloud Data Flow。 https://docs.spring.io/spring-batch/docs/current/reference/html/jsr-352.html

扩展

1. Job/Step运行时的上下文,是如何保存?如何控制?

整个Job在运行时,会将运行信息保存在JobContext中。 类似的,Step运行时也有StepContext。可以在Context中保存一些参数,在任务或者步骤中传递使用。

查看JobContext/StepContext源码,发现仅用了普通变量保存Execution,这个类肯定有线程安全问题。 生产环境中常常出现多个任务并处处理的情况。

SpringBatch用了几种方式来包装并发安全:

此处能看到,如果在JobExecution存储大量的业务数据,会导致无法GC回收,导致OOM。所以在上下文中,只应保存精简的数据。

2. step执行时,如果出现异常,如何保护运行状态?

在源码中,使用了各类同步控制和加锁、oldVersion版本拷贝,整体比较复杂(org.springframework.batch.core.step.tasklet.TaskletStep.ChunkTransactionCallback#doInTransaction)

1.oldVersion版本拷贝:上一次运行出现异常时,本次执行时沿用上次的断点内容

// 节选部分代码
oldVersion = new StepExecution(stepExecution.getStepName(), stepExecution.getJobExecution());
copy(stepExecution, oldVersion);
 
private void copy(final StepExecution source, final StepExecution target) {
	target.setVersion(source.getVersion());
	target.setWriteCount(source.getWriteCount());
	target.setFilterCount(source.getFilterCount());
	target.setCommitCount(source.getCommitCount());
	target.setExecutionContext(new ExecutionContext(source.getExecutionContext()));
}

2.信号量控制,在每个chunk运行完成后,需先获取锁,再更新stepExecution前Shared semaphore per step execution, so other step executions can run in parallel without needing the lockSemaphore (org.springframework.batch.core.step.tasklet.TaskletStep#doExecute)

// 省略无关代码
try {
	try {
        // 执行w-p-r模型方法
		result = tasklet.execute(contribution, chunkContext);
		if (result == null) {
			result = RepeatStatus.FINISHED;
		}
	}
	catch (Exception e) {
		// 省略...
	}
}
finally {
	// If the step operations are asynchronous then we need to synchronize changes to the step execution (at a
	// minimum). Take the lock *before* changing the step execution.
	try {
        // 获取锁
		semaphore.acquire();
		locked = true;
	}
	catch (InterruptedException e) {
		logger.error("Thread interrupted while locking for repository update");
		stepExecution.setStatus(BatchStatus.STOPPED);
		stepExecution.setTerminateOnly();
		Thread.currentThread().interrupt();
	}
	stepExecution.apply(contribution);
}
stepExecutionUpdated = true;
stream.update(stepExecution.getExecutionContext());
try {
    // 更新上下文、DB中的状态
	// Going to attempt a commit. If it fails this flag will stay false and we can use that later.
	getJobRepository().updateExecutionContext(stepExecution);
	stepExecution.incrementCommitCount();
	getJobRepository().update(stepExecution);
}
catch (Exception e) {
	// If we get to here there was a problem saving the step execution and we have to fail.
	String msg = "JobRepository failure forcing rollback";
	logger.error(msg, e);
	throw new FatalStepExecutionException(msg, e);
}

到此这篇关于springbatch的job是如何运行的?的文章就介绍到这了,更多相关springbatch job运行内容请搜索编程网以前的文章或继续浏览下面的相关文章希望大家以后多多支持编程网!

阅读原文内容投诉

免责声明:

① 本站未注明“稿件来源”的信息均来自网络整理。其文字、图片和音视频稿件的所属权归原作者所有。本站收集整理出于非商业性的教育和科研之目的,并不意味着本站赞同其观点或证实其内容的真实性。仅作为临时的测试数据,供内部测试之用。本站并未授权任何人以任何方式主动获取本站任何信息。

② 本站未注明“稿件来源”的临时测试数据将在测试完成后最终做删除处理。有问题或投稿请发送至: 邮箱/279061341@qq.com QQ/279061341

软考中级精品资料免费领

  • 历年真题答案解析
  • 备考技巧名师总结
  • 高频考点精准押题
  • 2024年上半年信息系统项目管理师第二批次真题及答案解析(完整版)

    难度     813人已做
    查看
  • 【考后总结】2024年5月26日信息系统项目管理师第2批次考情分析

    难度     354人已做
    查看
  • 【考后总结】2024年5月25日信息系统项目管理师第1批次考情分析

    难度     318人已做
    查看
  • 2024年上半年软考高项第一、二批次真题考点汇总(完整版)

    难度     435人已做
    查看
  • 2024年上半年系统架构设计师考试综合知识真题

    难度     224人已做
    查看

相关文章

发现更多好内容

猜你喜欢

AI推送时光机
位置:首页-资讯-后端开发
咦!没有更多了?去看看其它编程学习网 内容吧
首页课程
资料下载
问答资讯