0


【文件增量备份系统】备份业务实现与性能优化

文章目录

性能优化

原方案

递归扫描数据源的所有文件,每扫描一个,就判断当前文件需不需要备份,如果需要备份,直接执行备份,并将数据插入到数据库中。该实现方式会造成程序与数据库的通讯时间长、索引维护时间长、数据库日志写入次数更多、IO效率较低。从下图发现整个备份时长竟达到了一个小时(备份目录大小:

8.15G

,文件个数:

211470

),这个性能肯定是属于不可用的

在这里插入图片描述

缓冲区备份方案

该方案即使用缓冲区来暂存需要插入或者更新的数据,等待缓冲区的数据量较多时,再进行批量插入或批量更新。通过下图可以发现,优化后的程序只需要46秒即可完成备份,备份效率相较于原方案大大提升

在这里插入图片描述

优点

  • 效率高

缺点

  • 实时性不强,原方案每次备份完文件就会将数据插入数据库,但当前方案则是等数据够多才批量存储,如果程序在备份过程中被关闭,则部分备份过程数据会丢失,导致部分文件在下次备份时会替换本次备份已经备份过的文件,注意这里丢失的数据不是指数据源中的数据,而是要存储到数据库的那些数据
  • 占用内存相比原方案会稍微大一点

缓冲区备份方案实现

备份原理

备份原理其实非常简单。在文件第一次备份的时候,会在数据库中存储文件

大小

修改日期

MD5码信息

,等第二次备份的时候,会对比文件现在的状态,如判断文件大小、修改日期有没有变化。如果两者都没有变化,说明文件没有被修改,无需替换;如果大小有变化,说明文件被修改了,需要进行替换;如果修改日期变化,文件大小没有变化,则需要进一步判断文件当前的MD5码是否和数据库中存储的一致,因为文件大小相同不能说明文件一定没有修改。如果MD5码不一致说明文件真正被修改了,因为同样的输入通过算法输出的MD5码一定是相同的

Controller

/**
 * 对指定的数据源进行备份
 */@GetMapping("/backupBySourceId/{sourceId}")publicResultbackupBySourceId(@PathVariableLong sourceId)throwsIOException{if(backupingSourceIDSet.contains(sourceId)){thrownewClientException("当前备份源正在备份中,请稍后再试");}// 检查 备份源目录是否存在 和 准备好备份目标目录List<Task> taskList = backupService.checkSourceAndTarget(sourceId);if(taskList ==null|| taskList.size()==0){removeSourceIdFromBacking(backupingSourceIDSet, sourceId);returnResults.failure();}// 开始备份
    backupingSourceIDSet.add(sourceId);CompletableFuture.runAsync(()->{try{
            backupService.backupBySourceId(sourceId, taskList);}catch(ServerException e){try{thrownewServerException(e.getMessage());}catch(ServerException ex){thrownewRuntimeException(ex);}}catch(IOException e){thrownewRuntimeException(e);}}, executor).exceptionally(throwable ->{
        log.error(throwable.getMessage());removeSourceIdFromBacking(backupingSourceIDSet, sourceId);returnnull;});returnResults.success();}/**
 * 将数据源Id从正在备份的数据源set中移除
 *
 * @param backupingSourceIDSet
 * @param sourceId
 */privatevoidremoveSourceIdFromBacking(HashSet<Long> backupingSourceIDSet,Long sourceId){if(backupingSourceIDSet.contains(sourceId)){
        backupingSourceIDSet.remove(sourceId);}}

这里面主要有如下细节:

  • 在备份之前,首先判断当前数据源是否处于备份状态(backupingSourceIDSet可以理解为一个备份ID池,ID在里面则说明数据源正在备份),如果数据源处于备份状态,则直接返回提示告诉用户数据源正在备份,让其稍后再尝试
  • 在真正开始备份之前,需要检测数据源和备份目标目录是否存在,有时候用户可能忘记插上硬盘或者输错目录路径
  • 如果数据量较大,备份需要花费一定的时间,但是用户点击备份按钮之后,系统应该有所提示让用户知道备份是否成功开始,因此使用CompletableFuture来开启异步任务来执行备份,然后给用户返回数据源加入备份成功
  • 备份完成之后,将数据源ID从备份ID池中移除

Service

备份功能的实现需要使用的表如下:

  • backup_source:存储备份数据源
  • backup_target:存储备份目标目录,关联数据源,数据源和备份目标目录是一对多关系
  • backup_task:存储备份任务
  • backup_file:存储已备份的文件
  • backup_file_history:存储已备份文件对应的备份记录
  • sys_param:存储系统在备份时忽略的文件或目录

下面代码开始真正的业务介绍:

/**
 * 对指定的备份源进行备份
 *
 * @param sourceId
 */@OverridepublicvoidbackupBySourceId(Long sourceId,List<Task> taskList)throwsIOException{// 更新数据源备份次数
    backupSourceService.updateBackupNum(sourceId);// 查询忽略文件和忽略目录List<String> ignoreFileList = sysParamService.getIgnoreFileOrIgnoreDir(SystemParamEnum.IGNORE_FILE_NAME.getParamName());List<String> ignoreDirectoryList = sysParamService.getIgnoreFileOrIgnoreDir(SystemParamEnum.IGNORE_DIRECTORY_NAME.getParamName());// 执行备份CompletableFuture[] futureArr =newCompletableFuture[taskList.size()];for(int i =0; i < taskList.size(); i++){int finalI = i;Task task = taskList.get(finalI);//            backUpByTask(task, ignoreFileList, ignoreDirectoryList);
        futureArr[i]=CompletableFuture.runAsync(()->{try{backUpByTask(task, ignoreFileList, ignoreDirectoryList);}catch(IOException e){thrownewRuntimeException(e);}}, executor).exceptionally(e ->{
            log.error(e.getMessage());// 备份失败(出现异常),移除相应数据源IDif(backupController.backupingSourceIDSet.contains(sourceId)){
                backupController.backupingSourceIDSet.remove(sourceId);}Map<String,Object> dataMap =newHashMap<>();
            dataMap.put("code",WebsocketNoticeEnum.BACKUP_ERROR.getCode());
            dataMap.put("message", e.getMessage());
            webSocketServer.sendMessage(JSON.toJSONString(dataMap),WebSocketServer.usernameAndSessionMap.get("Admin"));returnnull;});}CompletableFuture.allOf(futureArr).join();// 备份完成,移除相应数据源IDif(backupController.backupingSourceIDSet.contains(sourceId)){
        backupController.backupingSourceIDSet.remove(sourceId);}}

该方法业务流程如下:

  1. 执行备份之前先更新数据库中数据源的备份次数
  2. 通过sys_param查询出要忽略的文件和忽略目录,在备份过程中对这些文件和目录进行忽略,因为部分文件是不需要备份的,例如Java项目的.idea文件,该文件使用IDEA启动项目会自动生成,而且不同版本IDEA生成的.idea文件有所区别,因此不需要进行备份
  3. 如果需要将一个数据源的数据同时备份到多个目标目录中,同时开多个线程来分别执行每个备份任务,提高备份效率,一个备份任务负责将数据源的数据备份到一个目标目录中
/**
 * 根据备份任务来进行备份
 *
 * @param task                备份任务
 * @param ignoreFileList      忽略文件名列表
 * @param ignoreDirectoryList 忽略目录名列表
 */privatevoidbackUpByTask(Task task,List<String> ignoreFileList,List<String> ignoreDirectoryList)throwsIOException{BackupSource backupSource = task.getSource();BackupTarget backupTarget = task.getTarget();// 找到备份目录下面的所有文件BackupStatistic sta =newBackupStatistic(0,0,0,0,newDate().getTime()/1000);// 获取数据源的统计数据getStatisticMessage(newFile(backupSource.getRootPath()), sta);//        log.info("当前数据源(id={})下的总文件数量:{},总字节数:{}", backupSource.getId(), sta.totalBackupFileNum, sta.totalBackupByteNum);String targetRootPath =getTargetRootPath(task, backupSource, backupTarget);// 将任务插入到数据库中BackupTask backupTask =newBackupTask(backupSource.getRootPath(), targetRootPath,
            sta.totalBackupFileNum,0, sta.totalBackupByteNum,0L,0,"0.0","0.0",0L,newDate());
    backupTaskService.save(backupTask);//        log.info("发送任务消息,通知前端任务创建成功");Map<String,Object> dataMap =newHashMap<>();
    dataMap.put("code",WebsocketNoticeEnum.BACKUP_START.getCode());
    dataMap.put("message",WebsocketNoticeEnum.BACKUP_START.getDetail());
    dataMap.put("backupTask", backupTask);
    webSocketServer.sendMessage(JSON.toJSONString(dataMap),WebSocketServer.usernameAndSessionMap.get("Admin"));
    log.info("任务创建成功,开始备份");/// 查询出数据源和备份目标对应的 备份文件信息// 查询出当前数据源中所有已经备份过的文件QueryWrapper<BackupFile> backupFileQueryWrapper =newQueryWrapper<BackupFile>().eq("backup_source_id", backupSource.getId()).eq("father_id",0L).select("id","source_file_path","target_file_path","file_name");if(backupSource.getBackupType()==0){// 集中备份的时候,根据目标id查询;分散备份的时候,目标id不确定,所以都查询出来
        backupFileQueryWrapper.eq("backup_target_id", backupTarget.getId());}List<BackupFile> backupFileList = backupFileService.list(backupFileQueryWrapper);// 将数据源的数据备份到多个目标目录下面
    sta.second =newDate().getTime()/1000;/// 开始备份List<BackupFile> backupFileBuffer1 =newArrayList<>();List<BackupFile> backupFileBuffer2 =newArrayList<>();List<BackupFileHistory> backupFileHistoryBuffer1 =newArrayList<>();List<BackupFileHistory> backupFileHistoryBuffer2 =newArrayList<>();backUpAllFilesOfFatherFile(task,newFile(backupSource.getRootPath()),
            backupSource, backupTarget, task.getTargetList(), sta,"", backupTask.getId(), backupTask.getCreateTime(),0L, backupFileList, ignoreFileList, ignoreDirectoryList,
            backupFileBuffer1, backupFileHistoryBuffer1,
            backupFileBuffer2, backupFileHistoryBuffer2);// 处理缓冲区中残留数据buffer1Process(backupFileBuffer1, backupFileHistoryBuffer1);buffer2Process(backupTask.getId(), backupSource, backupFileBuffer2, backupFileHistoryBuffer2);/// 备份结束if(Cache.STOP_TASK_ID_SET.contains(backupTask.getId())){// --if-- 因为备份任务被暂停才结束的Cache.STOP_TASK_ID_SET.remove(backupTask.getId());}else{// --if-- 备份完成了,修改备份任务的状态为完成
        backupTask.setBackupStatus(2);
        backupTask.setFinishFileNum(sta.getTotalBackupFileNum());
        backupTask.setFinishByteNum(sta.getTotalBackupByteNum());
        backupTask.setEndTime(newDate());
        backupTask.setBackupTime(backupTask.getEndTime().getTime()- backupTask.getCreateTime().getTime());
        backupTaskService.updateById(backupTask);setProgress(backupTask);
        log.info("发送任务消息,通知前端任务备份完成");
        dataMap =newHashMap<>();
        dataMap.put("code",WebsocketNoticeEnum.BACKUP_SUCCESS.getCode());
        dataMap.put("message",WebsocketNoticeEnum.BACKUP_SUCCESS.getDetail());
        dataMap.put("backupTask", backupTask);
        webSocketServer.sendMessage(JSON.toJSONString(dataMap),WebSocketServer.usernameAndSessionMap.get("Admin"));}}/**
 * 获取一个目录下面的统计信息
 * 1. 需要备份的文件数量
 * 2. 需要备份的字节数量
 *
 * @param file
 * @param sta  用来存储统计信息
 */privatevoidgetStatisticMessage(File file,BackupStatistic sta){File[] fileArr = file.listFiles();for(File f : fileArr){if(f.isDirectory()){// --if-- 若是目录,则递归统计该目录下的文件数量getStatisticMessage(f, sta);}else{// --if-- 若是文件,添加到文件夹中
            sta.totalBackupFileNum++;
            sta.totalBackupByteNum += f.length();}}}

该方法主要负责一个任务的备份,业务流程如下:

  1. 使用递归方法getStatisticMessage来统计数据源根目录下面一个有多少个文件,方便后面实现进度可视化(大数据量时,这个方法较慢,需要进一步优化)
  2. 将备份任务插入的数据库中进行保存、然后通过Websocket双向通讯技术通知前端备份开始啦,顺便告诉前端当前任务需要备份的文件总数是多少、文件个数是多少,类似下图的效果

在这里插入图片描述

  1. 将当前数据源所备份过第一层深度的备份文件一起查询出来,这些备份文件的father_id为0。现实情况中,目录下面可能会包含子目录和子文件,而子目录下面又可能会有子目录或子文件,可以将此结构理解成一个文件树,所以就有了深度这个概念
  2. 进入递归备份方法backUpAllFilesOfFatherFile,检验每个目录、每个文件是否需要进行备份
  3. 备份完成之后,将缓冲区中残留的数据存储到数据库中
  4. 更新数据库中的备份任务状态
  5. 使用Websocket通知前端当前任务备份完成
/**
     * 将一个 父文件夹 的所有文件 备份到 目标目录中
     *
     * @param fatherFile
     * @param backupSource
     * @param backupTarget
     * @param backupStatistic
     * @param middlePath
     */privatevoidbackUpAllFilesOfFatherFile(Task task,File fatherFile,BackupSource backupSource,BackupTarget backupTarget,List<BackupTarget> targetList,BackupStatistic backupStatistic,String middlePath,Long backupTaskId,Date taskBackupStartTime,Long fatherId,List<BackupFile> backupFileList,List<String> ignoreFileList,List<String> ignoreDirectoryList,List<BackupFile> backupFileBuffer1,List<BackupFileHistory> backupFileHistoryBuffer1,List<BackupFile> backupFileBuffer2,List<BackupFileHistory> backupFileHistoryBuffer2){//        System.out.println("execSingleFileBackUp_TIME:" + execSingleFileBackUp_TIME * 1.0 / 1000 + "s");File[] sonFileArr = fatherFile.listFiles();HashMap<String,BackupFile> fileNameAndBackupFileMap =newHashMap<>();if(backupFileList !=null){// 记录要移除的 文件信息ID//            List<Long> removeBackupFileIdList = new ArrayList<>();// 存储数据源中存在的文件的名称HashSet<String> fileNameSet =newHashSet<>();for(int i =0; i < sonFileArr.length; i++){
                fileNameSet.add(sonFileArr[i].getName());}for(BackupFile backupFile : backupFileList){
                fileNameAndBackupFileMap.put(backupFile.getFileName(), backupFile);if(!fileNameSet.contains(backupFile.getFileName())){//                    removeBackupFileIdList.add(backupFile.getId());}}// 如果数据源中没有相应文件,将其也从数据库中删除//            backupFileService.recursionRemoveBackupFile(removeBackupFileIdList);}for(File file : sonFileArr){if(Cache.STOP_TASK_ID_SET.contains(backupTaskId)){// --if-- 如果任务被暂停,退出备份,存储当前备份任务的信息BackupTask backupTask =newBackupTask();
                backupTask.setId(backupTaskId);
                backupTask.setBackupStatus(4);
                backupTask.setFinishFileNum(backupStatistic.getFinishBackupFileNum());
                backupTask.setFinishByteNum(backupStatistic.getFinishBackupByteNum());
                backupTask.setEndTime(newDate());
                backupTask.setBackupTime(backupTask.getEndTime().getTime()- taskBackupStartTime.getTime());
                backupTaskService.updateById(backupTask);
                backupTask.setTotalFileNum(backupStatistic.getTotalBackupFileNum());
                backupTask.setTotalByteNum(backupStatistic.getTotalBackupByteNum());setProgress(backupTask);
                backupTask.setBackupSourceRoot(backupSource.getRootPath());
                backupTask.setBackupTargetRoot(backupTarget.getTargetRootPath());
                backupTask.setCreateTime(taskBackupStartTime);
                log.info("发送任务消息,通知前端任务暂停");Map<String,Object> dataMap =newHashMap<>();
                dataMap.put("code",WebsocketNoticeEnum.BACKUP_STOP.getCode());
                dataMap.put("message",WebsocketNoticeEnum.BACKUP_STOP.getDetail());
                dataMap.put("backupTask", backupTask);
                webSocketServer.sendMessage(JSON.toJSONString(dataMap),WebSocketServer.usernameAndSessionMap.get("Admin"));break;}//            if (file.toString().indexOf("/.") != -1 || file.toString().indexOf("\\.") != -1) {//                continue;//            }if(file.isDirectory()){// --if-- 若是目录,先在目标目录下创建目录,然后递归备份文件if(isContainedInIgnoreList(ignoreDirectoryList, file)){continue;}String targetFilePath =getTargetFilePath(backupSource, backupTarget, targetList, middlePath, file);// 查询备份文件数据表是否已经包含这个记录BackupFile backupFile = fileNameAndBackupFileMap.get(file.getName());Long curBackupFileId = backupFile ==null?null: backupFile.getId();File targetFile =newFile(targetFilePath);if(!targetFile.exists()){boolean mkdirs = targetFile.mkdirs();if(mkdirs){// 将目录插入到数据库中if(curBackupFileId ==null){
                            curBackupFileId =saveBackupFileDir(backupSource, backupTarget, targetFilePath, fatherId, file);}}else{thrownewServiceException("无法创建目录,可能是权限不够");}}else{// --if-- 虽然目录已经存在,但是数据库中没有信息,还是需要存储相关信息if(curBackupFileId ==null){
                        curBackupFileId =saveBackupFileDir(backupSource, backupTarget, targetFilePath, fatherId, file);}}// 是否存在对应的文件信息,如果备份类型不是是分散存储,那么文件信息肯定不存在boolean haveBackupFile = fileNameAndBackupFileMap.get(file.getName())!=null;List<BackupFile> children =null;if(haveBackupFile){
                    children =newArrayList<>();long start =System.currentTimeMillis();
                    children.addAll(backupFileService.list(newQueryWrapper<BackupFile>().eq("backup_source_id", backupSource.getId()).eq("father_id", curBackupFileId)));//                    DATABASE_BACKUP_FILE_SEARCH_TIME += System.currentTimeMillis() - start;//                    System.out.println("备份文件查询时间:" + DATABASE_BACKUP_FILE_SEARCH_TIME * 1.0 / 1000 + "s");}backUpAllFilesOfFatherFile(task, file, backupSource, backupTarget,
                        targetList, backupStatistic,
                        middlePath + file.getName()+File.separator, backupTaskId, taskBackupStartTime,
                        curBackupFileId, children,
                        ignoreFileList, ignoreDirectoryList,
                        backupFileBuffer1, backupFileHistoryBuffer1,
                        backupFileBuffer2, backupFileHistoryBuffer2);}else{// --if-- 若是文件,执行备份操作if(isContainedInIgnoreList(ignoreFileList, file)){continue;}if(file.getName().contains(".DS_Store")){// 跳过Macos的Finder创建文件continue;}try{execSingleFileBackUp(task, backupSource, backupTarget, targetList, file.toString(),
                            backupStatistic, middlePath, backupTaskId, taskBackupStartTime, fatherId,
                            fileNameAndBackupFileMap, backupFileBuffer1, backupFileHistoryBuffer1,
                            backupFileBuffer2, backupFileHistoryBuffer2);}catch(SQLException e){thrownewRuntimeException(e);}catch(IOException e){thrownewRuntimeException(e);}}}}

该方法用来递归处理一个目录的备份,业务逻辑如下:

  1. 将目录对应的备份文件集合封装到字典中,优化后续校验文件是否修改的时候查询效率
  2. 在循环处理sonFileArr的时候,首先判断当前任务是否被暂停备份,如果任务ID存在于暂停ID池STOP_TASK_ID_SET中,则暂停当前任务,更新数据库的任务状态,并通知前端任务暂停成功
  3. 判断当前所循环到的子文件是目录还是文件,如果是目录,进入第4步;否则进入第5步
  4. 检查当前目录是否被忽略,如果被忽略直接continue,否则继续执行;检查backup_file中是否有相应信息,没有则存储到数据库中,有则继续执行;若备份目标目录没有对应的目录,则创建目录;查询当前所遍历目录的子备份文件集合children,递归调用backUpAllFilesOfFatherFile
  5. 检查当前文件是否被忽略,如果被忽略直接continue,否则继续执行;调用execSingleFileBackUp执行单个文件的备份
/**
 * 执行一个文件的备份
 * 首先判断文件是否已经备份或者是否有所修改,是则进行备份
 *
 * @param source
 * @param target
 * @param backupSourceFilePath
 * @param backupStatistic
 * @param middlePath
 * @throws SQLException
 * @throws IOException
 */privatevoidexecSingleFileBackUp(Task task,BackupSource source,BackupTarget target,List<BackupTarget> targetList,String backupSourceFilePath,BackupStatistic backupStatistic,String middlePath,Long backupTaskId,Date taskBackupStartTime,Long fatherId,HashMap<String,BackupFile> fileNameAndBackupFileMap,List<BackupFile> backupFileBuffer1,List<BackupFileHistory> backupFileHistoryBuffer1,List<BackupFile> backupFileBuffer2,List<BackupFileHistory> backupFileHistoryBuffer2)throwsSQLException,IOException{long start =System.currentTimeMillis();/* if (backupSourceFilePath.indexOf("/.") != -1 || backupSourceFilePath.indexOf("\\.") != -1) {
        // 不拷贝.开头的文件夹和文件
        return;
    }*/// 获取源文件File backupSourceFile =newFile(backupSourceFilePath);if(!backupSourceFile.exists()){int temp =0;}Long targetId = source.getBackupType()==0? target.getId():0;if(fileNameAndBackupFileMap.get(backupSourceFile.getName())==null){// --if-- 文件还没有备份过,将其插入到数据库中,并取出id// 获取备份目标路径String targetFilePath =getTargetFilePath(source, target, targetList, middlePath, backupSourceFile);int isCompress =0;if(isNeedCompress(source, backupSourceFile)){// --if-- 当数据源设置了压缩,且文件的大小等于10M才进行压缩
            isCompress =1;
            targetFilePath =updateTargetFilePath(targetFilePath);}BackupFile backupFile =constructBackupFile(source, backupSourceFilePath, targetFilePath, targetId,
                fatherId, isCompress, backupSourceFile);FileInputStream sourceFileInputStream =newFileInputStream(backupSourceFilePath);String md5str =DigestUtil.md5Hex(sourceFileInputStream);
        sourceFileInputStream.close();// backupFileId 待定,还不是准确的BackupFileHistory backupFileHistory =constructBackupFileHistory(backupSourceFilePath, source.getId(), targetId, targetFilePath,0L, backupTaskId,newDate(), backupSourceFile, md5str);addToBuffer1(backupFile, backupFileHistory, backupFileBuffer1, backupFileHistoryBuffer1,
                isCompress, backupSourceFile, targetFilePath);}else{// 直接从字典中获取BackupFile backupFileInDatabase = fileNameAndBackupFileMap.get(backupSourceFile.getName());addToBuffer2(source.getId(), targetId, backupTaskId,
                source, backupFileInDatabase,
                backupFileBuffer2, backupFileHistoryBuffer2);}// 每隔一秒输出一下拷贝进度
    backupStatistic.finishBackupFileNum++;
    backupStatistic.finishBackupByteNum += backupSourceFile.length();long curTime =System.currentTimeMillis();if((curTime /1000)!= backupStatistic.second){
        backupStatistic.second = curTime /1000;//            log.info("文件数量:拷贝进度:" + statistic.finishBackupFileNum * 100.0 / statistic.totalBackupFileNum + "%  " + statistic.finishBackupFileNum + "/" + statistic.totalBackupFileNum +//                    "; 文件大小:拷贝进度:" + statistic.finishBackupByteNum * 100.0 / statistic.totalBackupByteNum + "%  " + statistic.finishBackupByteNum + "/" + statistic.totalBackupByteNum);BackupTask backupTask =newBackupTask();
        backupTask.setId(backupTaskId);
        backupTask.setBackupStatus(1);
        backupTask.setFinishFileNum(backupStatistic.finishBackupFileNum);
        backupTask.setFinishByteNum(backupStatistic.finishBackupByteNum);
        backupTask.setBackupTime(curTime - taskBackupStartTime.getTime());
        backupTaskService.updateById(backupTask);// 剩下的信息用来给前端看的,不需要更新到数据库中
        backupTask.setBackupSourceRoot(source.getRootPath());
        backupTask.setBackupTargetRoot(getTargetRootPath(task, source, target));
        backupTask.setTotalFileNum(backupStatistic.totalBackupFileNum);
        backupTask.setTotalByteNum(backupStatistic.totalBackupByteNum);
        backupTask.setCreateTime(taskBackupStartTime);setProgress(backupTask);
        log.info("发送任务消息,通知前端备份进度变化");Map<String,Object> dataMap =newHashMap<>();
        dataMap.put("code",WebsocketNoticeEnum.BACKUP_PROCESS.getCode());
        dataMap.put("message",WebsocketNoticeEnum.BACKUP_PROCESS.getDetail());
        dataMap.put("backupTask", backupTask);
        webSocketServer.sendMessage(JSON.toJSONString(dataMap),WebSocketServer.usernameAndSessionMap.get("Admin"));}//        execSingleFileBackUp_TIME += (System.currentTimeMillis() - start);}/**
 * 处理还没有存储到数据库中的备份文件, 这些备份文件 百分之百 是没有进行备份的
 * 1. 将其进行备份
 * 2. 直接给这些备份文件添加备份记录
 *
 * @param backupFile
 * @param backupFileBuffer1
 */privatevoidbuffer1Process(BackupFile backupFile,BackupFileHistory backupFileHistory,List<BackupFile> backupFileBuffer1,List<BackupFileHistory> backupFileHistoryBuffer1,int isCompress,File backupSourceFile,String targetFilePath){// 执行文件备份try{if(execBackupSingleFile(isCompress, backupSourceFile, targetFilePath)){
            backupFileBuffer1.add(backupFile);
            backupFileHistoryBuffer1.add(backupFileHistory);}else{
            log.error("备份出错");}}catch(Exception e){
        log.error("文件备份出错");thrownewRuntimeException(e);}if(backupFileBuffer1.size()>this.BATCH_SIZE){buffer1Process(backupFileBuffer1, backupFileHistoryBuffer1);}}privatevoidaddToBuffer2(Long backupTaskId,BackupSource backupSource,BackupFile backupFileInDatabase,List<BackupFile> backupFileBuffer2,List<BackupFileHistory> backupFileHistoryBuffer2)throwsIOException{
    backupFileBuffer2.add(backupFileInDatabase);if(backupFileBuffer2.size()>=this.BATCH_SIZE){buffer2Process(backupTaskId, backupSource, backupFileBuffer2, backupFileHistoryBuffer2);}}

当前方法主要判断文件是否被备份,或者距上次备份是否有修改,如果没有备份过或者修改过,则需要进行备份。业务流程如下:

  1. 检查fileNameAndBackupFileMap中是否包含当前文件名,包含则说明文件之前已经被备份过,进入第2步;否则进入第3步
  2. 构建backupFilebackupFileHistory对象,并添加到缓冲区buffer1 ,同时执行文件的备份
  3. fileNameAndBackupFileMap中取出backupFile,将其加入缓冲区buffer2
  4. 除了上面步骤之外,每隔一秒需要通知前端当前的备份进度
privatevoidbuffer1Process(List<BackupFile> backupFileBuffer1,List<BackupFileHistory> backupFileHistoryBuffer1){
    backupFileService.saveBatch(backupFileBuffer1);for(int i =0; i < backupFileHistoryBuffer1.size(); i++){
        backupFileHistoryBuffer1.get(i).setBackupFileId(backupFileBuffer1.get(i).getId());}// 批量存储备份历史记录
    backupFileHistoryService.saveBatch(backupFileHistoryBuffer1);
    backupFileHistoryBuffer1.clear();
    backupFileBuffer1.clear();}

该方法是缓冲区1满了之后的处理逻辑,即简单地批量存储

备份文件数据

以及

备份历史记录数据

,并清空缓冲区

privatevoidbuffer2Process(Long backupSourceId,Long backupTargetId,Long backupTaskId,BackupSource backupSource,List<BackupFile> backupFileBuffer2,List<BackupFileHistory> backupFileHistoryBuffer2)throwsIOException{String md5str ="";List<BackupFile> updateBackupFileBuffer =newArrayList<>();List<Long> backupFileIdList = backupFileBuffer2.stream().map(item ->{return item.getId();}).collect(Collectors.toList());// 获取这些备份文件对应的备份历史记录Map<Long,BackupFileHistory> fileIdAndFileHistoryMap =newHashMap<>();long start =System.currentTimeMillis();List<BackupFileHistory> historyList = backupFileHistoryService.listLastBackupHistoryByBackupFileIdList(backupFileIdList);//        DATABASE_BACKUP_FILE_HISTORY_SEARCH_TIME += System.currentTimeMillis() - start;//        System.out.println("备份历史查询时间:" + DATABASE_BACKUP_FILE_HISTORY_SEARCH_TIME * 1.0 / 1000 + "s");for(BackupFileHistory fileHistory : historyList){
        fileIdAndFileHistoryMap.put(fileHistory.getBackupFileId(), fileHistory);}for(BackupFile backupFile : backupFileBuffer2){FileInputStream sourceFileInputStream =null;boolean isNeedBackup =true;BackupFileHistory fileHistory = fileIdAndFileHistoryMap.get(backupFile.getId());File backupSourceFile =newFile(backupFile.getSourceFilePath());// 获取备份目标路径String targetFilePath = backupFile.getTargetFilePath();int isCompress =0;if(isNeedCompress(backupSource, backupSourceFile)){// --if-- 当数据源设置了压缩,且文件的大小等于10M才进行压缩
            isCompress =1;
            targetFilePath =updateTargetFilePath(targetFilePath);}if(fileHistory !=null){long lastModify = fileHistory.getModifyTime();long fileSize = fileHistory.getFileSize();String historyMD5 = fileHistory.getMd5();if(lastModify == backupSourceFile.lastModified()&& fileSize == backupSourceFile.length()){// 如果文件的 修改时间 和 文件大小 都和数据库中的对应,认为文件没有被修改,无需备份
                isNeedBackup =false;}// 如果修改时间不一样,文件大小一样,追加校验一次hash,如果hash一样,则更新修改时间,不执行备份if(lastModify != backupSourceFile.lastModified()&& fileSize == backupSourceFile.length()){// 只要输入一样,输出的MD5码就是一样的,如果md5一样,不执行备份
                sourceFileInputStream =newFileInputStream(backupSourceFile);
                md5str =DigestUtil.md5Hex(sourceFileInputStream);if(md5str.equals(historyMD5)){
                    isNeedBackup =false;}}}if(isNeedBackup ==false){// --if-- 判断备份目标目录中没有文件,也要备份过去File file =newFile(targetFilePath);if(!file.exists()){
                isNeedBackup =true;}}if(isNeedBackup){Date startDate =newDate();try{// 检查目标目录的文件对应的目录是否存在,不存在则创建(有可能文件被备份到目标目录之后,目标目录的文件夹被删除)String dirPath = targetFilePath.substring(0, targetFilePath.lastIndexOf(File.separator));File dir =newFile(dirPath);if(!dir.exists()){
                    dir.mkdirs();}if(!execBackupSingleFile(isCompress, backupSourceFile, targetFilePath)){
                    log.error("备份出错");}else{if(sourceFileInputStream ==null){
                        sourceFileInputStream =newFileInputStream(backupSourceFile);
                        md5str =DigestUtil.md5Hex(sourceFileInputStream);}/// 保存文件备份历史BackupFileHistory history =constructBackupFileHistory(backupFile.getSourceFilePath(), backupSourceId, backupTargetId,
                            targetFilePath, backupFile.getId(), backupTaskId, startDate, backupSourceFile, md5str);
                    history.setId(fileHistory.getId());updateBackupFileHistory(history, backupFileHistoryBuffer2);/// 更新文件信息BackupFile newBackupFile =newBackupFile();// 文件的大小可能会改变
                    newBackupFile.setFileLength(backupSourceFile.length());// 文件大小改变之后,压缩之后的文件大小也会改变if(isCompress ==1){File targetFile =newFile(targetFilePath);
                        newBackupFile.setFileLengthAfterCompress(targetFile.length());}// 本来可以压缩的文件,修改之后可能不再可以压缩,因为空间可能变大
                    newBackupFile.setIsCompress(isCompress);// 更新文件的备份次数int backupNum = backupFile.getBackupNum();
                    newBackupFile.setBackupNum(++backupNum);// 修改文件的上次备份时间
                    newBackupFile.setLastBackupTime(newDate());
                    updateBackupFileBuffer.add(newBackupFile);}}catch(Exception e){
                log.error("文件备份出错");thrownewRuntimeException(e);}}if(sourceFileInputStream !=null){
            sourceFileInputStream.close();}}// 批量更新备份文件信息if(updateBackupFileBuffer.size()>0){
        backupFileService.updateBatchById(updateBackupFileBuffer);}

    backupFileBuffer2.clear();}

该方法是缓冲区2满了之后的处理逻辑,解释如下:

  1. 根据备份文件集合批量查询出每个备份文件所对应的备份历史记录,并封装成字典fileIdAndFileHistoryMap,方便后续使用
  2. 遍历缓冲区的所有backupFile,从fileIdAndFileHistoryMap中获取对应的fileHistory,根据fileHistory判断文件是否需要重新备份
  3. 如果需要重新备份,调用execBackupSingleFile进行备份,备份成功之后更新备份历史和备份文件,注意这里还是使用批量更新,等攒够一定的数据量再进行更新

注意,如下代码是起到一个兜底作用,即为了避免备份目标目录中的数据被误删,如果备份目标目录中没有对应的文件,说明文件被误删了,也需要重新进行备份

if(isNeedBackup ==false){// --if-- 判断备份目标目录中没有文件,也要备份过去File file =newFile(targetFilePath);if(!file.exists()){
        isNeedBackup =true;}}/**
 * 执行 单个文件 的拷贝
 *
 * @param isCompress     是否压缩
 * @param targetFilePath 备份的目标文件路径
 * @return
 * @throws IOException
 */privatebooleanexecBackupSingleFile(int isCompress,File backupSourceFile,String targetFilePath)throwsIOException{//        System.out.println("执行备份");try{if(isCompress ==1){// 对文件进行压缩GzipCompressUtil.compressFile(backupSourceFile, targetFilePath);}else{// 直接拷贝backupWithFileChannel(backupSourceFile,newFile(targetFilePath));}//            log.info("备份文件成功,从" + sourceFilePath + " 到 " + targetFilePath);}catch(Exception e){//            log.info("备份文件失败,从" + sourceFilePath + " 到 " + targetFilePath);returnfalse;}returntrue;}/**
 * 将 source 备份到 target
 *
 * @param source
 * @param target
 * @throws IOException
 */privatestaticvoidbackupWithFileChannel(File source,File target)throwsIOException{if(!source.exists()){
        log.error("备份源文件不存在");return;}FileChannel inputChannel =null;FileChannel outputChannel =null;try{
        inputChannel =newFileInputStream(source).getChannel();
        outputChannel =newFileOutputStream(target).getChannel();
        outputChannel.transferFrom(inputChannel,0, inputChannel.size());}catch(Exception e){
        e.printStackTrace();}finally{if(inputChannel !=null){
            inputChannel.close();}if(outputChannel !=null){
            outputChannel.close();}}}

该方法主要使用nio来实现文件的拷贝,当然,如果选择了压缩形式,则直接将文件压缩之后输出到目标路径

/**
 * 检查 备份源目录是否存在 和 准备好备份目标目录
 *
 * @param sourceId
 */@OverridepublicList<Task>checkSourceAndTarget(Long sourceId){BackupSource source = backupSourceService.getById(sourceId);if(source ==null){thrownewClientException("id对应备份源信息不存在于数据库中");}File sourceFile =newFile(source.getRootPath());if(!sourceFile.exists()){thrownewServiceException("备份源目录不存在,请检查备份源是否被删除");}// 查询备份源对应的所有 备份目标目录 准备好相关的目录List<BackupTarget> backupTargetList = backupTargetService.list(newQueryWrapper<BackupTarget>().eq("backup_source_id", source.getId()));if(backupTargetList.size()==0){thrownewClientException("没有为 备份源 配置 备份目标目录,请先配置 备份目标目录");}// 存储不正常的目标目录List<BackupTarget> unNormalTargetList =newArrayList<>();for(BackupTarget backupTarget : backupTargetList){File file =newFile(backupTarget.getTargetRootPath());if(!file.exists()){boolean mkdir = file.mkdir();if(!mkdir){
                unNormalTargetList.add(backupTarget);thrownewServiceException("目标目录创建失败,请检查备份目标磁盘是否正常连接电脑");}}}
    backupTargetList.removeAll(unNormalTargetList);if(backupTargetList.size()==0){// --if-- 如果当前数据源没有一个备份目标目录正常,则将当前数据源从正在备份的备份源列表中移除if(backupController.backupingSourceIDSet.contains(sourceId)){
            backupController.backupingSourceIDSet.remove(sourceId);}returnnewArrayList<>();}List<Task> taskList =null;if(source.getBackupType()==0){
        taskList = backupTargetList.stream().map(item ->{returnnewTask(source, item,null);}).collect(Collectors.toList());}elseif(source.getBackupType()==1){Task task =newTask(source,null, backupTargetList);
        taskList =newArrayList<>();
        taskList.add(task);}return taskList;}

该方法主要用来检查数据源和备份目标目录的准备状态,并准备好备份任务

说明

备份业务比较复杂,代码随时会被优化,文章中的代码仅供参考,如果对最新代码感兴趣的话,还请到Git仓库中进行查看


本文转载自: https://blog.csdn.net/laodanqiu/article/details/136544911
版权归原作者 Hello Dam 所有, 如有侵权,请联系我们删除。

“【文件增量备份系统】备份业务实现与性能优化”的评论:

还没有评论