Java字符串压缩与解压的开发记录-编程学习网

1、场景：

由于数据库字段长度有限，并且不能随意的修改数据库字段的配置，数据库的某个字段设置的长度可能在设置初期是满足需求的，后期由于业务变更或业务量增大导致该字段存储的数据增长，落库时可能因为该字段数据长度过长导致落库失败，基于这种场景我们就有必要进行字符串的压缩，然后再进行落库，而落库后取出数据使用时再进行解压即可。

2、CompressUtil类：

使用Java8中的gzip来进行实现

import lombok.extern.slf4j.Slf4j;
import org.apache.commons.codec.binary.Base64;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.zip.GZIPInputStream;
import java.util.zip.GZIPOutputStream;

@Slf4j
public class CompressUtil {
    
    public static String compress(String str) {
        if (str == null || str.length() <= 0) {
            return str;
        }
        ByteArrayOutputStream out = new ByteArrayOutputStream();
        try (GZIPOutputStream gzip = new GZIPOutputStream(out)) {
            gzip.write(str.getBytes(StandardCharsets.UTF_8));
        } catch (IOException e) {
            log.error("字符串压缩失败str:{}，错误信息:{}", str, e.getMessage());
            throw new RuntimeException("字符串压缩失败");
        }
        return Base64.encodeBase64String(out.toByteArray());
    }
    
    public static String uncompress(String compressedStr) {
        if (compressedStr == null || compressedStr.length() <= 0) {
            return compressedStr;
        }
        ByteArrayOutputStream out = new ByteArrayOutputStream();
        ByteArrayInputStream in;
        GZIPInputStream gzip = null;
        byte[] compressed;
        String decompressed;
        try {
            compressed = Base64.decodeBase64(compressedStr);
            in = new ByteArrayInputStream(compressed);
            gzip = new GZIPInputStream(in);
            byte[] buffer = new byte[1024];
            int offset;
            while ((offset = gzip.read(buffer)) != -1) {
                out.write(buffer, 0, offset);
            }
            decompressed = out.toString(StandardCharsets.UTF_8.name());
        } catch (IOException e) {
            log.error("字符串解压失败compressedStr:{}，错误信息:{}", compressedStr, e.getMessage());
            throw new RuntimeException("字符串解压失败");
        } finally {
            if (gzip != null) {
                try {
                    gzip.close();
                } catch (IOException ignored) {
                }
            }
            try {
                out.close();
            } catch (IOException ignored) {
            }
        }
        return decompressed;
    }
}

3、注意点：

1）CompressUtil在压缩过程和解压过程使用统一字符集，防止压缩和解压过程因为字符集不同导致结果与实际预期不符；

2）在web项目中，服务器端将加密后的字符串返回给前端，前端再通过ajax请求将加密字符串发送给服务器端处理的时候，在http传输过程中会改变加密字符串的内容，导致服务器解压压缩字符串发生异常；

而CompressUtil压缩和解压过程中使用Base64.encodeBase64String和Base64.decodeBase64进行编码和解码，可以完全解决上述问题。

3）压缩/解压失败怎么处理？
通过CompressUtil工具类可以看出，如果压缩或解压失败，过程发生异常，则会抛出一个运行时异常给调用方，方便调用方及时感知并处理；

具体如何处理要看具体的业务场景，我这边是在MQ消费者中调用，在MQ中统一捕获异常，所以如果压缩失败会进行重试，如果重试多次依然失败，我这边会进行报警打印日志，内部人会去处理。

4、单元测试：

import org.junit.Test;
public class CompressUtilTest {
    @Test
    public void test1() {
        StringBuilder stringBuilder = new StringBuilder();
        for(int i = 0;i < 100000;i++) {
            stringBuilder.append("1");
        }
        System.out.println(stringBuilder.toString().length());
        String compress = CompressUtil.compress(stringBuilder.toString());
        System.out.println("compress="+compress);
        System.out.println(compress.length());
        String uncompress = CompressUtil.uncompress(compress);
        System.out.println(uncompress.length());
        System.out.println("uncompress=" + uncompress);
    }
}

测试1：100000压缩以后为180，解压后也可以正常返回原字符串