优秀的编程知识分享平台

网站首页 > 技术文章 正文

linux 上装Tesseract-OCR最新版本

nanyue 2024-08-13 07:55:10 技术文章 9 ℃

首先在机器上找个目录 打开官方git https://github.com/tesseract-ocr

我们在服务器上找个目录 git clone https://github.com/tesseract-ocr/tesseract.git

  • 如果新机器就随手安装上工具包
yum install autoconf automake libtool pkgconfig.x86_64 libpng12-devel.x86_64 libjpeg-devel libtiff-devel.x86_64 zlib-devel.x86_64
  • 安装leptonica1.7(4.0必须在1.74以上)
wget http://www.leptonica.org/source/leptonica-1.80.0.tar.gz
tar -xzvf leptonica-1.80.0.tar.gz
cd leptonica-1.80.0
./configure --prefix=/usr/local/
make && make install
配置环境
vim \etc\bashrc
加入

PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/local/lib/pkgconfig
export PKG_CONFIG_PATH
CPLUS_INCLUDE_PATH=$CPLUS_INCLUDE_PATH:/usr/local/include/
export CPLUS_INCLUDE_PATH
C_INCLUDE_PATH=$C_INCLUDE_PATH:/usr/local/leptonica/include/leptonica
export C_INCLUDE_PATH
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib
export LD_LIBRARY_PATH
LIBRARY_PATH=$LIBRARY_PATH:/usr/local/lib
export LIBRARY_PATH
TESSDATA_PREFIX=/root/tesseract/
export TESSDATA_PREFIX

最后刷新

source /etc/bashrc

回到tesseract目录,开始安装

./autogen.sh
./configure --with-extra-includes=/usr/local/include --with-extra-libraries=/usr/local/include

注:这时有可能会报 configure: error: Leptonica 1.74 or higher is required. Try to install libleptonica-dev package.

明明我们已经安装了的。为什么还会报呢?

那就要加入环境变量

vim /etc/profile

在最后插入

export LD_LIBRARY_PATH=$LD_LIBRARY_PAYT:/usr/local/lib
export LIBLEPT_HEADERSDIR=/usr/local/include
export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig

最后刷新下文件

source /etc/profile

然后我们再执行

./autogen.sh
./configure

这时就没有 leptonica问题啦,随后make下

make && sudo make install

最后我们用代码测试一切都正常,但到项目中却莫明奇妙地失败,断点服务直接崩掉,cache不住,找不到报错

各种百度google最后确认下来可能少了tess4j要的linux依赖库

执行下

cp /usr/local/lib/*.so.* /usr/lib64/

如果报

Error in findTiffCompression: function not present
Error in pixReadStreamTiff: function not present
Error in pixReadStream: tiff: no pix returned
Error in pixRead: pix not read
Unsupported image type.

相关。

那就是没有少了png和jpge依赖库

yum install libjpeg-devel
yum install libpng-devel
yum install libtiff-devel libjpeg-devel libpng-devel -y

进入leptonica的安装文件夹重新编译

./configure
make
make install

到此基本所有问题都 解决了。运行项目正常

  • Tesseract-OCR 中文识别与训练字库 : https://www.jianshu.com/p/3326c7216696
  • Tesseract5.0训练字库,提高OCR识别率:https://www.cnblogs.com/pyweb/p/11457519.html
  • Tesseract-OCR-v5.0中文识别,训练自定义字库:http://www.likecs.com/show-90988.html
yum install autoconf automake libtool pkgconfig.x86_64 libpng12-devel.x86_64 libjpeg-devel libtiff-devel.x86_64 zlib-devel.x86_64

tesseract编译错误:fatal error: allheaders.h: No such file or directory

错误描述:

globaloc.cpp:25:33: fatal error: allheaders.h: No such file or directory  

#include          "allheaders.h"

解决办法:

1、确认是否安装了leptonica

2、配置环境变量

vi /etc/profile

添加:
CPLUS_INCLUDE_PATH=$CPLUS_INCLUDE_PATH:/usr/local/leptonica/include/leptonica
export CPLUS_INCLUDE_PATH
C_INCLUDE_PATH=$C_INCLUDE_PATH:/usr/local/leptonica/include/leptonica
export C_INCLUDE_PATH
 
source /etc/profile

如果仍然无法解决,参考以下文章重新安装:

http://www.cnblogs.com/rouge/p/7275391.html

最近发表
标签列表