小白学Python之实现OCR识别

发布时间：2023/12/16

小白学Python之实现OCR识别攻略

简介

OCR（Optical Character Recognition）是一种将图片或扫描文档中的文本转换成可编辑和搜索的文本的技术。Python作为一种强大的编程语言，有许多 OCR 库和工具可以用于实现 OCR 功能。在本文中，我们将通过几个简单的步骤，介绍如何使用Python实现OCR识别。

步骤

步骤1：安装 Tesseract OCR

首先，需要在计算机上安装一个OCR引擎，这里我们选择 Tesseract OCR，一个基于开源的OCR引擎。在 Windows 系统中，下载和安装 Tesseract OCR 的最简单方法是通过 Tesseract OCR 官网提供的 Windows 安装程序进行安装。

在 Linux 系统中，可以使用包管理器安装 Tesseract OCR：

sudo apt-get install tesseract-ocr

在 MacOS 系统中，可以使用 Homebrew 安装 Tesseract OCR：

brew install tesseract

步骤2：安装 PyOCR 库

PyOCR是一个Python库，可以用于与Tesseract OCR进行交互。可以通过pip在命令行中进行安装：

pip install pyocr

步骤3：编写 Python 代码

可以使用 PyOCR 来编写一个简单的 Python 代码，用于将图片中的文本转换为可编辑和搜索的文本。代码如下：

import pyocr
import pyocr.builders
from PIL import Image

# 设置OCR引擎
tools = pyocr.get_available_tools()
if len(tools) == 0:
    print("No OCR tool found")
    sys.exit(1)
tool = tools[0]

# 打开图片
img = Image.open('test.png')

# OCR识别
txt = tool.image_to_string(
    img,
    builder=pyocr.builders.TextBuilder()
)

# 输出结果
print(txt)

使用以上代码中，将图像的路径指定为 'test.png'，并通过调用 image_to_string() 函数，将图像中的文本转换为可编辑和搜索的文本。Python将使用操作系统中安装的Tesseract OCR引擎来执行 OCR 功能。

示例1：使用示例图片

以下是一个使用示例图片的示例：

import pyocr
import pyocr.builders
from PIL import Image

# 设置OCR引擎
tools = pyocr.get_available_tools()
if len(tools) == 0:
    print("No OCR tool found")
    sys.exit(1)
tool = tools[0]

# 打开图片
img = Image.open('example.png')

# OCR识别
txt = tool.image_to_string(
    img,
    builder=pyocr.builders.TextBuilder()
)

# 输出结果
print(txt)

以上代码中，将图像的路径指定为 'example.png'，并调用 image_to_string() 函数，将图像中的文本转换为可编辑和搜索的文本。输出识别结果。

示例2：使用在线图片

以下是一个使用在线图片的代码示例：

import pyocr
import pyocr.builders
import requests
from PIL import Image
from io import BytesIO

# 设置OCR引擎
tools = pyocr.get_available_tools()
if len(tools) == 0:
    print("No OCR tool found")
    sys.exit(1)
tool = tools[0]

# 获取在线图片
response = requests.get('http://www.example.com/image.jpg')
img = Image.open(BytesIO(response.content))

# OCR识别
txt = tool.image_to_string(
    img,
    builder=pyocr.builders.TextBuilder()
)

# 输出结果
print(txt)

以上代码中，使用 requests.get() 函数获取在线图片，并通过 BytesIO 将其转换为图像文件。然后，调用 image_to_string() 函数将图像中的文本转换为可编辑和搜索的文本。输出识别结果。

结论

通过以上步骤和代码示例，可以轻松实现OCR识别，将图像中的文本转换为可编辑和搜索的文本。