python 3调用百度OCR API实现剪贴板文字识别

发布时间：2024/01/24

Python 3调用百度OCR API实现剪贴板文字识别

本文介绍如何使用Python 3调用百度OCR API实现剪贴板文字识别，同时提供了2个示例来展示如何调用OCR API以及如何通过Python将识别结果保存到文本文件。

前置条件

在使用本文提供的代码之前，您需要先完成以下事项：

注册百度OCR API并获取相应的API Key和Secret Key
安装Python 3.x环境以及相应的依赖库（如requests, Pillow等）

OCR API简介

百度OCR API是一款文本识别服务，可以识别图片中的文字。其使用过程如下：

将待识别的图片文件通过请求发送到百度服务器
百度服务器返回识别结果

调用OCR API

下面我们介绍如何通过Python 3调用OCR API。

import requests
from PIL import Image
from io import BytesIO

appkey = 'your appkey'
secretkey = 'your secretkey'
api_endpoint = 'https://aip.baidubce.com/rest/2.0/ocr/v1/general_basic'

def ocr(file_path):
    '''
    使用OCR API识别图片中的文字
    :param file_path: 图片文件路径
    :return: 识别结果（str）
    '''
    with open(file_path, 'rb') as fp:
        img = Image.open(fp)
        # 将图片转换为RGB格式
        img = img.convert('RGB')
        # 将图片转换为Bytes数据
        output_buffer = BytesIO()
        img.save(output_buffer, format='JPEG')
        binary_data = output_buffer.getvalue()

    # 设置请求头
    headers = {
        'Content-Type': 'application/x-www-form-urlencoded'
    }

    # 设置请求参数
    data = {
        'image': binary_data,
        'access_token': access_token
    }

    response = requests.post(api_endpoint, headers=headers, data=data)
    if response.status_code == 200:
        result = response.json()
        if 'words_result' in result and len(result['words_result']) > 0:
            return '\n'.join([r['words'] for r in result['words_result']])
    return ''

上述代码首先将图片文件转换为Bytes数据，然后将其通过POST请求发送到OCR API接口，接口返回的结果为JSON格式字符串，其中包含了识别结果。

示例1：从剪贴板中读取图片并识别

下面提供一个示例，展示如何从剪贴板中读取图片并进行文字识别。需要先安装pyperclip库。

import pyperclip
from PIL import ImageGrab
import os

file_name = 'temp.jpg'
clipboard_image = ImageGrab.grabclipboard()

if clipboard_image:
    clipboard_image.save(file_name, 'JPEG')
    result = ocr(file_name)
    os.remove(file_name)
    if result:
        pyperclip.copy(result)
        print('识别结果已复制到粘贴板：\n{}'.format(result))

上述代码首先使用ImageGrab.grabclipboard()从剪贴板中获取图片，然后调用ocr函数进行识别。最后，将识别结果复制到粘贴板中，并输出识别结果。

示例2：批量识别图片并保存到文本文件

下面提供一个示例，展示如何批量识别图片，并将识别结果保存到文本文件中。

import os

input_dir = 'input'
output_dir = 'output'

if not os.path.exists(output_dir):
    os.mkdir(output_dir)

for file_name in os.listdir(input_dir):
    if file_name.endswith('.jpg') or file_name.endswith('.png'):
        input_file_path = os.path.join(input_dir, file_name)
        output_file_path = os.path.join(output_dir, file_name + '.txt')
        result = ocr(input_file_path)
        with open(output_file_path, 'w', encoding='utf-8') as fp:
            fp.write(result)
            print('{} 识别成功'.format(file_name))

上述代码首先读取指定目录下的所有图片文件，然后依次进行识别，并将识别结果保存到文本文件中。

总结

本文介绍了如何使用Python 3调用百度OCR API进行文字识别，并提供了两个示例来展示如何调用OCR API以及如何通过Python将识别结果保存到文本文件。

python 3调用百度OCR API实现剪贴板文字识别