Android图片文字识别(阿里OCR接口)

it2026-04-03  7

最近使用了阿里云的OCR文字识别API

先来看看效果

我使用的是通用类文字识别,具体实现过程如下:

1.购买阿里云的通用类文字识别

目前是0元免费的,可以使用500次。购买成功后到->控制台->云市场查看购买的API,复制它的APPCODE码。

2.根据官方给出的API文档提交请求

我使用的Retrofit提交网络请求,定义如下的接口:

interface AliService{ @POST("/api/predict/ocr_general") Call<HttpResult> getText(@Body RequestBody body,@Header("Authorization") String authorization); }

根据官方提供的返回json实例,自定义一个HTTPResult类用于接收数据,记得添加Getter and Setter方法和构造方法:

public class HttpResult{ private String request_id; private List<Bean> ret; private boolean success; } class Bean{ private Rect rect; private String word; class Rect{ private float angle; private float height; private float left; private float top; private float width; } }

由于图片是bitmap格式的,我们必须要将图片进行base64编码后进行请求。

public static String bitmapToBase64(Bitmap bitmap) { ByteArrayOutputStream bos = new ByteArrayOutputStream(); bitmap.compress(Bitmap.CompressFormat.JPEG, 40, bos);//参数100表示不压缩 byte[] bytes = bos.toByteArray(); //转换来的base64码不需要加前缀,必须是NO_WRAP参数,表示没有空格。 return Base64.encodeToString(bytes, Base64.NO_WRAP); //转换来的base64码需要需要加前缀,必须是NO_WRAP参数,表示没有空格。 //return "data:image/jpeg;base64," + Base64.encodeToString(bytes, Base64.NO_WRAP); }

根据官方文档里的请求参数,构建出请求体:

Retrofit retrofit = new Retrofit.Builder() .baseUrl("https://tysbgpu.market.alicloudapi.com") .addConverterFactory(GsonConverterFactory.create()) .build(); AliService aliService = retrofit.create(AliService.class); String body = "{\"image\":\""+bitmapToBase64(bitmap)+"\"," + "\"configure\":{\"min_size\":16,\"output_prob\":false,\"output_keypoints\":false,\"skip_detection\":false,\"without_predicting_direction\":false}}"; RequestBody requestBody = RequestBody.create(okhttp3.MediaType.parse("application/json;charset=UTF-8"), body); Call<HttpResult> call = aliService.getText(requestBody, "APPCODE " + APPCODE); call.enqueue(new Callback<HttpResult>() { @Override public void onResponse(Call<HttpResult> call, Response<HttpResult> response) { //根据返回的json解析出来并更新UI if (response.body().getRet()!= null){ List<Bean> beans = response.body().getRet(); for (Bean bean : beans) text += bean.getWord()+"\n"; activity.runOnUiThread(new Runnable() { @Override public void run() { textView.setText(text); } }); } } @Override public void onFailure(Call<HttpResult> call, Throwable t) { Log.e(TAG, "onFailure: "+t.getMessage()); } });

以上,就是调用阿里云OCR接口的核心代码了。 如果你还不清楚如何调用相机拍照并返回图片的话,继续往下看。

3.Android调用相机拍照并返回图片

① 在清单文件AndroidManifest里面申请权限。

<uses-permission android:name="android.permission.INTERNET"/> <uses-permission android:name="android.permission.CAMERA"/> <uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE"/> <uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE"/>

在application中声明FileProvide:

<provider android:authorities="com.briana.aliocr.provider"//自己的包名 android:name="androidx.core.content.FileProvider" android:exported="false" android:grantUriPermissions="true"> <meta-data android:name="android.support.FILE_PROVIDER_PATHS" android:resource="@xml/file_paths" /> </provider>

新建一个xml,命名为file_paths.xml。

<?xml version="1.0" encoding="utf-8"?> <resources> <paths> <external-path name="camera_photos" path="." /> <!-- path设置为'.'时代表整个存储卡 Environment.getExternalStorageDirectory() + "/path/" --> </paths> </resources>

② 在MainActivity中修改如下:

在调用相机拍照前,判断是否拥有权限,没有权限,就去申请。

private static final int PERMISSIONS_REQUEST_CODE = 1; private boolean hasPermission(){ if (ContextCompat.checkSelfPermission(this, Manifest.permission.WRITE_EXTERNAL_STORAGE) != PackageManager.PERMISSION_GRANTED || ContextCompat.checkSelfPermission(this, Manifest.permission.READ_EXTERNAL_STORAGE) != PackageManager.PERMISSION_GRANTED || ContextCompat.checkSelfPermission(this,Manifest.permission.CAMERA) != PackageManager.PERMISSION_GRANTED) { ActivityCompat.requestPermissions(this, new String[]{Manifest.permission.WRITE_EXTERNAL_STORAGE, Manifest.permission.READ_EXTERNAL_STORAGE, Manifest.permission.CAMERA}, PERMISSIONS_REQUEST_CODE); return false; }else { return true; } }

重写onRequestPermissionsResult方法,查看请求权限结果是否被用户通过,如果通过,就调用takephoto()方法拍照。

@Override public void onRequestPermissionsResult(int requestCode, @NonNull String[] permissions, @NonNull int[] grantResults) { super.onRequestPermissionsResult(requestCode, permissions, grantResults); if (requestCode == PERMISSIONS_REQUEST_CODE) { if (grantResults.length > 0) { for (int grantResult : grantResults) { if (grantResult == PackageManager.PERMISSION_DENIED) { return; } } takePhoto(); } } }

调用相机拍照,并将图片路径记录下来:

private static final int CAMERA_REQUEST_CODE = 2; File mFile; Uri imageUri; private void takePhoto(){ if (!hasPermission()) { return; } File path = new File(Environment.getExternalStorageDirectory(),"img"); mFile = new File(path,System.currentTimeMillis()+".jpg"); try { if (!path.exists()) path.mkdir(); if (!mFile.exists()) mFile.createNewFile(); } catch (IOException e) { e.printStackTrace(); } if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.M) { String authority = getPackageName() + ".provider"; imageUri = FileProvider.getUriForFile(this, authority, mFile); } else { imageUri = Uri.fromFile(mFile); } Intent intent = new Intent(MediaStore.ACTION_IMAGE_CAPTURE); intent.putExtra(MediaStore.EXTRA_OUTPUT,imageUri); startActivityForResult(intent,CAMERA_REQUEST_CODE); }

重写onActivityResult方法,根据路径取得图片,显示在imageView上,再调用阿里云的接口进行图片文字识别。

@Override protected void onActivityResult(int requestCode, int resultCode, @Nullable Intent data) { super.onActivityResult(requestCode, resultCode, data); if (requestCode == CAMERA_REQUEST_CODE) { Bitmap photo = BitmapFactory.decodeFile(mFile.getAbsolutePath()); imageView.setImageBitmap(photo); AliOcr aliOcr = new AliOcr(); aliOcr.getText(this,photo); } }

给按钮添加点击事件监听,点击拍照:

button.setOnClickListener(new View.OnClickListener() { @Override public void onClick(View view) { takePhoto(); } });

最后附上github地址和下载

如果对你有帮助的话,给个赞吧~

最新回复(0)