You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Tesseract.js wraps a [webassembly port](https://github.com/naptha/tesseract.js-core) of the [Tesseract](https://github.com/tesseract-ocr/tesseract) OCR Engine.
34
-
It works in the browser using [webpack](https://webpack.js.org/), esm, or plain script tags with a [CDN](#CDN) and on the server with [Node.js](https://nodejs.org/en/).
32
+
Tesseract.js works in the browser using [webpack](https://webpack.js.org/), esm, or plain script tags with a [CDN](#CDN) and on the server with [Node.js](https://nodejs.org/en/).
35
33
After you [install it](#installation), using it is as simple as:
36
34
37
35
```javascript
@@ -72,6 +70,11 @@ npm install tesseract.js@3.0.3
72
70
yarn add tesseract.js@3.0.3
73
71
```
74
72
73
+
## Project Scope
74
+
Tesseract.js aims to bring the [Tesseract](https://github.com/tesseract-ocr/tesseract) OCR engine (a separate project) to the browser and Node.js, and works by wrapping a [WebAssembly port](https://github.com/naptha/tesseract.js-core) of Tesseract. This project does not modify core Tesseract features. Most notably, **Tesseract.js does not support PDF files and does not modify the Tesseract recognition model to improve accuracy.**
75
+
76
+
If your project requires features outside of this scope, consider the [Scribe.js library](https://github.com/scribeocr/scribe.js). Scribe.js is an alternative library created to accommodate common feature requests that are outside of the scope of this repo. Scribe.js includes improvements to the Tesseract recognition model and supports extracting text from PDF documents, among other features. For more information see [Scribe.js vs. Tesseract.js](https://github.com/scribeocr/scribe.js/blob/master/docs/scribe_vs_tesseract.md).
77
+
75
78
## Documentation
76
79
77
80
*[Workers vs. Schedulers](./docs/workers_vs_schedulers.md)
@@ -152,19 +155,20 @@ npm start
152
155
The development server will be available at http://localhost:3000/examples/browser/basic-efficient.html in your favorite browser.
153
156
It will automatically rebuild `tesseract.min.js` and `worker.min.js` when you change files in the **src** folder.
154
157
155
-
### Online Setup with a single Click
156
-
157
-
You can use Gitpod(A free online VS Code like IDE) for contributing. With a single click it will launch a ready to code workspace with the build & start scripts already in process and within a few seconds it will spin up the dev server so that you can start contributing straight away without wasting any time.
158
-
159
-
[](https://gitpod.io/#https://github.com/naptha/tesseract.js/blob/master/examples/browser/basic-efficient.html)
160
-
161
158
### Building Static Files
162
159
To build the compiled static files just execute the following:
163
160
```shell
164
161
npm run build
165
162
```
166
163
This will output the files into the `dist` directory.
167
164
165
+
### Run Tests
166
+
**Always confirm the automated tests pass before submitting a pull request.** To run the automated tests locally, run the following commands.
0 commit comments