Skip to content

Navigation Menu

Appearance settings

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

Appearance settings

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

scribeocr / scribe.js Public

Notifications You must be signed in to change notification settings
Fork 5
Star 111

Code
Issues 20
Pull requests
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Releases: scribeocr/scribe.js

Releases · scribeocr/scribe.js

v0.8.0

09 Mar 09:39

Balearica

Compare

Choose a tag to compare

Loading

v0.8.0 Latest

Latest

What's Changed

Added scribe CLI command
- If scribe.js is installed globally (npm i -g scribe.js-ocr), the scribe command can be used to process documents from the command line.
  - For example, scribe recognize analyst_report.png runs OCR on an image and saves the result as a PDF.
- This feature is still experimental and command/argument names and features may change without warning.
Added new intermediate data format .scribe for storing and loading document data.
- Given OCR is computationally expensive, it is often desirable to save results for later use without losing data.
- By saving results to .scribe files, results can be re-loaded later (e.g. to export with slightly different settings).
  - While several other output formats can be re-loaded later (notably .hocr and .pdf), only .scribe can be re-loaded without any data being lost in the export/import process.
  - .scribe files only contain the text layer; they do not contain embedded images or PDF files.
    - .scribe files can be loaded alongside image/PDF files to restore both image and text data.

Full Changelog: v0.7.4...v0.8.0

Assets 2

Loading

Uh oh!

There was an error while loading. Please reload this page.

All reactions

v0.7.4

03 Mar 08:08

Balearica

Compare

Choose a tag to compare

Loading

v0.7.4

What's Changed

Fixed bug causing crash for certain PDF input documents.
Added support for bold + italic style (previously only bold or italic style)
Added support for underline style.
- Underlined text is currently detected automatically when importing a text-native PDF or Abbyy XML file.
Disabled ligatures by default.
- To re-enable, set scribe.opt.ligatures to true.

Full Changelog: v0.7.3...v0.7.4

Assets 2

Loading

Uh oh!

There was an error while loading. Please reload this page.

All reactions

v0.7.3

03 Mar 08:02

Balearica

Compare

Choose a tag to compare

Loading

v0.7.3

What's Changed

Updated HTML export to support Node.js

Full Changelog: v0.7.2...v0.7.3

Assets 2

Loading

Uh oh!

There was an error while loading. Please reload this page.

All reactions

v0.7.2

20 Feb 04:25

Balearica

Compare

Choose a tag to compare

Loading

v0.7.2

What's Changed

Added HTML output format (browser only).
- This implementation is still preliminary; the implementation may change substantially in future versions.
Standardized fonts and font names

Full Changelog: v0.7.1...v0.7.2

Assets 2

Loading

Uh oh!

There was an error while loading. Please reload this page.

All reactions

v0.7.1

09 Feb 19:46

Balearica

Compare

Choose a tag to compare

Loading

v0.7.1

What's Changed

Standardized fonts and font names

Full Changelog: v0.7.0...v0.7.1

Assets 2

Loading

Uh oh!

There was an error while loading. Please reload this page.

All reactions

v0.7.0

07 Jan 08:38

Balearica

Compare

Choose a tag to compare

Loading

v0.7.0

What's Changed

Major rework of PDF export implementation.
- Writing to PDF is faster and uses less memory.
  - Documents that used to crash due to memory errors now run almost instantly.
- For many inputs, output PDF file sizes are now much smaller.
Fixed memory leaks within OCR module.
Misc bug fixes.

Full Changelog: v0.6.1...v0.7.0

Assets 2

Loading

Uh oh!

There was an error while loading. Please reload this page.

All reactions

v0.6.1

17 Dec 05:25

Balearica

Compare

Choose a tag to compare

Loading

v0.6.1

What's Changed

Fixed Node.js support on Windows (#9)
Fixed platform-related installation issues (#27, #29)
Increased use of workers in Node.js version, enabling much better performance using a single process.

Full Changelog: v0.5.1...v0.6.1

Assets 2

Loading

Uh oh!

There was an error while loading. Please reload this page.

All reactions

v0.5.1

10 Dec 09:30

Balearica

Compare

Choose a tag to compare

Loading

v0.5.1

What's Changed

Fixed bug causing crashes when recognizing certain PDFs using Node.js (#26)
Minor updates

Full Changelog: v0.5.0...v0.5.1

Assets 2

Loading

Uh oh!

There was an error while loading. Please reload this page.

All reactions

v0.5.0

25 Nov 09:08

Balearica

Compare

Choose a tag to compare

Loading

v0.5.0

What's Changed

Added config argument to recognize, which allows for passing arguments to Tesseract.js (#22)
Added support for parsing PDF text at various orientations (90/180/270 degrees).
Minor improvements to OCR quality.
Various improvements to imports of HOCR and native PDF text.
Added saveAs utility function for saving files.
Added opt.kerning option that can be used to enable or disable kerening.

Full Changelog: v0.4.1...v0.5.0

Assets 2

Loading

Uh oh!

There was an error while loading. Please reload this page.

All reactions

v0.4.1

10 Nov 19:24

Balearica

Compare

Choose a tag to compare

Loading

v0.4.1

What's Changed

Implemented parallel processing by default for Node.js version
- To restore the previous behavior (1 worker), set scribe.opt.workerN = 1 before calling any functions.
Non-default behavior for extracting text from PDF files is now handled by setting the properties of scribe.opt.usePDFText.
Added Nimbus Mono font (similar to Courier)
Improvements to text extraction from PDF files.
Improvements to text positioning.

Full Changelog: v0.3.1...v0.4.1

Note: This post combines changes for 0.4.0 and 0.4.1 since the former was only the most recent version for a few hours.

Assets 2

Loading

Uh oh!

There was an error while loading. Please reload this page.

All reactions

Previous 1 2 3 Next

Previous Next

Footer

© 2025 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.