OpenALPR exits with segmentation fault with custom traineddata

Question

I'm using OpenALPR and I have trained the OCR to recognize Mandatory font. When I try to use that traineddata, alpr exits with segmentation fault.

I'm using version 1.2.0 and tesseract 3.03, with leptonica-1.71

When I run it with gdb, I get the following stack trace:

(gdb) bt
#0  0x00007ffff67ded7b in tesseract::Classify::ComputeCharNormArrays(FEATURE_STRUCT*, INT_TEMPLATES_STRUCT*, unsigned char*, unsigned char*) () from /usr/local/lib/libtesseract.so.3
#1  0x00007ffff67e3b6b in tesseract::Classify::CharNormTrainingSample(bool, int, tesseract::TrainingSample const&, GenericVector<tesseract::UnicharRating>*) () from /usr/local/lib/libtesseract.so.3
#2  0x00007ffff6806882 in tesseract::TessClassifier::UnicharClassifySample(tesseract::TrainingSample const&, Pix*, int, int, GenericVector<tesseract::UnicharRating>*) () from /usr/local/lib/libtesseract.so.3
#3  0x00007ffff67e1b22 in tesseract::Classify::CharNormClassifier(TBLOB*, tesseract::TrainingSample const&, ADAPT_RESULTS*) () from /usr/local/lib/libtesseract.so.3
#4  0x00007ffff67e1c95 in tesseract::Classify::DoAdaptiveMatch(TBLOB*, ADAPT_RESULTS*) () from /usr/local/lib/libtesseract.so.3
#5  0x00007ffff67e1f24 in tesseract::Classify::AdaptiveClassifier(TBLOB*, BLOB_CHOICE_LIST*) () from /usr/local/lib/libtesseract.so.3
#6  0x00007ffff67d993d in tesseract::Wordrec::call_matcher(TBLOB*) () from /usr/local/lib/libtesseract.so.3
#7  0x00007ffff67d9986 in tesseract::Wordrec::classify_blob(TBLOB*, char const*, C_COL, BlamerBundle*) () from /usr/local/lib/libtesseract.so.3
#8  0x00007ffff67d6bab in tesseract::Wordrec::classify_piece(GenericVector<SEAM*> const&, short, short, char const*, TWERD*, BlamerBundle*) () from /usr/local/lib/libtesseract.so.3
#9  0x00007ffff67c808d in tesseract::Wordrec::chop_word_main(WERD_RES*) () from /usr/local/lib/libtesseract.so.3
#10 0x00007ffff67d9821 in tesseract::Wordrec::cc_recog(WERD_RES*) () from /usr/local/lib/libtesseract.so.3
#11 0x00007ffff6716e62 in tesseract::Tesseract::recog_word_recursive(WERD_RES*) () from /usr/local/lib/libtesseract.so.3
#12 0x00007ffff6716ff5 in tesseract::Tesseract::recog_word(WERD_RES*) () from /usr/local/lib/libtesseract.so.3
#13 0x00007ffff6708160 in tesseract::Tesseract::tess_segment_pass_n(int, WERD_RES*) () from /usr/local/lib/libtesseract.so.3
#14 0x00007ffff66cf2a5 in tesseract::Tesseract::match_word_pass_n(int, WERD_RES*, ROW*, BLOCK*) () from /usr/local/lib/libtesseract.so.3
#15 0x00007ffff66cf482 in tesseract::Tesseract::classify_word_pass1(tesseract::WordData*, WERD_RES*) () from /usr/local/lib/libtesseract.so.3
#16 0x00007ffff66d26d6 in tesseract::Tesseract::classify_word_and_language(void (tesseract::Tesseract::*)(tesseract::WordData*, WERD_RES*), tesseract::WordData*) () from /usr/local/lib/libtesseract.so.3
#17 0x00007ffff66d2dea in tesseract::Tesseract::RecogAllWordsPassN(int, ETEXT_DESC*, GenericVector<tesseract::WordData>*) () from /usr/local/lib/libtesseract.so.3
#18 0x00007ffff66d3701 in tesseract::Tesseract::recog_all_words(PAGE_RES*, ETEXT_DESC*, TBOX const*, char const*, int) () from /usr/local/lib/libtesseract.so.3
#19 0x00007ffff66c203d in tesseract::TessBaseAPI::Recognize(ETEXT_DESC*) () from /usr/local/lib/libtesseract.so.3
#20 0x00000000004aadf4 in OCR::performOCR (this=0x93d7c0, pipeline_data=0x7fffe0a3a9b0) at /opt/openalpr/src/openalpr/ocr.cpp:79
#21 0x000000000048845b in plateAnalysisThread (arg=0x7fffffffd380) at /opt/openalpr/src/openalpr/alpr_impl.cpp:261
#22 0x00000000004df217 in tthread::thread::wrapper_function (aArg=0x93d210) at /opt/openalpr/src/openalpr/support/tinythread.cpp:169
#23 0x00007ffff6401182 in start_thread (arg=0x7fffe0a3b700) at pthread_create.c:312
#24 0x00007ffff590e00d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Is your traineddata working with Tesseract ? If you sure its not your fault, you can maybe post a bug report on OpenALPR project ! — Alto
I did a workaround. I was too big for a comment but is not so conclusive for an answer. — Diogo Melo

Diogo Melo Diogo Melo · Accepted Answer · 2015-01-16T02:46:42

I was debugging tesseract when I saw that the error occurs because shape_table_->getShape(id) was being called with id >= to shape_table_'s size, on file adaptmatch.cpp.

As a workaround, I changed the code to check the size first and skip the iteration instead of exit with segfault.

It is possible that this workaround will have undesirable consequences, but at least it stopped breaking. Here is the diff:

diff --git a/classify/adaptmatch.cpp b/classify/adaptmatch.cpp
index 0eaf144..b21d980 100644
--- a/classify/adaptmatch.cpp
+++ b/classify/adaptmatch.cpp
@@ -1148,7 +1148,7 @@ void Classify::ExpandShapesAndApplyCorrections(
     fontinfo_id = ClassAndConfigIDToFontOrShapeID(class_id, int_result.Config);
     fontinfo_id2 = ClassAndConfigIDToFontOrShapeID(class_id,
                                                    int_result.Config2);
-    if (shape_table_ != NULL) {
+    if (shape_table_ != NULL && fontinfo_id < shape_table_->NumShapes()) {
       // Actually fontinfo_id is an index into the shape_table_ and it
       // contains a list of unchar_id/font_id pairs.
       int shape_id = fontinfo_id;
@@ -1781,10 +1781,12 @@ void Classify::ComputeCharNormArrays(FEATURE_STRUCT* norm_feature,
         int font_set_id = templates->Class[id]->font_set_id;
         const FontSet &fs = fontset_table_.get(font_set_id);
         for (int config = 0; config < fs.size; ++config) {
-          const Shape& shape = shape_table_->GetShape(fs.configs[config]);
-          for (int c = 0; c < shape.size(); ++c) {
-            if (char_norm_array[shape[c].unichar_id] < pruner_array[id])
-              pruner_array[id] = char_norm_array[shape[c].unichar_id];
+          if (shape_table_->NumShapes() > fs.configs[config]) {
+            const Shape shape = shape_table_->GetShape(fs.configs[config]);
+            for (int c = 0; c < shape.size(); ++c) {
+              if (char_norm_array[shape[c].unichar_id] < pruner_array[id])
+                pruner_array[id] = char_norm_array[shape[c].unichar_id];
+            }
           }
         }
       }

OpenALPR exits with segmentation fault with custom traineddata

1 Answers