A Solution to Reconstruct Cross-Cut Shredded Text Documents Based on Character Recognition and Genetic Algorithm

The reconstruction of destroyed paper documents is of more interest during the last years. This topic is relevant to the fields of forensics, investigative sciences, and archeology. Previous research and analysis on the reconstruction of cross-cut shredded text document (RCCSTD) are mainly based on...

Full description

Saved in:
Bibliographic Details
Main Authors: Hedong Xu, Jing Zheng, Ziwei Zhuang, Suohai Fan
Format: Article
Language:English
Published: Wiley 2014-01-01
Series:Abstract and Applied Analysis
Online Access:http://dx.doi.org/10.1155/2014/829602
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832555476742569984
author Hedong Xu
Jing Zheng
Ziwei Zhuang
Suohai Fan
author_facet Hedong Xu
Jing Zheng
Ziwei Zhuang
Suohai Fan
author_sort Hedong Xu
collection DOAJ
description The reconstruction of destroyed paper documents is of more interest during the last years. This topic is relevant to the fields of forensics, investigative sciences, and archeology. Previous research and analysis on the reconstruction of cross-cut shredded text document (RCCSTD) are mainly based on the likelihood and the traditional heuristic algorithm. In this paper, a feature-matching algorithm based on the character recognition via establishing the database of the letters is presented, reconstructing the shredded document by row clustering, intrarow splicing, and interrow splicing. Row clustering is executed through the clustering algorithm according to the clustering vectors of the fragments. Intrarow splicing regarded as the travelling salesman problem is solved by the improved genetic algorithm. Finally, the document is reconstructed by the interrow splicing according to the line spacing and the proximity of the fragments. Computational experiments suggest that the presented algorithm is of high precision and efficiency, and that the algorithm may be useful for the different size of cross-cut shredded text document.
format Article
id doaj-art-7f05415943d547559092abc07a90a410
institution Kabale University
issn 1085-3375
1687-0409
language English
publishDate 2014-01-01
publisher Wiley
record_format Article
series Abstract and Applied Analysis
spelling doaj-art-7f05415943d547559092abc07a90a4102025-02-03T05:48:09ZengWileyAbstract and Applied Analysis1085-33751687-04092014-01-01201410.1155/2014/829602829602A Solution to Reconstruct Cross-Cut Shredded Text Documents Based on Character Recognition and Genetic AlgorithmHedong Xu0Jing Zheng1Ziwei Zhuang2Suohai Fan3School of Information Science and Technology, Jinan University, Guangzhou 510632, ChinaSchool of Information Science and Technology, Jinan University, Guangzhou 510632, ChinaSchool of Information Science and Technology, Jinan University, Guangzhou 510632, ChinaSchool of Information Science and Technology, Jinan University, Guangzhou 510632, ChinaThe reconstruction of destroyed paper documents is of more interest during the last years. This topic is relevant to the fields of forensics, investigative sciences, and archeology. Previous research and analysis on the reconstruction of cross-cut shredded text document (RCCSTD) are mainly based on the likelihood and the traditional heuristic algorithm. In this paper, a feature-matching algorithm based on the character recognition via establishing the database of the letters is presented, reconstructing the shredded document by row clustering, intrarow splicing, and interrow splicing. Row clustering is executed through the clustering algorithm according to the clustering vectors of the fragments. Intrarow splicing regarded as the travelling salesman problem is solved by the improved genetic algorithm. Finally, the document is reconstructed by the interrow splicing according to the line spacing and the proximity of the fragments. Computational experiments suggest that the presented algorithm is of high precision and efficiency, and that the algorithm may be useful for the different size of cross-cut shredded text document.http://dx.doi.org/10.1155/2014/829602
spellingShingle Hedong Xu
Jing Zheng
Ziwei Zhuang
Suohai Fan
A Solution to Reconstruct Cross-Cut Shredded Text Documents Based on Character Recognition and Genetic Algorithm
Abstract and Applied Analysis
title A Solution to Reconstruct Cross-Cut Shredded Text Documents Based on Character Recognition and Genetic Algorithm
title_full A Solution to Reconstruct Cross-Cut Shredded Text Documents Based on Character Recognition and Genetic Algorithm
title_fullStr A Solution to Reconstruct Cross-Cut Shredded Text Documents Based on Character Recognition and Genetic Algorithm
title_full_unstemmed A Solution to Reconstruct Cross-Cut Shredded Text Documents Based on Character Recognition and Genetic Algorithm
title_short A Solution to Reconstruct Cross-Cut Shredded Text Documents Based on Character Recognition and Genetic Algorithm
title_sort solution to reconstruct cross cut shredded text documents based on character recognition and genetic algorithm
url http://dx.doi.org/10.1155/2014/829602
work_keys_str_mv AT hedongxu asolutiontoreconstructcrosscutshreddedtextdocumentsbasedoncharacterrecognitionandgeneticalgorithm
AT jingzheng asolutiontoreconstructcrosscutshreddedtextdocumentsbasedoncharacterrecognitionandgeneticalgorithm
AT ziweizhuang asolutiontoreconstructcrosscutshreddedtextdocumentsbasedoncharacterrecognitionandgeneticalgorithm
AT suohaifan asolutiontoreconstructcrosscutshreddedtextdocumentsbasedoncharacterrecognitionandgeneticalgorithm
AT hedongxu solutiontoreconstructcrosscutshreddedtextdocumentsbasedoncharacterrecognitionandgeneticalgorithm
AT jingzheng solutiontoreconstructcrosscutshreddedtextdocumentsbasedoncharacterrecognitionandgeneticalgorithm
AT ziweizhuang solutiontoreconstructcrosscutshreddedtextdocumentsbasedoncharacterrecognitionandgeneticalgorithm
AT suohaifan solutiontoreconstructcrosscutshreddedtextdocumentsbasedoncharacterrecognitionandgeneticalgorithm