Identity statement area
Reference TypeConference Paper (Conference Proceedings)
Last Update2017: administrator
Metadata Last Update2020: administrator
Citation KeyJulca-AguilarMaiaHira:2017:TeClCo
TitleText/non-text classification of connected components in document images
DateOct. 17-20, 2017
Access Date2021, Jan. 21
Number of Files1
Size1930 KiB
Context area
Author1 Julca-Aguilar, Frank Dennis
2 Maia, Ana Lucia Lima Marreiros
3 Hirata, Nina Sumiko Tomita
Affiliation1 University of São Paulo
2 State University of Feira de Santana, University of São Paulo
3 University of São Paulo
EditorTorchelsen, Rafael Piccin
Nascimento, Erickson Rangel do
Panozzo, Daniele
Liu, Zicheng
Farias, Mylène
Viera, Thales
Sacht, Leonardo
Ferreira, Nivan
Comba, João Luiz Dihl
Hirata, Nina
Schiavon Porto, Marcelo
Vital, Creto
Pagot, Christian Azambuja
Petronetto, Fabiano
Clua, Esteban
Cardeal, Flávio
Conference NameConference on Graphics, Patterns and Images, 30 (SIBGRAPI)
Conference LocationNiterói, RJ
Book TitleProceedings
PublisherIEEE Computer Society
Publisher CityLos Alamitos
Tertiary TypeFull Paper
History2017-08-22 02:22:06 :: -> administrator ::
2020-02-19 02:01:40 :: administrator -> :: 2017
Content and structure area
Is the master or a copy?is the master
Content Stagecompleted
Content TypeExternal Contribution
Keywordstext segmentation, connected component, convolutional neural network.
AbstractText segmentation is an important problem in document analysis related applications. We address the problem of classifying connected components of a document image as text or non-text. Inspired from previous works in the literature, besides common size and shape related features extracted from the components, we also consider component images, without and with context information, as inputs of the classifiers. Muli-layer perceptrons and convolutional neural networks are used to classify the components. High precision and recall is obtained with respect to both text and non-text components.
source Directory Contentthere are no files
agreement Directory Content
agreement.html 21/08/2017 23:22 1.2 KiB 
Conditions of access and use area
data URL
zipped data URL
Target FilePID4960469.pdf
Update Permissionnot transferred
Allied materials area
Next Higher Units8JMKD3MGPAW/3PJT9LS
Notes area
Empty Fieldsaccessionnumber archivingpolicy archivist area callnumber copyholder copyright creatorhistory descriptionlevel dissemination doi edition electronicmailaddress group holdercode isbn issn label lineage mark nextedition notes numberofvolumes orcid organization pages parameterlist parentrepositories previousedition previouslowerunit progress project readergroup readpermission resumeid rightsholder secondarydate secondarykey secondarymark secondarytype serieseditor session shorttitle sponsor subject tertiarymark type url versiontype volume