<?xml version="1.0" encoding="ISO-8859-1"?>
<metadatalist>
	<metadata ReferenceType="Conference Proceedings">
		<site>sibgrapi.sid.inpe.br 802</site>
		<identifier>8JMKD3MGPAW/3PFS8CH</identifier>
		<repository>sid.inpe.br/sibgrapi/2017/08.22.02.22</repository>
		<lastupdate>2017:08.22.02.22.06 sid.inpe.br/banon/2001/03.30.15.38 administrator</lastupdate>
		<metadatarepository>sid.inpe.br/sibgrapi/2017/08.22.02.22.06</metadatarepository>
		<metadatalastupdate>2020:02.19.02.01.40 sid.inpe.br/banon/2001/03.30.15.38 administrator {D 2017}</metadatalastupdate>
		<citationkey>Julca-AguilarMaiaHira:2017:TeClCo</citationkey>
		<title>Text/non-text classification of connected components in document images</title>
		<format>On-line</format>
		<year>2017</year>
		<date>Oct. 17-20, 2017</date>
		<numberoffiles>1</numberoffiles>
		<size>1930 KiB</size>
		<author>Julca-Aguilar, Frank Dennis,</author>
		<author>Maia, Ana Lucia Lima Marreiros,</author>
		<author>Hirata, Nina Sumiko Tomita,</author>
		<affiliation>University of São Paulo</affiliation>
		<affiliation>State University of Feira de Santana, University of São Paulo</affiliation>
		<affiliation>University of São Paulo</affiliation>
		<editor>Torchelsen, Rafael Piccin,</editor>
		<editor>Nascimento, Erickson Rangel do,</editor>
		<editor>Panozzo, Daniele,</editor>
		<editor>Liu, Zicheng,</editor>
		<editor>Farias, Mylène,</editor>
		<editor>Viera, Thales,</editor>
		<editor>Sacht, Leonardo,</editor>
		<editor>Ferreira, Nivan,</editor>
		<editor>Comba, João Luiz Dihl,</editor>
		<editor>Hirata, Nina,</editor>
		<editor>Schiavon Porto, Marcelo,</editor>
		<editor>Vital, Creto,</editor>
		<editor>Pagot, Christian Azambuja,</editor>
		<editor>Petronetto, Fabiano,</editor>
		<editor>Clua, Esteban,</editor>
		<editor>Cardeal, Flávio,</editor>
		<e-mailaddress>nina@ime.usp.br</e-mailaddress>
		<conferencename>Conference on Graphics, Patterns and Images, 30 (SIBGRAPI)</conferencename>
		<conferencelocation>Niterói, RJ</conferencelocation>
		<booktitle>Proceedings</booktitle>
		<publisher>IEEE Computer Society</publisher>
		<publisheraddress>Los Alamitos</publisheraddress>
		<tertiarytype>Full Paper</tertiarytype>
		<transferableflag>1</transferableflag>
		<contenttype>External Contribution</contenttype>
		<keywords>text segmentation, connected component, convolutional neural network.</keywords>
		<abstract>Text segmentation is an important problem in document analysis related applications. We address the problem of classifying connected components of a document image as text or non-text. Inspired from previous works in the literature, besides common size and shape related features extracted from the components, we also consider component images, without and with context information, as inputs of the classifiers. Muli-layer perceptrons and convolutional neural networks are used to classify the components. High precision and recall is obtained with respect to both text and non-text components.</abstract>
		<language>en</language>
		<targetfile>PID4960469.pdf</targetfile>
		<usergroup>nina@ime.usp.br</usergroup>
		<visibility>shown</visibility>
		<documentstage>not transferred</documentstage>
		<mirrorrepository>sid.inpe.br/banon/2001/03.30.15.38.24</mirrorrepository>
		<nexthigherunit>8JMKD3MGPAW/3PJT9LS</nexthigherunit>
		<nexthigherunit>8JMKD3MGPAW/3PKCC58</nexthigherunit>
		<hostcollection>sid.inpe.br/banon/2001/03.30.15.38</hostcollection>
		<agreement>agreement.html .htaccess .htaccess2</agreement>
		<lasthostcollection>sid.inpe.br/banon/2001/03.30.15.38</lasthostcollection>
		<url>http://sibgrapi.sid.inpe.br/rep-/sid.inpe.br/sibgrapi/2017/08.22.02.22</url>
	</metadata>
</metadatalist>