<?xml version="1.0" encoding="ISO-8859-1"?>
<metadatalist>
	<metadata ReferenceType="Conference Proceedings">
		<identifier>8JMKD3MGPAW/3S4ELD8</identifier>
		<repository>sid.inpe.br/sibgrapi/2018/10.24.00.46</repository>
		<metadatarepository>sid.inpe.br/sibgrapi/2018/10.24.00.46.42</metadatarepository>
		<site>sibgrapi.sid.inpe.br 802</site>
		<citationkey>KuboNazAguOliDua:2018:UsUNPr</citationkey>
		<author>Kubo, Diandra Akemi,</author>
		<author>Nazare, Tiago Santana de,</author>
		<author>Aguirre, Priscila Louise Ribeiro,</author>
		<author>Oliveira, Bruno Domingues,</author>
		<author>Duarte, Felipe Simões Lage Gomes,</author>
		<affiliation>Data Science Team - Itau Unibanco</affiliation>
		<affiliation>Data Science Team - Itau Unibanco</affiliation>
		<affiliation>Data Science Team - Itau Unibanco</affiliation>
		<affiliation>Data Science Team - Itau Unibanco</affiliation>
		<affiliation>Data Science Team - Itau Unibanco</affiliation>
		<title>The usage of U-Net for pre-processing document images</title>
		<conferencename>Conference on Graphics, Patterns and Images, 31 (SIBGRAPI)</conferencename>
		<year>2018</year>
		<editor>Ross, Arun,</editor>
		<editor>Gastal, Eduardo S. L.,</editor>
		<editor>Jorge, Joaquim A.,</editor>
		<editor>Queiroz, Ricardo L. de,</editor>
		<editor>Minetto, Rodrigo,</editor>
		<editor>Sarkar, Sudeep,</editor>
		<editor>Papa, João Paulo,</editor>
		<editor>Oliveira, Manuel M.,</editor>
		<editor>Arbeláez, Pablo,</editor>
		<editor>Mery, Domingo,</editor>
		<editor>Oliveira, Maria Cristina Ferreira de,</editor>
		<editor>Spina, Thiago Vallin,</editor>
		<editor>Mendes, Caroline Mazetto,</editor>
		<editor>Costa, Henrique Sérgio Gutierrez,</editor>
		<editor>Mejail, Marta Estela,</editor>
		<editor>Geus, Klaus de,</editor>
		<editor>Scheer, Sergio,</editor>
		<booktitle>Proceedings</booktitle>
		<date>Oct. 29 - Nov. 1, 2018</date>
		<publisheraddress>Porto Alegre</publisheraddress>
		<publisher>Sociedade Brasileira de Computação</publisher>
		<conferencelocation>Foz do Iguaçu, PR, Brazil</conferencelocation>
		<keywords>#deep-learning #computer-vision #image-processing.</keywords>
		<abstract>When processing documents in real-world scenarios, it is common to deal with artifacts that may hamper document analysis, such as stamps, noise and strange backgrounds. Aiming to mitigate these problems, we propose the use of U-Net, a very successful biomedical image segmentation network, for handwritten and machine text segmentation. In order to do so, we trained a model for each type of text. One of the main advantages presented is that the models are trained on artificial data, avoiding the wearisome task of data labeling. For the machine text segmentation model, we test its impacts on both word and character recognition when combined with the Tesseract OCR model. For the handwritten segmentation model, we present qualitative results. Initial experiments indicate that both models are able to improve results in their respective applications.</abstract>
		<language>en</language>
		<tertiarytype>Industry Application Paper</tertiarytype>
		<format>On-line</format>
		<size>906 KiB</size>
		<numberoffiles>1</numberoffiles>
		<targetfile>sibgrapi_pi_cv.pdf</targetfile>
		<lastupdate>2018:10.24.00.46.42 sid.inpe.br/banon/2001/03.30.15.38 felipe.duarte@itau-unibanco.com.br</lastupdate>
		<metadatalastupdate>2020:02.20.22.06.51 sid.inpe.br/banon/2001/03.30.15.38 administrator {D 2018}</metadatalastupdate>
		<mirrorrepository>sid.inpe.br/banon/2001/03.30.15.38.24</mirrorrepository>
		<e-mailaddress>felipe.duarte@itau-unibanco.com.br</e-mailaddress>
		<usergroup>felipe.duarte@itau-unibanco.com.br</usergroup>
		<visibility>shown</visibility>
		<transferableflag>1</transferableflag>
		<hostcollection>sid.inpe.br/banon/2001/03.30.15.38</hostcollection>
		<documentstage>not transferred</documentstage>
		<nexthigherunit>8JMKD3MGPAW/3RPADUS</nexthigherunit>
		<agreement>agreement.html .htaccess .htaccess2</agreement>
		<lasthostcollection>sid.inpe.br/banon/2001/03.30.15.38</lasthostcollection>
		<url>http://sibgrapi.sid.inpe.br/rep-/sid.inpe.br/sibgrapi/2018/10.24.00.46</url>
	</metadata>
</metadatalist>