Metadata

Close
Metadata

1. Identity statement

Reference Type Conference Paper (Conference Proceedings)

Site sibgrapi.sid.inpe.br

Holder Code ibi 8JMKD3MGPEW34M/46T9EHH

Identifier 8JMKD3MGPEW34M/45E54GS

Repository sid.inpe.br/sibgrapi/2021/09.13.19.25

Last Update 2021:10.13.16.06.07 (UTC) administrator

Metadata Repository sid.inpe.br/sibgrapi/2021/09.13.19.25.26

Metadata Last Update 2022:09.10.00.16.17 (UTC) administrator

Citation Key CarvalhoBorg:2021:CoStTe

Title A Comparative Study of Text Document Representation Approaches Using Point Placement-based Visualizations

Format On-line

Year 2021

Access Date 2025, July 03

Number of Files 1

Size 4243 KiB

2. Context

Author 1 Carvalho, Hevelyn Sthefany Lima de 2 Borges, Vinicius Ruela Pereira

Affiliation 1 University of Brasília 2 University of Brasília

Editor Paiva, Afonso Menotti, David Baranoski, Gladimir V. G. Proença, Hugo Pedro Junior, Antonio Lopes Apolinario Papa, João Paulo Pagliosa, Paulo dos Santos, Thiago Oliveira e Sá, Asla Medeiros da Silveira, Thiago Lopes Trugillo Brazil, Emilio Vital Ponti, Moacir A. Fernandes, Leandro A. F. Avila, Sandra

e-Mail Address hevelyn.sthefany@gmail.com

Conference Name Conference on Graphics, Patterns and Images, 34 (SIBGRAPI)

Conference Location Gramado, RS, Brazil (virtual)

Date 18-22 Oct. 2021

Publisher Sociedade Brasileira de Computação

Publisher City Porto Alegre

Book Title Proceedings

Tertiary Type Undergraduate Work

History (UTC) 2021-10-13 16:06:07 :: hevelyn.sthefany@gmail.com -> administrator :: 2021 2022-09-10 00:16:17 :: administrator -> :: 2021

3. Content and structure

Is the master or a copy? is the master

Content Stage completed

Transferable 1

Keywords visualization word-embedding feature extraction text multidimensional scaling

Abstract In natural language processing, text representation plays an important role which can affect the performance of language models and machine learning algorithms. Basic vector space models, such as the term frequency-inverse document frequency, became popular approaches to represent text documents. In the last years, approaches based on word embeddings have been proposed to preserve the meaning and semantic relations of words, phrases and texts. In this paper, we focus on studying the influences of different text representations to the quality of layouts generated by state-of-art visualizations based on point placement. For that purpose, a visualization-assisted approach is proposed to support users when exploring such representations in classification tasks. Experimental results using two public labeled corpora were conducted to assess the quality of the layouts and to discuss possible relations to the classification performances. The results are promising, indicating that the proposed approach can guide users to understand the relevant patterns of a corpus in each representation.

Arrangement urlib.net > SDLA > Fonds > SIBGRAPI 2021 > A Comparative Study...

doc Directory Content access

source Directory Content there are no files

agreement Directory Content

agreement.html 13/09/2021 16:25 1.3 KiB

4. Conditions of access and use

data URL http://urlib.net/ibi/8JMKD3MGPEW34M/45E54GS

zipped data URL http://urlib.net/zip/8JMKD3MGPEW34M/45E54GS

Language en

Target File WUW-9.pdf

User Group hevelyn.sthefany@gmail.com

Visibility shown

5. Allied materials

Mirror Repository sid.inpe.br/banon/2001/03.30.15.38.24

Next Higher Units 8JMKD3MGPEW34M/45PQ3RS

Citing Item List sid.inpe.br/sibgrapi/2021/11.12.11.46 90 sid.inpe.br/banon/2001/03.30.15.38.24 2

Host Collection sid.inpe.br/banon/2001/03.30.15.38

6. Notes

Empty Fields archivingpolicy archivist area callnumber contenttype copyholder copyright creatorhistory descriptionlevel dissemination documentstage doi edition electronicmailaddress group isbn issn label lineage mark nextedition notes numberofvolumes orcid organization pages parameterlist parentrepositories previousedition previouslowerunit progress project readergroup readpermission resumeid rightsholder schedulinginformation secondarydate secondarykey secondarymark secondarytype serieseditor session shorttitle sponsor subject tertiarymark type url versiontype volume