Identity statement area
Reference TypeConference Proceedings
Last Update2018: administrator
Metadata Last Update2020: administrator
Citation KeyCardenas:2018:MuHuAc
TitleMultimodal Human Action Recognition Based on a Fusion of Dynamic Images using CNN descriptors
DateOct. 29 - Nov. 1, 2018
Access Date2020, Dec. 04
Number of Files1
Size2214 KiB
Context area
AuthorCardenas, Edwin Jonathan Escobedo
AffiliationFederal University of Ouro Preto
EditorRoss, Arun
Gastal, Eduardo S. L.
Jorge, Joaquim A.
Queiroz, Ricardo L. de
Minetto, Rodrigo
Sarkar, Sudeep
Papa, João Paulo
Oliveira, Manuel M.
Arbeláez, Pablo
Mery, Domingo
Oliveira, Maria Cristina Ferreira de
Spina, Thiago Vallin
Mendes, Caroline Mazetto
Costa, Henrique Sérgio Gutierrez
Mejail, Marta Estela
Geus, Klaus de
Scheer, Sergio
Conference NameConference on Graphics, Patterns and Images, 31 (SIBGRAPI)
Conference LocationFoz do Iguaçu, PR, Brazil
Book TitleProceedings
PublisherIEEE Computer Society
Publisher CityLos Alamitos
History2018-09-10 20:04:03 :: -> administrator ::
2020-02-19 03:10:45 :: administrator -> :: 2018
Content and structure area
Is the master or a copy?is the master
Document Stagecompleted
Document Stagenot transferred
Content TypeExternal Contribution
Tertiary TypeFull Paper
Keywordsaction recognition, dynamic images, RGB-D data, kinect, CNN.
AbstractIn this paper, we propose the use of dynamic-images-based approach for action recognition. Specifically, we exploit the multimodal information recorded by a Kinect sensor (RGB-D and skeleton joint data). We combine several ideas from rank pooling and skeleton optical spectra to generate dynamic images to summarize an action sequence into single flow images. We group our dynamic images into five groups: a dynamic color group (DC); a dynamic depth group (DD) and three dynamic skeleton groups (DXY, DYZ, DXZ). As action is composed of different postures along time, we generated N different dynamic images with the main postures for each dynamic group. Next, we applied a pre-trained flow-CNN to extract spatiotemporal features with a max-mean aggregation. The proposed method was evaluated on a public benchmark dataset, the UTD-MHAD, and achieved the state-of-the-art result.
source Directory Contentthere are no files
agreement Directory Content
agreement.html 10/09/2018 17:04 1.2 KiB 
Conditions of access and use area
Target FileMultimodal_Human_Action_Recognition_Based_on_a_Fusion_of_Dynamic_Images_using_CNN_descriptors.pdf
Allied materials area
Next Higher Units8JMKD3MGPAW/3RPADUS
Notes area
Empty Fieldsaccessionnumber archivingpolicy archivist area callnumber copyholder copyright creatorhistory descriptionlevel dissemination doi edition electronicmailaddress group holdercode isbn issn label lineage mark nextedition notes numberofvolumes orcid organization pages parameterlist parentrepositories previousedition previouslowerunit progress project readergroup readpermission resumeid rightsholder secondarydate secondarykey secondarymark secondarytype serieseditor session shorttitle sponsor subject tertiarymark type url versiontype volume