Identity statement area
Reference TypeConference Proceedings
Last Update2013:
Metadata Last Update2020: administrator
Citation KeyOtiniano-RodríguezCáma:2013:FiSpRe
TitleFinger Spelling Recognition from RGB-D Information using Kernel Descriptor
DateAug. 5-8, 2013
Access Date2020, Dec. 05
Number of Files1
Size1580 KiB
Context area
Author1 Otiniano-Rodríguez, K.
2 Cámara-Chávez, G.
Affiliation1 Federal University of Ouro Preto
2 Federal University of Ouro Preto
EditorBoyer, Kim
Hirata, Nina
Nedel, Luciana
Silva, Claudio
Conference NameConference on Graphics, Patterns and Images, 26 (SIBGRAPI)
Conference LocationArequipa, Peru
Book TitleProceedings
PublisherIEEE Computer Society
Publisher CityLos Alamitos
History2013-07-12 22:55:42 :: -> administrator ::
2020-02-19 03:09:22 :: administrator -> :: 2013
Content and structure area
Is the master or a copy?is the master
Document Stagecompleted
Content TypeExternal Contribution
Tertiary TypeFull Paper
Keywordssign language, finger spelling, support vector machine (SVM), bag-of-visual-words.
AbstractDeaf people use systems of communication based on sign language and finger spelling. Manual spelling, or finger spelling, is a system where each letter of the alphabet is represented by an unique and discrete movement of the hand. RGB and depth images can be used to characterize hand shapes corresponding to letters of the alphabet. The advantage of depth cameras over color cameras for gesture recognition is more evident when performing hand segmentation. In this paper, we propose a hybrid system approach for finger spelling recognition using RGB-D information from Kinect sensor. In a first stage, the hand area is segmented from background using depth map and precise hand shape is extracted using both depth data and color data from Kinect sensor. Motivated by the performance of kernel based features, due to its simplicity and the ability to turn any type of pixel attribute into patch-level features, we decided to use the gradient kernel descriptor for feature extraction from depth images. The Scale-Invariant Feature Transform (SIFT) is used for describing the content of the RGB image. Then, the Bag-of-Visual-Words approach is used to extract semantic information. Finally, these features are used as input of our Support Vector Machine (SVM) classifier. The performance of this approach is quantitatively and qualitatively evaluated on a dataset of real images of American Sign Language (ASL) hand shapes. Three experiments were performed, using a combination of RGB and depth information and also using only RGB or depth information separately. The database used is composed of 120,000 images. According to our experiments, our approach has an accuracy rate of 91.26% when RGB and depth information is used, outperforming other state-of-the-art methods.
source Directory Contentthere are no files
agreement Directory Content
agreement.html 12/07/2013 19:55 0.7 KiB 
Conditions of access and use area
Target Filefinal paper.pdf
Allied materials area
Notes area
Empty Fieldsaccessionnumber archivingpolicy archivist area callnumber copyholder copyright creatorhistory descriptionlevel dissemination documentstage doi edition electronicmailaddress group holdercode isbn issn label lineage mark nextedition nexthigherunit notes numberofvolumes orcid organization pages parameterlist parentrepositories previousedition previouslowerunit progress project readergroup readpermission resumeid rightsholder secondarydate secondarykey secondarymark secondarytype serieseditor session shorttitle sponsor subject tertiarymark type url versiontype volume