Patent Counselors IP (PCIP) - IP Counseling and Prosecution - Patent and Trademark - David Tran - 11538461 - Language Agnostic Missing Subtitle Detection

U.S. Patent No. 11,538,461 - Prepared by Attorney David Tran for Amazon.com, Inc. and filed by Weaver (WAVS IP)

Brief Description: Some implementations include methods for detecting missing subtitles associated with a media presentation and may include receiving an audio component and a subtitle component associated with a media presentation, the audio component including an audio sequence, the audio sequence divided into a plurality of audio segments; evaluating the plurality of audio segments using a combination of a recurrent neural network and a convolutional neural network to identify refined speech segments associated with the audio sequence, the recurrent neural network trained based on a plurality of languages, the convolutional neural network trained based on a plurality of categories of sound; determining timestamps associated with the identified refined speech segments; and determining missing subtitles based on the timestamps associated with the identified refined speech segments and timestamps associated with subtitles included in the subtitle component. This disclosure describes techniques for identifying missing subtitles associated with a media presentation. The media presentation may include an audio component and a subtitle component. The subtitle component may include timestamps associated with subtitles. The techniques may include receiving an audio sequence associated with the audio component. The audio sequence may be divided into a plurality of audio segments of a first duration. For example, the first duration may be 800 milliseconds (ms). Each of the audio segments may be processed using voice activity detection (VAD) network to determine whether an audio segment is a speech segment. The VAD network may be configured to perform operations associated with a recurrent neural network. The VAD network may be trained to detect speech. The VAD network may be trained based on a plurality of different languages and a plurality of samples. The VAD network may be language agnostic.

View Complete Description

About Attorney-Client Relationship

The information provided on this website does not, and is not intended to, constitute legal advice. Contacting PCIP by phone, email or by using an online contact form does not establish an attorney-client relationship. Please do not send any confidential information to us until such time as a formal attorney-client relationship has been established.

Images courtesy of Pixabay.com

Contact Us

(408) 800-6223

This email address is being protected from spambots. You need JavaScript enabled to view it.

2200 Eastridge Loop, #730102, San Jose, CA. 95173

Main Menu

11538461 - Language Agnostic Missing Subtitle Detection

About Attorney-Client Relationship

Contact Us

Copyright